Re: Allow replication roles to use file access functions

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Allow replication roles to use file access functions
Date: 2015-09-04 05:14:08
Message-ID: CAB7nPqQq+-CZMng8+Pii2Wss23_acg6L3KLNd3xzM5uXv0FnvQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 3, 2015 at 9:53 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> * Michael Paquier (michael(dot)paquier(at)gmail(dot)com) wrote:
>> On Thu, Sep 3, 2015 at 11:20 AM, Stephen Frost wrote:
>> >> Not only, +clog, configuration files, etc.
>> >
>> > Configuration files? Perhaps you could elaborate?
>>
>> Sure. Sorry for being unclear. It copies everything that is not a
>> relation file, a kind of base backup without the relation files then.
>
> How does that work on systems where the configuration files aren't
> stored under PGDATA (Debian and derivatives, at least)?

When a file is out of PGDATA, it is not fetched. Symlinks in PGDATA
have their contents fetched as well if I recall correctly.

> I guess I don't
> quite see why it's necessary for pg_rewind to copy the configuration
> files in the first place, it doesn't have the same role as
> pg_basebackup, at least as I understand it.

Of course, that's not mandatory to fetch them. It is as well not worth
the complication to apply a filter to not fetch a portion of the
files, and I think that's why Heikki took the approach to fetch
everything in PGDATA (except relation files) because that was just
more simple to implement as such for little gain.

>> I guess that what you are suggesting instead is an approach where
>> caller sends something like that through the replication protocol with
>> a relation OID and a block list:
>> BLOCK_DIFF relation_oid BLOCK_LIST m,n,[o, ...]
>
> Right, something along those lines is what I had been thinking. We
> would probably need to provide independent commands for the different
> file types, with the parameters expressed in terms appropriate for each
> kind of file (block numbers for heap, XIDs for WAL and CLOG?).
> Essentially, whatever API would be both simple for pg_rewind and general
> enough to be useful for other clients in the future. At least, I
> imagine that pg_rewind would be a bit simpler if it could communicate
> with the backend in the 'language of PG' rather than having to specify
> file names and paths.

I guess that makes sense if we want to remove the superuser-only
barrier, still this would require to invent new commands each time a
new file type is added as this new type of file may be needed as well
in the rewound node (imagine a pg_clog2 for example). I would rather
take the other approach by applying an exclude list in an existing
command, and not an include list or a new set of commands.

> Other clients that might find such an interface useful are incremental
> pg_basebackup or possibly parallel pg_basebackup.

Possible.

>> We would need as well to extend BASE_BACKUP so as it does not include
>> relation files though for this use case.
>
> ... huh? I'm not following this comment at all. We might need to
> provide explicit start/stop backup commands and/or extend BASE_BACKUP
> for things like parallel pg_basebackup, but I'm not following why we
> would need to change it for pg_rewind. Further BASE_BACKUP clearly does
> include relation files today..

The whole point of pg_rewind is not to have to fetch relation files,
so my idea would be basically to extend a bit BASE_BACKUP so as it
accepts a set of regex expressions aimed at filtering files to not
include in a base backup as requested by the client.

Still, for now it seems that this patch is taking the wrong approach
and that the general consensus would be to use the replication
protocol instead, so I am marking this patch as returned with
feedback.
Thanks!
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2015-09-04 05:51:56 Re: Declarative partitioning
Previous Message Kyotaro HORIGUCHI 2015-09-04 05:11:14 Re: pgbench - allow backslash-continuations in custom scripts