Re: base backup client as auxiliary backend process

From: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: base backup client as auxiliary backend process
Date: 2019-07-11 20:56:31
Message-ID: fb069b86-9a5f-915e-c8cf-d34fa5730805@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2019-07-11 22:20, Robert Haas wrote:
> On Thu, Jul 11, 2019 at 4:10 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>>> On Thu, Jul 11, 2019 at 10:36 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>>> Gotta have config files in place already, no?
>>
>>> Why?
>>
>> How's the postmaster to know that it's supposed to run pg_basebackup
>> rather than start normally? Where will it get the connection information?
>> Seem to need configuration data *somewhere*.
>
> Maybe just:
>
> ./postgres --replica='connstr' -D createme

What you are describing is of course theoretically possible, but it
doesn't really fit with how existing tooling normally deals with this,
which is one of the problems I want to address.

initdb has all the knowledge of how to create the data *directory*, how
to set permissions, deal with existing and non-empty directories, how to
set up a separate WAL directory. Packaged environments might wrap this
further by using the correct OS users, creating the directory first as
root, then changing owner, etc. This is all logic that we can reuse and
probably don't want to duplicate elsewhere.

Furthermore, we have for the longest time encouraged packagers *not* to
create data directories automatically when a service is started, because
this might store data in places that will be hidden by a later mount.
Keeping this property requires making the initialization of the data
directory a separate step somehow. That step doesn't have to be called
"initdb", it could be a new "pg_mkdirs", but for the reasons described
above, this would create a fair mount of code duplication and not really
gain anything.

Finally, many installations want to have the configuration files under
control of some centralized configuration management system. The way
those want to work is usually: (1) create file system structures, (2)
install configuration files from some templates, (3) start service.
This is of course how setting up a primary works. Having such a system
set up a standby is currently seemingly impossible in an elegant way,
because the order and timing of how things work is all wrong. My
proposed change would fix this because things would be set up in the
same three-step process. (As has been pointed out, this would require
that the base backup does not copy over the configuration files from the
remote, which my patch currently doesn't do correctly.)

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2019-07-11 21:14:20 REINDEX filtering in the backend
Previous Message Mike Palmiotto 2019-07-11 20:54:26 Re: [RFC] [PATCH] Flexible "partition pruning" hook