Re: Improving Physical Backup/Restore within the Low Level API

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
Cc: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Improving Physical Backup/Restore within the Low Level API
Date: 2023-10-17 18:28:22
Message-ID: CA+Tgmob-17TSvjbZ24DhCt+gd9dn31=oTeh+pwAUuTv8BCvsiA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Oct 16, 2023 at 5:21 PM David G. Johnston
<david(dot)g(dot)johnston(at)gmail(dot)com> wrote:
> But no, by default, and probably so far as pg_basebackup is concerned, a server crash during backup results in requiring outside intervention in order to get the server to restart.

Others may differ, but I think such a proposal is dead on arrival. As
Laurenz says, that's just reinventing one of the main problems with
exclusive backup mode.

The underlying issue here is that, fundamentally, there's no way for
postgres itself to tell the difference between the backup directory on
the primary and an exact copy of it on a standby. There has to be some
mechanism by which the user tells us whether this is the original
directory or a clone of it -- and that's what backup_label,
recovery.signal, and standby.signal are for. Your proposal rejiggers
the details of how we distinguish primary from standby, but it
doesn't, and can't, avoid the need for users to actually follow the
directions, and I don't see why they'd be any more likely to follow
the directions that this proposal would require than the directions
we're giving them now.

I wish I had a better idea here, because the status quo is definitely
not great. The only thought that really occurs to me is that we might
do better if PostgreSQL did more of the work itself and left fewer
steps to the user to perform. If you could click the "take a backup
here" button and the "restore a backup there" button and not think
about what was actually happening, you'd not have the opportunity to
mess up. But, as I understand it, the main motivation for the
continued existence of the low-level API is that the data directory
might be really big, and you might need to clone it using some kind of
special magic that your system has available instead of copying all
the bytes. And that makes it hard to move more of the responsibility
into PostgreSQL itself, because we don't know how that special magic
works.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2023-10-17 18:46:05 Re: Track Oldest Initialized WAL Buffer Page
Previous Message David G. Johnston 2023-10-17 18:04:56 Re: odd buildfarm failure - "pg_ctl: control file appears to be corrupt"