Re: Improving Physical Backup/Restore within the Low Level API

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Improving Physical Backup/Restore within the Low Level API
Date: 2023-10-16 18:18:38
Message-ID: CAKFQuwbdOBh+4xTmK3d1+27M9rrU1je4-=Ye9xo5tiWfFb_HoA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Oct 16, 2023 at 10:26 AM Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
wrote:

> On Mon, 2023-10-16 at 09:26 -0700, David G. Johnston wrote:
> > This email is a first pass at a user-visible design for how our backup
> and restore
> > process, as enabled by the Low Level API, can be modified to make it
> more mistake-proof.
> > In short, it requires pg_start_backup to further expand upon what it
> means for the
> > system to be in the midst of a backup, pg_stop_backup to reverse those
> things,
> > and modifying the startup process to deal with the server having crashed
> while the
> > system is in that backup state. Notes at the end extend the design to
> handle concurrent backups.
> >
> > The core functional changes are:
> > 1) pg_backup_start modifies a newly added "in backup" state flag in
> pg_control to on.
> > 2) pg_backup_stop modifies that flag back to off.
> > 3) postmaster will refuse to start if that flag is on, unless one of:
> > a) crash.signal exists in the data directory
> > b) recovery.signal exists in the data directory
> > c) standby.signal exists in the data directory
> > 4) Signal file processing causes the in-backup flag in pg_control to be
> set to off
> >
> > The newly added crash.signal file is required to handle the case where
> the server
> > crashes after pg_backup_start and before pg_backup_stop. It initiates a
> crash recovery
> > of the instance just as is done today but with the added change of
> flipping the flag
> > to off when recovery is complete just before going live.
>
> I see a couple of problems and/or things that need clarification with that
> idea:
>
> - Two backups can run concurrently. How do you reconcile that with the
> "in backup"
> flag and crash.signal?
> - I guess crash.signal is created during pg_start_backup(). So that file
> will be
> included in the backup. How do you handle that during recovery? Ignore
> it if
> another signal file is present? And if the user forgets to create a
> signal file
> for recovery, how do you prevent PostgreSQL from performing crash
> recovery?
>
>
crash.signal is created in the pg_backup_metadata directory, not the root
directory. Should the server crash while any backup is in progress
pg_control would be aware of that fact (in_backup=true would still be
there, instead of in_backup=false which only comes back after all backups
have completed) and the server will not restart without user intervention -
specifically, moving the crash.signal file from (one of) the
pg_backup_metadata subdirectories to the root directory. As there is
nothing special about the crash.signal files in the pg_backup_metadata
subdirectories "touch crash.signal" could be used.

The backed up pg_control file will have in_backup=true (I haven't pondered
the torn reads dynamic of this - I'm supposing that placing a copy of
pg_control into the pg_backup_metadata directory might be part of solving
that problem).

David J.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2023-10-16 18:54:23 Re: ALTER COLUMN ... SET EXPRESSION to alter stored generated column's expression
Previous Message Robert Haas 2023-10-16 18:06:01 Re: [PATCH] Clarify the behavior of the system when approaching XID wraparound