Re: Postgres restart in the middle of exclusive backup and the presence of backup_label file

From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Chapman Flack <chap(at)anastigmatix(dot)net>, Magnus Hagander <magnus(at)hagander(dot)net>, David Steele <david(at)pgmasters(dot)net>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Postgres restart in the middle of exclusive backup and the presence of backup_label file
Date: 2022-03-29 02:09:48
Message-ID: 20220329020948.GA311150@nathanxps13
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 28, 2022 at 04:30:27PM -0400, Stephen Frost wrote:
>> - By default, <function>pg_start_backup</function> can take a long time to finish.
>> + By default, <function>pg_backup_start</function> can take a long time to finish.
>> This is because it performs a checkpoint, and the I/O
>> required for the checkpoint will be spread out over a significant
>> period of time, by default half your inter-checkpoint interval
>
> Hrmpf. Not this patch's fault, but the default isn't 'half your
> inter-checkpoint interval' anymore (checkpoint_completion_target was
> changed to 0.9 ... let's not discuss who did that and missed fixing
> this). Overall though, maybe we should reword this to avoid having to
> remember to change it again if we ever change the completion target
> again? Also it seems to imply that pg_backup_start is going to run its
> own independent checkpoint or something, which isn't the typical case.
> I generally explain what is going on here like this:
>
> By default, <function>pg_backup_start</function> will wait for the next
> checkpoint to complete (see ref: checkpoint_timeout ... maybe also
> wal-configuration.html).

done

>> - The <function>pg_stop_backup</function> will return one row with three
>> + The <function>pg_backup_stop</function> will return one row with three
>> values. The second of these fields should be written to a file named
>> <filename>backup_label</filename> in the root directory of the backup. The
>> third field should be written to a file named
>
> I get that we have <function> and </function>, but those are just
> formatting and don't show up in the actual text, and so it ends up
> being:
>
> The pg_backup_stop will return
>
> Which doesn't sound quite right to me. I'd say we either drop 'The' or
> add 'function' after. I realize that this patch is mostly doing a
> s/pg_stop_backup/pg_backup_stop/, but, still.

done

>> - Finishes performing an exclusive or non-exclusive on-line backup.
>> - The <parameter>exclusive</parameter> parameter must match the
>> - previous <function>pg_start_backup</function> call.
>> - In an exclusive backup, <function>pg_stop_backup</function> removes
>> - the backup label file and, if it exists, the tablespace map file
>> - created by <function>pg_start_backup</function>. In a non-exclusive
>> - backup, the desired contents of these files are returned as part of
>> + Finishes performing an on-line backup. The desired contents of the
>> + backup label file and the tablespace map file are returned as part of
>> the result of the function, and should be written to files in the
>> backup area (not in the data directory).
>> </para>
>
> Given that this is a crucial part, which the exclusive mode has wrong,
> I'd be a bit more forceful about it, eg:
>
> The contents of the backup label file and the tablespace map file must
> be stored as part of the backup and must NOT be stored in the live data
> directory (doing so will cause PostgreSQL to fail to restart in the
> event of a crash).

done

>> The result of the function is a single record.
>> The <parameter>lsn</parameter> column holds the backup's ending
>> write-ahead log location (which again can be ignored). The second and
>> - third columns are <literal>NULL</literal> when ending an exclusive
>> - backup; after a non-exclusive backup they hold the desired contents of
>> + third columns hold the desired contents of
>> the label and tablespace map files.
>> </para>
>> <para>
>
> I dislike saying 'desired' here as if it's something that we're nicely
> asking but maybe isn't a big deal, and just saying 'label' isn't good
> since we call it 'backup label' elsewhere and we should try hard to be
> consistent and clear. How about:
>
> The second column returns the contents of the backup label file, the
> third column returns the contents of the tablespace map file. These
> must be stored as part of the backup and are required as part of the
> restore process.

done

>> {
>> print_msg(_("WARNING: online backup mode is active\n"
>> - "Shutdown will not complete until pg_stop_backup() is called.\n\n"));
>> + "Shutdown will not complete until pg_backup_stop() is called.\n\n"));
>> }
>>
>> print_msg(_("waiting for server to shut down..."));
>> @@ -1145,7 +1145,7 @@ do_restart(void)
>> get_control_dbstate() != DB_IN_ARCHIVE_RECOVERY)
>> {
>> print_msg(_("WARNING: online backup mode is active\n"
>> - "Shutdown will not complete until pg_stop_backup() is called.\n\n"));
>> + "Shutdown will not complete until pg_backup_stop() is called.\n\n"));
>> }
>>
>> print_msg(_("waiting for server to shut down..."));
>
> This... can't actually happen anymore, right? Shouldn't we just pull
> all of this out?

done

> On a once-over of the rest of the code, I definitely like how much we're
> able to simplify things in this area and remove various hacks in things
> like pg_basebackup and pg_rewind where we previously had to worry about
> backup_label and tablespace_map files being in a live data directory.
> I'm planning to spend more time on this and hopefully be able to get it
> in for v15.

Great! Much appreciated.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v8-0001-remove-exclusive-backup-mode.patch text/x-diff 88.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2022-03-29 02:11:52 Re: Window Function "Run Conditions"
Previous Message Tom Lane 2022-03-29 02:07:44 Re: MDAM techniques and Index Skip Scan patch