Re: What's the dfifference between pg_start_backup+copy+pg_stop_backup+WAL vs. pg_start_backup+pg_stop_backup+copy+WAL?

From: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
To: Thorsten Schöning <tschoening(at)am-soft(dot)de>, pgsql-admin(at)lists(dot)postgresql(dot)org
Subject: Re: What's the dfifference between pg_start_backup+copy+pg_stop_backup+WAL vs. pg_start_backup+pg_stop_backup+copy+WAL?
Date: 2021-07-23 10:41:43
Message-ID: 399c734c5b9ffe10f68bcb79a69bea000d8d0c21.camel@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

On Fri, 2021-07-23 at 11:56 +0200, Thorsten Schöning wrote:
> So a process of "pg_start_backup", directly followed by a
> "pg_stop_backup" with assuring that the WALs during both calls are
> preserved, should result in a valid backup, shouldn't it?
>
> An afterwards copied data directory is crash safe again up to the
> point "pg_start_backup" has been called and can recover using the WALs
> up to the point "pg_stop_backup" has been called. Regardless how long
> copying the data directory takes and which changes Postgres applies.
> Those changes are simply discarded during recovery, aren't they?
>
> So one might lose changes during the backup process by not following
> the order of the docs, but again, I might not care, like with file
> system snapshots.
>
> Or what's the difference when looking at consistency/recovery compared
> to some point in time only between FS level snapshots and using
> pg_start|stop_backup+WALs in different order?
>
> From my understanding, pg_start_backup+pg_stop_backup+FS-snapshot
> should even be the most correct approach, as long as the created WALs
> are contained in the snapshot. Other thoughts?

No, this is all wrong.

You need to copy/snapshot the files *between* pg_start_backup and
pg_stop_backup.

To make that clear, think of two tables with a foreign key relationship:
A and B. Now you perform a "backup" like you envision.
Now the files for A and B get backed up at different times, let's
say that A gets backed up first. Now a client inserts one row in A
and one in B that reference each other. The row are inserted after A
is backed up, but before B is backed up.

Now the backed up files are inconsistent - there is a row in B with
no matching row in A. The only way to cope with that is to replay
WAL to a point in time *after* the last data file was backed up.
But with your proposed backup method, you have no way to determine
that point in time, so this is not safe at all.

This is exactly the purpose of pg_stop_backup: it does not only
archive the WAL segment that completes the backup, but it also adds
a BACKUP_END record to the WAL stream. That ensures that recovery
cannot stop too early: any attempt to end recovery before reaching
BACKUP_END will cause a fatal error. As soon as BACKUP_END is
processed, the startup process knows that the database is now
consistent (and logs a message to that extent). You can contine
recovering to any later point in time, but you need not
(recovery_target = immediate).

With your proposed method, you run the danger of stopping recovery too
early, with the consequence of ending up with a corrupted database.

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Thorsten Schöning 2021-07-23 14:12:02 Re: What's the dfifference between pg_start_backup+copy+pg_stop_backup+WAL vs. pg_start_backup+pg_stop_backup+copy+WAL?
Previous Message Thorsten Schöning 2021-07-23 09:56:38 What's the dfifference between pg_start_backup+copy+pg_stop_backup+WAL vs. pg_start_backup+pg_stop_backup+copy+WAL?