Re: Do results of pg_start_backup work without WAL segments created during backup?

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Thorsten Schöning <tschoening(at)am-soft(dot)de>
Cc: pgsql-admin <pgsql-admin(at)postgresql(dot)org>
Subject: Re: Do results of pg_start_backup work without WAL segments created during backup?
Date: 2019-07-08 10:02:31
Message-ID: CABUevEynTpkvkdC3vLAF4o6_j5DbRO=FKOcH2zQ2vsyFNv2JLg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

On Mon, Jul 8, 2019 at 11:57 AM Thorsten Schöning <tschoening(at)am-soft(dot)de>
wrote:

> Hi all,
>
> we are reviewing our current backup process based on the low level
> pg_start_backup and pg_stop_backup using the exclusive approach. I
> wonder how important the WAL-archives created during backup really are
> in terms of if they are necessary to get Postgres up and working at
> all.
>
> The docs mention that after pg_start_backup has been issued, files of
> the data directory of Postgres can be copied however one likes. The
> important point seems to be that pg_start_backup does checkpointing,
> so that all data until the start of the backup gets written to disk.
> Afterwards, additional writes can happen to any file at any given time
> and changes are recorded using the WAL like normal.
>
> What happens WITHOUT the WAL-archives created during the backup when a
> cluster needs to be restored?
>

Then you have a corrupt and unusable backup.

The pg_start_backup/pg_stop_backup method for backups can *only* be used
together with a working log archive.

From my understanding, the cluster restores as normal, but only up
> until the point when pg_start_backup executed. Without additionally
> shipping the WAL-archives later, one would simply loose the data
> created after pg_start_backup has been called. But once the data
>

Not just loose the data, your cluster will be corrupt.

> OR are the copied files so fundamentally broken that Postgres is not
> able to operate at all without the WAL-archives during backup?
>

This would be the case.

Wouldn't make much sense to me, because Postgres needs to operate
> properly already to replay the WAL-archives. It needs to know from
> which checkpoint to start, which is available after using
> pg_start_backup. From my understanding, there's no info created by
> pg_start_backup about additionally necessary WAL-archives blocking
> bringing up the cluster successfully if not present. If none are
> available, nothing gets replayed, but things still work.
>

No, without the WAL generated betweens tart and stop backup, the cluster
will be incomplete and corrupt. In your description you are for example not
counting for activity that happens *during* the checkpoint.

The information about which WAL blocks are necessary are generated during
pg_stop_backup, not pg_start_backup.

There is a reason these functions are labeled "low level APIs". They are
designed to work together with other parts, like the log archiving, not to
be a complete solution on their own. There are other tools available that
provide all the plumbing, such as pg_basebackup, pgbackrest and pgbarman.
If there are any doubts whatsoever on how they interact in your
environment, you should *really* be looking at one of those higher level
tools.

/Magnus

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Thorsten Schöning 2019-07-08 10:56:24 Re: Do results of pg_start_backup work without WAL segments created during backup?
Previous Message Thorsten Schöning 2019-07-08 09:57:07 Do results of pg_start_backup work without WAL segments created during backup?