Re: Add notes to pg_combinebackup docs

From: David Steele <david(at)pgmasters(dot)net>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Martín Marqués <martin(dot)marques(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add notes to pg_combinebackup docs
Date: 2024-04-11 00:01:00
Message-ID: 05fb32c9-18d8-4f72-9af3-f41576c33119@pgmasters.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 4/9/24 19:44, Tomas Vondra wrote:
>
> On 4/9/24 09:59, Martín Marqués wrote:
>> Hello,
>>
>> While doing some work/research on the new incremental backup feature
>> some limitations were not listed in the docs. Mainly the fact that
>> pg_combienbackup works with plain format and not tar.
>>
>
> Right. The docs mostly imply this by talking about output directory and
> backup directories, but making it more explicit would not hurt.
>
> FWIW it'd be great if we could make incremental backups work with tar
> format in the future too. People probably don't want to keep around the
> expanded data directory or extract everything before combining the
> backups is not very convenient. Reading and writing the tar would make
> this simpler.

I have a hard time seeing this feature as being very useful, especially
for large databases, until pg_combinebackup works on tar (and compressed
tar). Right now restoring an incremental requires at least twice the
space of the original cluster, which is going to take a lot of users by
surprise.

I know you have made some improvements here for COW filesystems, but my
experience is that Postgres is generally not run on such filesystems,
though that is changing a bit.

>> Around the same time, Tomas Vondra tested incremental backups with a
>> cluster where he enabled checksums after taking the previous full
>> backup. After combining the backups the synthetic backup had pages
>> with checksums and other pages without checksums which ended in
>> checksum errors.
>
> I'm not sure just documenting this limitation is sufficient. We can't
> make the incremental backups work in this case (it's as if someone
> messes with cluster without writing stuff into WAL), but I think we
> should do better than silently producing (seemingly) corrupted backups.
>
> I say seemingly, because the backup is actually fine, the only problem
> is it has checksums enabled in the controlfile, but the pages from the
> full backup (and the early incremental backups) have no checksums.
>
> What we could do is detect this in pg_combinebackup, and either just
> disable checksums with a warning and hint to maybe enable them again. Or
> maybe just print that the user needs to disable them.
>
> I was thinking maybe we could detect this while taking the backups, and
> force taking a full backup if checksums got enabled since the last
> backup. But we can't do that because we only have the manifest from the
> last backup, and the manifest does not include info about checksums.

I'd say making a new full backup is the right thing to do in this case.
It should be easy enough to store the checksum state of the cluster in
the manifest.

Regards,
-David

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Kincaid 2024-04-11 00:23:19 Re: post-freeze damage control
Previous Message David Steele 2024-04-10 23:36:11 Re: post-freeze damage control