Re: Add recovery to pg_control and remove backup_label

From: David Steele <david(at)pgmasters(dot)net>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Add recovery to pg_control and remove backup_label
Date: 2023-11-21 18:48:59
Message-ID: 5cd0edb1-57eb-4814-8236-4f2ccc65f369@pgmasters.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11/21/23 13:59, Andres Freund wrote:
>
> On 2023-11-21 13:41:15 -0400, David Steele wrote:
>> On 11/20/23 16:41, Andres Freund wrote:
>>>
>>> On 2023-11-20 15:56:19 -0400, David Steele wrote:
>>>> I understand this is an option -- but does it need to be? What is the
>>>> benefit of excluding the manifest?
>>>
>>> It's not free to create the manifest, particularly if checksums are enabled.
>>
>> It's virtually free, even with the basic CRCs.
>
> Huh?

<snip>

> I'd not call 7.06->4.77 or 6.76->4.77 "virtually free".

OK, but how does that look with compression -- to a remote location?
Uncompressed backup to local storage doesn't seem very realistic. With
gzip compression we measure SHA1 checksums at about 5% of total CPU.
Obviously that goes up with zstd or lz4. but parallelism helps offset
that cost, at least in clock time.

I can't understate how valuable checksums are in finding corruption,
especially in long-lived backups.

>> Anyway, would you really want a backup without a manifest? How would you
>> know something is missing? In particular, for page incremental how do you
>> know something is new (but not WAL logged) if there is no manifest? Is the
>> plan to just recopy anything not WAL logged with each incremental?
>
> Shrug. If you just want to create a new standby by copying the primary, I
> don't think creating and then validating the manifest buys you much. Long term
> backups are a different story, particularly if data files are stored
> individually, rather than in a single checksummed file.

Fine, but you are probably not using page incremental if just using
pg_basebackup to create a standby. With page incremental, at least one
of the backups will already exist, which argues for a manifest.

>>> Also, for external backups, there's no manifest...
>>
>> There certainly is a manifest for many external backup solutions. Not having
>> a manifest is just running with scissors, backup-wise.
>
> You mean that you have an external solution gin up a backup manifest? I fail
> to see how that's relevant here?

Just saying that for external backups there *is* often a manifest and it
is a good thing to have.

Regards,
-David

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2023-11-21 20:00:18 Re: Add recovery to pg_control and remove backup_label
Previous Message Jeff Davis 2023-11-21 18:26:14 Re: simplehash: SH_OPTIMIZE_REPEAT for optimizing repeated lookups of the same key