Re: pg_combinebackup does not detect missing files

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: David Steele <david(at)pgmasters(dot)net>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_combinebackup does not detect missing files
Date: 2024-04-19 15:47:07
Message-ID: CA+Tgmoa6McCGpaA66dbrGA6onGUytc5Jds_3Y=y7tMaPy1p9mQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Apr 18, 2024 at 7:36 PM David Steele <david(at)pgmasters(dot)net> wrote:
> Sure -- pg_combinebackup would only need to verify the data that it
> uses. I'm not suggesting that it should do an exhaustive verify of every
> single backup in the chain. Though I can see how it sounded that way
> since with pg_verifybackup that would pretty much be your only choice.
>
> The beauty of doing verification in pg_combinebackup is that it can do a
> lot less than running pg_verifybackup against every backup but still get
> a valid result. All we care about is that the output is correct -- if
> there is corruption in an unused part of an earlier backup
> pg_combinebackup doesn't need to care about that.
>
> As far as I can see, pg_combinebackup already checks most of the boxes.
> The only thing I know that it can't do is detect missing files and that
> doesn't seem like too big a thing to handle.

Hmm, that's an interesting perspective. I've always been very
skeptical of doing verification only around missing files and not
anything else. I figured that wouldn't be particularly meaningful, and
that's pretty much the only kind of validation that's even
theoretically possible without a bunch of extra overhead, since we
compute checksums on entire files rather than, say, individual blocks.
And you could really only do it for the final backup in the chain,
because you should end up accessing all of those files, but the same
is not true for the predecessor backups. So it's a very weak form of
verification.

But I looked into it and I think you're correct that, if you restrict
the scope in the way that you suggest, we can do it without much
additional code, or much additional run-time. The cost is basically
that, instead of only looking for a backup_manifest entry when we
think we can reuse its checksum, we need to do a lookup for every
single file in the final input directory. Then, after processing all
such files, we need to iterate over the hash table one more time and
see what files were never touched. That seems like an acceptably low
cost to me. So, here's a patch.

I do think there's some chance that this will encourage people to
believe that pg_combinebackup is better at finding problems than it
really is or ever will be, and I also question whether it's right to
keep changing stuff after feature freeze. But I have a feeling most
people here are going to think this is worth including in 17. Let's
see what others say.

--
Robert Haas
EDB: http://www.enterprisedb.com

Attachment Content-Type Size
v1-0001-pg_combinebackup-Detect-missing-files-when-possib.patch application/octet-stream 11.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2024-04-19 16:18:33 Re: Support a wildcard in backtrace_functions
Previous Message Japin Li 2024-04-19 15:46:10 Support event trigger for logoff