From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | David Steele <david(at)pgmasters(dot)net> |
Cc: | Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: pg_combinebackup does not detect missing files |
Date: | 2024-08-02 13:37:26 |
Message-ID: | CA+TgmoZTfFxuF07_ZPvN4Or0H2r-Du8zahiggi=izeNiJjajGQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Apr 19, 2024 at 11:47 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> Hmm, that's an interesting perspective. I've always been very
> skeptical of doing verification only around missing files and not
> anything else. I figured that wouldn't be particularly meaningful, and
> that's pretty much the only kind of validation that's even
> theoretically possible without a bunch of extra overhead, since we
> compute checksums on entire files rather than, say, individual blocks.
> And you could really only do it for the final backup in the chain,
> because you should end up accessing all of those files, but the same
> is not true for the predecessor backups. So it's a very weak form of
> verification.
>
> But I looked into it and I think you're correct that, if you restrict
> the scope in the way that you suggest, we can do it without much
> additional code, or much additional run-time. The cost is basically
> that, instead of only looking for a backup_manifest entry when we
> think we can reuse its checksum, we need to do a lookup for every
> single file in the final input directory. Then, after processing all
> such files, we need to iterate over the hash table one more time and
> see what files were never touched. That seems like an acceptably low
> cost to me. So, here's a patch.
>
> I do think there's some chance that this will encourage people to
> believe that pg_combinebackup is better at finding problems than it
> really is or ever will be, and I also question whether it's right to
> keep changing stuff after feature freeze. But I have a feeling most
> people here are going to think this is worth including in 17. Let's
> see what others say.
There was no hue and cry to include this in v17 and I think that ship
has sailed at this point, but we could still choose to include this as
an enhancement for v18 if people want it. I think David's probably in
favor of that (but I'm not 100% sure) and I have mixed feelings about
it (explained above) so what I'd really like is some other opinions on
whether this idea is good, bad, or indifferent.
Here is a rebased version of the patch. No other changes since v1.
--
Robert Haas
EDB: http://www.enterprisedb.com
Attachment | Content-Type | Size |
---|---|---|
v2-0001-pg_combinebackup-Detect-missing-files-when-possib.patch | application/octet-stream | 11.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Jacob Champion | 2024-08-02 13:48:00 | Re: can we mark upper/lower/textlike functions leakproof? |
Previous Message | David Rowley | 2024-08-02 13:13:36 | Re: On disable_cost |