From: | Michael Paquier <michael(at)paquier(dot)xyz> |
---|---|
To: | Stephen Frost <sfrost(at)snowman(dot)net> |
Cc: | Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, David Steele <david(at)pgmasters(dot)net>, Magnus Hagander <magnus(at)hagander(dot)net>, Michael Banck <michael(dot)banck(at)credativ(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Asif Rehman <asifr(dot)rehman(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Online verification of checksums |
Date: | 2020-11-24 01:28:06 |
Message-ID: | 20201124012806.GB3046@paquier.xyz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Nov 23, 2020 at 05:28:52PM -0500, Stephen Frost wrote:
> * Anastasia Lubennikova (a(dot)lubennikova(at)postgrespro(dot)ru) wrote:
>> Yes and this is a tricky part. Until you have explained it in your latest
>> message, I wasn't sure how we can distinct concurrent update from a page
>> header corruption. Now I agree that if page LSN updated and increased
>> between rereads, it is safe enough to conclude that we have some concurrent
>> load.
>
> Even in this case, it's almost free to compare the LSN to the starting
> backup LSN, and to the current LSN position, and make sure it's
> somewhere between the two. While that doesn't entirely eliminite the
> possibility that the page happened to get corrupted *and* return a
> different result on subsequent reads *and* that it was corrupted in such
> a way that the LSN ended up falling between the starting backup LSN and
> the current LSN, it's certainly reducing the chances of a false negative
> a fair bit.
FWIW, I am not much a fan of designs that are not bullet-proof by
design. This reduces the odds of problems, sure, still it does not
discard the possibility of incorrect results, confusing users as well
as people looking at such reports.
>> To sum up, I agree with your proposal to reread the page and rely on
>> ascending LSNs. Can you submit a patch?
>
> Probably would make sense to give Michael an opportunity to comment and
> get his thoughts on this, and for him to update the patch if he agrees.
I think that a LSN check would be a safe thing to do iff pd_checksum
is already checked first to make sure that the page contents are fine
to use. Still, what's the point in doing a LSN check anyway if we
know that the checksum is valid? Then on a retry if the first attempt
failed you also need the guarantee that there is zero concurrent I/O
activity while a page is rechecked (no need to do that unless the
initial page check doing a checksum match failed). So the retry needs
to do some s_b interactions, but then comes the much trickier point of
concurrent smgrwrite() calls bypassing the shared buffers.
> As it relates to pgbackrest, we're currently contemplating having a
> higher level loop which, upon detecting any page with an invalid
> checksum, continues to scan to the end of that file and perform the
> compression, encryption, et al, but then loops back after we've
> completed that file and skips through the file again, re-reading those
> pages which didn't have a valid checksum the first time to see if their
> LSN has changed and is within the range of the backup. This will
> certainly give more opportunity for the kernel to 'catch up', if needed,
> and give us an updated page without a random 100ms delay, and will also
> make it easier for us to, eventually, check and make sure the page was
> in the WAL that was been produced as part of the backup, to give us a
> complete guarantee that the contents of this page don't matter and that
> the failed checksum isn't a sign of latent storage corruption.
That would reduce the likelyhood of facing torn pages, still you
cannot fully discard the problem either as a same page may get changed
again once you loop over, no? And what if a corruption has updated
pd_lsn on-disk? Unlikely so, still possible.
--
Michael
From | Date | Subject | |
---|---|---|---|
Next Message | James Coleman | 2020-11-24 01:29:16 | Re: enable_incremental_sort changes query behavior |
Previous Message | Michael Paquier | 2020-11-24 01:10:43 | Re: Online verification of checksums |