From: | Michael Banck <michael(dot)banck(at)credativ(dot)de> |
---|---|
To: | Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru> |
Cc: | PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [patch] Fix checksum verification in base backups for zero page headers |
Date: | 2020-09-22 14:26:31 |
Message-ID: | 4b1c16216dcb60e1dad96142d25fd06fb77485bf.camel@credativ.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
Am Mittwoch, den 02.09.2020, 16:50 +0300 schrieb Anastasia Lubennikova:
> On 01.09.2020 13:22, Michael Banck wrote:
> > as a continuation of [1], I've split-off the zero page header case from
> > the last patch, as this one seems less contentious.
> > [1] https://commitfest.postgresql.org/28/2308/
>
> I've looked through the previous discussion. As far as I got it, most of
> the controversy was about online checksums improvements.
>
> The warning about pd_upper inconsistency that you've added is a good
> addition. The patch is a bit messy, though, because a huge code block
> was shifted.
>
> Will it be different, if you just leave
> "if (!PageIsNew(page) && PageGetLSN(page) < startptr)"
> block as it was, and add
> "else if (PageIsNew(page) && !PageIsZero(page))" ?
Thanks, that does indeed look better as a patch and I think it's fine
as-is for the code as well, I've attached a v2.
> While on it, I also have a few questions about the code around:
All fair points, but I think those should be handled in another patch,
if any.
> 1) Maybe we can use other header sanity checks from PageIsVerified() as
> well? Or even the function itself.
Yeah, I'll keep that in mind.
> 2) > /* Reset loop to validate the block again */
> How many times do we try to reread the block? Is one reread enough?
Not sure whether more rereads would help, but I think the general
direction was to replace this with something better anyway I think (see
below).
> Speaking of which, 'reread_cnt' name looks confusing to me. I would
> expect that this variable contains a number of attempts, not the number
> of bytes read.
Well, there are other "cnt" variables that keep the number of bytes read
in that file, so it does not seem to be out of place for me. A comment
might not hurt though.
On second glance, it should maybe also be of size_t and not int type, as
is the other cnt variable?
> If everyone agrees, that for basebackup purpose it's enough to rely on a
> single reread, I'm ok with it.
> Another approach is to read the page directly from shared buffers to
> ensure that the page is fine. This way is more complicated, but should
> be almost bullet-proof. Using it we can also check pages with lsn >=
> startptr.
Right, that's what Andres also advocated for initially I think, and what
should be done going forward.
> 3) Judging by warning messages, we count checksum failures per file, not
> per page, and don't report after a fifth failure. Why so? Is it a
> common case that so many pages of one file are corrupted?
I think this was a request during the original review, on the basis that
if we have more than a few checksum errors, it's likely there is
something fundamentally broken and it does not make sense to spam the
log with potentially millions of broken checksums.
Cheers,
Michael
--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax: +49 2166 9901-100
Email: michael(dot)banck(at)credativ(dot)de
credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer
Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz
Attachment | Content-Type | Size |
---|---|---|
0001-Fix-checksum-verification-in-base-backups-for-zero-p.V2.patch | text/x-patch | 6.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Banck | 2020-09-22 14:30:19 | Re: [patch] Fix checksum verification in base backups for zero page headers |
Previous Message | Tom Lane | 2020-09-22 14:18:40 | Re: pgindent vs dtrace on macos |