From: | Michael Banck <michael(dot)banck(at)credativ(dot)de> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Stephen Frost <sfrost(at)snowman(dot)net>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
Subject: | Re: [Patch] Base backups and random or zero pageheaders |
Date: | 2019-04-30 13:07:43 |
Message-ID: | 1556629663.25111.4.camel@credativ.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
Am Mittwoch, den 27.03.2019, 11:37 +0100 schrieb Michael Banck:
> Am Dienstag, den 26.03.2019, 19:23 +0100 schrieb Michael Banck:
> > Am Dienstag, den 26.03.2019, 10:30 -0700 schrieb Andres Freund:
> > > On 2019-03-26 18:22:55 +0100, Michael Banck wrote:
> > > > /*
> > > > - * Only check pages which have not been modified since the
> > > > - * start of the base backup. Otherwise, they might have been
> > > > - * written only halfway and the checksum would not be valid.
> > > > - * However, replaying WAL would reinstate the correct page in
> > > > - * this case. We also skip completely new pages, since they
> > > > - * don't have a checksum yet.
> > > > + * We skip completely new pages after checking they are
> > > > + * all-zero, since they don't have a checksum yet.
> > > > */
> > > > - if (!PageIsNew(page) && PageGetLSN(page) < startptr)
> > > > + if (PageIsNew(page))
> > > > {
> > > > - checksum = pg_checksum_page((char *) page, blkno + segmentno * RELSEG_SIZE);
> > > > - phdr = (PageHeader) page;
> > > > - if (phdr->pd_checksum != checksum)
> > > > + all_zeroes = true;
> > > > + pagebytes = (size_t *) page;
> > > > + for (int i = 0; i < (BLCKSZ / sizeof(size_t)); i++)
> > >
> > > Can we please abstract the zeroeness check into a separate function to
> > > be used both by PageIsVerified() and this?
> >
> > Ok, done so as PageIsZero further down in bufpage.c.
>
> It turns out that pg_checksums (current master and back branches, not
> just the online version) needs this treatment as well as it won't catch
> zeroed-out pageheader corruption, see attached patch to its TAP tests
> which trigger it (I also added a random data check similar to
> pg_basebackup as well which is not a problem for the current codebase).
>
> Any suggestion on how to handle this? Should I duplicate the
> PageIsZero() code in pg_checksums? Should I move PageIsZero into
> something like bufpage_impl.h for use by external programs, similar to
> pg_checksum_page()?
>
> I've done the latter as a POC in the second attached patch.
This is still an open item for the back branches I guess, i.e. zero page
header for pg_verify_checksums and additionally random page header for
pg_basebackup's base backup.
Do you plan to work on the patch you have outlined, what would I need to
change in the patches I submitted or is another approach warranted
entirely? Should I add my patches to the next commitfest in order to
track them?
Michael
--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax: +49 2166 9901-100
Email: michael(dot)banck(at)credativ(dot)de
credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer
Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2019-04-30 13:43:16 | Re: ERROR: failed to add item to the index page |
Previous Message | Fabien COELHO | 2019-04-30 10:32:13 | Re: performance regression when filling in a table |