From: | Peter Geoghegan <pg(at)bowt(dot)ie> |
---|---|
To: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Cc: | Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com> |
Subject: | Re: amcheck's verify_heapam(), and HOT chain verification |
Date: | 2021-11-06 22:09:15 |
Message-ID: | CAH2-Wzm=LtRDtgvB506GWveEtj-iZ0nUwyRMZUfCD=tq_pQ1ug@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Nov 5, 2021 at 7:51 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> Here are some specific checks I have in mind:
One more for the list:
* Validate PageIsAllVisible() for each page.
In other words, pg_visibility should be merged with verify_heapam.c
(or at least pg_visibility 's pg_check_frozen() and pg_check_visible()
functions should be moved, merged, or whatever). This would mean that
verify_heapam() would directly check if the page-level PD_ALL_VISIBLE
flag contradicts either the tuple headers of tuples with storage on
the page, the presence (or absence) of LP_DEAD stub line pointers on
the page, or the corresponding visibility map bit (e.g.,
VISIBILITYMAP_ALL_VISIBLE) for the page.
There is value in teaching verify_heapam() about any possible problem,
including with the visibility map, but it's certainly less valuable
than the HOT chain verification stuff -- and probably trickier to get
right. I'm mentioning it now to be exhaustive, but it's less of a
priority for me personally.
I am quite willing to help out with all this, if you're interested.
One more thing about HOT chain validation:
I can give you another example bug of the kind I'd expect
verify_heapam() to catch only with full HOT chain validation. This one
is a vintage MultiXact bug that has the same basic HOT chain
corruption, looks-like-index-corruption-but-isn't quality as the more
memorable freeze-the-dead bug (this one was fixed by commit 6bfa88ac):
In general I think that reviewing historic examples of pernicious
corruption bugs is a valuable exercise when designing tools like
amcheck. Maybe even revert the fix during testing, to be sure it would
have been caught had the final tool been available. History doesn't
repeat itself, but it does rhyme.
--
Peter Geoghegan
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2021-11-06 22:22:51 | Re: inefficient loop in StandbyReleaseLockList() |
Previous Message | Peter Geoghegan | 2021-11-06 20:03:17 | Re: amcheck's verify_heapam(), and HOT chain verification |