From: | Peter Geoghegan <pg(at)bowt(dot)ie> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Kamigishi Rei <iijima(dot)yun(at)koumakan(dot)jp>, David Rowley <dgrowley(at)gmail(dot)com>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org> |
Subject: | Re: BUG #17245: Index corruption involving deduplicated entries |
Date: | 2021-10-29 01:43:54 |
Message-ID: | CAH2-WznnFrNd62LHW5WwazzvXrwJ5mc_Ddou49Jd2Bm_7oW_oQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On Thu, Oct 28, 2021 at 6:19 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> It's not the cause of this problem, but I did find a minor issue: the retry
> path in lazy_scan_prune() looses track of the deleted tuple count when
> retrying.
Matthias van de Meent (now my coworker) pointed this out several
months back. I don't see any reason to prefer remembering it to not
remembering it. In any case it's not too important, since the retry
behavior is inherently very rare. But we can change it around later,
if you prefer.
> The retry codepath also made me wonder if there could be problems if we do
> FreezeMultiXactId() multiple times due to retry. I think we can end up
> creating multiple multixactids for the same tuple (if the members change,
> which is likely in the retry path). But that should be fine, I think.
That's not possible, because we fully decide what we're going to do
with the page as soon as we break out of the for() loop. The closest
thing to calling FreeMultiXactId() that can happen before that point
(before we "commit" to a certain course of action for the page) are
calls made to heap_prepare_freeze_tuple(), from inside the loop.
heap_prepare_freeze_tuple() doesn't actually modify anything -- it
just decides what to do later on, in the "nfrozen > 0" critical
section back in lazy_scan_prune().
> Shrug. It doesn't seem that hard to believe that repeatedly trying to prune
> the same page could unearth some bugs. E.g. via the heap_prune_record_unused()
> path in heap_prune_chain().
I wasn't thinking of lazy_scan_prune() before, when you brought up
index vacuuming -- I was thinking of lazy_vacuum(). But FWIW I'm not
sure what you mean here.
I believe that the goto retry logic inside lazy_scan_prune() can
restart and expect a clean slate. Sure, we're pruning again, but is
that appreciably different to a concurrent opportunistic prune that
could happen at almost any time during the VACUUM, anyway? (Well,
maybe it is in that we get the accounting for things like VACUUM
VERBOSE very slightly wrong, but even that's debatable.)
> Hm. I assume somebody checked and verified that old_snapshot_threshold is not
> in use? Seems unlikely, but wrongly entering that heap_prune_record_unused()
> path could certainly cause issues like we're observing.
We should inquire about settings used, on general principle.
--
Peter Geoghegan
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Geoghegan | 2021-10-29 01:50:04 | Re: BUG #17245: Index corruption involving deduplicated entries |
Previous Message | PG Bug reporting form | 2021-10-29 01:27:35 | BUG #17253: Composite partition table configuration error |