Re: BUG #17245: Index corruption involving deduplicated entries

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Kamigishi Rei <iijima(dot)yun(at)koumakan(dot)jp>, David Rowley <dgrowley(at)gmail(dot)com>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #17245: Index corruption involving deduplicated entries
Date: 2021-10-28 23:04:44
Message-ID: CAH2-Wz=k-Gh3ANkpxvK9+XJqAQVEySPu08w2KYfnUKwHPqFrgw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Thu, Oct 28, 2021 at 3:48 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> That wouldn't protect against e.g. a logic bug in ZFS.

> Not saying that that is the most likely explanation, just something worth
> checking.

True. It's too early to rule that out. Though note that a full
pg_amcheck of the database mostly didn't complain about anything -- it
was just a handful of indexes, associated with just 2 tables. And this
is mediawiki, which has lots of tables. None of the new heapam
verification functionality found any problems (as with the older
index-matches-table heapallindexed stuff).

> Didn't 14 change the logic when index vacuums are done? That could cause
> previously existing issues to manifest with a higher likelihood.

I don't follow. The new logic that skips index vacuuming kicks in 1)
in an anti-wraparound vacuum emergency, and 2) when there are very few
LP_DEAD line pointers in the heap. We can rule 1 out, I think, because
the XIDs we see are in the low millions, and our starting point was a
database that was upgraded via a dump and reload.

The second criteria for skipping index vacuuming (the "less than 2% of
heap pages have any LP_DEAD items" thing) might well have been hit on
these tables -- it is after all very common. But I don't see how that
could matter. We're never going to get to a code path inside
vacuumlazy.c that sets LP_DEAD items from VACUUM's dead_tuples array
to LP_UNUSED (how could reached such a code path without also index
vacuuming, given the way things are set up inside lazy_vacuum()?).
We're always going to have the opportunity to do index vacuuming with
any left-behind LP_DEAD line pointers in the next VACUUM -- right
after the later VACUUM successfully returns from
lazy_vacuum_all_indexes().

--
Peter Geoghegan

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Peter Geoghegan 2021-10-28 23:23:09 Re: BUG #17245: Index corruption involving deduplicated entries
Previous Message Andres Freund 2021-10-28 22:52:55 Re: BUG #17241: llvm::install_bad_alloc_error_handler error