From: | Peter Geoghegan <pg(at)bowt(dot)ie> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Kamigishi Rei <iijima(dot)yun(at)koumakan(dot)jp>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
Subject: | Re: BUG #17245: Index corruption involving deduplicated entries |
Date: | 2021-10-30 18:46:22 |
Message-ID: | CAH2-WzkN5aESSLfK7-yrYgsXxYUi__VzG4XpZFwXm98LUtoWuQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On Fri, Oct 29, 2021 at 9:12 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> It looks like a v14 issue. I can't reproduce it in 13, and I think a
> change in
>
> commit b4af70cb210393c9c8f41643acf6b213e21178e7
> Author: Peter Geoghegan <pg(at)bowt(dot)ie>
> Date: 2021-04-05 13:21:44 -0700
>
> Simplify state managed by VACUUM.
>
> which changed the logic for which relations are done in the leader:
> without realizing that the "shared_indstats == NULL || " piece is
> important to handle parallel-safe but too-small indexes correctly.
This was a simple blunder on my part. I didn't intend to change
anything about the behavior of parallel VACUUM, and so this commit was
just about the only one of mine from Postgres 14 that I didn't
consider as a possibility, even once. Mea culpa.
Attached is a draft patch that fixes the problem.
Also attached is a second patch. This adds assertions to
heap_index_delete_tuples() to catch cases where a heap TID in an index
points to an LP_UNUSED item in the heap (which is what this bug looked
like, mostly). It also checks for certain more or less equivalent
inconsistencies: the case where a heap TID in an index points to a
line pointer that's past the end of the heap page's line pointer
array, and the case where a heap TID in an index points directly to a
heap-only tuple.
Putting these new assertions in heap_index_delete_tuples() makes a lot
of sense. The interlocking rules between an index and a heap relation
are very simple here. They're also uniform across all supported index
access methods (of those that support index tuple deletion, currently
just nbtree, hash, and GiST). This is one of the very few code paths
where there is a buffer lock (not just a pin) on the index page held
throughout checking the heap; there is clearly no question of spurious
assertion failures due to a concurrent vacuum. Note that the index
deletion stuff I worked on in Postgres 14 batches-up TIDs in various
ways, and so in practice heap_index_delete_tuples() will tend to pass
over many of the TIDs from the index page, in many important cases
(not so much with GiST and hash right now, but I expect that to change
in the future).
I want to once again thank Kamigishi Rei for her tireless work on
this. If she didn't go to the trouble of providing us with so much
help, then it would have probably taken significantly longer to figure
out what was going on here. Also want to thank Andres.
--
Peter Geoghegan
Attachment | Content-Type | Size |
---|---|---|
v1-0002-Add-index-points-to-LP_UNUSED-item-assertions.patch | application/octet-stream | 1.6 KB |
v1-0001-Fix-parallel-index-vacuuming-bug.patch | application/octet-stream | 1.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Noah Misch | 2021-10-30 19:19:19 | Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data |
Previous Message | Juan José Santamaría Flecha | 2021-10-30 18:31:05 | Re: BUG #17254: Crash with 0xC0000409 in pg_stat_statements when pg_stat_tmp\pgss_query_texts.stat exceeded 2GB. |