Re: IRe: BUG #16792: silent corruption of GIN index resulting in SELECTs returning non-matching rows

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Pawel Kudzia <kudzia(at)gmail(dot)com>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: IRe: BUG #16792: silent corruption of GIN index resulting in SELECTs returning non-matching rows
Date: 2021-07-20 11:55:58
Message-ID: 14dfcfab-36df-a92d-a42c-01065acf0036@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 18/07/2021 11:27, Pawel Kudzia wrote:
> I'm sending those over directly to your mail.

Thanks! I looked through those logs, and I'm now not seeing anything
wrong with the pages involved. The query first scans the pending list,
and doesn't find any matches there. It then descends the entry tree,
finds pointer to a posting tree for id 1373, then scans the posting
tree. For some reason, the TIDs of those offending rows, (4002784,1) and
(4002869,14) are in the posting tree for key '1373', even though the
heap tuples don't contain that key. Both of those TIDs are in the same
TID list, on page 255179:

> postgres=# select * from (select blknum, ((gin_leafpage_items(content)).*) from ginblocks where blknum=255179) x where tids::text like '%(4002869,14)%';
> blknum | first_tid | nbytes |
> tids
>
>
> --------+--------------+--------+-------------------------------------------------------------------------------------------------------------------------------------------------------------
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> -------------------------------------------------------------------------------------------
> 255179 | (4002720,16) | 87 | {"(4002720,16)","(4002721,3)","(4002723,5)","(4002739,11)","(4002743,14)","(4002744,2)","(4002784,1)","(4002784,10)","(4002790,46)","(4002790,47)","(4002792
> ,22)","(4002792,23)","(4002792,24)","(4002796,22)","(4002798,2)","(4002800,7)","(4002802,27)","(4002802,28)","(4002803,22)","(4002805,28)","(4002816,13)","(4002817,19)","(4002820,2)","(40028
> 25,17)","(4002854,7)","(4002855,13)","(4002855,14)","(4002869,14)","(4006217,3)","(4006217,11)","(4006307,11)","(4006330,4)","(4006330,8)","(4006331,6)","(4006332,9)","(4006332,10)","(400633
> 3,9)","(4006335,2)","(4006338,1)","(4006342,7)","(4006343,5)","(4006407,2)","(4008912,2)"}
> (1 row)
I've used pg_filedump and pageinspect on all of the pages on the path to
this entry, but I don't see anything wrong with the index.

So how did it end up like that? I'm afraid I have no clue :-(. Peter's
hunch that it's some bad interaction between deleted pages and the
pending list sounds plausible, but looking at the code, I don't see any
concrete bugs like that.

We have one more lead that we haven't fully explored: amcheck reported
this error:

ERROR: index "entity_attribute_name_ids_gin": tid exceeds parent's high
key in postingTree leaf on block 321746

Give that the GIN amheck patch is still very experimental, that could
well be a bug in amcheck instead of real corruption. Also, that error
was *not* in the posting tree for key '1373'. I hacked amcheck on my
laptop to run only on the posting tree or '1373', and it didn't find any
errors there.

But let's check it out. I modified 'amcheck' to print more information.
I also changed it so that it doesn't stop at the first error, but keeps
going, and reports all issues as warnings.

Pawel: Could you try the attached version of amcheck? Please apply the
attached patch (you'll have to un-apply the previous version or start
from a clean source tree), run "SELECT
gin_index_parent_check('entity_attribute_name_ids_gin');" again, and
send over the result? It should print a bunch of INFOs and WARNINGs to
the client's stderr, so a good way to capture that is something like this:

psql -c "select
gin_index_parent_check('entity_attribute_name_ids_gin');" 2> amcheck-log

Let's see if that reveals something interesting. If not, what can we do
next?

1. Let's fix the issues with the triconsistent functions that we
discussed earlier. They don't explain what we're seeing here, but should
be cleaned up nevertheless.

2. Let's add more sanity checks wherever we read a GIN page, to check
that it's of the expected kind. Again we have no theory on what sequence
of events could cause this, but having more sanity checks seems like a
good idea in any case.

3. Run the test case again and wait for the issue to re-appear. This
time, let's try to capture the WAL, so that we can trace back the
modifications to the pages that lead to the situation.

Any other ideas?

- Heikki

Attachment Content-Type Size
v3-0001-Amcheck-for-GIN-13stable.patch text/x-patch 37.1 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Pawel Kudzia 2021-07-20 12:22:35 Re: IRe: BUG #16792: silent corruption of GIN index resulting in SELECTs returning non-matching rows
Previous Message Kyotaro Horiguchi 2021-07-20 04:18:20 Re: BUG #17103: WAL segments are not removed after exceeding max_slot_wal_keep_size