Re: IRe: BUG #16792: silent corruption of GIN index resulting in SELECTs returning non-matching rows

From: Pawel Kudzia <kudzia(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: IRe: BUG #16792: silent corruption of GIN index resulting in SELECTs returning non-matching rows
Date: 2021-07-18 08:27:32
Message-ID: CAJYBUS_O8ZYvKgF8A+gwR3=ja0SZGmBBee9gVWBpHYrULMOHVQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Sat, Jul 17, 2021 at 10:51 PM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
>
> On 16/07/2021 10:06, Pawel Kudzia wrote:
> >
> >
> > On Thu, Jul 15, 2021 at 8:49 PM Heikki Linnakangas <hlinnaka(at)iki(dot)fi
> > <mailto:hlinnaka(at)iki(dot)fi>> wrote:
> >
> > thank you for the hint. i've messed up patching. now i can get one step
> > further:
> >
> > data=# CREATE EXTENSION amcheck;
> > CREATE EXTENSION
> > data=# gin_index_parent_check('entity_attribute_name_ids_gin');
> > ERROR: syntax error at or near "gin_index_parent_check"
> > LINE 1: gin_index_parent_check('entity_attribute_name_ids_gin');
> > ^
>
> Ah, sorry, you need to call it with SELECT, like:
>
> SELECT gin_index_parent_check('entity_attribute_name_ids_gin');
>

That worked. It gave me gin_index_parent_check.txt, attached.

> > i've also applied trace-gin-readbuffer-2.patch
>
> Darn, I missed one function that's used to read a page when descending
> the GIN tree. That explains the seemingly nonsensical accesses in the
> log - the trace left out some crucial accesses.
>
> Attached is a new version of that debugging patch. Please repeat the
> same steps as before with this:
>
> 1. Apply the patch (remove previous patch first if necessary)
>
> 2. Compile Postgres, "make install"
>
> 3. Clear the log, start postgres
>
> 4. Run the query again:
>
> SELECT ctid, entity_id FROM entity WHERE
> ( attribute_name_ids && '{1737}' ) AND NOT ( (attribute_name_ids||0)
> && '{1737}') LIMIT 10;
>

for clarity - i'm also running "set enable_seqscan=off" before that
select to ensure that GIN is used.

SELECT gave me

ctid | entity_id
--------------+-----------
(4002784,1) | 38048120
(4002869,14) | 95333744
(2 rows)

> 5. Stop the server.
>
> 6. Extract the content of the accessed index blocks:
>
> cat <path to postgres log> | perl -ne '/.*read gin blk (\d+)/ && print
> "$1\n" ' | sort -n |uniq > /tmp/blocknums
>
> psql data -c "create temp table blocknums(blknum int); copy blocknums
> from '/tmp/blocknums'; copy (select blknum,
> get_raw_page('entity_attribute_name_ids_gin', blknum) from blocknums) to
> '/tmp/block-contents';"
>
> 7. Send over /tmp/blocknums, /tmp/block-contents and the postgres log again.
>

I'm sending those over directly to your mail.

> Thank you for your patience!
>

Thanks a lot for the investigation!

--
regards,
Pawel Kudzia

Attachment Content-Type Size
gin_index_parent_check.txt text/plain 12.9 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2021-07-18 21:00:01 BUG #17113: Assert failed on calling a function fixed after an extension reload
Previous Message Andrey Borodin 2021-07-18 05:17:28 Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data