Re: Amcheck verification of GiST and GIN

From: Kirill Reshke <reshkekirill(at)gmail(dot)com>
To: "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Alexander Lakhin <exclusion(at)gmail(dot)com>, Andrey Borodin <amborodin86(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, Jose Arthur Benetasso Villanova <jose(dot)arthur(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Nikolay Samokhvalov <samokhvalov(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Amcheck verification of GiST and GIN
Date: 2024-11-26 11:45:03
Message-ID: CALdSSPgFW1y3rC7ouPH4dDu+=AfVxo1JUsDg6Ar94J4xzEPRVA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 26 Nov 2024 at 12:22, Andrey M. Borodin <x4mmm(at)yandex-team(dot)ru> wrote:
>
>
>
> > On 26 Nov 2024, at 11:50, Kirill Reshke <reshkekirill(at)gmail(dot)com> wrote:
> >
> > I did mechanical patch rebase & beautification.
>
> Many thanks! Addressing Tomas' feedback was still one of top items on my todo list. And I'm more than happy that someone advance this patchset.
>
> > Notice my first patch, i did small refactoring as a separate contribution.
>
> It looks like these days such patches are committed via separate threads. Change looks good to me.
>
> >
> > === review from Tomas fixups
> >
> > 1) 0001-Refactor-amcheck-to-extract-common..
> >
> > This change was not correct (if stmt now need parenthesis )
> >
> >> - if (allequalimage && !_bt_allequalimage(indrel, false))
> >> - {
> >> - bool has_interval_ops = false;
> >> -
> >> - for (int i = 0; i < IndexRelationGetNumberOfKeyAttributes(indrel); i++)
> >> - if (indrel->rd_opfamily[i] == INTERVAL_BTREE_FAM_OID)
> >> - has_interval_ops = true;
> >> - ereport(ERROR,
> >> + for (int i = 0; i < IndexRelationGetNumberOfKeyAttributes(indrel); i++)
> >> + if (indrel->rd_opfamily[i] == INTERVAL_BTREE_FAM_OID)
> >> + has_interval_ops = true;
> >> + ereport(ERROR,
> >
> > Applied all tomas review comments.
> +1. All changes were correct, but there were some questions from Tomas.
>
> > The index relation AM mismatch
> > error message now looks like this:
> > ```
> > db1=# select bt_index_check('users_search_idx');
> > ERROR: expected "btree" index as targets for verification
> > DETAIL: Relation "users_search_idx" is a gin index.
> > ```
>
> Great!
>
> > I included Tomas to review-by section of this patch (in the commit message).
> > I also changed the commit message for this patch.
> >
> > 2) 0002-Add-gist_index_check-function-to-verify-G.patch
> > Did apply Tomas review comments. I left GIST's version of
> > PageGetItemIdCareful unchanged. Maybe we should have a common check in
> > verify_common.c, as Tomas was arguing for, but I'm not doing anything
> > for now, because I don't really understand its purpose. All other
> > review comments are addressed (i hope), if i'm not missing anything.
> >
> > I also included my fix for the memory leak mentioned by Tomas.
>
> The fix looks correct to me.
>
> > === problems with gin_index_check
> >
> > 1)
> > ```
> > reshke(at)ygp-jammy:~/postgres/contrib/amcheck$ ../../pgbin/bin/psql db1
> > psql (18devel)
> > Type "help" for help.
> >
> > db1=# select gin_index_check('users_search_idx');
> > ERROR: index "users_search_idx" has wrong tuple order, block 35868, offset 33
> > ```
>
> Ughm... are you sure it's not induced by some collation issues? Did you create the index on same VM where test was performed?
>

I have an input for this. I generated a big table with 2 md5 fields
and user binary search to find the exact border where gin_index_check
stops working.

```

db1=# create table users_3084 (like users INCLUDING INDEXES);
CREATE TABLE
db1=# insert into users_3084 select * from users order by first_name limit 3084;
INSERT 0 3084
db1=# select gin_index_check('users_3084_first_name_last_name_idx');
gin_index_check
-----------------

(1 row)
.....
db1=# create table users_3085 (like users INCLUDING INDEXES);
CREATE TABLE
db1=# insert into users_3085 select * from users order by first_name limit 3085;
INSERT 0 3085
db1=# select gin_index_check('users_3085_first_name_last_name_idx');
ERROR: index "users_3085_first_name_last_name_idx" has wrong tuple
order, block 589, offset 45
```

DDL:

CREATE TABLE public.users_3085 (
first_name text,
last_name text
);
CREATE INDEX users_3085_first_name_idx ON public.users_3085 USING hash
(first_name);
CREATE INDEX users_3085_first_name_last_name_idx ON public.users_3085
USING gin (first_name public.gin_trgm_ops, last_name
public.gin_trgm_ops);

PFA contents of users_3085

> >
> > For some reason gin_index_check fails on my index. I am 99% there is
> > no corruption in it. Will try to investigate.
> >
> > 2) this is already discovered by Tomas, but I add my input here:
> >
> >
> > psql session:
> > ```
> > db1=# set log_min_messages to debug5;
> > SET
> > db1=# select gin_index_check('users_search_idx');
> >
> > ```
> >
> >
> > gdb session:
> > ```
> > (gdb) bt
> > #0 __pthread_kill_implementation (no_tid=0, signo=6,
> > threadid=140601454760896) at ./nptl/pthread_kill.c:44
> > #1 __pthread_kill_internal (signo=6, threadid=140601454760896) at
> > ./nptl/pthread_kill.c:78
> > #2 __GI___pthread_kill (threadid=140601454760896,
> > signo=signo(at)entry=6) at ./nptl/pthread_kill.c:89
> > #3 0x00007fe055af0476 in __GI_raise (sig=sig(at)entry=6) at
> > ../sysdeps/posix/raise.c:26
> > #4 0x00007fe055ad67f3 in __GI_abort () at ./stdlib/abort.c:79
> > #5 0x000055ea82af4ef0 in ExceptionalCondition
> > (conditionName=conditionName(at)entry=0x7fe04a87aa35
> > "ItemPointerIsValid(pointer)",
> > fileName=fileName(at)entry=0x7fe04a87a928
> > "../../src/include/storage/itemptr.h",
> > lineNumber=lineNumber(at)entry=126) at assert.c:66
> > #6 0x00007fe04a871372 in ItemPointerGetOffsetNumber
> > (pointer=<optimized out>) at ../../src/include/storage/itemptr.h:126
> > #7 ItemPointerGetOffsetNumber (pointer=<optimized out>) at
> > ../../src/include/storage/itemptr.h:124
> > #8 gin_check_posting_tree_parent_keys_consistency
> > (posting_tree_root=<optimized out>, rel=<optimized out>) at
> > verify_gin.c:296
> > #9 gin_check_parent_keys_consistency (rel=rel(at)entry=0x7fe04a8aa328,
> > heaprel=heaprel(at)entry=0x7fe04a8a9db8,
> > callback_state=callback_state(at)entry=0x0,
> > readonly=readonly(at)entry=false) at verify_gin.c:597
> > #10 0x00007fe04a87098d in amcheck_lock_relation_and_check
> > (indrelid=16488, am_id=am_id(at)entry=2742,
> > check=check(at)entry=0x7fe04a870a80 <gin_check_parent_keys_consistency>,
> > lockmode=lockmode(at)entry=1,
> > state=state(at)entry=0x0) at verify_common.c:132
> > #11 0x00007fe04a871e34 in gin_index_check (fcinfo=<optimized out>) at
> > verify_gin.c:81
> > #12 0x000055ea827cc275 in ExecInterpExpr (state=0x55ea84903390,
> > econtext=0x55ea84903138, isnull=<optimized out>) at
> > execExprInterp.c:770
> > #13 0x000055ea82804fdc in ExecEvalExprSwitchContext
> > (isNull=0x7ffeba7fdd37, econtext=0x55ea84903138, state=0x55ea84903390)
> > at ../../../src/include/executor/executor.h:367
> > #14 ExecProject (projInfo=0x55ea84903388) at
> > ../../../src/include/executor/executor.h:401
> > #15 ExecResult (pstate=<optimized out>) at nodeResult.c:135
> > #16 0x000055ea827d007a in ExecProcNode (node=0x55ea84903028) at
> > ../../../src/include/executor/executor.h:278
> > #17 ExecutePlan (execute_once=<optimized out>, dest=0x55ea84901940,
> > direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>,
> > operation=CMD_SELECT, use_parallel_mode=<optimized out>,
> > planstate=0x55ea84903028, estate=0x55ea84902e00) at execMain.c:1655
> > #18 standard_ExecutorRun (queryDesc=0x55ea8485c1a0,
> > direction=<optimized out>, count=0, execute_once=<optimized out>) at
> > execMain.c:362
> > #19 0x000055ea829ad6df in PortalRunSelect (portal=0x55ea848b1810,
> > forward=<optimized out>, count=0, dest=<optimized out>) at
> > pquery.c:924
> > #20 0x000055ea829aedc1 in PortalRun
> > (portal=portal(at)entry=0x55ea848b1810,
> > count=count(at)entry=9223372036854775807,
> > isTopLevel=isTopLevel(at)entry=true, run_once=run_once(at)entry=true,
> > dest=dest(at)entry=0x55ea84901940,
> > altdest=altdest(at)entry=0x55ea84901940, qc=0x7ffeba7fdfd0) at
> > pquery.c:768
> > #21 0x000055ea829aab47 in exec_simple_query
> > (query_string=0x55ea84831250 "select
> > gin_index_check('users_search_idx');") at postgres.c:1283
> > #22 0x000055ea829ac777 in PostgresMain (dbname=<optimized out>,
> > username=<optimized out>) at postgres.c:4798
> > #23 0x000055ea829a6a33 in BackendMain (startup_data=<optimized out>,
> > startup_data_len=<optimized out>) at backend_startup.c:107
> > #24 0x000055ea8290122f in postmaster_child_launch
> > (child_type=<optimized out>, child_slot=1,
> > startup_data=startup_data(at)entry=0x7ffeba7fe48c "",
> > startup_data_len=startup_data_len(at)entry=4,
> > client_sock=client_sock(at)entry=0x7ffeba7fe490) at launch_backend.c:274
> > #25 0x000055ea82904c3f in BackendStartup (client_sock=0x7ffeba7fe490)
> > at postmaster.c:3377
> > #26 ServerLoop () at postmaster.c:1663
> > #27 0x000055ea8290656b in PostmasterMain (argc=argc(at)entry=3,
> > argv=argv(at)entry=0x55ea8482ab10) at postmaster.c:1361
> > #28 0x000055ea825ecc0a in main (argc=3, argv=0x55ea8482ab10) at main.c:196
> > (gdb)
> > ```
> >
> > We also need to change the default version of the extension to 1.5.
> > I'm not sure which patch of this series should do that.
>
> +1
>
> >
> > ====
> > Overall I think 0001 & 0002 are ready as-is. 0003 is maybe ok. Other
> > patches need more review rounds.
>
> Yeah, I agree with this analysis.

I now don't: im leaning towards the option to squash 0001-0002 into one patch.

>
> Best regards, Andrey Borodin.

--
Best regards,
Kirill Reshke

Attachment Content-Type Size
users_3085 application/octet-stream 198.8 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2024-11-26 12:18:07 Re: SQL:2011 application time
Previous Message Alvaro Herrera 2024-11-26 11:40:39 Re: [PATCH] Missing Assert in the code