| From: | Kirill Reshke <reshkekirill(at)gmail(dot)com> | 
|---|---|
| To: | "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru> | 
| Cc: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Alexander Lakhin <exclusion(at)gmail(dot)com>, Andrey Borodin <amborodin86(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, Jose Arthur Benetasso Villanova <jose(dot)arthur(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Nikolay Samokhvalov <samokhvalov(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> | 
| Subject: | Re: Amcheck verification of GiST and GIN | 
| Date: | 2024-11-26 11:45:03 | 
| Message-ID: | CALdSSPgFW1y3rC7ouPH4dDu+=AfVxo1JUsDg6Ar94J4xzEPRVA@mail.gmail.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
On Tue, 26 Nov 2024 at 12:22, Andrey M. Borodin <x4mmm(at)yandex-team(dot)ru> wrote:
>
>
>
> > On 26 Nov 2024, at 11:50, Kirill Reshke <reshkekirill(at)gmail(dot)com> wrote:
> >
> > I did mechanical patch rebase & beautification.
>
> Many thanks! Addressing Tomas' feedback was still one of top items on my todo list. And I'm more than happy that someone advance this patchset.
>
> > Notice my first patch, i did small refactoring as a separate contribution.
>
> It looks like these days such patches are committed via separate threads. Change looks good to me.
>
> >
> > === review from Tomas fixups
> >
> > 1) 0001-Refactor-amcheck-to-extract-common..
> >
> > This change was not correct (if stmt now need parenthesis )
> >
> >> - if (allequalimage && !_bt_allequalimage(indrel, false))
> >> - {
> >> - bool has_interval_ops = false;
> >> -
> >> - for (int i = 0; i < IndexRelationGetNumberOfKeyAttributes(indrel); i++)
> >> - if (indrel->rd_opfamily[i] == INTERVAL_BTREE_FAM_OID)
> >> - has_interval_ops = true;
> >> - ereport(ERROR,
> >> + for (int i = 0; i < IndexRelationGetNumberOfKeyAttributes(indrel); i++)
> >> + if (indrel->rd_opfamily[i] == INTERVAL_BTREE_FAM_OID)
> >> + has_interval_ops = true;
> >> + ereport(ERROR,
> >
> > Applied all tomas review comments.
> +1. All changes were correct, but there were some questions from Tomas.
>
> > The index relation AM mismatch
> > error message now looks like this:
> > ```
> > db1=# select bt_index_check('users_search_idx');
> > ERROR:  expected "btree" index as targets for verification
> > DETAIL:  Relation "users_search_idx" is a gin index.
> > ```
>
> Great!
>
> > I included Tomas to review-by section of this patch (in the commit message).
> > I also changed the commit message for this patch.
> >
> > 2) 0002-Add-gist_index_check-function-to-verify-G.patch
> > Did apply Tomas review comments. I left GIST's version of
> > PageGetItemIdCareful unchanged. Maybe we should have a common check in
> > verify_common.c, as Tomas was arguing for, but I'm not doing anything
> > for now, because I don't really understand its purpose. All other
> > review comments are addressed (i hope), if i'm not missing anything.
> >
> > I also included my fix for the memory leak mentioned by Tomas.
>
> The fix looks correct to me.
>
> > === problems with gin_index_check
> >
> > 1)
> > ```
> > reshke(at)ygp-jammy:~/postgres/contrib/amcheck$ ../../pgbin/bin/psql db1
> > psql (18devel)
> > Type "help" for help.
> >
> > db1=# select gin_index_check('users_search_idx');
> > ERROR:  index "users_search_idx" has wrong tuple order, block 35868, offset 33
> > ```
>
> Ughm... are you sure it's not induced by some collation issues? Did you create the index on same VM where test was performed?
>
I have an input for this. I generated a big table with 2 md5 fields
and user binary search to find the exact border where gin_index_check
stops working.
```
db1=# create table users_3084 (like users INCLUDING INDEXES);
CREATE TABLE
db1=# insert into users_3084 select * from users order by first_name limit 3084;
INSERT 0 3084
db1=# select gin_index_check('users_3084_first_name_last_name_idx');
 gin_index_check
-----------------
(1 row)
.....
db1=# create table users_3085 (like users INCLUDING INDEXES);
CREATE TABLE
db1=# insert into users_3085 select * from users order by first_name limit 3085;
INSERT 0 3085
db1=# select gin_index_check('users_3085_first_name_last_name_idx');
ERROR:  index "users_3085_first_name_last_name_idx" has wrong tuple
order, block 589, offset 45
```
DDL:
CREATE TABLE public.users_3085 (
    first_name text,
    last_name text
);
CREATE INDEX users_3085_first_name_idx ON public.users_3085 USING hash
(first_name);
CREATE INDEX users_3085_first_name_last_name_idx ON public.users_3085
USING gin (first_name public.gin_trgm_ops, last_name
public.gin_trgm_ops);
PFA contents of users_3085
> >
> > For some reason gin_index_check fails on my index. I am 99% there is
> > no corruption in it. Will try to investigate.
> >
> > 2) this is already discovered by Tomas, but I add my input here:
> >
> >
> > psql session:
> > ```
> > db1=# set log_min_messages to debug5;
> > SET
> > db1=# select gin_index_check('users_search_idx');
> >
> > ```
> >
> >
> > gdb session:
> > ```
> > (gdb) bt
> > #0  __pthread_kill_implementation (no_tid=0, signo=6,
> > threadid=140601454760896) at ./nptl/pthread_kill.c:44
> > #1  __pthread_kill_internal (signo=6, threadid=140601454760896) at
> > ./nptl/pthread_kill.c:78
> > #2  __GI___pthread_kill (threadid=140601454760896,
> > signo=signo(at)entry=6) at ./nptl/pthread_kill.c:89
> > #3  0x00007fe055af0476 in __GI_raise (sig=sig(at)entry=6) at
> > ../sysdeps/posix/raise.c:26
> > #4  0x00007fe055ad67f3 in __GI_abort () at ./stdlib/abort.c:79
> > #5  0x000055ea82af4ef0 in ExceptionalCondition
> > (conditionName=conditionName(at)entry=0x7fe04a87aa35
> > "ItemPointerIsValid(pointer)",
> >    fileName=fileName(at)entry=0x7fe04a87a928
> > "../../src/include/storage/itemptr.h",
> > lineNumber=lineNumber(at)entry=126) at assert.c:66
> > #6  0x00007fe04a871372 in ItemPointerGetOffsetNumber
> > (pointer=<optimized out>) at ../../src/include/storage/itemptr.h:126
> > #7  ItemPointerGetOffsetNumber (pointer=<optimized out>) at
> > ../../src/include/storage/itemptr.h:124
> > #8  gin_check_posting_tree_parent_keys_consistency
> > (posting_tree_root=<optimized out>, rel=<optimized out>) at
> > verify_gin.c:296
> > #9  gin_check_parent_keys_consistency (rel=rel(at)entry=0x7fe04a8aa328,
> > heaprel=heaprel(at)entry=0x7fe04a8a9db8,
> > callback_state=callback_state(at)entry=0x0,
> > readonly=readonly(at)entry=false) at verify_gin.c:597
> > #10 0x00007fe04a87098d in amcheck_lock_relation_and_check
> > (indrelid=16488, am_id=am_id(at)entry=2742,
> > check=check(at)entry=0x7fe04a870a80 <gin_check_parent_keys_consistency>,
> > lockmode=lockmode(at)entry=1,
> >    state=state(at)entry=0x0) at verify_common.c:132
> > #11 0x00007fe04a871e34 in gin_index_check (fcinfo=<optimized out>) at
> > verify_gin.c:81
> > #12 0x000055ea827cc275 in ExecInterpExpr (state=0x55ea84903390,
> > econtext=0x55ea84903138, isnull=<optimized out>) at
> > execExprInterp.c:770
> > #13 0x000055ea82804fdc in ExecEvalExprSwitchContext
> > (isNull=0x7ffeba7fdd37, econtext=0x55ea84903138, state=0x55ea84903390)
> > at ../../../src/include/executor/executor.h:367
> > #14 ExecProject (projInfo=0x55ea84903388) at
> > ../../../src/include/executor/executor.h:401
> > #15 ExecResult (pstate=<optimized out>) at nodeResult.c:135
> > #16 0x000055ea827d007a in ExecProcNode (node=0x55ea84903028) at
> > ../../../src/include/executor/executor.h:278
> > #17 ExecutePlan (execute_once=<optimized out>, dest=0x55ea84901940,
> > direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>,
> > operation=CMD_SELECT, use_parallel_mode=<optimized out>,
> >    planstate=0x55ea84903028, estate=0x55ea84902e00) at execMain.c:1655
> > #18 standard_ExecutorRun (queryDesc=0x55ea8485c1a0,
> > direction=<optimized out>, count=0, execute_once=<optimized out>) at
> > execMain.c:362
> > #19 0x000055ea829ad6df in PortalRunSelect (portal=0x55ea848b1810,
> > forward=<optimized out>, count=0, dest=<optimized out>) at
> > pquery.c:924
> > #20 0x000055ea829aedc1 in PortalRun
> > (portal=portal(at)entry=0x55ea848b1810,
> > count=count(at)entry=9223372036854775807,
> > isTopLevel=isTopLevel(at)entry=true, run_once=run_once(at)entry=true,
> >    dest=dest(at)entry=0x55ea84901940,
> > altdest=altdest(at)entry=0x55ea84901940, qc=0x7ffeba7fdfd0) at
> > pquery.c:768
> > #21 0x000055ea829aab47 in exec_simple_query
> > (query_string=0x55ea84831250 "select
> > gin_index_check('users_search_idx');") at postgres.c:1283
> > #22 0x000055ea829ac777 in PostgresMain (dbname=<optimized out>,
> > username=<optimized out>) at postgres.c:4798
> > #23 0x000055ea829a6a33 in BackendMain (startup_data=<optimized out>,
> > startup_data_len=<optimized out>) at backend_startup.c:107
> > #24 0x000055ea8290122f in postmaster_child_launch
> > (child_type=<optimized out>, child_slot=1,
> > startup_data=startup_data(at)entry=0x7ffeba7fe48c "",
> > startup_data_len=startup_data_len(at)entry=4,
> >    client_sock=client_sock(at)entry=0x7ffeba7fe490) at launch_backend.c:274
> > #25 0x000055ea82904c3f in BackendStartup (client_sock=0x7ffeba7fe490)
> > at postmaster.c:3377
> > #26 ServerLoop () at postmaster.c:1663
> > #27 0x000055ea8290656b in PostmasterMain (argc=argc(at)entry=3,
> > argv=argv(at)entry=0x55ea8482ab10) at postmaster.c:1361
> > #28 0x000055ea825ecc0a in main (argc=3, argv=0x55ea8482ab10) at main.c:196
> > (gdb)
> > ```
> >
> > We also need to change the default version of the extension to 1.5.
> > I'm not sure which patch of this series should do that.
>
> +1
>
> >
> > ====
> > Overall I think 0001 & 0002 are ready as-is. 0003 is maybe ok.  Other
> > patches need more review rounds.
>
> Yeah, I agree with this analysis.
I now don't: im leaning towards the option to squash 0001-0002 into one patch.
>
> Best regards, Andrey Borodin.
-- 
Best regards,
Kirill Reshke
| Attachment | Content-Type | Size | 
|---|---|---|
| users_3085 | application/octet-stream | 198.8 KB | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Peter Eisentraut | 2024-11-26 12:18:07 | Re: SQL:2011 application time | 
| Previous Message | Alvaro Herrera | 2024-11-26 11:40:39 | Re: [PATCH] Missing Assert in the code |