Re: Amcheck verification of GiST and GIN

From: Kirill Reshke <reshkekirill(at)gmail(dot)com>
To: "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Alexander Lakhin <exclusion(at)gmail(dot)com>, Andrey Borodin <amborodin86(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, Jose Arthur Benetasso Villanova <jose(dot)arthur(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Nikolay Samokhvalov <samokhvalov(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Amcheck verification of GiST and GIN
Date: 2024-11-26 06:50:20
Message-ID: CALdSSPhFcpyEHzS-8F+j2+v=RCp8zaKd9FHPULq2xSV0t4n4wg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 15 Jul 2024 at 00:00, Andrey M. Borodin <x4mmm(at)yandex-team(dot)ru> wrote:
>
> Hi Tomas!
>
> Thank you so much for your interest in the patchset.
>
> > On 10 Jul 2024, at 19:01, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
> >
> > I realized amcheck GIN/GiST support would be useful for testing my
> > patches adding parallel builds for these index types, so I decided to
> > take a look at this and do an initial review today.
>
> Great! Thank you!
>
> > Attached is a patch series with a extra commits to keep the review
> > comments and patches adjusting the formatting by pgindent (the patch
> > seems far enough for this).
>
> I was hoping to address your review comments this weekend, but unfortunately I could not. I'll do this ASAP, but at least I decided to post answers on questions.
>
> >
> > Let me quickly go through the review comments:
> >
> > 1) Not sure I like 'amcheck.c' very much, I'd probably go with something
> > like 'verify_common.c' to match naming of the other files. But it's just
> > nitpicking and I can live with it.
>
> Any name works for me. We have tens of files ending with "common.c", so I think that's a good way to go.
>
> > 2) amcheck_lock_relation_and_check seems to be the most important
> > function, yet there's no comment explaining what it does :-(
>
> Makes sense.
>
> > 3) amcheck_lock_relation_and_check still has a TODO to add the correct
> > name of the AM
>
> Yes, I've discovered it during rebase and added TODO.
>
> > 4) Do we actually need amcheck_index_mainfork_expected as a separate
> > function, or could it be a part of index_checkable?
>
> It was separate function before refactoring...
>
> > 5) The comment for heaptuplespresent says "debug counter" but that does
> > not really explain what it's for. (I see verify_nbtree has the same
> > comment, but maybe let's improve that.)
>
> It's there for a DEBUG1 message
> ereport(DEBUG1,
> (errmsg_internal("finished verifying presence of " INT64_FORMAT " tuples from table \"%s\" with bitset %.2f%% set",
> But the message is gone for GiST. Perhaps, let's restore this message?
>
> >
> > 6) I'd suggest moving the GISTSTATE + blocknum fields to the beginning
> > of GistCheckState, it seems more natural to start with "generic" fields.
>
> Makes sense.
>
> > 7) I'd adjust the gist_check_parent_keys_consistency comment a bit, to
> > explain what the function does first, and only then explain how.
>
> Makes sense.
>
> > 8) We seem to be copying PageGetItemIdCareful() around, right? And the
> > copy in _gist.c still references nbtree - I guess that's not right.
>
> Version differ in two aspects:
> 1. Size of opaque data may be different. But we can pass it as a parameter.
> 2. GIN's line pointer verification is slightly more strict.
>
> >
> > 9) Why is the GIN function called gin_index_parent_check() and not
> > simply gin_index_check() as for the other AMs?
>
> AFAIR function should be called _parent_ if it takes ShareLock. gin_index_parent_check() does not, so I think we should rename it.
>
> > 10) The debug in gin_check_posting_tree_parent_keys_consistency triggers
> > assert when running with client_min_messages='debug5', it seems to be
> > accessing bogus item pointers.
> >
> > 11) Why does it add pg_amcheck support only for GiST and not GIN?
>
> GiST part is by far more polished. When we were discussing current implementation with Peter G, we decided that we could finish work on GiST, and then proceed to GIN. Main concern is about GIN's locking model.
>
>
>
>
> > On 12 Jul 2024, at 15:16, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
> >
> > On 7/10/24 18:01, Tomas Vondra wrote:
> >> ...
> >>
> >> That's all for now. I'll add this to the stress-testing tests of my
> >> index build patches, and if that triggers more issues I'll report those.
> >>
> >
> > As mentioned a couple days ago, I started using this patch to validate
> > the patches adding parallel builds to GIN and GiST indexes - I scripts
> > to stress-test the builds, and I added the new amcheck functions as
> > another validation step.
> >
> > For GIN indexes it didn't find anything new (in either this or my
> > patches), aside from the assert crash I already reported.
> >
> > But for GiST it turned out to be very valuable - it did actually find an
> > issue in my patches, or rather confirm my hypothesis that the way the
> > patch generates fake LSN may not be quite right.
> >
> > In particular, I've observed these two issues:
> >
> > ERROR: heap tuple (13315,38) from table "planet_osm_roads" lacks
> > matching index tuple within index "roads_7_1_idx"
> >
> > ERROR: index "roads_7_7_idx" has inconsistent records on page 23723
> > offset 113
> >
> > And those consistency issues are real - I've managed to reproduce issues
> > with incorrect query results (by comparing the results to an index built
> > without parallelism).
> >
> > So that's nice - it shows the value of this patch, and I like it.
>
> That's great!
>
> > One thing I've been wondering about is that currently amcheck (in
> > general, not just these new GIN/GiST functions) errors out on the first
> > issue, because it does ereport(ERROR). Which is good enough to decide if
> > there is some corruption, but a bit inconvenient if you need to assess
> > how much corruption is there. For example when investigating the issue
> > in my patch it would have been great to know if there's just one broken
> > page, or if there are dozens/hundreds/thousands of them.
> >
> > I'd imagine we could have a flag which says whether to fail on the first
> > issue, or keep looking at future pages. Essentially, whether to do
> > ereport(ERROR) or ereport(WARNING). But maybe that's a dead-end, and
> > once we find the first issue it's futile to inspect the rest of the
> > index, because it can be garbage. Not sure. In any case, it's not up to
> > this patch to invent that.
>
> The thing is amcheck tries hard to to do a core dump. It's still possible to crash it with garbage. But if we continue check after encountering first corruption - increase in SegFaults is inevitable.
>
>
> Thank you! I hope I can get back to code ASAP.
>
>
> Best regards, Andrey Borodin.
>
Hi!
I did mechanical patch rebase & beautification.

Notice my first patch, i did small refactoring as a separate contribution.

=== review from Tomas fixups

1) 0001-Refactor-amcheck-to-extract-common..

This change was not correct (if stmt now need parenthesis )

> - if (allequalimage && !_bt_allequalimage(indrel, false))
> - {
> - bool has_interval_ops = false;
> -
> - for (int i = 0; i < IndexRelationGetNumberOfKeyAttributes(indrel); i++)
> - if (indrel->rd_opfamily[i] == INTERVAL_BTREE_FAM_OID)
> - has_interval_ops = true;
> - ereport(ERROR,
> + for (int i = 0; i < IndexRelationGetNumberOfKeyAttributes(indrel); i++)
> + if (indrel->rd_opfamily[i] == INTERVAL_BTREE_FAM_OID)
> + has_interval_ops = true;
> + ereport(ERROR,

Applied all tomas review comments. The index relation AM mismatch
error message now looks like this:
```
db1=# select bt_index_check('users_search_idx');
ERROR: expected "btree" index as targets for verification
DETAIL: Relation "users_search_idx" is a gin index.
```
I included Tomas to review-by section of this patch (in the commit message).
I also changed the commit message for this patch.

2) 0002-Add-gist_index_check-function-to-verify-G.patch
Did apply Tomas review comments. I left GIST's version of
PageGetItemIdCareful unchanged. Maybe we should have a common check in
verify_common.c, as Tomas was arguing for, but I'm not doing anything
for now, because I don't really understand its purpose. All other
review comments are addressed (i hope), if i'm not missing anything.

I also included my fix for the memory leak mentioned by Tomas.

3) 003-Add gin_index_check() to verify GIN index
The only change is gin_index_parent_check() -> gin_index_check()

4) Applying: Add GiST support to pg_amcheck

Simply rebased & run pgident.

==== tests

make check runs with success

=== problems with gin_index_check

1)
```
reshke(at)ygp-jammy:~/postgres/contrib/amcheck$ ../../pgbin/bin/psql db1
psql (18devel)
Type "help" for help.

db1=# select gin_index_check('users_search_idx');
ERROR: index "users_search_idx" has wrong tuple order, block 35868, offset 33
```

For some reason gin_index_check fails on my index. I am 99% there is
no corruption in it. Will try to investigate.

2) this is already discovered by Tomas, but I add my input here:

psql session:
```
db1=# set log_min_messages to debug5;
SET
db1=# select gin_index_check('users_search_idx');

```

gdb session:
```
(gdb) bt
#0 __pthread_kill_implementation (no_tid=0, signo=6,
threadid=140601454760896) at ./nptl/pthread_kill.c:44
#1 __pthread_kill_internal (signo=6, threadid=140601454760896) at
./nptl/pthread_kill.c:78
#2 __GI___pthread_kill (threadid=140601454760896,
signo=signo(at)entry=6) at ./nptl/pthread_kill.c:89
#3 0x00007fe055af0476 in __GI_raise (sig=sig(at)entry=6) at
../sysdeps/posix/raise.c:26
#4 0x00007fe055ad67f3 in __GI_abort () at ./stdlib/abort.c:79
#5 0x000055ea82af4ef0 in ExceptionalCondition
(conditionName=conditionName(at)entry=0x7fe04a87aa35
"ItemPointerIsValid(pointer)",
fileName=fileName(at)entry=0x7fe04a87a928
"../../src/include/storage/itemptr.h",
lineNumber=lineNumber(at)entry=126) at assert.c:66
#6 0x00007fe04a871372 in ItemPointerGetOffsetNumber
(pointer=<optimized out>) at ../../src/include/storage/itemptr.h:126
#7 ItemPointerGetOffsetNumber (pointer=<optimized out>) at
../../src/include/storage/itemptr.h:124
#8 gin_check_posting_tree_parent_keys_consistency
(posting_tree_root=<optimized out>, rel=<optimized out>) at
verify_gin.c:296
#9 gin_check_parent_keys_consistency (rel=rel(at)entry=0x7fe04a8aa328,
heaprel=heaprel(at)entry=0x7fe04a8a9db8,
callback_state=callback_state(at)entry=0x0,
readonly=readonly(at)entry=false) at verify_gin.c:597
#10 0x00007fe04a87098d in amcheck_lock_relation_and_check
(indrelid=16488, am_id=am_id(at)entry=2742,
check=check(at)entry=0x7fe04a870a80 <gin_check_parent_keys_consistency>,
lockmode=lockmode(at)entry=1,
state=state(at)entry=0x0) at verify_common.c:132
#11 0x00007fe04a871e34 in gin_index_check (fcinfo=<optimized out>) at
verify_gin.c:81
#12 0x000055ea827cc275 in ExecInterpExpr (state=0x55ea84903390,
econtext=0x55ea84903138, isnull=<optimized out>) at
execExprInterp.c:770
#13 0x000055ea82804fdc in ExecEvalExprSwitchContext
(isNull=0x7ffeba7fdd37, econtext=0x55ea84903138, state=0x55ea84903390)
at ../../../src/include/executor/executor.h:367
#14 ExecProject (projInfo=0x55ea84903388) at
../../../src/include/executor/executor.h:401
#15 ExecResult (pstate=<optimized out>) at nodeResult.c:135
#16 0x000055ea827d007a in ExecProcNode (node=0x55ea84903028) at
../../../src/include/executor/executor.h:278
#17 ExecutePlan (execute_once=<optimized out>, dest=0x55ea84901940,
direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>,
operation=CMD_SELECT, use_parallel_mode=<optimized out>,
planstate=0x55ea84903028, estate=0x55ea84902e00) at execMain.c:1655
#18 standard_ExecutorRun (queryDesc=0x55ea8485c1a0,
direction=<optimized out>, count=0, execute_once=<optimized out>) at
execMain.c:362
#19 0x000055ea829ad6df in PortalRunSelect (portal=0x55ea848b1810,
forward=<optimized out>, count=0, dest=<optimized out>) at
pquery.c:924
#20 0x000055ea829aedc1 in PortalRun
(portal=portal(at)entry=0x55ea848b1810,
count=count(at)entry=9223372036854775807,
isTopLevel=isTopLevel(at)entry=true, run_once=run_once(at)entry=true,
dest=dest(at)entry=0x55ea84901940,
altdest=altdest(at)entry=0x55ea84901940, qc=0x7ffeba7fdfd0) at
pquery.c:768
#21 0x000055ea829aab47 in exec_simple_query
(query_string=0x55ea84831250 "select
gin_index_check('users_search_idx');") at postgres.c:1283
#22 0x000055ea829ac777 in PostgresMain (dbname=<optimized out>,
username=<optimized out>) at postgres.c:4798
#23 0x000055ea829a6a33 in BackendMain (startup_data=<optimized out>,
startup_data_len=<optimized out>) at backend_startup.c:107
#24 0x000055ea8290122f in postmaster_child_launch
(child_type=<optimized out>, child_slot=1,
startup_data=startup_data(at)entry=0x7ffeba7fe48c "",
startup_data_len=startup_data_len(at)entry=4,
client_sock=client_sock(at)entry=0x7ffeba7fe490) at launch_backend.c:274
#25 0x000055ea82904c3f in BackendStartup (client_sock=0x7ffeba7fe490)
at postmaster.c:3377
#26 ServerLoop () at postmaster.c:1663
#27 0x000055ea8290656b in PostmasterMain (argc=argc(at)entry=3,
argv=argv(at)entry=0x55ea8482ab10) at postmaster.c:1361
#28 0x000055ea825ecc0a in main (argc=3, argv=0x55ea8482ab10) at main.c:196
(gdb)
```

We also need to change the default version of the extension to 1.5.
I'm not sure which patch of this series should do that.

====
Overall I think 0001 & 0002 are ready as-is. 0003 is maybe ok. Other
patches need more review rounds.

--
Best regards,
Kirill Reshke

Attachment Content-Type Size
v29-0004-Add-gin_index_check-to-verify-GIN-index.patch application/octet-stream 32.6 KB
v29-0001-A-tiny-nitpicky-tweak-to-beautify-the-Amcheck-in.patch application/octet-stream 948 bytes
v29-0005-Add-GiST-support-to-pg_amcheck.patch application/octet-stream 35.5 KB
v29-0003-Add-gist_index_check-function-to-verify-GiST-ind.patch application/octet-stream 34.1 KB
v29-0002-Refactor-amcheck-internals-to-isolate-common-loc.patch application/octet-stream 21.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2024-11-26 07:10:50 Re: Make COPY format extendable: Extract COPY TO format implementations
Previous Message Amit Kapila 2024-11-26 06:27:11 Re: Improve the error message for logical replication of regular column to generated column.