Re: Amcheck verification of GiST and GIN

From: Kirill Reshke <reshkekirill(at)gmail(dot)com>
To: "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>
Cc: Alexander Lakhin <exclusion(at)gmail(dot)com>, Andrey Borodin <amborodin86(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, Jose Arthur Benetasso Villanova <jose(dot)arthur(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Nikolay Samokhvalov <samokhvalov(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tomas Vondra <tomas(at)vondra(dot)me>
Subject: Re: Amcheck verification of GiST and GIN
Date: 2024-12-03 15:04:08
Message-ID: CALdSSPgu9uAhVYojQ0yjG=q5MaqmiSLUJPhz+-u7cA6K6Mc9UA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 2 Dec 2024 at 12:57, Kirill Reshke <reshkekirill(at)gmail(dot)com> wrote:
> This change was not correct at all.
> PFA v32.
>
> Repro:
> ```
> db1=# create table ttt(t text);
> CREATE TABLE
> db1=# create index on ttt using gin(t gin_trgm_ops);
> CREATE INDEX
> db1=# insert into ttt select md5(random()::text) from generate_series(1,50000);
> INSERT 0 50000
> db1=# set client_min_messages to debug5;
> DEBUG: CommitTransaction(1) name: unnamed; blockState: STARTED;
> state: INPROGRESS, xid/subid/cid: 0/1/0
> SET
> db1=# select gin_index_check('ttt_t_idx');
> DEBUG: StartTransaction(1) name: unnamed; blockState: DEFAULT; state:
> INPROGRESS, xid/subid/cid: 0/1/0
> DEBUG: processing entry tree page at blk 1, maxoff: 2
> DEBUG: processing entry tree page at blk 941, maxoff: 229
> ERROR: index "ttt_t_idx" has wrong tuple order on entry tree page,
> block 941, offset 229
> db1=#
> ```
> I have only observed failures on the last tuple of the entry page. All
> other known issues that were on v31 are now fixed (I hope).

PFA v33.

Main change from v32 is a meson-related fix, and also bug fix that I
reported in a previous message.

I did find this in README.
```
GIN packs keys and downlinks into tuples in a different way.

(P_0, K_1), (P_1, K_2), ... , (P_n, K_{n+1})

P_i is grouped with K_{i+1}. -Inf key is not needed.

There are couple of additional notes regarding K_{n+1} key.
1) In entry tree rightmost page, a key coupled with P_n doesn't really matter.
Highkey is assumed to be infinity.
2) In posting tree, a key coupled with P_n always doesn't matter. Highkey for
non-rightmost pages is stored separately and accessed via
GinDataPageGetRightBound().
```
I indeed only observe gin_index_check failures only on the In entry
tree rightmost (non-leaf & non-root) page, on the last tuple
(Highkey). So fix was not to check is:

```
- /* (apparently) first block is metadata, skip
order check */
- if (i != FirstOffsetNumber && stack->blkno !=
(BlockNumber) 1)
+ /*
+ * First block is metadata, skip order check.
Also, never check
+ * for high key on rightmost page, as this key
is not really
+ * stored explicitly.
+ */
+ if (i != FirstOffsetNumber && stack->blkno !=
GIN_ROOT_BLKNO &&
+ !(i == maxoff && rightlink ==
InvalidBlockNumber))
{
```

I also did a small wording correction for this part of README (this is v33-0006)

--
Best regards,
Kirill Reshke

Attachment Content-Type Size
v33-0003-Add-gist_index_check-function-to-verify-GiST-ind.patch application/octet-stream 34.1 KB
v33-0001-A-tiny-nitpicky-tweak-to-beautify-the-Amcheck-in.patch application/octet-stream 948 bytes
v33-0002-Refactor-amcheck-internals-to-isolate-common-loc.patch application/octet-stream 21.4 KB
v33-0004-Add-gin_index_check-to-verify-GIN-index.patch application/octet-stream 33.3 KB
v33-0005-Add-GiST-support-to-pg_amcheck.patch application/octet-stream 35.6 KB
v33-0006-Fix-wording-in-GIN-README.patch application/octet-stream 1.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2024-12-03 15:07:07 Re: shared-memory based stats collector - v70
Previous Message Alvaro Herrera 2024-12-03 15:01:41 Re: Drop back the redundant "Lock" suffix from LWLock wait event names