Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
Cc: Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Noah Misch <noah(at)leadboat(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Maxim Orlov <orlovmg(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Greg Stark <stark(at)mit(dot)edu>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>, Maxim Orlov <m(dot)orlov(at)postgrespro(dot)ru>, lubennikovaav(at)gmail(dot)com
Subject: Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
Date: 2024-05-17 20:00:55
Message-ID: CAH2-Wz=-XLP3NUFnJwtBXF-ZWESoqpJ3z+n29V7LsCLgYgVHKA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, May 17, 2024 at 3:42 PM Mark Dilger
<mark(dot)dilger(at)enterprisedb(dot)com> wrote:
> On further review, the test was not anticipating the error message "high key invariant violated for index". That wasn't seen in calls to bt_index_parent_check(), but appears as one of the errors from bt_index_check(). I am rerunning the test now....

Many different parts of the B-Tree code will fight against allowing
duplicates of the same value to span multiple leaf pages -- this is
especially true for unique indexes. For example, nbtsplitloc.c has a
variety of strategies that will prevent choosing a split point that
necessitates including a distinguishing heap TID in the new high key.
In other words, nbtsplitloc.c is very aggressive about picking a split
point between (rather than within) groups of duplicates.

Of course it's still *possible* for a unique index to have multiple
leaf pages containing the same individual value. The regression tests
do have coverage for certain relevant code paths (e.g., there is
coverage for code paths only hit when _bt_check_unique has to go to
the page to the right). This is only the case because I went out of my
way to make sure of it, by adding tests that allow a huge number of
version duplicates to accumulate within a unique index. (The "move
right" _bt_check_unique branches had zero test coverage for a year or
two.)

Just how important it is that amcheck covers cases where the version
duplicates span multiple leaf pages is of course debatable -- it's
always better to be more thorough, when practical. But it's certainly
something that needs to be assessed based on the merits.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2024-05-17 20:03:09 Re: broken tables on hot standby after migration on PostgreSQL 16 (3x times last month)
Previous Message Robert Haas 2024-05-17 19:59:21 Re: commitfest.postgresql.org is no longer fit for purpose