Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, Noah Misch <noah(at)leadboat(dot)com>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Maxim Orlov <orlovmg(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Greg Stark <stark(at)mit(dot)edu>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, David Steele <david(at)pgmasters(dot)net>, Maxim Orlov <m(dot)orlov(at)postgrespro(dot)ru>, lubennikovaav(at)gmail(dot)com, Hamid Akhtar <hamid(dot)akhtar(at)percona(dot)com>
Subject: Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
Date: 2024-04-24 09:57:24
Message-ID: CAPpHfdsVbB9ToriaB1UHuOKwjKxiZmTFQcEF=juzzC_nby31uA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 17, 2024 at 6:41 PM Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com> wrote:
>> I did notice (I meant to point out) that I have concerns about this
>> part of the new uniqueness check code:
>> "
>> if (P_IGNORE(topaque) || !P_ISLEAF(topaque))
>> break;
>> "
>>
>> My concern here is with the !P_ISLEAF(topaque) test -- it shouldn't be
>> required
>
> I agree. But I didn't see the need to check uniqueness constraints violations in internal pages. Furthermore, it doesn't mean only a violation of constraint, but a major index corruption. I agree that checking and reporting this type of corruption separately is a possible thing.

I think we could just throw an error in case of an unexpected internal
page. It doesn't seem reasonable to continue the check with this type
of corruption detected. If the tree linkage is corrupted we may enter
an endless loop or something.

>> Separately, I dislike the way the target block changes within
>> bt_target_page_check(). The general idea behind verify_nbtree.c's
>> target block is that every block becomes the target exactly once, in a
>> clearly defined place. All corruption (in the index structure itself)
>> is formally considered to be a problem with that particular target
>> block. I want to be able to clearly distinguish between the target and
>> target's right sibling here, to explain my concerns, but they're kinda
>> both the target, so that's a lot harder than it should be. (Admittedly
>> directly blaming the target block has always been a little bit
>> arbitrary, at least in certain cases, but even there it provides
>> structure that makes things much easier to describe unambiguously.)
>
> The possible way to load the target block only once is to get rid of the cross-page uniqueness violation check. I introduced it to catch more possible cases of uniqueness violations. Though they are expected to be extremely rare, and anyway the algorithm doesn't get any warranty, just does its best to catch what is possible. I don't object to this change.

I think we could probably just avoid setting state->target during
cross-page check. Just save that into a local variable and pass as an
argument where needed.

Skipping the visibility checks for "only one tuple for scan key" case
looks like very valuable optimization [1].

I also think we should wrap lVis_* variables into struct. That would
make the way we pass them to functions more elegant.

Links.
1. https://www.postgresql.org/message-id/20240325020323.fd.nmisch%40google.com

------
Regards,
Alexander Korotkov

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2024-04-24 09:58:11 Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
Previous Message Peter Eisentraut 2024-04-24 09:51:20 Re: Q: Escapes in jsonpath Idents