Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

From: Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Noah Misch <noah(at)leadboat(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Maxim Orlov <orlovmg(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Greg Stark <stark(at)mit(dot)edu>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>, Peter Geoghegan <pg(at)bowt(dot)ie>, Maxim Orlov <m(dot)orlov(at)postgrespro(dot)ru>, lubennikovaav(at)gmail(dot)com
Subject: Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
Date: 2024-05-17 11:09:04
Message-ID: CALT9ZEEixajWOQjPj+arYPUwGeTg2WOe-gup3yjPcW5HY6MqUA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi, Alexander!

On Fri, 17 May 2024 at 14:11, Alexander Korotkov <aekorotkov(at)gmail(dot)com>
wrote:

> On Mon, May 13, 2024 at 4:42 AM Alexander Korotkov <aekorotkov(at)gmail(dot)com>
> wrote:
> > On Mon, May 13, 2024 at 12:23 AM Alexander Korotkov
> > <aekorotkov(at)gmail(dot)com> wrote:
> > > On Sat, May 11, 2024 at 4:13 AM Mark Dilger
> > > <mark(dot)dilger(at)enterprisedb(dot)com> wrote:
> > > > > On May 10, 2024, at 12:05 PM, Alexander Korotkov <
> aekorotkov(at)gmail(dot)com> wrote:
> > > > > The only bt_target_page_check() caller is
> > > > > bt_check_level_from_leftmost(), which overrides state->target in
> the
> > > > > next iteration anyway. I think the patch is just refactoring to
> > > > > eliminate the confusion pointer by Peter Geoghegan upthread.
> > > >
> > > > I find your argument unconvincing.
> > > >
> > > > After bt_target_page_check() returns at line 919, and before
> bt_check_level_from_leftmost() overrides state->target in the next
> iteration, bt_check_level_from_leftmost() conditionally fetches an item
> from the page referenced by state->target. See line 963.
> > > >
> > > > I'm left with four possibilities:
> > > >
> > > >
> > > > 1) bt_target_page_check() never gets to the code that uses
> "rightpage" rather than "state->target" in the same iteration where
> bt_check_level_from_leftmost() conditionally fetches an item from
> state->target, so the change you're making doesn't matter.
> > > >
> > > > 2) The code prior to v2-0003 was wrong, having changed
> state->target in an inappropriate way, causing the wrong thing to happen at
> what is now line 963. The patch fixes the bug, because state->target no
> longer gets overwritten where you are now using "rightpage" for the value.
> > > >
> > > > 3) The code used to work, having set up state->target correctly in
> the place where you are now using "rightpage", but v2-0003 has broken that.
> > > >
> > > > 4) It's been broken all along and your patch just changes from
> wrong to wrong.
> > > >
> > > >
> > > > If you believe (1) is true, then I'm complaining that you are
> relying far to much on action at a distance, and that you are not
> documenting it. Even with documentation of this interrelationship, I'd be
> unhappy with how brittle the code is. I cannot easily discern that the two
> don't ever happen in the same iteration, and I'm not at all convinced one
> way or the other. I tried to set up some Asserts about that, but none of
> the test cases actually reach the new code, so adding Asserts doesn't help
> to investigate the question.
> > > >
> > > > If (2) is true, then I'm complaining that the commit message doesn't
> mention the fact that this is a bug fix. Bug fixes should be clearly
> documented as such, otherwise future work might assume the commit can be
> reverted with only stylistic consequences.
> > > >
> > > > If (3) is true, then I'm complaining that the patch is flat busted.
> > > >
> > > > If (4) is true, then maybe we should revert the entire feature, or
> have a discussion of mitigation efforts that are needed.
> > > >
> > > > Regardless of which of 1..4 you pick, I think it could all do with
> more regression test coverage.
> > > >
> > > >
> > > > For reference, I said something similar earlier today in another
> email to this thread:
> > > >
> > > > This patch introduces a change that stores a new page into variable
> "rightpage" rather than overwriting "state->target", which the old
> implementation most certainly did. That means that after returning from
> bt_target_page_check() into the calling function
> bt_check_level_from_leftmost() the value in state->target is not what it
> would have been prior to this patch. Now, that'd be irrelevant if nobody
> goes on to consult that value, but just 44 lines further down in
> bt_check_level_from_leftmost() state->target is clearly used. So the
> behavior at that point is changing between the old and new versions of the
> code, and I think I'm within reason to ask if it was wrong before the
> patch, wrong after the patch, or something else? Is this a bug being
> introduced, being fixed, or ... ?
> > >
> > > Thank you for your analysis. I'm inclined to believe in 2, but not
> > > yet completely sure. It's really pity that our tests don't cover
> > > this. I'm investigating this area.
> >
> > It seems that I got to the bottom of this. Changing
> > BtreeCheckState.target for a cross-page unique constraint check is
> > wrong, but that happens only for leaf pages. After that
> > BtreeCheckState.target is only used for setting the low key. The low
> > key is only used for non-leaf pages. So, that didn't lead to any
> > visible bug. I've revised the commit message to reflect this.
> >
> > So, the picture for the patches is the following now.
> > 0001 – optimization, but rather simple and giving huge effect
> > 0002 – refactoring
> > 0003 – fix for the bug
> > 0004 – better error reporting
>
> I think the thread contains enough motivation on why 0002, 0003 and
> 0004 are material for post-FF. They are fixes and refactoring for
> new-in-v17 feature. I'm going to push them if no objections.
>
> Regarding 0001, I'd like to ask Tom and Mark if they find convincing
> that given that optimization is small, simple and giving huge effect,
> it could be pushed post-FF? Otherwise, this could wait for v18.
>

In my view, patches 0002-0004 are worth pushing.
0001 is ready in my view. But I see no problem pushing it into v18
regarding that this optimization could be not eligible for post-FF. I don't
know the criteria for this just let's be safe about it.

Regards,
Pavel Borisov

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2024-05-17 11:11:05 Re: commitfest.postgresql.org is no longer fit for purpose
Previous Message Pavel Borisov 2024-05-17 11:02:40 Re: Add SPLIT PARTITION/MERGE PARTITIONS commands