From: | Noah Misch <noah(at)leadboat(dot)com> |
---|---|
To: | Peter Geoghegan <pg(at)bowt(dot)ie> |
Cc: | Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Maxim Orlov <orlovmg(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Greg Stark <stark(at)mit(dot)edu>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, David Steele <david(at)pgmasters(dot)net>, Maxim Orlov <m(dot)orlov(at)postgrespro(dot)ru>, lubennikovaav(at)gmail(dot)com, Hamid Akhtar <hamid(dot)akhtar(at)percona(dot)com> |
Subject: | Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index. |
Date: | 2024-03-25 18:24:43 |
Message-ID: | 20240325182443.51@rfd.leadboat.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Mar 25, 2024 at 12:03:10PM -0400, Peter Geoghegan wrote:
> On Sun, Mar 24, 2024 at 10:03 PM Noah Misch <noah(at)leadboat(dot)com> wrote:
> Separately, I now see that the committed patch just reuses the code
> that has long been used to check that things are in the correct order
> across page boundaries: this is the bt_right_page_check_scankey check,
> which existed in the very earliest versions of amcheck. So while I
> agree that we could just keep the original scan key (from the last
> item on every leaf page), and then make the check at the start of the
> next page instead (as opposed to making it at the end of the previous
> leaf page, which is how it works now), it's not obvious that that
> would be a good trade-off, all things considered.
>
> It might still be a little better that way around, overall, but you're
> not just talking about changing the recently committed checkunique
> patch (I think). You're also talking about restructuring the long
> established bt_right_page_check_scankey check (otherwise, what's the
> point?). I'm not categorically opposed to that, but it's not as if
I wasn't thinking about changing the pre-v17 bt_right_page_check_scankey()
code. I got interested in this area when I saw the interaction of the new
"first key on the next page" logic with bt_right_page_check_scankey(). The
patch made bt_right_page_check_scankey() pass back rightfirstoffset. The new
code then does palloc_btree_page() and PageGetItem() with that offset, which
bt_right_page_check_scankey() had already done. That smelled like a misplaced
distribution of responsibility. For a time, I suspected the new code should
move down into bt_right_page_check_scankey(). Then I transitioned to thinking
checkunique didn't need new code for the page boundary.
> it'll allow you to throw out a bunch of code -- AFAICT that proposal
> doesn't have that clear advantage going for it. The race condition
> that is described at great length in bt_right_page_check_scankey isn't
> ever going to be a problem for the recently committed checkunique
> patch (as you more or less pointed out yourself), but obviously it is
> still a concern for the cross-page order check.
>
> In summary, the old bt_right_page_check_scankey check is strictly
> concerned with the consistency of a physical data structure (the index
> itself), whereas the new checkunique check makes sure that the logical
> content of the database is consistent (the index, the heap, and all
> associated transaction status metadata have to be consistent). That
> means that the concerns that are described at length in
> bt_right_page_check_scankey (nor anything like those concerns) don't
> apply to the new checkunique check. We agree on all that, I think. But
> it's less clear that that presents us with an opportunity to simplify
> this patch.
See above for why I anticipated a simplification opportunity with respect to
new-in-v17 code. Still, it may not pan out.
> > Adding checkunique raised runtime from 58s to 276s, because it checks
Side note: my last email incorrectly described that as "raises runtime by
476%". It should have said "by 376%" or "by a factor of 4.76".
> > visibility for every heap tuple. It could do the heap fetch and visibility
> > check lazily, when the index yields two heap TIDs for one scan key. That
> > should give zero visibility checks for this particular test case, and it
> > doesn't add visibility checks to bloated-table cases.
> It seems like the implication of everything that you said about
> refactoring/moving the check was that doing so would enable this
> optimization (at least an implementation along the lines of your
> pseudo code). If that was what you intended, then it's not obvious to
> me why it is relevant. What, if anything, does it have to do with
> making the new checkunique visibility checks happen lazily?
Their connection is just being the two big-picture topics I found in
post-commit review. Decisions about the cross-page check are indeed separable
from decisions about lazy vs. eager visibility checks.
Thanks,
nm
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2024-03-25 18:26:55 | Re: Possibility to disable `ALTER SYSTEM` |
Previous Message | Robert Haas | 2024-03-25 18:13:21 | Re: Possibility to disable `ALTER SYSTEM` |