Re: Why doesn't GiST VACUUM require a super-exclusive lock, like nbtree VACUUM?

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Cc: hlinnaka(at)iki(dot)fi, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, Tomas Vondra <tomas(at)vondra(dot)me>
Subject: Re: Why doesn't GiST VACUUM require a super-exclusive lock, like nbtree VACUUM?
Date: 2024-12-03 01:18:14
Message-ID: CAH2-Wz=jjiNL9FCh8C1L-GUH15f4WFTWub2x+_NucngcDDcHKw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Nov 3, 2021 at 7:33 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> But what about index-only scans, which GiST also supports? I think
> that the rules are different there, even though index-only scans use
> an MVCC snapshot.

(Reviving this old thread after 3 years)

I was reminded of this old thread during today's discussion of a
tangentially related TID-recycle-safety bug that affects bitmap index
scans that use the visibility map as an optimization [1]. Turns out I
was right to be concerned.

This GiST bug is causally unrelated to that other bug, so I thought it
would be helpful to move discussion of the GiST bug to this old
thread.

Attached is a refined version of a test case I posted earlier on [2],
decisively proving that GiST index-only scans are in fact subtly
broken. Right now it fails, showing a wrong answer to a query. The
patch adds an isolationtest test case to btree_gist, based on a test
case of Andres'.

Offhand, I think that the only way to fix this is to bring GiST in
line with nbtree: we ought to teach GiST VACUUM to start acquiring
cleanup locks (previously known as super-exclusive locks), and then
teach GiST index-only scans to hold onto a leaf page buffer pin as the
visibility map (or the heap proper) is accessed for the TIDs to be
returned from the leaf page. Arguably, GiST isn't obeying the current
contract for amgettuple index AMs at all right now (whether or not it
violates the contract as written is open to interpretation, I suppose,
but either way the current behavior is wrong).

We probably shouldn't hold onto a buffer pin during plain GiST
index-only scans, though -- plain GiST index scans *aren't* broken,
and so we should change as little as possible there. More concretely,
we should probably copy more nbtree scan related code into GiST to
deal with all this: we could copy nbtree's _bt_drop_lock_and_maybe_pin
into GiST to fix this bug, while avoiding changing the performance
characteristics of GiST plain index scans. This will also entail
adding a new buffer field to GiST's GISTScanOpaqueData struct --
something similar to nbtree's BTScanOpaqueData.currPos.buf field
(it'll go next to the current GISTScanOpaqueData.curBlkno field, just
like the nbtree equivalent goes next to its own currPage blkno field).

Long term, code like nbtree's _bt_drop_lock_and_maybe_pin should be
generalized and removed from every individual index AM, nbtree
included -- I think that the concepts generalize to every index AM
that supports amgettuple (the race condition in question has
essentially nothing to do with individual index AM requirements). I've
discussed this kind of approach with Tomas Vondra (CC'd) recently.
That's not going to be possible within the scope of a backpatchable
fix, though.

[1] https://postgr.es/m/873c33c5-ef9e-41f6-80b2-2f5e11869f1c@garret.ru
[2] https://postgr.es/m/CAH2-Wzm6gBqc99iEKO6540ynwpjOqWESt5yjg-bHbt0hc8DPsw%40mail.gmail.com
--
Peter Geoghegan

Attachment Content-Type Size
v2-0001-isolationtester-showing-broken-index-only-scans-w.patch application/octet-stream 4.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2024-12-03 01:22:34 Re: Incorrect result of bitmap heap scan.
Previous Message Michael Paquier 2024-12-03 01:16:35 Re: code contributions for 2024, WIP version