Re: Why doesn't GiST VACUUM require a super-exclusive lock, like nbtree VACUUM?

From: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, hlinnaka(at)iki(dot)fi, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, Tomas Vondra <tomas(at)vondra(dot)me>
Subject: Re: Why doesn't GiST VACUUM require a super-exclusive lock, like nbtree VACUUM?
Date: 2025-01-04 01:00:34
Message-ID: CAEze2Wg1kbpo_Q1=9X68JRsgfkyPCk4T0QN+qKz10+FVzCAoGA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 3 Dec 2024 at 17:21, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>
> On Mon, Dec 2, 2024 at 8:18 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> > Attached is a refined version of a test case I posted earlier on [2],
> > decisively proving that GiST index-only scans are in fact subtly
> > broken. Right now it fails, showing a wrong answer to a query. The
> > patch adds an isolationtest test case to btree_gist, based on a test
> > case of Andres'.
>
> I can confirm that the same bug affects SP-GiST. I modified the
> original failing GiST isolation test to make it use SP-GiST instead,
> proving what I already strongly suspected.
>
> I have no reason to believe that there are any similar problems in
> core index AMs other than GiST and SP-GiST, though. Let's go through
> them all now: nbtree already does everything correctly, and all
> remaining core index AMs don't support index-only scans *and* don't
> support scans that don't just use an MVCC snapshot.

I think I have a fix for GiST which can be found attached in patch 0002.

As to how this works: the patch tracks (for IOS) the pages for which
there are some entries yet to be returned by gistgettuple(), and keeps
a pin on those pages using a new AMscan tracking mechanism that
utilizes buffer refcounts. Even if it might not a very elegant
solution and IMV still has rough edges, it works, and fixes the issue
with incorrect results from the GiST index.

One side effect of this change to keep pins in GiST-IOS, is that that
this could realistically keep pins on a huge portion of the index,
thus exhausting shared buffers and increasing prevalence of "no
unpinned buffers"-related errors.

I haven't looked much at SP-GiST yet, so I don't have anything for the
VACUUM+IOS bug there.

0001 is a slight modification of your v2-0001, a version which now
(critically) doesn't expect VACUUM to run to completion before
s1_fetch_all starts; this is important for 0002 as that causes vacuum
to block and wait for the cursor to return more tuples, which the
isolation tester doesn't (can't?) detect. With only 0001, the new test
fails with incorrect results, with 0002 applied the test succeeds.

I'm looking forward to any feedback.

Kind regards,

Matthias van de Meent

Attachment Content-Type Size
v3-0001-isolationtester-showing-broken-index-only-scans-w.patch application/octet-stream 7.1 KB
v3-0002-RFC-Extend-buffer-pin-lifetime-for-GIST-IOS.patch application/octet-stream 10.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2025-01-04 01:01:12 Re: Fwd: Re: A new look at old NFS readdir() problems?
Previous Message Masahiko Sawada 2025-01-04 00:32:34 Re: POC: enable logical decoding when wal_level = 'replica' without a server restart