From: | Peter Geoghegan <pg(at)bowt(dot)ie> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Konstantin Knizhnik <knizhnik(at)garret(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Alexander Kuzmenkov <akuzmenkov(at)timescale(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Subject: | Re: Incorrect result of bitmap heap scan. |
Date: | 2024-12-02 17:02:12 |
Message-ID: | CAH2-WznGntS+Q_5Eye4bupdsqO0SqU5as=1-Ku9x+vKPg9pvJw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Dec 2, 2024 at 11:32 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> On 2024-12-02 16:08:02 +0100, Matthias van de Meent wrote:
> > Concurrency timeline:
> >
> > Session 1. amgetbitmap() gets snapshot of index contents, containing
> > references to dead tuples on heap P1.
>
> IIUC, an unstanted important aspect here is that these dead tuples are *not*
> visible to S1. Otherwise the VACUUM in the next step could not remove the dead
> tuples.
I would state the same thing slightly differently, FWIW: I would say
that the assumption that has been violated is that a TID is a stable
identifier for a given index tuple/logical row/row version (stable for
the duration of the scan).
This emphasis/definition seems slightly more useful, because it makes
it clear why this is hard to hit: you have to be fairly unlucky for a
dead-to-everyone TID to be returned by your scan (no LP_DEAD bit can
be set) and set in the scan's bitmap, only to later be concurrently
set LP_UNUSED in the heap -- all without VACUUM randomly being
prevented from setting the same page all-visible due to some unrelated
not-all-visible heap item making it unsafe.
> Ugh :/
>
> Pretty depressing that we've only found this now, ~6 years after it's been
> released. We're lacking some tooling to find this stuff in a more automated
> fashion.
FWIW I have suspicions about similar problems with index-only scans
that run in hot standby, and about all GiST index-only scans:
https://postgr.es/m/CAH2-Wz=PqOziyRSrnN5jAtfXWXY7-BJcHz9S355LH8Dt=5qxWQ@mail.gmail.com
In general there seems to be a lack of rigor in this area. I'm hoping
that Tomas Vondra's work can tighten things up here in passing.
> It seems pretty much impossible to fix that with the current interaction
> between nodeBitmap* and indexam. I wonder if we could, at least in some cases,
> and likely with some heuristics / limits, address this by performing some
> visibility checks during the bitmap build. I'm bringing it up here because I
> suspect that some of the necessary changes would overlap with what's needed to
> repair bitmap index-only scans.
This seems like it plays into some of the stuff I've discussed with
Tomas, about caching visibility information in local state, as a means
to avoiding holding onto pins in the index AM. For things like
mark/restore support.
> > This is quite non-trivial, however, as we don't have much in place regarding
> > per-relation shared structures which we could put such a value into, nor a
> > good place to signal changes of the value to bitmap heap-scanning backends.
>
> It doesn't have to be driven of table state, it could be state in
> indexes. Most (all?) already have a metapage.
FWIW GiST doesn't have a metapage (a historical oversight).
> Unfortunately I don't see a better path either.
>
> I think it'd be good if we added a test that shows the failure mechanism so
> that we don't re-introduce this in the future. Evidently this failure isn't
> immediately obvious...
Maybe a good use for injection points?
--
Peter Geoghegan
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2024-12-02 17:02:39 | Re: Incorrect result of bitmap heap scan. |
Previous Message | Ilia Evdokimov | 2024-12-02 17:00:23 | Re: Showing applied extended statistics in explain Part 2 |