Re: Incorrect result of bitmap heap scan.

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Konstantin Knizhnik <knizhnik(at)garret(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Incorrect result of bitmap heap scan.
Date: 2024-12-02 18:39:43
Message-ID: CAH2-WzmiOd9-tvqRV-8oyXNkvaiAK_=LYT-gX8GkinZqeXEneQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Dec 2, 2024 at 12:11 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > Freezing a page, and setting a page all-visible are orthogonal.
>
> Sorry, sloppy wording on my part.

Freezing doesn't affect the contents of the visibility map in any way
that seems relevant. The executor only cares about the all-visible bit
(and never the all-frozen bit), and the rules around when and how
VACUUM sets the all-visible bit (and how everybody else unsets the
all-visible bit) haven't changed in forever. So I just can't see it.

I guess it's natural to suspect more recent work -- commit 7c70996e is
about 6 years old. But I the race condition that I suspect is at play
here is very narrow.

It's pretty unlikely that there'll be a dead-to-all TID returned to a
scan (not just dead to our MVCC snapshot, dead to everybody's) that is
subsequently concurrently removed from the index, and then set
LP_UNUSED in the heap. It's probably impossible if you don't have a
small table -- VACUUM just isn't going to be fast enough to get to the
leaf page after the bitmap index scan, but still be able to get to the
heap before its corresponding bitmap heap scan (that uses the VM as an
optimization) can do the relevant visibility checks (while it could
happen with a large table and a slow bitmap scan, the chances of the
VACUUM being precisely aligned with the bitmap scan, in just the wrong
way, seem remote in the extreme). Finally, none of this will happen if
some other factor hinders VACUUM from setting the relevant heap page
all-visible.

AFAICT this is only a problem because of the involvement of the VM,
specifically -- an MVCC snapshot *is* generally sufficient to make
bitmap index scans safe from the dangers of concurrent TID recycling,
as explained in "62.4. Index Locking Considerations". That only ceases
to be true when the visibility map becomes involved (the VM lacks the
granular visibility information required to make all this safe). This
is essentially the same VM race issue that nbtree's
_bt_drop_lock_and_maybe_pin protects against during conventional
index-only scans.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dmitry Dolgov 2024-12-02 19:17:59 Re: Changing shared_buffers without restart
Previous Message Tom Lane 2024-12-02 18:06:01 Re: Remove useless casts to (void *)