Re: BitmapHeapScan streaming read user and prelim refactoring

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
Subject: Re: BitmapHeapScan streaming read user and prelim refactoring
Date: 2024-04-06 14:40:01
Message-ID: 6367e86a-7269-4eae-9a35-5a3864854911@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 4/6/24 02:51, Tomas Vondra wrote:
>
> * The one question I'm somewhat unsure about is why Tom chose to use the
> "wrong" recheck flag in the 2017 commit, when the correct recheck flag
> is readily available. Surely that had a reason, right? But I can't think
> of one ...
>

I've been wondering about this a bit more, so I decided to experiment
and try to construct a case for which the current code prefetches the
wrong blocks, and the patch fixes that. But I haven't been very
successful so far :-(

My understanding was that the current code should do the wrong thing if
I alternate all-visible and not-all-visible pages. This understanding is
not correct, as I learned, because the thing that needs to change is the
recheck flag, not visibility :-( I'm still posting what I tried, perhaps
you will have an idea how to alter it to demonstrate the incorrect
behavior with current master.

The test was very simple:

create table t (a int, b int) with (fillfactor=10);
insert into t select mod((i/22),2), (i/22)
from generate_series(0,1000) s(i);
create index on t (a);

which creates a table with 46 pages, 22 rows per page, column "a"
alternates between 0/1 on pages, column "b" increments on each page (so
"b" identifies page).

and then

delete from t where mod(b,8) = 0;

which deletes tuples on pages 0, 8, 16, 24, 32, 40, so these pages will
need to be prefetched as not-all-visible by this query

explain analyze select count(1) from t where a = 0

when forced to do bitmap heap scan. The other even-numbered pages remain
all-visible. I added a bit of logging into BitmapPrefetch(), but even
with master I get this:

LOG: prefetching block 8 0 current block 6 0
LOG: prefetching block 16 0 current block 14 0
LOG: prefetching block 24 0 current block 22 0
LOG: prefetching block 32 0 current block 30 0
LOG: prefetching block 40 0 current block 38 0

So it prefetches the correct pages (the other value is the recheck flag
for that block from the iterator result).

Turns out (and I realize the comment about the assumption actually
states that, I just failed to understand it) the thing that would have
to differ for the blocks is the recheck flag.

But that can't actually happen because that's set by the AM/opclass and
the built-in ones do essentially this:

.../hash.c: scan->xs_recheck = true;
.../nbtree.c: scan->xs_recheck = false;

gist opclasses (e.g. btree_gist):

/* All cases served by this function are exact */
*recheck = false;

spgist opclasses (e.g. geo_spgist):

/* All tests are exact. */
out->recheck = false;

If there's an opclass that alters the recheck flag, it's well hidden and
I missed it.

Anyway, after this exercise and learning more about the recheck flag, I
think I agree the assumption is unnecessary. It's pretty harmless
because none of the built-in opclasses alters the recheck flag, but the
correct recheck flag is readily available. I'm still a bit puzzled why
the 2017 commit even relied on this assumption, though.

regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2024-04-06 14:57:51 Re: BitmapHeapScan streaming read user and prelim refactoring
Previous Message jian he 2024-04-06 14:34:11 Re: remaining sql/json patches