Re: Adding skip scan (including MDAM style range skip scan) to nbtree

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Masahiro Ikeda <ikedamsh(at)oss(dot)nttdata(dot)com>
Cc: Tomas Vondra <tomas(at)vondra(dot)me>, Masahiro(dot)Ikeda(at)nttdata(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org, Masao(dot)Fujii(at)nttdata(dot)com
Subject: Re: Adding skip scan (including MDAM style range skip scan) to nbtree
Date: 2024-11-25 22:32:34
Message-ID: CAH2-Wzmd4TT-F6RSEjyXGNMoUEYUMdhit50onCEw02ud=Tb_Gw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 25, 2024 at 4:07 AM Masahiro Ikeda <ikedamsh(at)oss(dot)nttdata(dot)com> wrote:
> The approach looks good to me. In fact, the significant regressions I
> reported have disappeared in my environment. And the test_for_v17.sql
> shows that the performance of the master and the master with your
> patches
> is almost same.

That's great.

I'm also a bit worried about regressing queries that don't even
involve skip arrays -- particularly very simple queries. I'm thinking
of things like "pgbench -S" style SELECT queries. Adding almost any
new code to a hot code path such as _bt_check_compare comes with a
risk of such regressions. (Up until now it has been convenient to
pretend that "skipscan_prefix_cols=0" is 100% representative of
master, but obviously it is less than 100% representative -- since
"skipscan_prefix_cols=0" still has enlarged object code size in places
like _bt_check_compare.)

Attached is v18, which makes _bt_check_compare copy all relevant scan
key fields into local variables. This gives the compiler more freedom
due to not having to worry about aliasing.

I believe that this change to _bt_check_compare fixes some regressions
with simple queries. This includes a "pgbench -S" variant that I
previously used during the Postgres 17 SAOP array work -- a script
that showed a favorable case for that 17 work, involving "SELECT aid
FROM pgbench_accounts WHERE aid IN (....)", with hundreds of
contiguous array elements. More importantly, it avoids regressions
with standard "pgbench -S" SELECT queries.

It takes a long time to validate changes such as these -- the
regressions are within the range of noise, so it is difficult to tell
the difference between signal and noise. I probably went further with
some of the new changes to _bt_check_compare than really made sense.
For example, I probably added likely() or unlikely() annotations that
don't really help. This is something that I will continue to refine.

> One thing I am concerned about is that it reduces the cases where the
> optimization of _bt_checkkeys_look_ahead() works effectively, as the
> approach
> skips the skip immediately on the first occurrence per page.

I noticed that with the recent v17 revision of the patch, my original
MDAM paper "sales_mdam_paper" test case (the complicated query in the
introductory email of this thread) was about 2x slower. That's just
not okay, obviously. But the issue was relatively easy to fix: it was
fixed by making _bt_readpage not apply the "skipskip" optimization
when on the first page for the current primitive index scan -- we
already do this with the "precheck" optimization, so it's natural to
do it with the "skipskip" optimization as well.

The "sales_mdam_paper" test case involves thousands of primitive index
scans that each access only one leaf page. But each leaf page returns
2 non-adjoining tuples, with quite a few non-matching tuples "in
between" the matching tuples. There is one matching tuple for "store =
200", and another for "store = 250" -- and there's non-matching stores
201 - 249 between these two, which we want _bt_checkkeys_look_ahead to
skip over. This is exactly the kind of case where the
_bt_checkkeys_look_ahead() optimization is expected to help.

> -- performance
> SET skipscan_prefix_cols=32;
> EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT, SETTINGS, WAL, VERBOSE) SELECT *
> FROM t WHERE id2 = 360; -- cache
> EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT, SETTINGS, WAL, VERBOSE) SELECT *
> FROM t WHERE id2 = 360;
> SET skipscan_prefix_cols=0;
> EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT, SETTINGS, WAL, VERBOSE) SELECT *
> FROM t WHERE id2 = 360;
>
> Again, the above results are provided for reference, as I believe that
> many users prioritize stability and I'd like to take your new approach.

Adversarial cases specifically designed to "make the patch look bad"
are definitely useful review feedback. Ideally, the patch will be 100%
free of regressions -- no matter how unlikely (or even silly) they may
seem. I always prefer to not have to rely on anybody's opinion of what
is likely or unlikely. :-)

A quick test seems to show that this particular regression is more or
less fixed by v18. As you said, the _bt_checkkeys_look_ahead stuff is
the issue here (same with the MDAM sales query). You should confirm
that the issue has actually been fixed, though.

Thanks
--
Peter Geoghegan

Attachment Content-Type Size
v18-0001-Show-index-search-count-in-EXPLAIN-ANALYZE.patch application/octet-stream 52.6 KB
v18-0003-Add-skipskip-nbtree-skip-scan-optimization.patch application/octet-stream 24.3 KB
v18-0002-Add-skip-scan-to-nbtree.patch application/octet-stream 174.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Shayon Mukherjee 2024-11-25 22:34:38 Re: Proposal to Enable/Disable Index using ALTER INDEX (with patch)
Previous Message Ilia Evdokimov 2024-11-25 22:15:25 Re: Sample rate added to pg_stat_statements