pgsql: Normalize nbtree truncated high key array behavior.

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: pgsql-committers(at)lists(dot)postgresql(dot)org
Subject: pgsql: Normalize nbtree truncated high key array behavior.
Date: 2024-10-16 16:18:16
Message-ID: E1t16j2-0016fr-Ix@gemulon.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers

Normalize nbtree truncated high key array behavior.

Commit 5bf748b8 taught nbtree ScalarArrayOp index scans to decide when
and how to start the next primitive index scan based on physical index
characteristics. This included rules for deciding whether to start a
new primitive index scan (or whether to move onto the right sibling leaf
page instead) that specifically consider truncated lower-order columns
(-inf columns) from leaf page high keys.

These omitted columns were treated as satisfying the scan's required
scan keys, though only for scan keys marked required in the current scan
direction (forward). Scan keys that didn't get this behavior (those
marked required in the backwards direction only) usually didn't give the
scan reasonable cause to reposition itself to a later leaf page (via
another descent of the index in _bt_first), but _bt_advance_array_keys
would nevertheless always give up by forcing another call to _bt_first.

_bt_advance_array_keys was unwilling to allow the scan to continue onto
the next leaf page, to reconsider whether we really should start another
primitive scan based on the details of the sibling page's tuples. This
didn't match its behavior with similar cases involving keys required in
the current scan direction (forward), which seems unprincipled. It led
to an excessive number of primitive scans/index descents for queries
with a higher-order = array scan key (with dense, contiguous values)
mixed with a lower-order required > or >= scan key.

Bring > and >= strategy scan keys in line with other required scan key
types: treat truncated -inf scan keys as having satisfied scan keys
required in either scan direction (forwards and backwards alike) during
array advancement. That way affected scans can continue to the right
sibling leaf page. Advancement must now schedule an explicit recheck of
the right sibling page's high key in cases involving > or >= scan keys.
The recheck gives the scan a way to back out and start another primitive
index scan (we can't just rely on _bt_checkkeys with > or >= scan keys).

This work can be considered a stand alone optimization on top of the
work from commit 5bf748b8. But it was written in preparation for an
upcoming patch that will add skip scan to nbtree. In practice scans
that use "skip arrays" will tend to be much more sensitive to any
implementation deficiencies in this area.

Author: Peter Geoghegan <pg(at)bowt(dot)ie>
Reviewed-By: Tomas Vondra <tomas(at)vondra(dot)me>
Discussion: https://postgr.es/m/CAH2-Wz=9A_UtM7HzUThSkQ+BcrQsQZuNhWOvQWK06PRkEp=SKQ@mail.gmail.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/79fa7b3b1a449680d9e51624834fbb4b32208659

Modified Files
--------------
src/backend/access/nbtree/nbtree.c | 4 ++
src/backend/access/nbtree/nbtsearch.c | 25 +++++++
src/backend/access/nbtree/nbtutils.c | 119 ++++++++++++++++++----------------
src/include/access/nbtree.h | 3 +
4 files changed, 95 insertions(+), 56 deletions(-)

Browse pgsql-committers by date

  From Date Subject
Next Message Tom Lane 2024-10-16 16:25:09 pgsql: ecpg: fix some minor mishandling of bad input in preprocessor.
Previous Message Amit Langote 2024-10-16 11:41:34 Re: pgsql: Fix typo in comment of transformJsonAggConstructor()