More reliable nbtree detection of unsatisfiable RowCompare quals involving a leading NULL key/element

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: More reliable nbtree detection of unsatisfiable RowCompare quals involving a leading NULL key/element
Date: 2024-12-23 18:02:59
Message-ID: CAH2-WzmySVXst2hFrOATC-zw1Byg1XC-jYUS314=mzuqsNwk+Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Currently, nbtree has inconsistent detection of unsatisfiable
RowCompare quals that involve a leading NULL key. I'll show a test
case that illustrates where nbtree doesn't behave as expected right
now, resulting in an unnecessary full index scan.

Test case setup
===============

pg(at)regression:5432=# create table rowcompare_test as select i as
first, i as second, i as third from generate_series(1,10000) i;
SELECT 10000
pg(at)regression:5432=# create index on rowcompare_test (first, second, third);
CREATE INDEX

Query where nbtree detects an unsatisfiable qual, as expected
=============================================================

The following query is recognized as having an unsatisfiable qual, as
evidenced by the fact that we see nothing about any buffer hits in
explain analyze output (though you might have to repeat the query to
account for syscache effects):

pg(at)regression:5432=# explain (analyze, timing off, costs off) select *
from rowcompare_test where first = 1 and (second, third) > (NULL,0);
┌─────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ QUERY PLAN

├─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Index Only Scan using rowcompare_test_first_second_third_idx on
rowcompare_test (actual rows=0 loops=1) │
│ Index Cond: ((first = 1) AND (ROW(second, third) >
ROW(NULL::integer, 0))) │
│ Heap Fetches: 0

│ Planning Time: 0.034 ms

│ Execution Time: 0.013 ms

└─────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(5 rows)

Query where nbtree fails to detect an unsatisfiable qual
========================================================

However, this very similar query *won't* have its unsatisfiable qual
detected by nbtree, so we get a useless full index scan instead (we
see "Buffers: shared hit=40" now):

pg(at)regression:5432=# explain (analyze, timing off, costs off) select *
from rowcompare_test where (second, third) > (NULL,0);
┌─────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ QUERY PLAN

├─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Index Only Scan using rowcompare_test_first_second_third_idx on
rowcompare_test (actual rows=0 loops=1) │
│ Index Cond: (ROW(second, third) > ROW(NULL::integer, 0))

│ Heap Fetches: 0

│ Buffers: shared hit=40

│ Planning Time: 0.048 ms

│ Execution Time: 0.321 ms

└─────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(6 rows)

Patch
=====

The underlying issue is that nbtree currently detects this case in
_bt_first, which is an ugly special case -- every other unsatisfiable
qual gets detected earlier, in _bt_preprocess_keys (this is an
important responsibility of nbtree preprocessing). Handling the issue
in _bt_first has an undesirable downside, that my test case
highlights: _bt_first can only detect an unsatisfiable NULL RowCompare
qual at the point where it builds an insertion scan key using
RowCompare member input scan keys. That works out in the first test
query, but doesn't work out in the second test query.

There is no good reason for this inconsistency -- it is intrinsically
impossible for a row comparison with a row whose first element is NULL
to ever return anything other than NULL, regardless of any other
factor (though note that this isn't true of later elements that happen
to be NULL, only the first element, since only the first element is
guaranteed to be compared [1]).

Attached patch fixes the problem by moving detection of RowCompare
unsatisfiable-due-to-NULL cases into _bt_fix_scankey_strategy.
_bt_fix_scankey_strategy is called by _bt_preprocess_keys for every
scan key. And so with the patch applied, both cases will have their
unsatisfiable quals detected, allowing nbtree to consistently avoid
these useless full index scans. (My main motivation is removing the
annoying, distracting special case from _bt_first, though.)

[1] https://www.postgresql.org/docs/devel/functions-comparisons.html#ROW-WISE-COMPARISON

--
Peter Geoghegan

Attachment Content-Type Size
v1-0001-nbtree-Detect-more-unsatisfiable-RowCompare-NULL-.patch application/octet-stream 2.2 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jan Wieck 2024-12-23 18:06:49 Re: transaction lost when delete clog file after normal shutdown
Previous Message Melanie Plageman 2024-12-23 17:58:09 Re: Eagerly scan all-visible pages to amortize aggressive vacuum