Re: Re: bt Scankey in another contradictory case

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: "bigbro_wq(at)hotmail(dot)com" <bigbro_wq(at)hotmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, TomLane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Re: bt Scankey in another contradictory case
Date: 2024-10-07 15:44:26
Message-ID: CAH2-Wzno9NJSJdROx5immswvEiCSb+GH7oOSMtrn4r3DQ+HMRg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Sep 1, 2024 at 11:44 AM bigbro_wq(at)hotmail(dot)com
<bigbro_wq(at)hotmail(dot)com> wrote:
> I have reanalysed the code of function _bt_first. I notice that using a multi-attribute index
> if we can't identify the starting boundaries and the following attributes markded not required ,
> that means we need start at first or last page in the index to examine every tuple to satisfy the
> qual or not, in the meantime the scan will be stopped while the first attribute evaluated failed.

I'm not sure of what it is that you're trying to draw attention to.

The behavior of _bt_first doesn't seem relevant to this patch to me.
For one thing, _bt_first doesn't actually care about whether or not a
scan key is marked required -- _bt_first independently applies its own
similar rules to determine which scan keys can be used in the
insertion scan key used to find an initial position on the leaf level.
More importantly, whether a scan key is marked required doesn't seem
relevant to this patch (that is relevant to _bt_preprocess_keys, but
doesn't seem relevant to the changes that you propose to make to
_bt_preprocess_keys).

> For instance:
> create table c_s( x int, y int);
> insert into c_s select generate_series(1, 20000), generate_series(1, 20000);
> create index my_c_s_idx on c_s using btree(x,y);
> explain (analyze, buffers) select * from c_s where x >4000 and y >10 and y <10 order by x desc;

Your patch (which I haven't tested for myself) is based on the
observation that "y > 10 and y < 10" is a contradictory condition.
This is true regardless of any other factor, such as whether we're
able to mark the "y" scan key required.

All that _bt_first has to do is return when it notices "so->qual_ok"
was set to false within _bt_preprocess_keys. It doesn't matter if
there is an "x > 4000", and it doesn't matter if you use a composite
index like this that completely omits a condition on "x".

> What's more, if i change the number 4000 to 1000.
> -----------------------------------------------------------------------------------------------------
> Sort (cost=441.01..441.01 rows=1 width=8) (actual time=2.974..2.975 rows=0 loops=1)
> Sort Key: x DESC
> Sort Method: quicksort Memory: 25kB
> Buffers: shared hit=89
> -> Seq Scan on c_s (cost=0.00..441.00 rows=1 width=8) (actual time=2.971..2.971 rows=0 loops=1)
> Filter: ((x > 1000) AND (y > 10) AND (y < 10))
> Rows Removed by Filter: 20000
> Buffers: shared hit=89
> Planning:
> Buffers: shared hit=2
> Planning Time: 0.113 ms
> Execution Time: 2.990 ms
> (12 rows)
>
> The planner choosed the Seq Scan, and the executor have done the unnecessary jobs 20000 times.

It's true that a sequential scan won't ever apply the logic from
_bt_preprocess_keys, nor any similar logic. A sequential scan with
contradictory quals will therefore not be detected in cases where it
would have been detected had we used an index scan with the same
quals. This inconsistency doesn't make much sense; it's just an
implementation deficiency. It's historical.

We've talked about moving the current _bt_preprocess_keys logic to
plan time before. See Tom's remarks at the end of this email:

https://www.postgresql.org/message-id/2587523.1647982549@sss.pgh.pa.us

I think that it would be possible to move _bt_preprocess_keys to plan
time, and to generalize it to work with sequential scans, but that is
a much larger project. It seems out of scope. I think that you should
just focus on the immediate problem of not detecting contradictory
quals that happen to involve (> or >=) and (< or <=) operators. It's a
completely independent problem, and one that is well scoped.

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2024-10-07 15:52:27 Re: Proposal to Enable/Disable Index using ALTER INDEX
Previous Message Robert Haas 2024-10-07 15:36:07 Re: On disable_cost