BUG #18442: Unnecessary Sort operator in indexScan Plan

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: hu_yajun(at)qq(dot)com
Subject: BUG #18442: Unnecessary Sort operator in indexScan Plan
Date: 2024-04-19 03:10:24
Message-ID: 18442-aca4c9134416990e@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 18442
Logged by: yajun Hu
Email address: hu_yajun(at)qq(dot)com
PostgreSQL version: 14.11
Operating system: CentOS7 with kernel version 5.10
Description:

I have reproduced this problem in REL_14_11 and the latest master branch
(84db9a0eb10dd1dbee6db509c0e427fa237177dc).
The steps to reproduce are as follows.
1. ./configure --enable-debug --enable-depend --enable-cassert CFLAGS=-O0
2. make -j; make install -j; initdb -D ./primary; pg_ctl -D ../primary -l
logfile start
3. run SQL:
```
create table t( a int, b int);
insert into t select null,i from generate_series(1,100)i;
insert into t select i,i from generate_series(1,100000)i;
analyze t;
create index on t(a,b);
postgres=# explain select * from t where a is null order by b; -- need
sort
QUERY PLAN
--------------------------------------------------------------------------------
Sort (cost=9.54..9.80 rows=103 width=8)
Sort Key: b
-> Index Only Scan using t_a_b_idx on t (cost=0.29..6.10 rows=103
width=8)
Index Cond: (a IS NULL)
(4 rows)

postgres=# explain select * from t where a is null order by a, b; -- no need
sort
QUERY PLAN
--------------------------------------------------------------------------
Index Only Scan using t_a_b_idx on t (cost=0.29..6.10 rows=103 width=8)
Index Cond: (a IS NULL)
(2 rows)

postgres=# explain select * from t where a = 1 order by b; -- no need sort
QUERY PLAN
------------------------------------------------------------------------
Index Only Scan using t_a_b_idx on t (cost=0.29..4.31 rows=1 width=8)
Index Cond: (a = 1)
(2 rows)
```

In my understanding, in the first SELECT, because a is always NULL, the
scanned
data access by IndexOnlyScan is sorted according to b, which means that the
upper
Sort operator is unnecessary overhead.The second and third SELECT are both
as
expected.

I tried to analyze the code and found that the EquivalenceClass of column a
and NULL
was missing, which caused build_index_pathkeys to return NIL. No pathkeys
makes the
optimizer decide that the upper layer needed Sort to ensure that the data
was in order.
I roughly know that it may be because NullTest in the check_mergejoinable
function is
not OpExpr. Is it possible here to generate special EquivalenceClass for
column a and
NULL to solve this problem?

I’m looking forward to someone answering my confusion, thank you very much!

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Shlok Kyal 2024-04-19 11:51:08 Re: BUG #18433: Logical replication timeout
Previous Message Jerry Sievert 2024-04-18 17:37:27 Re: CVE-2024-28849