Quick Links

Redundant Unique plan node for table with a unique index

From:	Damir Belyalov <dam(dot)bel07(at)gmail(dot)com>
To:	PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject:	Redundant Unique plan node for table with a unique index
Date:	2023-09-13 13:22:00
Message-ID:	CALH1LgurirYhGweNz+rn=YBcw4+S4x=Nxg=LKwoPvt3U2+og5Q@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hello!

There is a table with a unique index on it and we have a query that
searching DISTINCT values on this table on columns of unique index. Example:

create table a (n int);
insert into a (n) select x from generate_series(1, 140000) as g(x);
create unique index on a (n);
explain select distinct n from a;
QUERY PLAN

------------------------------------------------------------------------------------
Unique (cost=0.42..6478.42 rows=140000 width=4)
-> Index Only Scan using a_n_idx on a (cost=0.42..6128.42 rows=140000
width=4)
(2 rows)

We can see that Unique node is redundant for this case. So I implemented a
simple patch that removes Unique node from the plan.
After patch:

explain select distinct n from a;
QUERY PLAN
---------------------------------------------------------
Seq Scan on a (cost=0.00..2020.00 rows=140000 width=4)
(1 row)

The patch is rather simple and doesn't consider queries with joins. The
criteria when Unique node is should be removed is a case when a set of Vars
in DISTINCT clause contains unique index columns from the same table.
Another example:
CREATE TABLE a (n int, m int);
CRETE UNIQUE INDEX ON a (n);
SELECT DISTINCT (n,m) FROM a;
The Unique node should be deleted because n is contained in (n,m).

The patch doesn't consider these cases:
1. DISTINCT ON [EXPR]
Because this case can need grouping.
2. Subqueries.
Because this case can need grouping:
CREATE TABLE a (n int);
CREA UNIQUE INDEX ON a (n);
SELECT DISTINCT g FROM (SELECT * FROM a) as g;
3. Joins, because it demands complication of code.
Example:
SELECT DISTINCT a.n1 JOIN b where a.n1 = b.n1;
where a.n1 and b.n1 should be unique indexes and join qual should be
on this index columns.
or
a have a unique index on n1 and b is "unique for a" on join qual.

I am wondering if there are opportunities for further development of this
patch, in particular for JOIN cases.
For several levels of JOINs we should understand which set columns is
unique for the every joinrel in query. In general terms I identified two
cases when joinrel "saves" unique index from table: when tables are joined
by unique index columns and when one table has unique index and it is
"unique_for" (has one common tuple) another table.

Regards,
Damir Belyalov
Postgres Professional

Attachment	Content-Type	Size
0001-Redundant-Unique-Node.patch	text/x-patch	2.8 KB

Responses

Re: Redundant Unique plan node for table with a unique index at 2023-09-13 13:28:18 from Daniel Gustafsson
Re: Redundant Unique plan node for table with a unique index at 2023-09-13 23:39:05 from David Rowley

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Daniel Gustafsson	2023-09-13 13:28:18	Re: Redundant Unique plan node for table with a unique index
Previous Message	Ashutosh Bapat	2023-09-13 13:18:24	Re: logical decoding and replication of sequences, take 2