From: | Kim Rose Carlsen <krc(at)hiper(dot)dk> |
---|---|
To: | "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> |
Subject: | Postgres wont remove useless joins, when the UNIQUE index is partial |
Date: | 2019-01-10 12:30:49 |
Message-ID: | AM6PR05MB549130A7CAB9D09F986A99A0C7840@AM6PR05MB5491.eurprd05.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hi
remove_useless_join does not prove uniqueness if the unique index is partial, and therefore wont remove the join if no columns are referenced (see example in bottom).
I have been trying to look around the source code and from what I have identified the problem seems to be that "check_index_predicates(..)" happens after "remove_useless_join(..)", and therefore cannot see that the unique index is actually covered by the join condition.
From analyzejoins.c:612, rel_supports_distinctness(..)
if (ind->unique && ind->immediate &&
(ind->indpred == NIL || ind->predOK))
return true;
But the problem is ind->predOK is calculated in check_index_predicates(..) but this happens later so ind->predOK is always false when checked here.
I have tried to add check_index_predicates(..) to rel_supports_distinctness(..) and this produces the expected plan, but I have no idea of the implication of doing check_index_predicates(..) earlier.
This is my first time looking at the postgres source code, so I know attached "patch" is not the solution, but any pointers on where to go from here would be appreciated.
Example:
CREATE TABLE a (
id INTEGER PRIMARY KEY,
sub_id INTEGER NOT NULL,
deleted_at TIMESTAMP
);
CREATE UNIQUE INDEX ON a (sub_id) WHERE (deleted_at IS NULL);
ANALYZE a;
EXPLAIN SELECT 1 FROM a AS a LEFT JOIN a AS b ON a.id = b.sub_id AND b.deleted_at IS NULL;
Expected plan:
QUERY PLAN
-----------------------------------------------------
Seq Scan on a (cost=0.00..28.50 rows=1850 width=4)
Actual plan:
QUERY PLAN
---------------------------------------------------------------------------------------
Hash Left Join (cost=14.76..48.13 rows=1850 width=4)
Hash Cond: (a.id = b.sub_id)
-> Seq Scan on a (cost=0.00..28.50 rows=1850 width=4)
-> Hash (cost=14.65..14.65 rows=9 width=4)
-> Bitmap Heap Scan on a b (cost=4.13..14.65 rows=9 width=4)
Recheck Cond: (deleted_at IS NULL)
-> Bitmap Index Scan on a_sub_id_idx (cost=0.00..4.13 rows=9 width=0)
(7 rows)
mvh
Kim Carlsen
Hiper A/S
M: 71 99 42 00
Attachment | Content-Type | Size |
---|---|---|
diff.patch | text/x-patch | 627 bytes |
From | Date | Subject | |
---|---|---|---|
Next Message | Tony Shelver | 2019-01-10 12:41:37 | Re: Pulling data from Postgres DB table for every 5 seconds. |
Previous Message | Arun Menon | 2019-01-10 12:19:23 | Lost synchronization with server: got message type"0" , length 879046704 |