Exists pull-up application with JoinExpr

From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Exists pull-up application with JoinExpr
Date: 2024-12-24 04:44:38
Message-ID: 0b1f670d-b39d-4966-bf32-f0d502ebc564@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi, hackers!

I found one pull-up that works if the inner join condition is written
through the where condition,

|create temp table ta (id int primary key, val int); insert into ta
values(1,1); insert into ta values(2,2); ||insert into ta values(3,3);|

|create temp table tb (id int primary key, aval int); insert into tb
values(4,1); insert into tb values(5,1); insert into tb values(1,2);
create temp table tc (id int primary key, aid int); insert into tc
values(6,1); insert into tc values(7,2);|

|EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF) SELECT * FROM ta
WHERE EXISTS (SELECT * FROM tb, tc WHERE ta.id = tb.id);|
QUERY PLAN
-------------------------------------------------------------------------
Nested Loop Semi Join (actual rows=1 loops=1)
Buffers: local hit=6
-> Seq Scan on ta (actual rows=3 loops=1)
Buffers: local hit=1
-> Nested Loop (actual rows=0 loops=3)
Buffers: local hit=5
-> Index Only Scan using tb_pkey on tb (actual rows=0 loops=3)
Index Cond: (id = ta.id)
Heap Fetches: 1
Buffers: local hit=4
-> Seq Scan on tc (actual rows=1 loops=1)
Buffers: local hit=1
Planning:
Buffers: shared hit=67 read=12
(14 rows)

but it doesn't work if it is written through the outside condition.

|alena(at)postgres=# EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF)
SELECT * FROM ta WHERE EXISTS (SELECT * FROM tb JOIN tc ON ta.id =
tb.id); QUERY PLAN
------------------------------------------------------ Seq Scan on ta
(actual rows=1 loops=1) Filter: EXISTS(SubPlan 1) Rows Removed by
Filter: 2 Buffers: local hit=5 SubPlan 1 -> Nested Loop (actual rows=0
loops=3) Buffers: local hit=4 -> Seq Scan on tb (actual rows=0 loops=3)
Filter: (ta.id = id) Rows Removed by Filter: 3 Buffers: local hit=3 ->
Seq Scan on tc (actual rows=1 loops=1) Buffers: local hit=1 Planning:
Buffers: shared hit=16 read=9 (15 rows) |

|I have written a patch to add this functionality and now it gives an
query plan: |

|alena(at)postgres=# EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF)
 SELECT *
   FROM ta
  WHERE EXISTS (SELECT *
                  FROM tb JOIN tc
                  ON ta.id = tb.id);
                     QUERY PLAN
-------------------------------------------------------------------------
 Nested Loop Semi Join (actual rows=1 loops=1)
   Buffers: local hit=6
   ->  Seq Scan on ta (actual rows=3 loops=1)
         Buffers: local hit=1
   ->  Nested Loop (actual rows=0 loops=3)
         Buffers: local hit=5
         ->  Index Only Scan using tb_pkey on tb (actual rows=0 loops=3)
               Index Cond: (id = ta.id)
               Heap Fetches: 1
               Buffers: local hit=4
         ->  Seq Scan on tc (actual rows=1 loops=1)
               Buffers: local hit=1
(12 rows)|

tb and tc form a Cartesian product, but in the case of the intersection
condition with tuples from the table ta (ta.id = tb.id). So, according
to the join condition, tb intersects only with 1, and only it gets into
the result, but at the same time they appear twice - this is because of
the Cartesian product of tb with tc

|*How it works:*
|

I rewrote the code a bit so that it considers not only the quals in
jointree->quals, but also those in join expression
(subselect->jointree->fromlist). If they satisfy the conditions for
using pull up, I add them to the list of clauses and form a "Bool"
expression from them, joined by an "AND" operation.

--

Regards, Alena Rybakina Postgres Professional

Attachment Content-Type Size
0001-Add-EXISTS-pull-up-if-subquery-join-expressions.patch text/x-patch 12.1 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2024-12-24 05:01:15 Re: stored procedures vs pg_stat_statements
Previous Message Merlin Moncure 2024-12-24 04:06:58 Re: stored procedures vs pg_stat_statements