Re: Exists pull-up application with JoinExpr

From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>
Subject: Re: Exists pull-up application with JoinExpr
Date: 2025-01-13 17:19:42
Message-ID: 6078f7e1-4b29-4b31-bd28-b84149e404e4@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi! I have solved it.

On 30.12.2024 11:24, Alena Rybakina wrote:
>
> Hi! Thank you for your interest to this subject!
>
> On 27.12.2024 15:53, Ilia Evdokimov wrote:
>> Hi Alena,
>>
>> Thank you for your work on subqueries with JOIN.
>>
>> Have you considered the scenario where in subquery includes a qual
>> like (tc.aid = 1)? When I tried executing those queries I receive
>> different results. In my opinion, to prevent this, we should add
>> filters for such quals within the loop 'foreach (lc, all_clauses)'
>>
>> EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF) SELECT * FROM ta
>> WHERE EXISTS (SELECT * FROM tb, tc WHERE ta.id = tb.id AND tc.aid = 1);
>>                               QUERY PLAN
>> ----------------------------------------------------------------------
>>  Hash Join (actual rows=1 loops=1)
>>    Hash Cond: (ta.id = tb.id)
>>    Buffers: local hit=3
>>    ->  Seq Scan on ta (actual rows=3 loops=1)
>>          Buffers: local hit=1
>>    ->  Hash (actual rows=3 loops=1)
>>          Buckets: 4096  Batches: 1  Memory Usage: 33kB
>>          Buffers: local hit=2
>>          ->  HashAggregate (actual rows=3 loops=1)
>>                Group Key: tb.id
>>                Batches: 1  Memory Usage: 121kB
>>                Buffers: local hit=2
>>                ->  Nested Loop (actual rows=3 loops=1)
>>                      Buffers: local hit=2
>>                      ->  Seq Scan on tb (actual rows=3 loops=1)
>>                            Buffers: local hit=1
>>                      ->  Materialize (actual rows=1 loops=3)
>>                            Storage: Memory  Maximum Storage: 17kB
>>                            Buffers: local hit=1
>>                            ->  Seq Scan on tc (actual rows=1 loops=1)
>>                                  Filter: (aid = 1)
>>                                  Rows Removed by Filter: 1
>>                                  Buffers: local hit=1
>> (23 rows)
>>
>> ============================
>>
>> EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF)
>> SELECT * FROM ta WHERE EXISTS (SELECT * FROM tb JOIN tc ON ta.id =
>> tb.id WHERE tc.aid = 1);
>>                                 QUERY PLAN
>> ---------------------------------------------------------------------------
>>
>>  Seq Scan on ta (actual rows=1 loops=1)
>>    Filter: EXISTS(SubPlan 1)
>>    Rows Removed by Filter: 2
>>    Buffers: local hit=6
>>    SubPlan 1
>>      ->  Nested Loop (actual rows=0 loops=3)
>>            Buffers: local hit=5
>>            ->  Index Only Scan using tb_pkey on tb (actual rows=0
>> loops=3)
>>                  Index Cond: (id = ta.id)
>>                  Heap Fetches: 1
>>                  Buffers: local hit=4
>>            ->  Seq Scan on tc (actual rows=1 loops=1)
>>                  Filter: (aid = 1)
>>                  Buffers: local hit=1
>> (14 rows)
>>
>>
> You are right, at the moment the code is not processed if there is a
> constant qual in the subquery (like t1.x1=1 in the example below) and
> this problem is not only related to the current patch.
>
> For example you can get such a query plan if you complete this request
> to the master:
>
> create table t (xint);
> create table t1 (x1int);
> create table t2 (x2int);
> EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF)
> SELECT 1
> FROM t
> WHERE EXISTS (SELECT 1
> FROM t1
> where t1.x1 = 1);
> QUERY PLAN
> ------------------------------------------------
> Result (actual rows=0 loops=1)
> One-Time Filter: (InitPlan 1).col1
> InitPlan 1
> -> Seq Scan on t1 (actual rows=0 loops=1)
> Filter: (x1 = 1)
> -> Seq Scan on t (never executed)
> (6 rows)
>
> It's all because of the check in this function - this qual has
> levelsoup = 0, not 1 (see (!contain_vars_of_level(whereClause, 1)),
> but I already found out that by changing this, the logic of correction
> there is required a little more complicated. At the moment, I'm
> working to add this processing to the patch.
>
> Thanks for the case!
>
The logic is the same, but extended to constants. I added a few more
tests that not only cover this case, but also NOT EXISTS, which will be
converted to ANTI JOIN.

--
Regards,
Alena Rybakina
Postgres Professional

Attachment Content-Type Size
v2-0001-Teach-the-planner-to-convert-EXISTS-and-NOT-EXISTS-s.patch text/x-patch 18.7 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2025-01-13 17:29:05 Re: Adjusting hash join memory limit to handle batch explosion
Previous Message Melanie Plageman 2025-01-13 16:32:04 Re: Adjusting hash join memory limit to handle batch explosion