From: | wenhui qiu <qiuwenhuifx(at)gmail(dot)com> |
---|---|
To: | Tender Wang <tndrwang(at)gmail(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Richard Guo <guofenglinux(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Anti join confusion |
Date: | 2025-02-24 09:12:40 |
Message-ID: | CAGjGUAKCgRzFgu-6bDATzAmeFDVjq_WbT9mbud36vXdUxJ+Jpg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi Tender Wang
Thank you very much for the analysis.Do we have plans for NOT IN
subquery pull up?
Thanks
On Mon, Feb 24, 2025 at 4:00 PM Tender Wang <tndrwang(at)gmail(dot)com> wrote:
>
>
> wenhui qiu <qiuwenhuifx(at)gmail(dot)com> 于2025年2月24日周一 15:10写道:
>
>> HI Tom , Tender Wang
>> But I saw the path test case are support ,
>> [image: image.png]
>>
>
> Hi wenhui,
>
> In the above picture, it is NOT EXISTS sub-query, which can be pulled
> up, but NOT IN sub-query, which can not now.
> As Tom said:
> "NOT IN is not convertible to an anti-join: the semantics are wrong
> because of the way it treats nulls in the sub-select's output."
>
>
> and my test case set the id is primary key also SubPlan
>> test=# explain analyze SELECT T1.id,T1.K1 FROM join1 t1 WHERE T1.id NOT
>> IN (SELECT T2.id FROM join2 t2 where T2.ID = 1000 );
>> QUERY PLAN
>>
>> ----------------------------------------------------------------------------------------------------------------------------------
>> Seq Scan on join1 t1 (cost=8.45..37446.94 rows=610100 width=8) (actual
>> time=0.071..209.875 rows=1220199 loops=1)
>> Filter: (NOT (ANY (id = (hashed SubPlan 1).col1)))
>> Rows Removed by Filter: 1
>> Buffers: shared hit=22190
>> SubPlan 1
>> -> Index Only Scan using join2_pkey on join2 t2 (cost=0.42..8.44
>> rows=1 width=4) (actual time=0.027..0.029 rows=1 loops=1)
>> Index Cond: (id = 1000)
>> Heap Fetches: 0
>> Buffers: shared hit=4
>> Planning Time: 0.245 ms
>> Execution Time: 247.906 ms
>> (11 rows)
>>
>> test=# explain SELECT T1.id,T1.K1 FROM join1 t1 WHERE T1.id NOT IN
>> (SELECT T2.id FROM join2 t2 where T2.ID > 10000 );
>> QUERY PLAN
>>
>> -------------------------------------------------------------------------------------------------------------
>> Gather (cost=1000.42..8414945643.30 rows=610100 width=8)
>> Workers Planned: 2
>> -> Parallel Seq Scan on join1 t1 (cost=0.42..8414883633.30
>> rows=254208 width=8)
>> Filter: (NOT (ANY (id = (SubPlan 1).col1)))
>> SubPlan 1
>> -> Materialize (cost=0.42..31028.21 rows=829768 width=4)
>> -> Index Only Scan using join2_pkey on join2 t2
>> (cost=0.42..23637.37 rows=829768 width=4)
>> Index Cond: (id > 10000)
>> (8 rows)
>>
>> test=# explain SELECT T1.id,T1.K1 FROM join1 t1 WHERE T1.id NOT IN
>> (SELECT T2.id FROM join2 t2 where T2.ID < 10000 );
>> QUERY PLAN
>>
>> ----------------------------------------------------------------------------------------------
>> Seq Scan on join1 t1 (cost=327.05..37765.54 rows=610100 width=8)
>> Filter: (NOT (ANY (id = (hashed SubPlan 1).col1)))
>> SubPlan 1
>> -> Index Only Scan using join2_pkey on join2 t2 (cost=0.42..301.22
>> rows=10331 width=4)
>> Index Cond: (id < 10000)
>> (5 rows)
>>
>> test=#
>>
>>
> --
> Thanks,
> Tender Wang
>
From | Date | Subject | |
---|---|---|---|
Next Message | jian he | 2025-02-24 09:17:45 | Re: Non-text mode for pg_dumpall |
Previous Message | Bertrand Drouvot | 2025-02-24 09:07:39 | Re: per backend WAL statistics |