From: | Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru> |
---|---|
To: | Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, vignesh C <vignesh21(at)gmail(dot)com>, Richard Guo <guofenglinux(at)gmail(dot)com>, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru> |
Subject: | Re: A new strategy for pull-up correlated ANY_SUBLINK |
Date: | 2023-10-13 08:39:41 |
Message-ID: | 00de47dd-7a38-4225-a47e-65a8ba6de8d3@yandex.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 13.10.2023 10:04, Andy Fan wrote:
>
> It seems to me that the expressions "=" and "IN" are equivalent
> here due to the fact that the aggregated subquery returns only one
> value, and the result with the "IN" operation can be considered as
> the intersection of elements on the left and right. In this query,
> we have some kind of set on the left, among which there will be
> found or not only one element on the right.
>
>
> Yes, they are equivalent at the final result, but there are some
> differences at the execution level. the '=' case will be transformed
> to a Subplan whose subPlanType is EXPR_SUBLINK, so if there
> is more than 1 rows is returned in the subplan, error will be raised.
>
> select * from tenk1 where
> ten = (select ten from tenk1 i where i.two = tenk1.two );
>
> ERROR: more than one row returned by a subquery used as an expression
>
> However the IN case would not.
> select * from tenk1 where
> ten = (select ten from tenk1 i where i.two = tenk1.two ) is OK.
>
> I think the test case you added is not related to this feature. the
> difference is there even without the patch. so I kept the code
> you changed, but not for the test case.
Yes, I understand and agree with you that we should delete the last
queries, except to one.
The query below have a different result compared to master, and it is
correct.
Without your patch:
explain (costs off)
+SELECT * FROM tenk1 A LEFT JOIN tenk2 B
ON B.hundred in (SELECT min(c.hundred) FROM tenk2 C WHERE c.odd = b.odd);
QUERY PLAN
-----------------------------------------------------------------------------
Nested Loop Left Join
-> Seq Scan on tenk1 a
-> Materialize
-> Seq Scan on tenk2 b
Filter: (SubPlan 2)
SubPlan 2
-> Result
InitPlan 1 (returns $1)
-> Limit
-> Index Scan using tenk2_hundred on
tenk2 c
Index Cond: (hundred IS NOT NULL)
Filter: (odd = b.odd)
(12 rows)
After your patch:
postgres=# explain (costs off)
SELECT * FROM tenk1 A LEFT JOIN tenk2 B
ON B.hundred in (SELECT min(c.hundred) FROM tenk2 C WHERE c.odd = b.odd);
QUERY PLAN
--------------------------------------------------------------
Nested Loop Left Join
-> Seq Scan on tenk1 a
-> Materialize
-> Nested Loop
-> Seq Scan on tenk2 b
*-> Subquery Scan on "ANY_subquery"
Filter: (b.hundred = "ANY_subquery".min)*
-> Aggregate
-> Seq Scan on tenk2 c
Filter: (odd = b.odd)
(10 rows)
>
>> I took the liberty of adding this to your patch and added
>> myself as reviewer, if you don't mind.
>>
>> Sure, the patch after your modification looks better than the
>> original.
>> I'm not sure how the test case around "because of got one row" is
>> relevant to the current changes. After we reach to some agreement
>> on the above discussion, I think v4 is good for committer to review!
>
> Thank you!) I am ready to discuss it.
>
> Actually I meant to discuss the "Unfortunately, I found a request..",
> looks
> we have reached an agreement there:)
>
Yes, we have)
--
Regards,
Alena Rybakina
Attachment | Content-Type | Size |
---|---|---|
pull-up.diff | text/x-patch | 2.1 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | a.rybakina | 2023-10-13 08:56:37 | Re: Removing unneeded self joins |
Previous Message | shveta malik | 2023-10-13 08:35:34 | Re: Synchronizing slots from primary to standby |