Quick Links

Re: A new strategy for pull-up correlated ANY_SUBLINK

From:	Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>
To:	Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, vignesh C <vignesh21(at)gmail(dot)com>, Richard Guo <guofenglinux(at)gmail(dot)com>, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
Subject:	Re: A new strategy for pull-up correlated ANY_SUBLINK
Date:	2023-10-13 08:39:41
Message-ID:	00de47dd-7a38-4225-a47e-65a8ba6de8d3@yandex.ru
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 13.10.2023 10:04, Andy Fan wrote:
>
> It seems to me that the expressions "=" and "IN" are equivalent
> here due to the fact that the aggregated subquery returns only one
> value, and the result with the "IN" operation can be considered as
> the intersection of elements on the left and right. In this query,
> we have some kind of set on the left, among which there will be
> found or not only one element on the right.
>
>
> Yes, they are equivalent at the final result, but there are some
> differences at the execution level. the '=' case will be transformed
> to a Subplan whose subPlanType is EXPR_SUBLINK, so if there
> is more than 1 rows is returned in the subplan, error will be raised.
>
> select * from tenk1 where
> ten = (select ten from tenk1 i where i.two = tenk1.two );
>
> ERROR: more than one row returned by a subquery used as an expression
>
> However the IN case would not.
> select * from tenk1 where
> ten = (select ten from tenk1 i where i.two = tenk1.two ) is OK.
>
> I think the test case you added is not related to this feature. the
> difference is there even without the patch. so I kept the code
> you changed, but not for the test case.
Yes, I understand and agree with you that we should delete the last
queries, except to one.

The query below have a different result compared to master, and it is
correct.

Without your patch:

explain (costs off)
+SELECT * FROM tenk1 A LEFT JOIN tenk2 B
ON B.hundred in (SELECT min(c.hundred) FROM tenk2 C WHERE c.odd = b.odd);
                                 QUERY PLAN
-----------------------------------------------------------------------------
Nested Loop Left Join
   -> Seq Scan on tenk1 a
   -> Materialize
         -> Seq Scan on tenk2 b
               Filter: (SubPlan 2)
               SubPlan 2
                 -> Result
                       InitPlan 1 (returns $1)
                         -> Limit
                               -> Index Scan using tenk2_hundred on
tenk2 c
                                     Index Cond: (hundred IS NOT NULL)
                                     Filter: (odd = b.odd)
(12 rows)

After your patch:

postgres=# explain (costs off)
SELECT * FROM tenk1 A LEFT JOIN tenk2 B
ON B.hundred in (SELECT min(c.hundred) FROM tenk2 C WHERE c.odd = b.odd);

                           QUERY PLAN
--------------------------------------------------------------
Nested Loop Left Join
    -> Seq Scan on tenk1 a
    -> Materialize
          -> Nested Loop
                -> Seq Scan on tenk2 b
*-> Subquery Scan on "ANY_subquery"
                      Filter: (b.hundred = "ANY_subquery".min)*
                      -> Aggregate
                            -> Seq Scan on tenk2 c
                                  Filter: (odd = b.odd)
(10 rows)

>
>> I took the liberty of adding this to your patch and added
>> myself as reviewer, if you don't mind.
>>
>> Sure, the patch after your modification looks better than the
>> original.
>> I'm not sure how the test case around "because of got one row" is
>> relevant to the current changes. After we reach to some agreement
>> on the above discussion, I think v4 is good for committer to review!
>
> Thank you!) I am ready to discuss it.
>
> Actually I meant to discuss the "Unfortunately, I found a request..",
> looks
> we have reached an agreement there:)
>
Yes, we have)

--
Regards,
Alena Rybakina

Attachment	Content-Type	Size
pull-up.diff	text/x-patch	2.1 KB

In response to

Re: A new strategy for pull-up correlated ANY_SUBLINK at 2023-10-13 07:04:45 from Andy Fan

Responses

Re: A new strategy for pull-up correlated ANY_SUBLINK at 2024-01-26 12:46:02 from vignesh C

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	a.rybakina	2023-10-13 08:56:37	Re: Removing unneeded self joins
Previous Message	shveta malik	2023-10-13 08:35:34	Re: Synchronizing slots from primary to standby