Re: Pulling up direct-correlated ANY_SUBLINK

From: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Richard Guo <riguo(at)vmware(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Pulling up direct-correlated ANY_SUBLINK
Date: 2022-10-30 07:28:38
Message-ID: CAKU4AWpi9oztiomUQt4JCxXEr6EaQ2thY-7JYDm6c9he0A7oCA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi All:

On Tue, Sep 10, 2019 at 9:49 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Richard Guo <riguo(at)pivotal(dot)io> writes:
> > Currently we do not try to pull up sub-select of type ANY_SUBLINK if it
> > refers to any Vars of the parent query, as indicated in the code snippet
> > below:
> > if (contain_vars_of_level((Node *) subselect, 1))
> > return NULL;
> > Why do we have this check?
>
> Because the result would not be a join between two independent tables.
>

I think this situation is caused by we pull-up the ANY-sublink with 2
steps, the first step is to pull up the sublink as a subquery, and the
next step is to pull up the subquery if it is allowed. The benefits of
this method are obvious, pulling up the subquery has more requirements,
even if we can just finish the first step, we still get huge benefits.
However the bad stuff happens if varlevelsup = 1 involves, step 1 fails!

The solution here is to use the lateral join to overcome the two
independent tables, the issue of this solution includes:

1. LATERAL pretty much constrains things to use a nestloop like below,
but this reason is questioned since if we can pull-up the subquery, if so
the
constraint gone. [1]
2. It has something with unique-ify the inner path. [2] , but Richard
thought
it should be fixed but without an agreement for all people [3].
3. Richard [4] found it would fail to get a plan for some query. (the
error is
below per my testing)

> ERROR: failed to build any 3-way joins

So back to the root cause of this issue, IIUC, if varlevelsup = 1
involves,
can we just bypass the 2-steps method, just as what we do for EXISTS
sublinks? If so, we just need to convert the ANY-SUBLINK to EXIST-SUBLINK
under the case.

The attached is the one commit which includes the 2 methods discussed
here, controlled by different GUC separately, for easy testing. Per my
test,
Query 2 choosed the Unique Join with the IN-to-EXISTS method, but not
with the Lateral method, and query 3 raises error with the lateral method,
but not with the IN-to-EXISTS method.

[1]
https://www.postgresql.org/message-id/flat/60794.1568104308%40antos#365d5ec69fd605a8569a2674a33909a1
[2] https://www.postgresql.org/message-id/60794.1568104308%40antos
[3]
https://www.postgresql.org/message-id/CAN_9JTzqa-3RmHAw3wZv099Rk8xX480YdEvGy%2BJAdVw8dTnHRA%40mail.gmail.com
[4]
https://www.postgresql.org/message-id/CAMbWs49cvkF9akbomz_fCCKS%3DD5TY%3D4KGHEQcfHPZCXS1GVhkA%40mail.gmail.com

> > Can we try to pull up direct-correlated ANY SubLink with the help of
> > LATERAL?
>
> Perhaps. But what's the argument that you'd end up with a better
> plan? LATERAL pretty much constrains things to use a nestloop,
> so I'm not sure there's anything fundamentally different.
>
> regards, tom lane
>
>
--
Best Regards
Andy Fan

Attachment Content-Type Size
test.sql application/sql 520 bytes
v1-0001-2-methods-for-Pulling-up-direct-correlated-ANY_SU.patch application/x-patch 23.2 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andy Fan 2022-10-30 09:16:50 Re: How to started with Contributions
Previous Message Thomas Munro 2022-10-30 06:10:41 Re: 16: Collation versioning and dependency helpers