Re: Memoize ANTI and SEMI JOIN inner

From: Richard Guo <guofenglinux(at)gmail(dot)com>
To: Andrei Lepikhov <lepihov(at)gmail(dot)com>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Memoize ANTI and SEMI JOIN inner
Date: 2025-03-31 10:18:37
Message-ID: CAMbWs49G-w9UMEBeMVj_ZT6-W-3iv=1xN8QUy_gLSiGf_VXSUw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 31, 2025 at 6:46 PM Andrei Lepikhov <lepihov(at)gmail(dot)com> wrote:
> On 3/31/25 11:03, Richard Guo wrote:
> > I reviewed this patch and I have some concerns about the following
> > code:
> >
> > if (extra->inner_unique &&
> > (inner_path->param_info == NULL ||
> > bms_num_members(inner_path->param_info->ppi_serials) <
> > list_length(extra->restrictlist)))
> > return NULL;
> >
> > I understand that this check is used to ensure that the entire join
> > condition is parameterized in the case of unique joins, so that we can
> > safely mark the cache entry as complete after reading the first tuple.
> > However, ppi_clauses includes join clauses available from all outer
> > rels, not just the current outer rel, while extra->restrictlist only
> > includes the restriction clauses for the current join. This means the
> > check could pass even if a restriction clause isn't parameterized, as
> > long as another join clause, which doesn't belong to the current join,
> > is included in ppi_clauses.

> Initially, I had the same concern. But if ppi_clauses contains a qual,
> it should refer to this join and, as a result, be in the
> extra->restrictlist, isn't it?

Hmm, I don't think so. As I mentioned upthread, ppi_clauses includes
join clauses available from all outer rels, not just the current one.
So a clause included in ppi_clauses is not necessarily included in
extra->restrictlist. As an example, consider

create table t (a int, b int);

explain (costs off)
select * from t t1 join t t2 join
lateral (select *, t1.a+t2.a as x from t t3 offset 0) t3
on t2.a = t3.a
on t1.b = t3.b;
QUERY PLAN
---------------------------------------------------------
Nested Loop
-> Seq Scan on t t2
-> Nested Loop
-> Seq Scan on t t1
-> Subquery Scan on t3
Filter: ((t2.a = t3.a) AND (t1.b = t3.b))
-> Seq Scan on t t3_1
(7 rows)

t3's ppi_clauses includes "t2.a = t3.a" and "t1.b = t3.b", while t1/t3
join's restrictlist only includes "t1.b = t3.b".

Thanks
Richard

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Pyhalov 2025-03-31 10:31:45 Re: SQLFunctionCache and generic plans
Previous Message Jasper Smit 2025-03-31 10:01:46 Re: Assertion with aborted UPDATE in subtransaction