Re: An incorrect check in get_memoize_path

From: wenhui qiu <qiuwenhuifx(at)gmail(dot)com>
To: Richard Guo <guofenglinux(at)gmail(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: An incorrect check in get_memoize_path
Date: 2025-04-07 12:02:04
Message-ID: CAGjGUAL5cdFyrXnRO_DiRPeUuGiO9vd5riMRuGCVQjUPue0g6Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Richard
Thank you work on this

- * - * XXX this could be enabled if the remaining join quals were made
part of - * the inner scan's filter instead of the join filter. Maybe it's
worth - * considering doing that? */ - if (extra->inner_unique && -
(inner_path->param_info == NULL || -
bms_num_members(inner_path->param_info->ppi_serials) < -
list_length(extra->restrictlist))) - return NULL; + if
(extra->inner_unique) + { + Bitmapset *ppi_serials; + + if
(inner_path->param_info == NULL) + return NULL; + + ppi_serials =
inner_path->param_info->ppi_serials; + + foreach_node(RestrictInfo, rinfo,
extra->restrictlist) + { + if (!bms_is_member(rinfo->rinfo_serial,
ppi_serials)) + return NULL; + } + } The refined version introduces a more
precise validation by:
1. Explicitly verifying each restriction – Instead of relying on a count
comparison, it checks whether every restriction's serial number is included
in the inner path’s parameter info (ppi_serials).
2. Reducing false negatives – The optimizer now only discards a path if a
restriction is definitively unhandled, allowing more valid plans to be
considered.

On Mon, Apr 7, 2025 at 3:50 PM Richard Guo <guofenglinux(at)gmail(dot)com> wrote:

> While reviewing another patch [1], I noticed an incorrect check in
> get_memoize_path.
>
> if (extra->inner_unique &&
> (inner_path->param_info == NULL ||
> bms_num_members(inner_path->param_info->ppi_serials) <
> list_length(extra->restrictlist)))
> return NULL;
>
> This check is to ensure that, for unique joins, all the restriction
> clauses are parameterized to enable the use of Memoize nodes.
> However, ppi_clauses includes join clauses available from all outer
> rels, not just the current outer rel, while extra->restrictlist only
> includes the restriction clauses for the current join. This means the
> check could pass even if a restriction clause isn't parameterized, as
> long as another join clause, which doesn't belong to the current join,
> is included in ppi_clauses. It's not very hard to construct a query
> that exposes this issue.
>
> create table t (a int, b int);
>
> explain (costs off)
> select * from t t0 left join
> (t t1 left join t t2 on t1.a = t2.a
> join lateral
> (select distinct on (a, b) a, b, t0.a+t1.a+t2.a as x from t t3 offset
> 0) t3
> on t1.a = t3.a and coalesce(t2.b) = t3.b)
> on t0.b = t3.b;
> QUERY PLAN
> ---------------------------------------------------------
> Nested Loop Left Join
> -> Seq Scan on t t0
> -> Nested Loop
> Join Filter: (COALESCE(t2.b) = t3.b)
> -> Hash Left Join
> Hash Cond: (t1.a = t2.a)
> -> Seq Scan on t t1
> -> Hash
> -> Seq Scan on t t2
> -> Subquery Scan on t3
> Filter: ((t0.b = t3.b) AND (t1.a = t3.a))
> -> Unique
> -> Sort
> Sort Key: t3_1.a, t3_1.b
> -> Seq Scan on t t3_1
> (15 rows)
>
> Consider the join to t3. It is a unique join, and not all of its
> restriction clauses are parameterized. Despite this, the check still
> passes.
>
> Interestingly, although the check passes, no Memoize nodes are added
> on top of t3's paths because paraminfo_get_equal_hashops insists that
> all ppi_clauses must satisfy clause_sides_match_join (though I
> actually question whether this is necessary, but that's a separate
> issue). However, this does not justify the check being correct.
>
> Hence, I propose the attached patch for the fix.
>
> BTW, there is a XXX comment there saying that maybe we can make the
> remaining join quals part of the inner scan's filter instead of the
> join filter. I don't think this is possible in all cases. In the
> above query, 'coalesce(t2.b) = t3.b' cannot be made part of t3's scan
> filter, according to join_clause_is_movable_into. So I removed that
> comment in the patch while we're here.
>
> Any thoughts?
>
> [1] https://postgr.es/m/60bf8d26-7c7e-4915-b544-afdb9020011d@gmail.com
>
> Thanks
> Richard
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Jones 2025-04-07 12:25:52 Re: [PATCH] Add CANONICAL option to xmlserialize
Previous Message Bernd Helmle 2025-04-07 11:48:49 Re: Modern SHA2- based password hashes for pgcrypto