Re: Memoize ANTI and SEMI JOIN inner

From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: Andrei Lepikhov <lepihov(at)gmail(dot)com>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Memoize ANTI and SEMI JOIN inner
Date: 2025-03-31 02:50:07
Message-ID: c168aaf6-e1c7-4ec7-87f0-63790905cbd9@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I realized that I uploaded my diff file with a small mistake - sorry
about that. I've corrected it with this message so your tests can pass
in the CI.

On 31.03.2025 05:33, Alena Rybakina wrote:
>
> Hi!
>
> On 21.03.2025 18:56, Andrei Lepikhov wrote:
>> On 20/3/2025 07:02, David Rowley wrote:
>>> On Thu, 20 Mar 2025 at 06:16, Andrei Lepikhov <lepihov(at)gmail(dot)com>
>>> wrote:
>>>> How can we be sure that semi or anti-join needs only one tuple? I
>>>> think
>>>> it would be enough to control the absence of join clauses and
>>>> filters in
>>>> the join. Unfortunately, we only have such a guarantee in the plan
>>>> creation stage (maybe even setrefs.c). So, it seems we need to
>>>> invent an
>>>> approach like AlternativeSubplan.
>>>
>>> I suggest looking at what 9e215378d did.  You might be able to also
>>> allow semi and anti-joins providing the cache keys cover the entire
>>> join condition. I think this might be safe as Nested Loop will only
>>> ask its inner subnode for the first match before skipping to the next
>>> outer row and with anti-join, there's no reason to look for additional
>>> rows after the first. Semi-join and unique joins do the same thing in
>>> nodeNestloop.c. To save doing additional checks at run-time, the code
>>> does:
>> Thank you for the clue! I almost took the wrong direction.
>> I have attached the new patch, which includes corrected comments for
>> better clarification of the changes, as well as some additional tests.
>> I now feel much more confident about this version since I have
>> resolved that concern.
>>
>
> I reviewed your patch and made a couple of suggestions.
>
> The first change is related to your comment (and the one before it). I
> fixed some grammar issues and simplified the wording to make it
> clearer and easier to understand.
>
> The second change involves adding an Assert when generating the
> Memoize path. Based on the existing comment and the surrounding logic
> (shown below),
> I believe it's worth asserting that both inner_unique and single_mode
> are not true at the same time — just as a safety check.
> /*
> * We may do this for SEMI or ANTI joins when they need only one tuple from
> * the inner side to produce the result. Following if condition checks that
> * rule.
> *
> * XXX Currently we don't attempt to mark SEMI/ANTI joins as inner_unique
> * = true. Should we? See add_paths_to_joinrel()
> */
> if(!extra->inner_unique&& (jointype== JOIN_SEMI||
> jointype== JOIN_ANTI))
> single_mode= true;

--
Regards,
Alena Rybakina
Postgres Professional

Attachment Content-Type Size
memoize.diff.no-cfbot text/plain 2.0 KB
v1-0001-Memoise-the-inner-of-SEMI-and-ANTI-join.patch text/x-patch 14.2 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2025-03-31 02:56:56 Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning
Previous Message Alena Rybakina 2025-03-31 02:33:49 Re: Memoize ANTI and SEMI JOIN inner