Re: Memoize ANTI and SEMI JOIN inner

From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: Andrei Lepikhov <lepihov(at)gmail(dot)com>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Memoize ANTI and SEMI JOIN inner
Date: 2025-03-31 02:33:49
Message-ID: 26e6dd23-ad56-4e00-8a24-c84b508267aa@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi!

On 21.03.2025 18:56, Andrei Lepikhov wrote:
> On 20/3/2025 07:02, David Rowley wrote:
>> On Thu, 20 Mar 2025 at 06:16, Andrei Lepikhov <lepihov(at)gmail(dot)com> wrote:
>>> How can we be sure that semi or anti-join needs only one tuple? I think
>>> it would be enough to control the absence of join clauses and
>>> filters in
>>> the join. Unfortunately, we only have such a guarantee in the plan
>>> creation stage (maybe even setrefs.c). So, it seems we need to
>>> invent an
>>> approach like AlternativeSubplan.
>>
>> I suggest looking at what 9e215378d did.  You might be able to also
>> allow semi and anti-joins providing the cache keys cover the entire
>> join condition. I think this might be safe as Nested Loop will only
>> ask its inner subnode for the first match before skipping to the next
>> outer row and with anti-join, there's no reason to look for additional
>> rows after the first. Semi-join and unique joins do the same thing in
>> nodeNestloop.c. To save doing additional checks at run-time, the code
>> does:
> Thank you for the clue! I almost took the wrong direction.
> I have attached the new patch, which includes corrected comments for
> better clarification of the changes, as well as some additional tests.
> I now feel much more confident about this version since I have
> resolved that concern.
>

I reviewed your patch and made a couple of suggestions.

The first change is related to your comment (and the one before it). I
fixed some grammar issues and simplified the wording to make it clearer
and easier to understand.

The second change involves adding an Assert when generating the Memoize
path. Based on the existing comment and the surrounding logic (shown
below),
I believe it's worth asserting that both inner_unique and single_mode
are not true at the same time — just as a safety check.
/*
* We may do this for SEMI or ANTI joins when they need only one tuple from
* the inner side to produce the result. Following if condition checks that
* rule.
*
* XXX Currently we don't attempt to mark SEMI/ANTI joins as inner_unique
* = true. Should we? See add_paths_to_joinrel()
*/
if(!extra->inner_unique&& (jointype== JOIN_SEMI||
jointype== JOIN_ANTI))
single_mode= true;

--
Regards,
Alena Rybakina
Postgres Professional

Attachment Content-Type Size
memoize.diff text/x-patch 2.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alena Rybakina 2025-03-31 02:50:07 Re: Memoize ANTI and SEMI JOIN inner
Previous Message torikoshia 2025-03-31 00:23:38 Re: Proposal: Progressive explain