From: | Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru> |
---|---|
To: | Andrei Lepikhov <lepihov(at)gmail(dot)com> |
Cc: | David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Memoize ANTI and SEMI JOIN inner |
Date: | 2025-03-31 02:50:07 |
Message-ID: | c168aaf6-e1c7-4ec7-87f0-63790905cbd9@postgrespro.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
I realized that I uploaded my diff file with a small mistake - sorry
about that. I've corrected it with this message so your tests can pass
in the CI.
On 31.03.2025 05:33, Alena Rybakina wrote:
>
> Hi!
>
> On 21.03.2025 18:56, Andrei Lepikhov wrote:
>> On 20/3/2025 07:02, David Rowley wrote:
>>> On Thu, 20 Mar 2025 at 06:16, Andrei Lepikhov <lepihov(at)gmail(dot)com>
>>> wrote:
>>>> How can we be sure that semi or anti-join needs only one tuple? I
>>>> think
>>>> it would be enough to control the absence of join clauses and
>>>> filters in
>>>> the join. Unfortunately, we only have such a guarantee in the plan
>>>> creation stage (maybe even setrefs.c). So, it seems we need to
>>>> invent an
>>>> approach like AlternativeSubplan.
>>>
>>> I suggest looking at what 9e215378d did. You might be able to also
>>> allow semi and anti-joins providing the cache keys cover the entire
>>> join condition. I think this might be safe as Nested Loop will only
>>> ask its inner subnode for the first match before skipping to the next
>>> outer row and with anti-join, there's no reason to look for additional
>>> rows after the first. Semi-join and unique joins do the same thing in
>>> nodeNestloop.c. To save doing additional checks at run-time, the code
>>> does:
>> Thank you for the clue! I almost took the wrong direction.
>> I have attached the new patch, which includes corrected comments for
>> better clarification of the changes, as well as some additional tests.
>> I now feel much more confident about this version since I have
>> resolved that concern.
>>
>
> I reviewed your patch and made a couple of suggestions.
>
> The first change is related to your comment (and the one before it). I
> fixed some grammar issues and simplified the wording to make it
> clearer and easier to understand.
>
> The second change involves adding an Assert when generating the
> Memoize path. Based on the existing comment and the surrounding logic
> (shown below),
> I believe it's worth asserting that both inner_unique and single_mode
> are not true at the same time — just as a safety check.
> /*
> * We may do this for SEMI or ANTI joins when they need only one tuple from
> * the inner side to produce the result. Following if condition checks that
> * rule.
> *
> * XXX Currently we don't attempt to mark SEMI/ANTI joins as inner_unique
> * = true. Should we? See add_paths_to_joinrel()
> */
> if(!extra->inner_unique&& (jointype== JOIN_SEMI||
> jointype== JOIN_ANTI))
> single_mode= true;
--
Regards,
Alena Rybakina
Postgres Professional
Attachment | Content-Type | Size |
---|---|---|
memoize.diff.no-cfbot | text/plain | 2.0 KB |
v1-0001-Memoise-the-inner-of-SEMI-and-ANTI-join.patch | text/x-patch | 14.2 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2025-03-31 02:56:56 | Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning |
Previous Message | Alena Rybakina | 2025-03-31 02:33:49 | Re: Memoize ANTI and SEMI JOIN inner |