Re: Using each rel as both outer and inner for JOIN_ANTI

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Richard Guo <guofenglinux(at)gmail(dot)com>
Cc: Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>, Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Emre Hasegeli <emre(at)hasegeli(dot)com>
Subject: Re: Using each rel as both outer and inner for JOIN_ANTI
Date: 2022-07-30 16:07:38
Message-ID: 391492.1659197258@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Richard Guo <guofenglinux(at)gmail(dot)com> writes:
> [ v4-0001-Using-each-rel-as-both-outer-and-inner-for-anti-j.patch ]

I took a quick look through this. The executor changes are indeed
impressively short, but that's largely because you've paid zero
attention to updating obsoleted comments. For example, in
nodeHashjoin.c there are lots of references to right/full joins
that likely now need to cover right-anti. I'm not sure that the
empty-rel startup optimizations are correct for this case, either.

I don't have a warm feeling about the planner changes being correct:
it looks like what you mostly did was to treat JOIN_RIGHT_ANTI
identically to JOIN_ANTI everywhere, which is surely wrong.
As an example of this, optimizer/README mentions

Similarly, parameterized paths do not normally get preference in add_path
for having cheap startup cost; that's seldom of much value when on the
inside of a nestloop, so it seems not worth keeping extra paths solely for
that. An exception occurs for parameterized paths for the RHS relation of
a SEMI or ANTI join: in those cases, we can stop the inner scan after the
first match, so it's primarily startup not total cost that we care about.

For RIGHT_ANTI it'd become startup of the outer scan that counts, but
I don't think you've gotten that right here.

There are various references to JOIN_ANTI in planner peripheral code,
e.g. selfuncs.c, that probably need adjustment.

[ wanders away wondering if JOIN_RIGHT_SEMI should become a thing ... ]

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-07-30 16:51:35 Re: standby recovery fails (tablespace related) (tentative patch and discussion)
Previous Message Tom Lane 2022-07-30 14:40:34 Re: pg15b2: large objects lost on upgrade