Re: BUG #18627: Regression (15 -> 16) - Join removal not performed when join condition spans multiple tables

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: gourlaouen(dot)mikael(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18627: Regression (15 -> 16) - Join removal not performed when join condition spans multiple tables
Date: 2024-09-22 16:56:00
Message-ID: 2899426.1727024160@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

David Rowley <dgrowleyml(at)gmail(dot)com> writes:
> On Mon, 23 Sept 2024 at 00:55, PG Bug reporting form
> <noreply(at)postgresql(dot)org> wrote:
> It looks like the first bad commit is acc5821e4d (Further fixes in
> qual nullingrel adjustment for outer join commutation)
> I doubt breaking this was an intended change of the nullable Var work.
> Tom can likely confirm.

Hmm ... I did not work through exactly why it broke at that particular
commit and not somewhen else, but what I conclude is that it's fairly
accidental that it ever worked at all. join_is_removable is rejecting
removal of the syntactically-lower join because it thinks that
i.version_id is still needed "above" the join, even though we
previously removed the upper join condition:

(gdb)
219 if (!bms_is_subset(innerrel->attr_needed[attroff], inputrelids))
(gdb) call bmsToString (inputrelids)
$5 = 0x2ecd968 "(b 1 2)"
(gdb) call bmsToString (joinrelids)
$6 = 0x2eac408 "(b 1 2 3)"
(gdb) call bmsToString (innerrel->attr_needed[attroff])
$7 = 0x2eac818 "(b 1 2 3)"

This happens because remove_rel_from_query only cleaned relid 4
(destination d) out of the attr_needed data; it didn't really
account for the fact that i.version_id is no longer needed anywhere
once that ON clause is gone. That is, really at this point
attr_needed[attroff] ought to be empty, and it isn't.

Pre-v16, the case managed to work because we didn't track outer
joins (here, relid 3 for the syntactically-lower join) in
attr_needed, and so this test succeeded anyway. That feels fairly
accidental though; it seems likely that there are related cases
that could be optimized and never have been.

The "clean and obviously correct" answer would be to recompute
all the attr_needed values from scratch after removing the upper
outer join. That seems impractically expensive. What I'm
thinking about is to test against joinrelids here instead of
inputrelids. As the adjacent comment says, that would get the
wrong answer for "pushed down" conditions, but we could revert
to the pre-v16 hack of checking relevant join conditions in
the loop further down, ie put back this kluge:

if (RINFO_IS_PUSHED_DOWN(restrictinfo, joinrelids))
{
/*
* If such a clause actually references the inner rel then join
* removal has to be disallowed. We have to check this despite
* the previous attr_needed checks because of the possibility of
* pushed-down clauses referencing the rel.
*/
if (bms_is_member(innerrelid, restrictinfo->clause_relids))
return false;
continue; /* else, ignore; not useful here */
}

That's ugly for sure, and it would only revert to the status
quo ante. But I'm not seeing a better way right now.

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Craig Milhiser 2024-09-23 00:52:20 Re: Reference to - BUG #18349: ERROR: invalid DSA memory alloc request size 1811939328, CONTEXT: parallel worker
Previous Message Tom Lane 2024-09-22 15:02:40 Re: Reference to - BUG #18349: ERROR: invalid DSA memory alloc request size 1811939328, CONTEXT: parallel worker