Re: BUG #18429: Inconsistent results on similar queries with join lateral

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Benoit Ryder <b(dot)ryder(at)ateme(dot)com>
Cc: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #18429: Inconsistent results on similar queries with join lateral
Date: 2024-04-13 22:58:39
Message-ID: 265068.1713049119@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

I wrote:
> It looks like the problem is that the old join_clause_is_movable
> logic is incorrectly deciding that the WHERE condition can be
> pushed down to the sub-select relation.

Nope, that's not accurate: I soon found that the
join_clause_is_movable functions aren't being invoked at all for the
troublesome clause. It turns out that that clause is generated by
generate_join_implied_equalities (after having been, rather uselessly,
decomposed into an EquivalenceClass), and the reason it gets into the
scan-level condition is that get_baserel_parampathinfo puts it there
without any movability checking. That happens because
get_baserel_parampathinfo believes this:

* Add in joinclauses generated by EquivalenceClasses, too. (These
* necessarily satisfy join_clause_is_movable_into.)

But this clause *doesn't* satisfy that function; in the back branches
it'll fail the test against nullable_relids. So I said to myself,
great, we can fix this by filtering the output clauses with
join_clause_is_movable_into. That takes care of the submitted bug
all right, but it turns out that it also causes some existing
regression test cases to give wrong answers. And that is because
of a different problem: some clauses generated by
generate_join_implied_equalities don't satisfy the "Clause must
physically reference at least one target rel" heuristic in
join_clause_is_movable_into. Specifically that fails if we have a
clause generated for a child appendrel member (ie, one arm of a
flattened UNION ALL construct) whose EC member expression is Var-free.

So this seems like a bit of a mess. We can fix the submitted bug with
the kluge of only testing the nullable_relids condition, as I've done
in the attached patch for v15. (As join_clause_is_movable_into says,
this condition is conservative and might sometimes reject a clause
that could be evaluated safely, but that's fine for
get_baserel_parampathinfo's purposes.) But I'm worried about the
physically-reference business, because this function isn't the only
one that assumes that all output from generate_join_implied_equalities
will satisfy join_clause_is_movable_into: there are other functions
that actually assert that. How come we've not seen assertion
failures, or reports of wrong query results similar to this one?
I can believe that the constant-EC-member situation is impossible
when considering a join as inner_rel, so get_joinrel_parampathinfo's
first usage is probably safe. But it sure looks like the
generate_join_implied_equalities_for_ecs usage is not. I wonder
if we can replace the test on clause_relids with something a bit
more reliable, or skip it when considering EC-derived clauses.

Anyway, if we go no further than this then we need the first
attached patch in <= v15, while in v16/HEAD we should at least
modify the comment to reflect reality, as I've done in the
second patch.

Thoughts? Can anybody devise a test case that triggers the
Asserts I mentioned?

regards, tom lane

Attachment Content-Type Size
fix-bug-18429-wip-v15.patch text/x-diff 4.0 KB
fix-incorrect-comment-wip-head.patch text/x-diff 895 bytes

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2024-04-14 01:29:35 BUG #18432: Polymorphic, table-returning PL/pgSQL function fails with an erroneous "schema mismatch" error
Previous Message Noah Misch 2024-04-13 17:15:28 Re: FSM Corruption (was: Could not read block at end of the relation)