Re: BUG #18634: Wrong varnullingrels with merge ... when not matched by source

From: Richard Guo <guofenglinux(at)gmail(dot)com>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, exclusion(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18634: Wrong varnullingrels with merge ... when not matched by source
Date: 2024-09-30 02:16:00
Message-ID: CAMbWs4-8AnhctGsEuBny0T5E_ckpq0+s7E0wyHViSgvKSMMiUQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Sun, Sep 29, 2024 at 3:49 PM Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
> On Sat, 28 Sept 2024 at 01:40, Richard Guo <guofenglinux(at)gmail(dot)com> wrote:
> > I have the same concern. I think we should NOT mark the vars in
> > mergeJoinCondition as nullable, as mergeJoinCondition acts as join
> > quals rather than filter quals at that outer join. Instead, we should
> > mark them nullable when they are pulled out and ready to be added to
> > the targetlist, if they are really needed in the targetlist.
>
> Actually, the marking is done after building the join node, so it's
> only marking a copy of the join condition, for use above the join. The
> original condition inside the join node remains unmarked, so I think
> it's right.

Ah yes, you are right. I thought the join node we build for MERGE
would use the marked mergeJoinCondition as join quals, but that's not
the case.

> I spent some time trying to figure out why none of the existing tests
> hit this error, and I think the reason is that all the previous tests
> involved a plan where the ModifyTable node is directly on top of the
> join node, so the top targetlist was the join node's targetlist, and
> therefore wasn't marked. But in the example here, there is a one-time
> filter Result node between the ModifyTable node and the join node,
> which means the ModifyTable node pulls from the Result node, whose
> output is marked as nullable, because it's above the join. That makes
> the error somewhat rare, though maybe there are other cases that can
> lead to a plan node being inserted between the ModifyTable node and
> the join node.
>
> It feels a bit unsatisfactory that this wasn't detectable with a
> ModifyTable node directly on top of the join node, making the bug hard
> to spot, but I don't know whether it would be feasible to do anything
> about that.

For an outer join, any vars appearing in its targetlist (and qpqual)
should be marked nullable if they are from the nullable side, because
they are logically above the join. However, when we fix up the
targetlist and qpqual, we don't have enough info available to identify
the nullingrel bits added by the outer join. So we have to use
superset matches rather than exact matches.

This is why we don't hit this error in cases where the ModifyTable
node is directly on top of the join node, even though we fail to mark
the vars in targetlist correctly.

In Alexander's case, there is a Result node in between. When we fix
up the targetlist for the Result node, we perform exact matches for
the nullingrel bits. That's how this issue is revealed.

Yeah, it's a bit unsatisfying that we can only perform superset
matches rather than exact matches for the Vars in the targetlist and
qpqual of an outer join. Maybe we can record the ojrelid of the outer
join in Join node, and then match these Vars with input Vars'
nullingrels plus this ojrelid. But when the outer joins are
re-ordered according to identity 3, it becomes very tricky to figure
out what the correct ojrelid(s) we should use for the outer join.

Thanks
Richard

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message DBA 2024-09-30 02:32:36 what are the things that occupy the session memory.
Previous Message Takeshi Ideriha 2024-09-30 01:16:00 Re: BUG #18641: Logical decoding of two-phase commit fails with TOASTed default values