Re: BUG #18305: Unexpected error: "WindowFunc not found in subplan target lists" triggered by subqueries

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Richard Guo <guofenglinux(at)gmail(dot)com>
Cc: zuming(dot)jiang(at)inf(dot)ethz(dot)ch, David Rowley <dgrowleyml(at)gmail(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18305: Unexpected error: "WindowFunc not found in subplan target lists" triggered by subqueries
Date: 2024-01-23 19:34:12
Message-ID: 4071133.1706038452@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Richard Guo <guofenglinux(at)gmail(dot)com> writes:
> I think the problem is that in find_window_run_conditions() we fail to
> check thoroughly whether the WindowFunc contains any potential subplan
> nodes. We do have the check:

> /* can't use it if there are subplans in the WindowFunc */
> if (contain_subplans((Node *) wfunc))
> return false;

> ... and the bug being reported here indicates that the current check is
> insufficient, because a Var node inside the WindowFunc could potentially
> belong to a subquery and reference a SubLink expression.

I dug through this, and my conclusion is that the entire "run
condition" optimization is seriously broken, because the tree
manipulations here are many bricks shy of a load. There are
two big problems:

1. The subquery hasn't yet been through any preprocessing. Thus,
we've not yet done subquery pullup within it, which is why the
contain_subplans call fails to detect a problem: the WindowFunc's arg
is just a Var. Later, pullup of the lower subquery will replace that
Var with a SubLink, which eventually gets replaced with a Param
referencing an initplan's output. However, run condition optimization
caused us to put a second copy of the WindowFunc into the
runCondition, and that copy gets its own copy of the SubLink, which
eventually results in a different Param, leading to the observed
failure. (BTW, it scares the heck out of me that we just link the
WindowFunc expr into wclause->runCondition without making a copy of
it. That could be safe if the planner never scribbles on its input,
but that ain't so.)

2. We are pushing the "otherexpr" from the upper-level query
(which *has* been through preprocessing) into the unprocessed
child query. This makes me acutely uncomfortable:
A. Almost certainly, this fails if otherexpr contains a subplan.
B. Since nothing is done about levelsup adjustment, it will
fail if anything in that tree includes a levelsup field.
Fortunately is_pseudo_constant_clause will reject Vars,
but those aren't the only things with levelsup. I wonder
if it's possible to get an uplevel Aggref into a subquery's
restrictlist, in particular.
C. Again, failure to make a copy of the expression tree seems
pretty dangerous.
D. We're relying on repeat preprocessing of the otherexpr to
do no harm. Maybe that's OK, but it's not great.
The first three things are not hard to fix, but they're not being
taken care of today.

We could work around point 1 by refusing to pull up subqueries that
contain sublinks in their targetlists, but that would be a pretty big
change (and, probably, a pessimization of some queries). I do not
consider run-condition optimization to justify that. Moreover
I'm not sure that sublinks are the only thing that could get
mutated to a different state in the runCondition than in the main
tree.

I think the only real way to prevent problems from point 1 is to stop
making a copy of the WindowFunc expr. We need some other way to refer
to the WindowFunc's value in the runCondition tree. Maybe a generated
Param would serve?

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Michael Paquier 2024-01-23 23:26:48 Re: MarkBufferDirty Assert held LW_EXCLUSIVE lock fail when ginFinishSplit
Previous Message Michael Zhilin 2024-01-23 19:24:11 Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum