Re: BUG #18422: Assert in expandTupleDesc() fails on row mismatch with additional SRF

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Tender Wang <tndrwang(at)gmail(dot)com>
Cc: exclusion(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18422: Assert in expandTupleDesc() fails on row mismatch with additional SRF
Date: 2024-04-09 19:34:23
Message-ID: 3861246.1712691263@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Tender Wang <tndrwang(at)gmail(dot)com> writes:
>> The following bug has been logged on the website:
>> Bug reference: 18422
>> Logged by: Alexander Lakhin
>> Email address: exclusion(at)gmail(dot)com
>> PostgreSQL version: 16.2
>> Operating system: Ubuntu 22.04
>> Description:
>>
>> The following query:
>> SELECT * FROM generate_series(1, 1),
>> COALESCE(row(1)) AS (a int, b int);
>> triggers an assertion failure:
>> TRAP: failed Assert("count <= tupdesc->natts"), File: "parse_relation.c",
>> Line: 3054, PID: 151361

> I copy the ereport code in tupledesc_match() into expandRTE() if
> funcolcount is larger
> than natts. The attached patch looks like a workaround fix.

I think that's mostly a kluge. It seems to me that the actual problem
is that 2ed8f9a01 missed some places that need fixing. The policy
that that commit intended to institute is "if a RangeTblFunction has a
coldeflist, then the function return type is certainly RECORD, and we
don't need to try to dig the column specifications out of the function
expression" (which dodges the problem that the function expression
might not produce what we're expecting). However, expandRTE didn't
get that memo, and would produce the wrong tupdesc in an example such
as this one. IMO, it ought to produce the query-specified columns in
all cases, even when the function expression doesn't match. Throwing
a premature error isn't better, especially if it only detects one
kind of mismatch.

I dug around looking at other callers of get_expr_result_type and
get_expr_result_tupdesc, and found a couple other places that aren't
actively buggy but can be made faster and more robust by adding
this same rule.

There is a remaining place where we're arguably misusing
get_expr_result_type, which is init_sexpr in execSRF.c. However,
that will just result in postponing the eventual error, because
we'll still cross-check the function result tupdesc against the
query-expected one later. So I think it's okay as-is; at least
it doesn't seem worth the trouble to refactor so that init_sexpr
would have access to the correct RangeTblFunction.

BTW, I wondered why the test cases we added for 2ed8f9a01 didn't
catch this, because they sure look superficially similar. The
reason seems to be that we fail because this query produces a join
plan in which we invoke build_physical_tlist for the FunctionScan,
and that's what's calling expandRTE and hitting the assertion.
The older test cases get simplified sufficiently that there's no
join but just a FunctionScan, and that plan node will be required
to produce the query-specified tlist not a physical tlist, so we
accidentally avoid reaching the assertion.

So I end with the attached.

regards, tom lane

Attachment Content-Type Size
v2-fix-bug-18422.patch text/x-diff 4.5 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Alexander Lakhin 2024-04-09 20:00:00 Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune()
Previous Message Alexander Lakhin 2024-04-09 19:00:00 Re: BUG #17821: Assertion failed in heap_update() due to heap pruning