Re: BUG #18657: Using JSON_OBJECTAGG with volatile function leads to segfault

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Tender Wang <tndrwang(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, exclusion(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18657: Using JSON_OBJECTAGG with volatile function leads to segfault
Date: 2024-10-17 08:15:49
Message-ID: CA+HiwqHpU4VARMpEih6O5xd+psvODEzutf8Rkx7m+sU8OPV-bA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Thu, Oct 17, 2024 at 1:12 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I wrote:
> > In other words, the proposed patch is dangerously wrong. I suspect at
> > this point that the true bug might be the opposite: somebody feeling
> > that they could dispense with updating raw_expr when they shouldn't.
>
> To expand on that: the proximate cause of the bug is that
> ExecBuildAggTrans is expecting that expression initialization found
> an Aggref for every aggtransno index. In the plan as emitted by
> the planner, there are two json_object_agg Aggref nodes, one in
> the JsonValueExpr's raw_expr and the other in the formatted_expr.
> preprocess_aggrefs() assigned aggno 0 and aggtransno 0 to the
> first one, aggno 1 and aggtransno 1 to the second. As already
> noted upthread, the core of the problem is that ExecInitExprRec's
> T_JsonValueExpr case is not recursing into raw_expr, so that
> only the second Aggref gets found and linked into the parent
> AggState, thus leaving a hole in the per-transno array.
>
> The proposed patch fixes that indirectly by lobotomizing
> expression_tree_walker so that preprocess_aggrefs doesn't
> find the raw_expr's Aggref either. However, the side effects
> of that will be spectacularly unpleasant, because there are
> approximately zero other callers of expression_tree_walker
> that will be pleased with it.
>
> In the short term, I suspect the only workable fix is to undo the
> optimization of having ExecInitExprRec not recurse into both raw_expr
> and formatted_expr.

This or actually I'm tempted to simply revert the whole thing
(b6e1157e7d3) as an ill-considered refactoring, because I am not able
to convince myself that calling ExecAggPlainTransByVal() twice, via
both raw_expr and formatted_expr, is always safe.

> In the longer run, I'd look hard at why the
> tree needs to have two copies of that subexpression at all.
> At least in this example, it appears that the only difference is
> a type coercion expression wrapped around the formatted_expr, and
> there are surely better ways to handle that.

Before commit b6e1157e, raw_expr and formatted_expr were evaluated
separately, but only raw_expr would contain the actual expression
(such as an Aggref in this case). The result of raw_expr was passed as
an argument to the formatting expression (like CoerceViaIO) using a
CaseValueExpr. That commit removed the CaseValueExpr indirection and
embedded the expression from raw_expr directly into formatted_expr,
making the separate runtime evaluation of raw_expr redundant.
However, this change was ill-advised given this report.

Thoughts on the attached?

--
Thanks, Amit Langote

Attachment Content-Type Size
v3-0001-Revert-Don-t-include-CaseTestExpr-in-JsonValueExp.patch application/octet-stream 10.9 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Thomas Munro 2024-10-17 08:57:25 Re: Reference to - BUG #18349: ERROR: invalid DSA memory alloc request size 1811939328, CONTEXT: parallel worker
Previous Message Andrei Lepikhov 2024-10-17 08:12:45 Re: Reference to - BUG #18349: ERROR: invalid DSA memory alloc request size 1811939328, CONTEXT: parallel worker