Re: BUG #17113: Assert failed on calling a function fixed after an extension reload

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: exclusion(at)gmail(dot)com
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #17113: Assert failed on calling a function fixed after an extension reload
Date: 2021-07-19 23:17:07
Message-ID: 333098.1626736627@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

I wrote:
> PG Bug reporting form <noreply(at)postgresql(dot)org> writes:
>> The following SQL snippet:
>> ...
>> leads to a failed assertion with the following stack:

> Hmm ... it seems fine in v14 and HEAD, but I do see the crash in v13.

So actually this is an uninitialized-variable problem, which makes
it mostly luck whether it fails or not. (Turning on
RANDOMIZE_ALLOCATED_MEMORY would make it much more reproducible,
though, and I imagine valgrind would complain as well.)

The core of the issue is that this coding pattern in exec_stmt_execsql
is unsafe:

/*
* On the first call for this statement generate the plan, and detect
* whether the statement is INSERT/UPDATE/DELETE
*/
if (expr->plan == NULL)
{
exec_prepare_plan(estate, expr, CURSOR_OPT_PARALLEL_OK, true);
// compute stmt->mod_stmt here
}

It's possible for exec_prepare_plan to throw an error after setting
expr->plan, in which case if we come here again then expr->plan is
already set so we'll never compute stmt->mod_stmt at all.

Since the above is a fairly tempting coding pattern, I tried to make
it safe by rearranging things so that exec_prepare_plan wouldn't set
expr->plan until the end. That idea crashed and burned though; the
plan refcount manipulations we do don't work correctly if we do it
like that, and trying to change that looks like it'd create fairly
substantial risks of leaking plan refcounts on error.

So I ended up with the attached, which just adds some warning comments
saying "don't do it like that" and then fixes the two places that did
do it like that (exec_stmt_call is broken too).

This requires changing struct PLpgSQL_stmt_execsql, which is something
of an ABI hazard for pldebugger and the like. I think we can make it
safe in the back branches by putting the new mod_stmt_set bool after
the other three bools, though that's rather an unintuitive ordering.

For me, the new test case fails in v10..HEAD but not 9.6; but I think
that's just random chance. The bug is really quite old.

Barring objections, I'll push this tomorrow.

(Memo to self: the back branches have more variants of the same issue,
so their patches are going to look different.)

regards, tom lane

Attachment Content-Type Size
bug-17113-fix.patch text/x-diff 6.5 KB

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Kyotaro Horiguchi 2021-07-20 04:18:20 Re: BUG #17103: WAL segments are not removed after exceeding max_slot_wal_keep_size
Previous Message Alvaro Herrera 2021-07-19 21:24:48 Re: BUG #17103: WAL segments are not removed after exceeding max_slot_wal_keep_size