Re: BUG #15727: PANIC: cannot abort transaction 295144144, it was already committed

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: r(dot)zharkov(at)postgrespro(dot)ru, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #15727: PANIC: cannot abort transaction 295144144, it was already committed
Date: 2019-04-06 23:10:37
Message-ID: 20190406231037.gncrhso7e3lumijj@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi,

On 2019-04-06 10:10:25 -0700, Andres Freund wrote:
> I noticed that we say
> + ereport(ERROR,
> + (errcode(ERRCODE_TRIGGERED_DATA_CHANGE_VIOLATION),
> + errmsg("tuple to be updated was already modified by an operation triggered by the current command"),
>
> in the ExecDelete() case (that's not new). Which seems odd.

My inclination is to fix this in master, but not backpatch. I did in the
current version of the patch, because it made it easier to verify my
tests actually work.

One bigger question, around precisely that error, I have is that fixing
this bug led me to add tests for the codepath. What I noticed is that
the current version of the patch detects the above error even if we
first have to follow the update chain, whereas previously that case was
just blindly ignored. Which seems hard to defend to me?

So with the new code we're doing:

starting permutation: wx1 updwctefail c1 c2 read
step wx1: UPDATE accounts SET balance = balance - 200 WHERE accountid = 'checking' RETURNING balance;
balance

400
step updwctefail: WITH doup AS (UPDATE accounts SET balance = balance + 1100 WHERE accountid = 'checking' RETURNING *, update_checking(999)) UPDATE accounts a
step c1: COMMIT;
step updwctefail: <... completed>
accountid balance accountid balance update_checking

savings 1600 checking 1500 t
step c2: COMMIT;
step read: SELECT * FROM accounts ORDER BY accountid;
accountid balance

checking 1501
savings 1600

but previously we got:

starting permutation: wx1 updwctefail c1 c2 read
step wx1: UPDATE accounts SET balance = balance - 200 WHERE accountid = 'checking' RETURNING balance;
balance

400
step updwctefail: WITH doup AS (UPDATE accounts SET balance = balance + 1100 WHERE accountid = 'checking' RETURNING *, update_checking(999)) UPDATE accounts a
step c1: COMMIT;
step updwctefail: <... completed>
accountid balance accountid balance update_checking

savings 1600 checking 1500 t
step c2: COMMIT;
step read: SELECT * FROM accounts ORDER BY accountid;
accountid balance

checking 1501
savings 1600

In other words, the previous error check put there by
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=6868ed7491b7ea7f0af6133bb66566a2f5fe5a75
commit 6868ed7491b7ea7f0af6133bb66566a2f5fe5a75
Author: Kevin Grittner <kgrittn(at)postgresql(dot)org>
Date: 2012-10-26 14:55:36 -0500

Throw error if expiring tuple is again updated or deleted.

weren't that thorough. As long as the "original" row to be updated is
updated by another transaction (i.e. we start read committed ctid
chasing), the old EPQ fetch logic just didn't perform the check
described in the commit + comment:

/*
* The target tuple was already updated or deleted by the
* current command, or by a later command in the current
* transaction. The former case is possible in a join DELETE
* where multiple tuples join to the same target tuple. This
* is somewhat questionable, but Postgres has always allowed
* it: we just ignore additional deletion attempts.
*
* The latter case arises if the tuple is modified by a
* command in a BEFORE trigger, or perhaps by a command in a
* volatile function used in the query. In such situations we
* should not ignore the deletion, but it is equally unsafe to
* proceed. We don't want to discard the original DELETE
* while keeping the triggered actions based on its deletion;
* and it would be no better to allow the original DELETE
* while discarding updates that it triggered. The row update
* carries some information that might be important according
* to business rules; so throwing an error is the only safe
* course.
*
* If a trigger actually intends this type of interaction, it
* can re-execute the DELETE and then return NULL to cancel
* the outer delete.
*/

It seems mighty finnicky to fix this in < v12 (as the error would need
to happen in the guts of EvalPlanQualFetch() rather than in
ExecUpdate/Delete) - so I'm inclined to just fix it in master.

Thoughts?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Andres Freund 2019-04-07 20:33:15 Re: BUG #15727: PANIC: cannot abort transaction 295144144, it was already committed
Previous Message Andres Freund 2019-04-06 17:27:08 Re: BUG #15727: PANIC: cannot abort transaction 295144144, it was already committed