| From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
|---|---|
| To: | Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl> |
| Cc: | pgsql-hackers(at)postgreSQL(dot)org |
| Subject: | Note about robustness of transaction-related data structures |
| Date: | 2004-07-15 21:33:56 |
| Message-ID: | 3072.1089927236@sss.pgh.pa.us |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
While running a parallel regression test I saw a failure
"relation deleted while still in use"
out of the subtransactions test, after which the backend dumped core.
The coredump was in AtSubAbort_smgr, and it was failing because
upperPendingDeletes was NIL (which it should not have been inside a
subtransaction, of course).
The underlying problem doesn't seem very reproducible --- I tried
several times without seeing it again. But what I deduce from the core
dump is that we had an error during subtransaction cleanup, leading to
an attempt to re-abort an already partially aborted transaction, and
the system just could not cope. The problem is that the list-of-lists
data structure used in smgr.c is brittle: it won't survive two
executions of AtSubAbort_smgr at the same nesting level.
In my local copy I've removed the list-of-lists data structure and
reverted to a flat list of pending deletion requests, with the addition
of a transaction nesting level field in each entry. AtSubCommit_smgr
does this:
int nestLevel = GetCurrentTransactionNestLevel();
PendingRelDelete *pending;
for (pending = pendingDeletes; pending != NULL; pending = pending->next)
{
if (pending->nestLevel >= nestLevel)
pending->nestLevel = nestLevel - 1;
}
while AtSubAbort_smgr acts only on entries with nestlevel >= transaction
nest level. This makes the data structure safe against multiple
execution.
I think we will have to make similar adjustments in the other places
where we've used list-of-lists data structures. We may need to look a
little closer at the manipulations of the transaction state stack in
xact.c, as well.
regards, tom lane
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tom Lane | 2004-07-15 21:41:33 | Re: Is "trust" really a good default? |
| Previous Message | Peter Eisentraut | 2004-07-15 21:17:47 | Re: Is "trust" really a good default? |