Re: Race condition between PREPARE TRANSACTION and COMMIT PREPARED (was Re: Problem with txid_snapshot_in/out() functionality)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Jan Wieck <jan(at)wi3ck(dot)info>, PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Race condition between PREPARE TRANSACTION and COMMIT PREPARED (was Re: Problem with txid_snapshot_in/out() functionality)
Date: 2014-04-14 16:51:02
Message-ID: 27400.1397494262@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> writes:
>> I think we'll need to transfer of the locks earlier, before the
>> transaction is marked as fully prepared. I'll take a closer look at this
>> tomorrow.

> Here's a patch to do that. It's very straightforward, I just moved the
> calls to transfer locks earlier, before ProcArrayClearTransaction.
> PostPrepare_MultiXact had a similar race - it also transfer state from
> the old PGPROC entry to the new, and needs to be done before allowing
> another backend to remove the new PGPROC entry. I changed the names of
> the functions to distinguish them from the other PostPrepare_* functions
> that now happen at a different time.

Why didn't you also move up PostPrepare_PredicateLocks? Seems like its
access to MySerializableXact is also racy.

> The patch is simple, but it's a bit scary to change the order of things
> like this.

Yeah. There are a lot of assumptions in there about the order of resource
release, in particular that it is safe to do certain things because we're
still holding locks.

I poked around a bit and noticed one theoretical problem sequence: if the
prepared xact drops some relation that we're still holding buffer pins on.
This shouldn't really happen (why are we still pinning some rel we think
we dropped?) but if it did, the commit would do DropRelFileNodeBuffers
which would end up busy-looping until we drop our pins (see
InvalidateBuffer, which thinks this must be an I/O wait situation).
So it would work, more or less, but it seems pretty fragile. I'm afraid
there are more assumptions like this one.

The whole thing feels like we are solving the wrong problem, anyway.
IIUC, the complaint arises because we are allowing COMMIT PREPARED
to occur before the source transaction has reported successful prepare
to its client. Surely that does not need to be a legal case? No
correctly-operating 2PC xact manager would do that.

I'd prefer to leave the prepare sequence alone and instead find a way
to reject COMMIT PREPARED until after the source transaction is safely
clear of the race conditions. The upthread idea of looking at vxid
instead of xid might help, except that I see we clear both of them
in ProcArrayClearTransaction. We'd need some state in PGPROC that
isn't cleared till later than that.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2014-04-14 16:53:57 Re: Signaling of waiting for a cleanup lock?
Previous Message Robert Haas 2014-04-14 16:48:48 Re: Minor improvements in create_foreign_table.sgml