From: | Jan Wieck <jan(at)wi3ck(dot)info> |
---|---|
To: | Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Problem with txid_snapshot_in/out() functionality |
Date: | 2014-04-12 18:10:13 |
Message-ID: | 53498185.6060404@wi3ck.info |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 04/12/14 10:03, Andres Freund wrote:
> On 2014-04-12 09:47:24 -0400, Tom Lane wrote:
>> Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> writes:
>> > On 04/12/2014 12:07 AM, Jan Wieck wrote:
>> >> the Slony team has been getting seldom reports of a problem with the
>> >> txid_snapshot data type.
>> >> The symptom is that a txid_snapshot on output lists the same txid
>> >> multiple times in the xip list part of the external representation.
>>
>> > It's two-phase commit. When preparing a transaction, the state of the
>> > transaction is first transfered to a dummy PGXACT entry, and then the
>> > PGXACT entry of the backend is cleared. There is a transient state when
>> > both PGXACT entries have the same xid.
>>
>> Hm, yeah, but why is that intermediate state visible to anyone else?
>> Don't we have exclusive lock on the PGPROC array while we're doing this?
>
> It's done outside the remit of ProcArray lock :(. And documented to lead
> to duplicate xids in PGXACT.
> EndPrepare():
> /*
> * Mark the prepared transaction as valid. As soon as xact.c marks
> * MyPgXact as not running our XID (which it will do immediately after
> * this function returns), others can commit/rollback the xact.
> *
> * NB: a side effect of this is to make a dummy ProcArray entry for the
> * prepared XID. This must happen before we clear the XID from MyPgXact,
> * else there is a window where the XID is not running according to
> * TransactionIdIsInProgress, and onlookers would be entitled to assume
> * the xact crashed. Instead we have a window where the same XID appears
> * twice in ProcArray, which is OK.
> */
> MarkAsPrepared(gxact);
>
> It doesn't sound too hard to essentially move PrepareTransaction()'s
> ProcArrayClearTransaction() into MarkAsPrepared() and rejigger the
> locking to remove the intermediate state. But I think it'll lead to
> bigger changes than we'd be comfortable backpatching.
>
>> If we don't, aren't we letting other backends see non-self-consistent
>> state in regards to who holds which locks, for example?
>
> I think that actually works out ok, because the locks aren't owned by
> xids/xacts, but procs. Otherwise we'd be in deep trouble in
> CommitTransaction() as well where ProcArrayEndTransaction() clearing
> that state.
> After the whole xid transfer, there's PostPrepare_Locks() transferring
> the locks.
Since it doesn't seem to produce any side effects, I'd think that making
the snapshot unique within txid_current_snapshot() and filtering
duplicates on input should be sufficient and eligible for backpatching.
The attached patch adds a unique loop to the internal sort_snapshot()
function and skips duplicates on input. The git commit is here:
https://github.com/wieck/postgres/commit/a88a2b2c25b856478d7e2b012fc718106338fe00
Regards,
Jan
--
Jan Wieck
Senior Software Engineer
http://slony.info
Attachment | Content-Type | Size |
---|---|---|
txid_snapshot_filter_duplicate_xip.diff | text/plain | 3.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Marco Atzeri | 2014-04-12 18:54:57 | Re: test failure on latest source |
Previous Message | Andres Freund | 2014-04-12 17:48:00 | Re: test failure on latest source |