From: | "Joe Chang" <jchang(at)greenplum(dot)com> |
---|---|
To: | "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Heikki Linnakangas" <hlinnaka(at)iki(dot)fi> |
Cc: | "Alvaro Herrera" <alvherre(at)dcc(dot)uchile(dot)cl>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Two-phase commit issues |
Date: | 2005-05-18 22:43:20 |
Message-ID: | BEB11318.8D0%jchang@greenplum.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
One thing I would suggest is to start a global transaction in begin, not in
prepare. That is way to be compliance with XA.
Thanks
Joe
On 5/18/05 2:15 PM, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I've started to look seriously at Heikki's patch for two-phase commit.
> There are a few issues that probably deserve discussion:
>
> * The major missing issue that I've come across so far is that
> subtransaction and multixact state isn't preserved across a crash.
> Assuming that we want to store only top-level XIDs in the shared-memory
> list of prepared XIDs (which I think is important), it is essential that
> crash restart rebuild the pg_subxact status for prepared transactions.
> The subxacts of a prepared xact have to be seen as still running, and
> they won't be unless the subxact links are there. Since subxact.c is
> designed to wipe all its state on restart, we need to recreate those
> entries. Fortunately this doesn't seem hard: the state file for a
> prepared xact will include all of its subxact XIDs, and we can just
> do SubTransSetParent() on them while rereading the state file. (AFAICS
> it's sufficient to make each subxact link directly to the top XID, even
> if there was a more complex hierarchy originally.) Similarly, we've got
> to reconstruct MultiXactIds that any prepared xacts are members of, else
> row-level locks taken out by prepared xacts won't be enforced correctly.
> I think this can be handled if we add to the state files a list of all
> MultiXactIds that each prepared xact belongs to, and then during restart
> forcibly recreate those MultiXactIds. (They would only be rebuilt with
> prepared XIDs, not any ordinary XIDs that might originally have been
> members.) This seems to require some new code in multixact.c, but not
> anything fundamentally difficult --- Alvaro, do you see any likely
> problems in this stuff?
>
> * The patch is designed to dump state files into WAL as well as onto
> disk. Why? Wouldn't it be better just to write and fsync the state
> file before reporting successful prepare? That would get rid of the
> need for checkpoint-time fsyncs.
>
> * I'm inclined to think that the "gid" identifiers for prepared
> transactions ought to be SQL identifiers (names), not string literals.
> Was there a particular reason for making them strings?
>
> * What are we going to do with GUC variables? My feeling is that
> the only sane answer is that PREPARE is the same as COMMIT as far as
> local GUC variables go, and COMMIT/ROLLBACK PREPARED have no effect
> on GUC state. Otherwise it's really unclear what to do. Consider
> SET myvar = foo;
> BEGIN;
> SET myvar = bar;
> PREPARE gid;
> SHOW myvar; -- what do you see ... foo or bar?
> SET myvar = baz; -- is this even legal?
> ROLLBACK PREPARED gid;
> SHOW myvar; -- now what do you see ... foo or baz?
> Since local GUC changes aren't going to be saved/restored across a
> crash anyway, I can't see a point in doing anything really complex.
>
> * There are some fairly ugly cases associated with creation and deletion
> of temporary tables as well. I think we might want to just decree that
> you can't PREPARE a transaction that included creating or dropping a
> temp table. Does anyone have much of a problem with that?
>
> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly
>
From | Date | Subject | |
---|---|---|---|
Next Message | Noel | 2005-05-18 22:49:24 | Re: Image storage questions |
Previous Message | Oleg Bartunov | 2005-05-18 21:59:52 | Re: pg_dump and using schema problem |