From: | Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl> |
---|---|
To: | Simon Riggs <simon(at)2ndquadrant(dot)com> |
Cc: | Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: XLog: how to log? |
Date: | 2004-05-11 21:26:00 |
Message-ID: | 20040511212600.GA5878@dcc.uchile.cl |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, May 11, 2004 at 09:25:37PM +0100, Simon Riggs wrote:
> On Tue, 2004-05-11 at 16:33, Bruce Momjian wrote:
> > Tom Lane wrote:
> > > Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl> writes:
> > > > Hmm ... I think it should be forbidden to quote a subtrans Xid as
> > > > rollforward point. Not sure if that can be done though, or how to do
> > > > it.
> > >
> > > Seems like a nonissue, unless the XLOG trace makes a subtrans look the
> > > same as a main trans, which it'd not do would it?
>
> I agree that a subtrans xid should not be a valid rollforward point.
If you try to do that you'll fail because there will be no XLog record
signalling the commit of a subtransaction. They will be marked
committed as necessary as a subproduct of main transaction committing.
> Currently, recovery loops until end of xlogs. There is no exit condition
> from the loop. There is not currently a timestamp on the xlogs -
> anywhere apart from the file date on each xlog.
Both xact commit and abort have timestamps in the XLog. I think valid
recovery points are transaction commit/abort, not transaction start.
> If we go searching for a particular Xid, there is no way to tell whether
> an Xid suggested by a user is too big or too small for use as a recovery
> target. We need to recover - it is the only way to tell; if we find an
> Xid that matches, we stop. If not, we keep going until end of logs, when
> we need to issue a "recovered fully - the Xid you gave was not valid",
> which may take some time and is also very clearly not what was wanted.
I think the user should first examine the logs with whatever tools are
provided, and use a timestamp or a Xid listed in the XLog. If they use
a Xid that's not listed, it's not our fault ...
> b) later, a utility that allowed xlogs to be inspected to allow DBA to
> decide which is the correct Xid to recover to.
Why is this difficult? There are lots of subsys_desc() functions which
already returns what's in each log record as a string. The tool could
initially just dump that ...
> Therefore: action on me? - add a timestamp to EACH xlog record -
> something I had been shying away from.
You only need timestamps in xl_xact_commit and xl_xact_abort, which are
already there.
> On Tue, 2004-05-11 at 14:56, Alvaro Herrera wrote:
> > (Unrelated: note that after main transaction commit, a committed
> > subtransaction is indistinguishable from a committed main transaction --
> > and with the current idea of XLog I have, after recovering a transaction
> > tree from XLog there won't be any mark in pg_subtrans. So the system
> > will not be exactly as it was before but it won't matter.)
>
> I don't think we need a subtrans commit directly, since if the top-level
> commits after the subtrans has committed, then we're good.
If the subxact wrote a tuple, its Xid has to be in the pg_clog. Thus
we need to recover the pg_clog write.
> However, if a subtrans aborts, yet the top-level commits there will be
> data written to the database about an aborted transaction. We don't have
> Undo, so the subtrans clog must be updated to show that the subtrans
> aborted, otherwise we would read both the committed (top-level) and the
> uncommitted data (subtrans).
If the aborted subxact wrote a tuple, its Xid has to be in the pg_clog.
> Another way of putting it - if it was worth writing before a crash, it
> is worth recovering after a crash. Surely?
Right. What I was saying is that we don't need pg_subtrans info,
because that's only needed while the subtransaction is marked as
"subcommitted" but it's parent hasn't committed or aborted yet. The
subcommitted status is changed to committed/aborted when the main
transaction commits or aborts; at recovery time, we already know if that
happenned or not so we can mark it right away.
--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"La vida es para el que se aventura"
From | Date | Subject | |
---|---|---|---|
Next Message | Gaetano Mendola | 2004-05-11 21:38:13 | invalid memory alloc request size 0 |
Previous Message | Tom Lane | 2004-05-11 21:15:51 | Re: PITR Signalling the Archiver |