From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | tarvip(at)gmail(dot)com |
Cc: | pgsql-bugs(at)postgresql(dot)org, Simon Riggs <simon(at)2ndquadrant(dot)com> |
Subject: | Re: BUG #7710: Xid epoch is not updated properly during checkpoint |
Date: | 2012-12-01 22:56:33 |
Message-ID: | 24367.1354402593@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
tarvip(at)gmail(dot)com writes:
> [ txid_current can show a bogus value near XID wraparound ]
> This happens only if wal_level=hot_standby.
I believe what is happening here is
(1) CreateCheckPoint sets up checkPoint.nextXid and
checkPoint.nextXidEpoch, near xlog.c line 7070 in HEAD. At this point,
nextXid is still a bit less than the wrap point.
(2) After performing the checkpoint, at line 7113, CreateCheckPoint
calls LogStandbySnapshot() which "helpfully" updates checkPoint.nextXid
to the latest value. Which by now has wrapped around. But it doesn't
fix checkPoint.nextXidEpoch, so the checkpoint that gets written out has
effectively lost the epoch bump that should have happened.
While we could add some more logic to try to correct the epoch value
in this scenario, I think it's a much better idea to just stop having
LogStandbySnapshot update the nextXid. That seems to me to be useless
complication. I also quite dislike the fact that we're effectively
redefining the checkpoint nextXid from being taken before the main
body of the checkpoint to being taken afterwards, but *only* in
XLogStandbyInfoActive mode. If that inconsistency isn't already causing
bugs (besides this one) today, it'll probably cause them in the future.
So barring objections, I'm going to remove LogStandbySnapshot's behavior
of returning the updated nextXid.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Janes | 2012-12-01 22:59:10 | Re: PITR potentially broken in 9.2 |
Previous Message | Tom Lane | 2012-12-01 21:56:44 | Re: PITR potentially broken in 9.2 |