From: | Noah Misch <noah(at)leadboat(dot)com> |
---|---|
To: | Daniel Farina <daniel(at)heroku(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: txid failed epoch increment, again, aka 6291 |
Date: | 2012-09-04 15:04:25 |
Message-ID: | 20120904150425.GA28600@tornado.leadboat.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Sep 04, 2012 at 02:07:30AM -0700, Daniel Farina wrote:
> It seems like this has reproduced once more. And once again, there
> doesn't appear to be any funny business in pg_control (but the structs
> are pasted here for your re-check), and there are successful sensical
> updates to it. The primary is running 9.0.6.
What version is the standby running?
> However, we do have a new piece of data: there was a very brief period
> where txid_snapshot did report an xmin greater than 2^33, our next
> epoch boundary, by a few thousand transactions. That could be because
> the reporting function GetNextXidAndEpoch does its own epoch
> calculation before the checkpoint and then after a checkpoint that
> forgets to increment the epoch there is no need to add post-facto
> adjust the epoch anymore.
That makes sense.
> I've been reviewing the mechanism in CreateCheckPoint for this on and
> off for a couple of days, but so far I haven't come up with a
> convincing mechanism. However, given that it seems historically that
> this bug is more likely to surface than *not* surface on this system,
> perhaps we can try for a sometimes-reproducing test case. I'm still
> struggling for a hint of a solution, though, so toss your thoughts
> here.
The cause is not apparent to me, either, and simple tests here do not
reproduce the problem.
I am suspicious of xlog_redo() updating ControlData->checkPointCopy without
acquiring ControlFileLock. If a restartpoint is finishing at about the same
time, ControlFile->checkPointCopy.nextXid might come from the just-read
checkpoint record while ControlFile->checkPointCopy.nextXidEpoch bears the
value from the older checkpoint just adopted as a restartpoint. The resulting
inconsistency would, however, vanish at the next ControlFile->checkPointCopy
update. This does not explain the symptoms you have reported, and it cannot
explain much of anything on a primary.
> (gdb) p *ControlFile
> $4 = {system_identifier = 5613733157253676693, pg_control_version =
> 903, catalog_version_no = 201008051,
> state = DB_IN_ARCHIVE_RECOVERY, time = 1346746898, checkPoint =
This capture, it seems, is from a standby.
> {xlogid = 18751, xrecoff = 1072693824}, prevCheckPoint = {
> xlogid = 18751, xrecoff = 1072693824}, checkPointCopy = {redo =
> {xlogid = 18751, xrecoff = 602482536}, ThisTimeLineID = 6,
> nextXidEpoch = 1, nextXid = 9904084, nextOid = 2047524, nextMulti
> = 1119, nextMultiOffset = 3115, oldestXid = 4115479553,
> oldestXidDB = 1, time = 1346746496, oldestActiveXid = 9558248},
You expected checkPointCopy = { ... nextXidEpoch = 2, ... }, correct? (Well,
nextXidEpoch = 3 if you count the previous missed increment.) Does pg_control
on the primary also bear epoch 1 where epoch 2 is expected?
Your last restartpoint was exactly five minutes before your last checkpoint,
so there's no evidence of a dearth of safe restartpoint opportunities.
nm
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2012-09-04 15:04:39 | Re: 9.2rc1 produces incorrect results |
Previous Message | Bruce Momjian | 2012-09-04 14:49:36 | Re: pg_upgrade del/rmdir path fix |