Re: BUG #7710: Xid epoch is not updated properly during checkpoint

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: tarvip(at)gmail(dot)com, pgsql-bugs(at)postgresql(dot)org, Andres Freund <andres(at)2ndQuadrant(dot)com>
Subject: Re: BUG #7710: Xid epoch is not updated properly during checkpoint
Date: 2012-12-02 15:25:20
Message-ID: 11810.1354461920@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> I've applied an absolutely minimal fix on this, which introduces no
> other changes that could cause unforeseen consequences.

This is not what we'd agreed to do, I thought.

Now that I've thought more about this bug, the existing coding is flat
out wrong, with or without correction of the epoch. As you yourself
just wrote in a comment, the checkpoint record logically belongs to the
"redo" point in the WAL stream, not to where it's physically located.
Having it carry a nextXid that belongs to the later point is simply
wrong. Having it carry different nextXids depending on wal_level is
even more wrong.

I can point right now to one misbehavior this causes: if you run a
point-in-time recovery with a stop point somewhere in the middle of the
checkpoint, you should end up with a nextXid corresponding to the stop
point. This hack in LogStandbySnapshot causes you to end up with a
much later nextXid, if you were running hot-standby.

> Others may wish to go further, overriding my patches, as they choose.

Okay, I will take the responsibility for changing this, but it needs to
change. This coding was ill-considered from the word go.

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Simon Riggs 2012-12-02 15:58:25 Re: BUG #7710: Xid epoch is not updated properly during checkpoint
Previous Message Simon Riggs 2012-12-02 15:08:35 Re: BUG #7710: Xid epoch is not updated properly during checkpoint