From: | Simon Riggs <simon(at)2ndQuadrant(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | tarvip(at)gmail(dot)com, pgsql-bugs(at)postgresql(dot)org |
Subject: | Re: BUG #7710: Xid epoch is not updated properly during checkpoint |
Date: | 2012-12-02 12:51:57 |
Message-ID: | CA+U5nM+qw3b74FrkY7z2bK73Q6svtCHLiGb3ZrHwXwLRp_nuDQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On 1 December 2012 22:56, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> tarvip(at)gmail(dot)com writes:
>> [ txid_current can show a bogus value near XID wraparound ]
>> This happens only if wal_level=hot_standby.
>
> I believe what is happening here is
>
> (1) CreateCheckPoint sets up checkPoint.nextXid and
> checkPoint.nextXidEpoch, near xlog.c line 7070 in HEAD. At this point,
> nextXid is still a bit less than the wrap point.
>
> (2) After performing the checkpoint, at line 7113, CreateCheckPoint
> calls LogStandbySnapshot() which "helpfully" updates checkPoint.nextXid
> to the latest value. Which by now has wrapped around. But it doesn't
> fix checkPoint.nextXidEpoch, so the checkpoint that gets written out has
> effectively lost the epoch bump that should have happened.
>
> While we could add some more logic to try to correct the epoch value
> in this scenario, I think it's a much better idea to just stop having
> LogStandbySnapshot update the nextXid. That seems to me to be useless
> complication. I also quite dislike the fact that we're effectively
> redefining the checkpoint nextXid from being taken before the main
> body of the checkpoint to being taken afterwards, but *only* in
> XLogStandbyInfoActive mode. If that inconsistency isn't already causing
> bugs (besides this one) today, it'll probably cause them in the future.
I agree that the coding looks weird and agree it shouldn't be there.
The meaning of the checkpoint values should not differ because
wal_level has changed.
> So barring objections, I'm going to remove LogStandbySnapshot's behavior
> of returning the updated nextXid.
Removing it may cause other bugs, but if so, those other bugs need to
be solved in the right way, not by having a too-far-forwards nextxid
on the checkpoint record. Having said that, I can't see any bugs that
would be caused by this.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Simon Riggs | 2012-12-02 15:08:35 | Re: BUG #7710: Xid epoch is not updated properly during checkpoint |
Previous Message | Simon Riggs | 2012-12-02 10:12:53 | Re: BUG #7710: Xid epoch is not updated properly during checkpoint |