From: | Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> |
---|---|
To: | katsumata(dot)tomonari(at)po(dot)ntts(dot)co(dot)jp |
Cc: | pgsql-bugs(at)postgresql(dot)org |
Subject: | Re: BUG #8686: Standby could not restart. |
Date: | 2013-12-20 16:15:09 |
Message-ID: | 52B46D0D.2070505@vmware.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On 12/19/2013 04:57 AM, katsumata(dot)tomonari(at)po(dot)ntts(dot)co(dot)jp wrote:
> At first, I doubted the recovery state reached "consistent" before redo
> starts.
> And then I checked pg_control and related WAL.
> The WAL sequence is like below.
>
>
> WAL--(a)--(b)--(c)--(d)--(e)-->
> ================================================
> (a) Latest checkpoint's REDO location
> 1/783B230
>
>
> (b) hot_update
> 1/7842010
>
>
> (c) truncate
> 1/8E7E5C8
>
>
> (d) Latest checkpoint location
> 1/8E7F0B0
>
>
> (e) Minimum recovery ending location
> 1/8E7F110
> ================================================
>
>
>>From these things, I found it has happened with this scenario.
> ----------
> (1) standby starting
> (2) seeking checkpoint location 1/8E7F0B0 because backup_label is not
> absecnt
> (3) reachedConsistency is set to true at 1/8E7F110 in
> CheckRecoveryConsistent
> (4) redo start from 1/783B230
> (5) PANIC at 1/7842010 because reachedConsistency has set already and
> operating against a block which will be truncated at (c).
> ----------
>
> At step(2), EndRecPtr is set to 1/8E7F110(next to 1/8E7F0B0),
> so reachedConsistency is set to true at step(3).
Yep. Thanks for a good explanation.
> I think it's not need to increase EndRecPtr while seeking checkpoint
> location.
> I tried to revise it and this worked fine.
Hmm. There's this comment in StartupXLOG, after reading the checkpoint
record, but before reading the first record at REDO point:
> /*
> * Initialize shared replayEndRecPtr, lastReplayedEndRecPtr, and
> * recoveryLastXTime.
> *
> * This is slightly confusing if we're starting from an online
> * checkpoint; we've just read and replayed the checkpoint record, but
> * we're going to start replay from its redo pointer, which precedes
> * the location of the checkpoint record itself. So even though the
> * last record we've replayed is indeed ReadRecPtr, we haven't
> * replayed all the preceding records yet. That's OK for the current
> * use of these variables.
> */
> SpinLockAcquire(&xlogctl->info_lck);
> xlogctl->replayEndRecPtr = ReadRecPtr;
> xlogctl->lastReplayedEndRecPtr = EndRecPtr;
> xlogctl->recoveryLastXTime = 0;
> xlogctl->currentChunkStartTime = 0;
> xlogctl->recoveryPause = false;
> SpinLockRelease(&xlogctl->info_lck);
I think we need to fix that confusion. Your patch will do it by not
setting EndRecPtr yet; that fixes the bug, but leaves those variables in
a slightly strange state; I'm not sure what EndRecPtr points to in that
case (0 ?), but ReadRecPtr would be set I guess.
Perhaps we should reset replayEndRecPtr and lastReplayedEndRecPtr to the
REDO point here, instead of ReadRecPtr/EndRecPtr.
- Heikki
From | Date | Subject | |
---|---|---|---|
Next Message | Heikki Linnakangas | 2013-12-20 19:34:30 | Re: Hot standby 9.2.6 -> 9.2.6 PANIC: WAL contains references to invalid pages |
Previous Message | Peter Eisentraut | 2013-12-20 15:46:43 | Re: BUG #8139: initdb: Misleading error message when current user not in /etc/passwd |