From: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com> |
---|---|
To: | Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> |
Cc: | Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: [BUG?] lag of minRecoveryPont in archive recovery |
Date: | 2012-12-11 15:04:34 |
Message-ID: | CAHGQGwHB-ToOv1W52YRBX9+5vW2B5BvHad0XjfW6FfYS=5F-=Q@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Dec 11, 2012 at 6:14 PM, Heikki Linnakangas
<hlinnakangas(at)vmware(dot)com> wrote:
> On 11.12.2012 08:07, Kyotaro HORIGUCHI wrote:
>>>
>>> The cause is that CheckRecoveryConsistency() is called before rm_redo(),
>>> as Horiguchi-san suggested upthead. Imagine the case where
>>> minRecoveryPoint is set to the location of the XLOG_SMGR_TRUNCATE
>>> record. When restarting the server with that minRecoveryPoint,
>>> the followings would happen, and then PANIC occurs.
>>>
>>> 1. XLOG_SMGR_TRUNCATE record is read.
>>> 2. CheckRecoveryConsistency() is called, and database is marked as
>>> consistent since we've reached minRecoveryPoint (i.e., the location
>>> of XLOG_SMGR_TRUNCATE).
>>> 3. XLOG_SMGR_TRUNCATE record is replayed, and invalid page is
>>> found.
>>> 4. Since the database has already been marked as consistent, an invalid
>>> page leads to PANIC.
>>
>>
>> Exactly.
>>
>> In smgr_redo, EndRecPtr which is pointing the record next to
>> SMGR_TRUNCATE, is used as the new minRecoveryPoint. On the other
>> hand, during the second startup of the standby,
>> CheckRecoveryConsistency checks for consistency by
>> XLByteLE(minRecoveryPoint, EndRecPtr) which should be true at
>> just BEFORE the SMGR_TRUNCATE record is applied. So
>> reachedConsistency becomes true just before the SMGR_TRUNCATE
>> record will be applied. Bang!
>
>
> Ah, I see. I thought moving CheckRecoveryConsistency was just a nice-to-have
> thing, so that we'd notice that we're consistent earlier, so didn't pay
> attention to that part.
>
> I think the real issue here is that CheckRecoveryConsistency should not be
> comparing minRecoveryPoint against recoveryLastRecPtr, not EndRecPtr as it
> currently does. EndRecPtr points to the end of the last record *read*, while
> recoveryLastRecPtr points to the end of the last record *replayed*. The
> latter is what CheckRecoveryConsistency should be concerned with.
>
> BTW, it occurs to me that we have two variables in the shared struct that
> are almost the same thing: recoveryLastRecPtr and replayEndRecPtr. The
> former points to the end of the last record successfully replayed, while
> replayEndRecPtr points to the end of the last record successfully replayed,
> or begin replayed at the moment. I find that confusing, so I suggest that we
> rename recoveryLastRecPtr to make that more clear. Included in the attached
> patch.
The patch looks good. I confirmed that the PANIC error didn't happen,
with the patch.
xlog.c
> * Initialize shared replayEndRecPtr, recoveryLastRecPtr, and
s/recoveryLastRecPtr/lastReplayedEndRecPtr
Regards,
--
Fujii Masao
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2012-12-11 15:06:50 | Re: [WIP] pg_ping utility |
Previous Message | Andres Freund | 2012-12-11 14:47:10 | Re: pg_dump cosmetic problem while dumping/restoring rules |