Quick Links

Re: [BUG?] lag of minRecoveryPont in archive recovery

From:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To:	Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc:	Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: [BUG?] lag of minRecoveryPont in archive recovery
Date:	2012-12-11 15:04:34
Message-ID:	CAHGQGwHB-ToOv1W52YRBX9+5vW2B5BvHad0XjfW6FfYS=5F-=Q@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, Dec 11, 2012 at 6:14 PM, Heikki Linnakangas
<hlinnakangas(at)vmware(dot)com> wrote:
> On 11.12.2012 08:07, Kyotaro HORIGUCHI wrote:
>>>
>>> The cause is that CheckRecoveryConsistency() is called before rm_redo(),
>>> as Horiguchi-san suggested upthead. Imagine the case where
>>> minRecoveryPoint is set to the location of the XLOG_SMGR_TRUNCATE
>>> record. When restarting the server with that minRecoveryPoint,
>>> the followings would happen, and then PANIC occurs.
>>>
>>> 1. XLOG_SMGR_TRUNCATE record is read.
>>> 2. CheckRecoveryConsistency() is called, and database is marked as
>>> consistent since we've reached minRecoveryPoint (i.e., the location
>>> of XLOG_SMGR_TRUNCATE).
>>> 3. XLOG_SMGR_TRUNCATE record is replayed, and invalid page is
>>> found.
>>> 4. Since the database has already been marked as consistent, an invalid
>>> page leads to PANIC.
>>
>>
>> Exactly.
>>
>> In smgr_redo, EndRecPtr which is pointing the record next to
>> SMGR_TRUNCATE, is used as the new minRecoveryPoint. On the other
>> hand, during the second startup of the standby,
>> CheckRecoveryConsistency checks for consistency by
>> XLByteLE(minRecoveryPoint, EndRecPtr) which should be true at
>> just BEFORE the SMGR_TRUNCATE record is applied. So
>> reachedConsistency becomes true just before the SMGR_TRUNCATE
>> record will be applied. Bang!
>
>
> Ah, I see. I thought moving CheckRecoveryConsistency was just a nice-to-have
> thing, so that we'd notice that we're consistent earlier, so didn't pay
> attention to that part.
>
> I think the real issue here is that CheckRecoveryConsistency should not be
> comparing minRecoveryPoint against recoveryLastRecPtr, not EndRecPtr as it
> currently does. EndRecPtr points to the end of the last record *read*, while
> recoveryLastRecPtr points to the end of the last record *replayed*. The
> latter is what CheckRecoveryConsistency should be concerned with.
>
> BTW, it occurs to me that we have two variables in the shared struct that
> are almost the same thing: recoveryLastRecPtr and replayEndRecPtr. The
> former points to the end of the last record successfully replayed, while
> replayEndRecPtr points to the end of the last record successfully replayed,
> or begin replayed at the moment. I find that confusing, so I suggest that we
> rename recoveryLastRecPtr to make that more clear. Included in the attached
> patch.

The patch looks good. I confirmed that the PANIC error didn't happen,
with the patch.

xlog.c
> * Initialize shared replayEndRecPtr, recoveryLastRecPtr, and

s/recoveryLastRecPtr/lastReplayedEndRecPtr

Regards,

--
Fujii Masao

In response to

Re: [BUG?] lag of minRecoveryPont in archive recovery at 2012-12-11 09:14:18 from Heikki Linnakangas

Responses

Re: [BUG?] lag of minRecoveryPont in archive recovery at 2012-12-11 16:37:02 from Heikki Linnakangas
Re: [BUG?] lag of minRecoveryPont in archive recovery at 2012-12-11 17:03:47 from Heikki Linnakangas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Bruce Momjian	2012-12-11 15:06:50	Re: [WIP] pg_ping utility
Previous Message	Andres Freund	2012-12-11 14:47:10	Re: pg_dump cosmetic problem while dumping/restoring rules