Quick Links

Re: PANIC during crash recovery of a recently promoted standby

From:	Michael Paquier <michael(at)paquier(dot)xyz>
To:	Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: PANIC during crash recovery of a recently promoted standby
Date:	2018-05-11 03:38:34
Message-ID:	20180511033834.GF26879@paquier.xyz
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Thu, May 10, 2018 at 10:52:12AM +0530, Pavan Deolasee wrote:
> I propose that we should always clear the minRecoveryPoint after promotion
> to ensure that crash recovery always run to the end if a just-promoted
> standby crashes before completing its first regular checkpoint. A WIP patch
> is attached.

I have been playing with your patch and upgraded the test to check as
well for cascading standbys. We could use that in the final patch.
That's definitely something to add in the recovery test suite, and the
sleep phases should be replaced by waits on replay and/or flush.

Still, that approach looks sensitive to me. A restart point could be
running while the end-of-recovery record is inserted, so your patch
could update minRecoveryPoint to InvalidXLogRecPtr, and then a restart
point would happily update again the control file's minRecoveryPoint to
lastCheckPointEndPtr because it would see that the former is older than
lastCheckPointEndPtr (let's not forget that InvalidXLogRecPtr is 0), so
you could still crash on invalid pages?

I need to think a bit more about that stuff, but one idea would be to
use a special state in the control file to mark it as ending recovery,
this way we would control race conditions with restart points.
--
Michael

In response to

PANIC during crash recovery of a recently promoted standby at 2018-05-10 05:22:12 from Pavan Deolasee

Responses

Re: PANIC during crash recovery of a recently promoted standby at 2018-05-11 15:09:58 from Alvaro Herrera

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Amit Langote	2018-05-11 03:59:27	Re: Should we add GUCs to allow partition pruning to be disabled?
Previous Message	Andres Freund	2018-05-11 03:32:27	Re: [HACKERS] Surjective functional indexes