From: | Nathan Bossart <nathandbossart(at)gmail(dot)com> |
---|---|
To: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org, masao(dot)fujii(at)oss(dot)nttdata(dot)com |
Subject: | Re: Possible corruption by CreateRestartPoint at promotion |
Date: | 2022-04-26 18:33:49 |
Message-ID: | 20220426183349.GA3002960@nathanxps13 |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Mar 16, 2022 at 10:24:44AM +0900, Kyotaro Horiguchi wrote:
> While discussing on additional LSNs in checkpoint log message,
> Fujii-san pointed out [2] that there is a case where
> CreateRestartPoint leaves unrecoverable database when concurrent
> promotion happens. That corruption is "fixed" by the next checkpoint
> so it is not a severe corruption.
I suspect we'll start seeing this problem more often once end-of-recovery
checkpoints are removed [0]. Would you mind creating a commitfest entry
for this thread? I didn't see one.
> AFAICS since 9.5, no check(/restart)pionts won't run concurrently with
> restartpoint [3]. So I propose to remove the code path as attached.
Yeah, this "quick hack" has been around for some time (2de48a8), and I
believe much has changed since then, so something like what you're
proposing is probably the right thing to do.
> /* Also update the info_lck-protected copy */
> SpinLockAcquire(&XLogCtl->info_lck);
> - XLogCtl->RedoRecPtr = lastCheckPoint.redo;
> + XLogCtl->RedoRecPtr = RedoRecPtr;
> SpinLockRelease(&XLogCtl->info_lck);
>
> /*
> @@ -6984,7 +6987,10 @@ CreateRestartPoint(int flags)
> /* Update the process title */
> update_checkpoint_display(flags, true, false);
>
> - CheckPointGuts(lastCheckPoint.redo, flags);
> + CheckPointGuts(RedoRecPtr, flags);
I don't understand the purpose of these changes. Are these related to the
fix, or is this just tidying up?
[0] https://postgr.es/m/CA%2BTgmoY%2BSJLTjma4Hfn1sA7S6CZAgbihYd%3DKzO6srd7Ut%3DXVBQ%40mail.gmail.com
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2022-04-26 19:47:13 | Re: Fix primary crash continually with invalid checkpoint after promote |
Previous Message | Nathan Bossart | 2022-04-26 18:16:29 | Re: Fix primary crash continually with invalid checkpoint after promote |