From: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> |
---|---|
To: | soumyadeep2007(at)gmail(dot)com |
Cc: | pgsql-hackers(at)postgresql(dot)org, kevin(dot)yeap(at)vmware(dot)com, michael(at)paquier(dot)xyz, jyih(at)vmware(dot)com |
Subject: | Re: PITR promote bug: Checkpointer writes to older timeline |
Date: | 2021-03-03 06:47:42 |
Message-ID: | 20210303.154742.1003082376286316043.horikyota.ntt@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
At Tue, 2 Mar 2021 17:56:03 -0800, Soumyadeep Chakraborty <soumyadeep2007(at)gmail(dot)com> wrote in
> Hello hackers,
>
> We came across an issue where the checkpointer writes to the older
> timeline while a promotion is ongoing after reaching the recovery point
> in a PITR, when there are prepared transactions before the recovery
> point. We came across this issue first in REL_12_STABLE and saw that it
> also exists in devel.
Good Catch! I can reproduce that.
> When there are prepared transactions in an older timeline, in the
> checkpointer, a call to CheckPointTwoPhase() and subsequently to
> XlogReadTwoPhaseData() and subsequently to read_local_xlog_page() leads
> to the following line:
>
> read_upto = GetXLogReplayRecPtr(&ThisTimeLineID);
>
> GetXLogReplayRecPtr() will change ThisTimeLineID to 1, in order to read
> the two phase WAL records in the older timeline. This variable will
> remain unchanged and the checkpointer ends up writing the checkpoint
> record into the older WAL segment (when XLogBeginInsert() is called
> within CreateCheckPoint(), the value is still 1). The value is not
> synchronized as even if RecoveryInProgress() is called,
> xlogctl->SharedRecoveryState is not RECOVERY_STATE_DONE
> (SharedRecoveryInProgress = true in older versions) as the startup
> process waits for the checkpointer inside RequestCheckpoint() (since
> recovery_target_action='promote' involves a non-fast promotion). Thus,
> InitXLOGAccess() is not called and the value of ThisTimeLineID is not
> updated before the checkpoint record write.
>
> Since 1148e22a82e, GetXLogReplayRecPtr() is called with ThisTimeLineID
> instead of a local variable, within read_local_xlog_page().
>
> PFA a small patch that fixes the problem by explicitly calling
> InitXLOGAccess() in CheckPointTwoPhase(), after the two phase state data
> is read, in order to update ThisTimeLineID to the latest timeline. It is
> okay to call InitXLOGAccess() as it is lightweight and would mostly be
> a no-op.
It is correct that read_local_xlog_page() changes ThisTimeLineID, but
InitXLOGAccess() is correctly called in CreateCheckPoint:
| /*
| * An end-of-recovery checkpoint is created before anyone is allowed to
| * write WAL. To allow us to write the checkpoint record, temporarily
| * enable XLogInsertAllowed. (This also ensures ThisTimeLineID is
| * initialized, which we need here and in AdvanceXLInsertBuffer.)
| */
| if (flags & CHECKPOINT_END_OF_RECOVERY)
| LocalSetXLogInsertAllowed();
It seems to e suficcient to recover ThisTimeLineID from the checkpoint
record to be written, as attached?
regareds.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachment | Content-Type | Size |
---|---|---|
write_checkpoint_record_to_the_right_timeline.patch | text/x-patch | 620 bytes |
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2021-03-03 06:57:47 | Re: buildfarm windows checks / tap tests on windows |
Previous Message | Dilip Kumar | 2021-03-03 06:37:53 | Re: [Patch] ALTER SYSTEM READ ONLY |