From: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
---|---|
To: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, soumyadeep2007(at)gmail(dot)com |
Cc: | pgsql-hackers(at)postgresql(dot)org, kevin(dot)yeap(at)vmware(dot)com, michael(at)paquier(dot)xyz, jyih(at)vmware(dot)com |
Subject: | Re: PITR promote bug: Checkpointer writes to older timeline |
Date: | 2021-03-03 08:46:42 |
Message-ID: | 21658731-e335-54f4-5b5a-107f8a169751@iki.fi |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 03/03/2021 08:47, Kyotaro Horiguchi wrote:
> At Tue, 2 Mar 2021 17:56:03 -0800, Soumyadeep Chakraborty <soumyadeep2007(at)gmail(dot)com> wrote in
>> When there are prepared transactions in an older timeline, in the
>> checkpointer, a call to CheckPointTwoPhase() and subsequently to
>> XlogReadTwoPhaseData() and subsequently to read_local_xlog_page() leads
>> to the following line:
>>
>> read_upto = GetXLogReplayRecPtr(&ThisTimeLineID);
>>
>> GetXLogReplayRecPtr() will change ThisTimeLineID to 1, in order to read
>> the two phase WAL records in the older timeline. This variable will
>> remain unchanged and the checkpointer ends up writing the checkpoint
>> record into the older WAL segment (when XLogBeginInsert() is called
>> within CreateCheckPoint(), the value is still 1). The value is not
>> synchronized as even if RecoveryInProgress() is called,
>> xlogctl->SharedRecoveryState is not RECOVERY_STATE_DONE
>> (SharedRecoveryInProgress = true in older versions) as the startup
>> process waits for the checkpointer inside RequestCheckpoint() (since
>> recovery_target_action='promote' involves a non-fast promotion). Thus,
>> InitXLOGAccess() is not called and the value of ThisTimeLineID is not
>> updated before the checkpoint record write.
>>
>> Since 1148e22a82e, GetXLogReplayRecPtr() is called with ThisTimeLineID
>> instead of a local variable, within read_local_xlog_page().
Confusing...
>> PFA a small patch that fixes the problem by explicitly calling
>> InitXLOGAccess() in CheckPointTwoPhase(), after the two phase state data
>> is read, in order to update ThisTimeLineID to the latest timeline. It is
>> okay to call InitXLOGAccess() as it is lightweight and would mostly be
>> a no-op.
>
> It is correct that read_local_xlog_page() changes ThisTimeLineID, but
> InitXLOGAccess() is correctly called in CreateCheckPoint:
>
> | /*
> | * An end-of-recovery checkpoint is created before anyone is allowed to
> | * write WAL. To allow us to write the checkpoint record, temporarily
> | * enable XLogInsertAllowed. (This also ensures ThisTimeLineID is
> | * initialized, which we need here and in AdvanceXLInsertBuffer.)
> | */
> | if (flags & CHECKPOINT_END_OF_RECOVERY)
> | LocalSetXLogInsertAllowed();
>
> It seems to e suficcient to recover ThisTimeLineID from the checkpoint
> record to be written, as attached?
I think it should be reset even earlier, inside XlogReadTwoPhaseData()
probably. With your patch, doesn't the LogStandbySnapshot() call just
above where you're ressetting ThisTimeLineID also write a WAL record
with incorrect timeline?
Even better, can we avoid setting ThisTimeLineID in
XlogReadTwoPhaseData() in the first place?
- Heikki
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2021-03-03 08:52:00 | Re: Increase value of OUTER_VAR |
Previous Message | Magnus Hagander | 2021-03-03 08:44:15 | Re: We should stop telling users to "vacuum that database in single-user mode" |