From: | Dilip Kumar <dilipbalaut(at)gmail(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Race condition in recovery? |
Date: | 2021-05-04 12:11:06 |
Message-ID: | CAFiTN-tO+OxiiNiM8oE=+10xhiMZkGrUZ-L1bn1SRChjzVnn7Q@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Mar 2, 2021 at 3:14 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> =====
> ee994272ca50f70b53074f0febaec97e28f83c4e
> Author: Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi> 2013-01-03 14:11:58
> Committer: Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi> 2013-01-03 14:11:58
>
> Delay reading timeline history file until it's fetched from master.
>
> Streaming replication can fetch any missing timeline history files from the
> master, but recovery would read the timeline history file for the target
> timeline before reading the checkpoint record, and before walreceiver has
> had a chance to fetch it from the master. Delay reading it, and the sanity
> checks involving timeline history, until after reading the checkpoint
> record.
>
> There is at least one scenario where this makes a difference: if you take
> a base backup from a standby server right after a timeline switch, the
> WAL segment containing the initial checkpoint record will begin with an
> older timeline ID. Without the timeline history file, recovering that file
> will fail as the older timeline ID is not recognized to be an ancestor of
> the target timeline. If you try to recover from such a backup, using only
> streaming replication to fetch the WAL, this patch is required for that to
> work.
> =====
The above commit avoid initializing the expectedTLEs from the
recoveryTargetTLI as shown in below hunk from this commit.
@@ -5279,49 +5299,6 @@ StartupXLOG(void)
*/
readRecoveryCommandFile();
- /* Now we can determine the list of expected TLIs */
- expectedTLEs = readTimeLineHistory(recoveryTargetTLI);
-
I think the fix for the problem will be that, after reading/validating
the checkpoint record, we can free the current value of expectedTLEs
and reinitialize it based on the recoveryTargetTLI as shown in the
attached patch?
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
Attachment | Content-Type | Size |
---|---|---|
0001-After-reading-checkpoint-record-fix-expectedTLEs-to-.patch | text/x-patch | 1.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2021-05-04 12:37:22 | Re: WIP: WAL prefetch (another approach) |
Previous Message | Thomas Munro | 2021-05-04 11:12:17 | Re: A test for replay of regression tests |