From: | Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> |
---|---|
To: | Andres Freund <andres(at)2ndquadrant(dot)com>, <pgsql-hackers(at)postgresql(dot)org> |
Cc: | <eshkinkot(at)gmail(dot)com> |
Subject: | Re: Too strict check when starting from a basebackup taken off a standby |
Date: | 2014-12-11 08:56:09 |
Message-ID: | 54895C29.4000104@vmware.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 12/11/2014 05:45 AM, Andres Freund wrote:
> A customer recently reported getting "backup_label contains data
> inconsistent with control file" after taking a basebackup from a standby
> and starting it with a typo in primary_conninfo.
>
> When starting postgres from a basebackup StartupXLOG() has the follow
> code to deal with backup labels:
> if (haveBackupLabel)
> {
> ControlFile->backupStartPoint = checkPoint.redo;
> ControlFile->backupEndRequired = backupEndRequired;
>
> if (backupFromStandby)
> {
> if (dbstate_at_startup != DB_IN_ARCHIVE_RECOVERY)
> ereport(FATAL,
> (errmsg("backup_label contains data inconsistent with control file"),
> errhint("This means that the backup is corrupted and you will "
> "have to use another backup for recovery.")));
> ControlFile->backupEndPoint = ControlFile->minRecoveryPoint;
> }
> }
>
> while I'm not enthusiastic about the error message, that bit of code
> looks sane at first glance. We certainly expect the control file to
> indicate we're in recovery. Since we're unlinking the backup label
> shortly afterwards we'd normally not expect to hit that case after a
> shutdown in recovery.
Check.
> The problem is that after reading the backup label we also have to read
> the corresponding checkpoing from pg_xlog. If primary_conninfo and/or
> restore_command are misconfigured and can't restore files that can only
> be fixed by shutting down the cluster and fixing up recovery.conf -
> which sets DB_SHUTDOWNED_IN_RECOVERY in the control file.
No it doesn't. The state is set to DB_SHUTDOWNED_IN_RECOVERY in
CreateRestartPoint(). If you shut down the server before it has even
read the initial checkpoint record, it will not attempt to create a
restartpoint nor update the control file.
> The easiest solution seems to be to simply also allow that as a state in
> the above check. It might be nicer to not allow a ShutdownXLOG to modify
> the control file et al at that stage, but I think that'd end up being
> more invasive.
>
> A short search shows that that also looks like a credible explanation
> for #12128...
Yeah. I was not able to reproduce this, but I'm clearly missing
something, since both you and Sergey have seen this happening. Can you
write a script to reproduce?
- Heikki
From | Date | Subject | |
---|---|---|---|
Next Message | Kyotaro HORIGUCHI | 2014-12-11 10:27:21 | [Bug] Duplicate results for inheritance and FOR UPDATE. |
Previous Message | Peter Geoghegan | 2014-12-11 08:52:03 | Re: 9.5 release scheduling (was Re: logical column ordering) |