On Tue, Jan 17, 2017 at 5:40 PM, Fujii Masao <fujii(at)postgresql(dot)org> wrote:
> Fix an assertion failure related to an exclusive backup.
>
> Previously multiple sessions could execute pg_start_backup() and
> pg_stop_backup() to start and stop an exclusive backup at the same time.
> This could trigger the assertion failure of
> "FailedAssertion("!(XLogCtl->Insert.exclusiveBackup)".
> This happend because, even while pg_start_backup() was starting
> an exclusive backup, other session could run pg_stop_backup()
> concurrently and mark the backup as not-in-progress unconditionally.
>
> This patch introduces ExclusiveBackupState indicating the state of
> an exclusive backup. This state is used to ensure that there is only
> one session running pg_start_backup() or pg_stop_backup() at
> the same time, to avoid the assertion failure.
Please note that this commit message is not completely exact. This fix
does not only avoid triggerring this assertion failure, it also makes
sure that no manual on-disk intervention is needed by the user to
remove a backup_label file after a failure of pg_stop_backup(). Before
this patch, what happened is that the exclusive backup counter in
XLogCtl got decremented before removing backup_label. However, after
the counter was decremented, if an error occurred, the shared memory
counter would have been at 0 with a backup_label file on disk.
Subsequent attempts to start pg_start_backup() would have failed, and
putting the system backup into a consistent state would have required
an operator to remove by hand the backup_label file. The heart of the
logic here is in the callback of pg_stop_backup() when an error
happens during the deletion of the backup_label file.
--
Michael