From: | Michael Paquier <michael(dot)paquier(at)gmail(dot)com> |
---|---|
To: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com> |
Cc: | Andreas Seltenreich <seltenreich(at)gmx(dot)de>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>, Magnus Hagander <magnus(at)hagander(dot)net> |
Subject: | Re: Re: [sqlsmith] FailedAssertion("!(XLogCtl->Insert.exclusiveBackup)", File: "xlog.c", Line: 10200) |
Date: | 2016-08-30 04:32:26 |
Message-ID: | CAB7nPqSq0YEnUfB1wcXdv75ZEKXuYf+4JpSozR2TVMUfQjOakQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Aug 29, 2016 at 11:16 PM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> You seem to add another entry for this patch into CommitFest.
> Either of them needs to be removed.
> https://commitfest.postgresql.org/10/698/
Indeed. I just removed this one.
> This patch prevents pg_stop_backup from starting while pg_start_backup
> is running. OTOH, we also should prevent pg_start_backup from starting
> until "previous" pg_stop_backup has completed? What happens if
> pg_start_backup starts while pg_stop_backup is running?
> As far as I read the current code, ISTM that there is no need to do that,
> but it's better to do the double check.
I don't agree about doing that. The on-disk presence of the
backup_label file is what prevents pg_start_backup to run should
pg_stop_backup have already decremented its counters but not unlinked
yet the backup_label, so this insurance is really enough IMO. One good
reason to keep pg_stop_backup as-is in its failure handling is that if
for example it fails to remove the backup_label file, it is still
possible to do again a backup by just removing manually the existing
backup_label file *without* restarting the server. If you have an
in-memory state it would not be possible to fallback to this method
and you'd need a method to clean up the in-memory state.
Now, well, with the patch as well as HEAD, it could be possible that
the backup counters are decremented, but pg_stop_backup *fails* to
remove the backup_label file which would prevent any subsequent
pg_start_backup to run, because there is still a backup_label file, as
well as any other pg_stop_backup to run, because there is no backup
running per the in-memory state. We could improve the in-memory state
by:
- Having an extra exclusive backup status to mark an exclusive backup
as stopping, say EXCLUSIVE_BACKUP_STOPPING. Then mark the exclusive
backup status as such when do_pg_stop_backup starts.
- delaying counter decrementation in pg_stop_backup to the moment when
the backup_label file has been removed.
- Taking an extra time the exclusive WAL insert lock after
backup_label is removed, and decrement the counters.
- During the time the backup_label file is removed, we need an error
callback to switch back to BACKUP_RUNNING in case of error, similarly
to do_pg_start_backup.
Which is more or less the attached patch. Now, if pg_stop_backup
fails, this has the disadvantage to force a server restart to clean up
the in-memory state, instead of just removing manually the existing
backup_file.
For those reasons I still strongly recommend using v3, and not meddle
with the error handling of pg_stop_backup. Because, well, that's
actually not necessary, and this would just hurt the error handling of
exclusive backups.
--
Michael
Attachment | Content-Type | Size |
---|---|---|
base-backup-crash-v4.patch | application/x-patch | 11.2 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Fujii Masao | 2016-08-30 04:44:51 | Re: Re: [sqlsmith] FailedAssertion("!(XLogCtl->Insert.exclusiveBackup)", File: "xlog.c", Line: 10200) |
Previous Message | Kyotaro HORIGUCHI | 2016-08-30 04:21:31 | Comment on GatherPath.single_copy |