From: | David Steele <david(at)pgmasters(dot)net> |
---|---|
To: | pgsql-bugs(at)postgresql(dot)org |
Subject: | Backend crash on non-exclusive backup cancel |
Date: | 2017-02-28 01:33:34 |
Message-ID: | c86627c3-fcdd-9b67-5a03-e2f1113d1b14@pgmasters.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs pgsql-hackers |
I found this issue while working on a pg_stop_backup() patch. If a
non-exclusive pg_stop_backup() is cancelled and then attempted again the
backend will crash on assertion:
$ test/pg/bin/psql
psql (10devel)
Type "help" for help.
postgres=# select * from pg_start_backup('label', true, false);
pg_start_backup
-----------------
0/2000028
(1 row)
postgres=# select * from pg_stop_backup(false);
^CCancel request sent
ERROR: canceling statement due to user request
postgres=# select * from pg_stop_backup(false);
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!> \q
From the server log:
2017-02-28 01:21:34.755 UTC STATEMENT: select * from pg_stop_backup(false);
TRAP: FailedAssertion("!(XLogCtl->Insert.nonExclusiveBackups > 0)",
File: "/postgres/src/backend/access/transam/xlog.c", Line: 10723)
This error was produced in master at 30df93f. Configure settings are
--enable-cassert --enable-tap-tests --with-openssl.
Disabling assertions "works", but there is still a problem. A backend
that keeps cancelling pg_stop_backup() without ever resetting the
exclusive flag in xlogfunc.c can decrement the the shared variable
XLogCtl->Insert.nonExclusiveBackups as many times as it wants. As far
as I can see the worst that will happen is that
XLogCtl->Insert.forcePageWrites won't get set back to false, but that's
still a bug.
This condition should throw "backup is not in progress" just as a
exclusive backup would, whether assertions are enabled or not.
I believe the solution is to move the exclusive flag to xlog.c and only
decrement XLogCtl->Insert.nonExclusiveBackups when exclusive is true,
otherwise return an error. Even then, it wouldn't be clear if the
backup had completed or not. I suppose any cancelled non-exclusive
pg_stop_backup() should be considered aborted whether a stop backup
record was written or not?
If that makes sense I'm happy to work up a patch. This is definitely an
edge case and I seriously doubt it is causing any issues in the field.
--
-David
david(at)pgmasters(dot)net
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2017-02-28 03:05:11 | Re: Backend crash on non-exclusive backup cancel |
Previous Message | Tom Lane | 2017-02-27 23:07:33 | Re: BUG #14543: libpq fails with group readable ssl keys |
From | Date | Subject | |
---|---|---|---|
Next Message | Haribabu Kommi | 2017-02-28 01:42:34 | Re: utility commands benefiting from parallel plan |
Previous Message | Andres Freund | 2017-02-28 01:13:32 | Re: Replication vs. float timestamps is a disaster |