WAL senders sending base backups not listening much to SIGTERM

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: PostgreSQL mailing lists <pgsql-bugs(at)postgresql(dot)org>
Subject: WAL senders sending base backups not listening much to SIGTERM
Date: 2016-09-27 02:05:24
Message-ID: CAB7nPqQokxXWEGZLOFEkeDdPWEikxVRb5g=NeAtEQxZhJ5p12Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi all,

A couple of days ago I received as report that Postgres does not
shutdown quickly even if the fast stop mode is used with pg_ctl.
Basically "pg_ctl stop -m fast -t 300" was trying to stop the server
but I saw the following process still remaining alive:
vpostgr+ 6883 0.0 0.1 490780 14928 ? Ss 00:51 0:00
postgres: wal sender process replicator 192.168.111.152(39986) sending
backup "pg_basebackup base backup"
And this prevented the postmaster to stop for 5 minutes, until it gave
up at the end of the timeout.

I am aware of the fact that WAL senders are stopped last to be given
the chance to stream WAL records at shutdown, per what InitWalSnd. But
also what I am noticing is that in this case WAL senders check for
walsender_ready_to_stop to determine if a WAL sender should do an
early exit or not, but WAL senders sending base backups do not check
or use it.

I have not been able to reproduce manually this behavior with 9.4.9
(master seems a lot of responsive) and saw this behavior only once on
a test lab, with a rather large base backup. This is rather an
annoying behavior, and I'd expect the WAL sender to leave as fast as
it can, and in case if a fast mode I'd expect server to be left in a
clean state by using CancelBackup() at least.

Perhaps I am missing something? Thoughts?
--
Michael

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Alvaro Herrera 2016-09-27 04:09:09 Re: BUG #14334: vacuumdb.c build failure on openbsd
Previous Message Vik Fearing 2016-09-26 14:48:22 Re: BUG #14340: pg xlog size increasing