From: | Nick Cleaton <nick(at)cleaton(dot)net> |
---|---|
To: | pgsql-bugs(at)postgresql(dot)org |
Subject: | streaming replication master can fail to shut down |
Date: | 2016-03-09 16:17:46 |
Message-ID: | CAFgz3kus=rC_avEgBV=+hRK5HYJ8vXskJRh8yEAbahJGTzF2VQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
I've found that the master server does not finish shutting down in
response to SIGINT to the postmaster when a streaming replica is
connected, if and only if all of the following hold:
* The master server is a specific new bit of kit, which has slightly
higher performance CPUs than the replica and is much more idle.
* The master is configured with archive_mode=on and a non-empty archive command.
* I do not have "strace -p ..." running on the master against the wal
sender servicing the slow replica.
While it's failing to shut down, about 10 megabytes per second of
network traffic is exchanged between master and replica, and the wal
receiver on the replica is at 100% CPU whereas the wal sender on the
master is around 70% to 80%. The wal receiver has about 4MiB of queued
tcp input, the wal sender has a few bytes.This appears to continue
indefinitely, I've tried up to 30 minutes.
If I "strace -p ... >/dev/null 2>&1" the wal sender during the blocked
shutdown, it seems to slow it down enough (wal sender's cpu usage
drops to about 50%) that after about 10 to 20 seconds the shutdown
completes cleanly and the replica logs "end of wal".
I've seen this only with one specific server acting as the master, so
I cannot absolutely rule out a hardware issue of some kind.
This was originally observed with 9.4.6, but I've now replicated it
with 9.5.1 using the pgdg packages under Ubuntu:
postgresql-9.5 9.5.1-1.pgdg14.04+1
postgresql-client-9.5 9.5.1-1.pgdg14.04+1
postgresql-contrib-9.5 9.5.1-1.pgdg14.04+1
Under 9.5.1 I've observed this by simply installing the packages and
then following https://wiki.postgresql.org/wiki/Binary_Replication_Tutorial#Binary_Replication_in_7_Steps
with only the addition of archive_mode=on and
archive_command=/bin/true on the master.
CPUs on the master: dual Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz,
hyperthreading off.
CPUs on the replica: dual Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz,
hyperthreading off.
From | Date | Subject | |
---|---|---|---|
Next Message | gtakahashi | 2016-03-09 16:51:36 | BUG #14010: Multi-valued Index-only scans do not properly handle nulls in search |
Previous Message | Tom Lane | 2016-03-09 15:31:06 | Re: gram.y comment issue |