From: | Thom Brown <thom(at)linux(dot)com> |
---|---|
To: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Primary not sending to synchronous standby |
Date: | 2015-02-23 15:25:57 |
Message-ID: | CAA-aLv6RZCxYxCsPpxjoty6WVqtN1jAX0G1CZVtUNubRuCA_cw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
I've noticed that if the primary is started and then a base backup is
immediately taken from it and started as as a synchronous standby, it
doesn't replicate and the primary hangs indefinitely when trying to run any
WAL-generating statements. It only recovers when either the primary is
restarted (which has to use a fast shutdown otherwise it also hangs
forever), or the standby is restarted.
Here's a way of reproducing it:
-------------------------------
mkdir -p -m 0700 primary standby1
initdb -N -k -D primary -E 'UTF8'
cat << PRIMARYCONFIG >> primary/postgresql.conf
shared_buffers = 8MB
logging_collector = on
log_line_prefix = '%m - %u - %d'
synchronous_standby_names = 'standby1'
max_connections = 8
wal_level = 'hot_standby'
port = 5530
max_wal_senders = 3
wal_keep_segments = 6
PRIMARYCONFIG
cat << PRIMARYHBA >> primary/pg_hba.conf
local replication rep_user trust
host replication rep_user 127.0.0.1/32 trust
host replication rep_user ::1/128 trust
PRIMARYHBA
pg_ctl start -D primary
psql -p 5530 -h localhost -c 'SET SESSION synchronous_commit TO
'off';CREATE USER rep_user REPLICATION;;' -d postgres
pg_basebackup -x -D standby1 -h localhost -p 5530 -U rep_user
cat << STANDBYCONFIG >> standby1/postgresql.conf
port = 5531
hot_standby = on
STANDBYCONFIG
cat << STANDBYRECOVERY >> standby1/recovery.conf
standby_mode = 'on'
recovery_target_timeline = 'latest'
primary_conninfo = 'host=127.0.0.1 user=rep_user port=5530
application_name=standby1'
STANDBYRECOVERY
pg_ctl -D standby1 start
-------------------------------
Note that if you run the commands one by one, there isn't a problem. If
you run it as a script, the standby doesn't connect to the primary. There
aren't any errors reported by either the standby or the primary. The
primary's wal sender process reports the following:
wal sender process rep_user 127.0.0.1(45243) startup waiting for 0/3000158
Anyone know why this would be happening? And if this could be a problem in
other scenarios?
Thom
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2015-02-23 15:38:44 | Re: Primary not sending to synchronous standby |
Previous Message | Andres Freund | 2015-02-23 15:25:53 | Re: "multiple backends attempting to wait for pincount 1" |