Quick Links

BUG #14039: Possible documentation bug regarding streaming replication slots

From:	sfrazer(at)lakeshoreint(dot)com
To:	pgsql-bugs(at)postgresql(dot)org
Subject:	BUG #14039: Possible documentation bug regarding streaming replication slots
Date:	2016-03-21 16:21:22
Message-ID:	20160321162122.2919.27632@wrigleys.postgresql.org
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-bugs

The following bug has been logged on the website:

Bug reference: 14039
Logged by: Scott Frazer
Email address: sfrazer(at)lakeshoreint(dot)com
PostgreSQL version: 9.5.1
Operating system: CentOS 7
Description:

When setting up replication to 3 servers I discovered a problem that
prevented autovacuum from running efficiently. The problem appears to be
with replication_slots.

>From scratch, I set up a master database (server_a) then created a replica
(server_b). I used the following commands on server_a to create the
replica:

psql -c "select pg_start_backup('replica_creation');"
rsync -cva --inplace --exclude=*pg_log* --exclude=*.conf
/var/lib/pgsql/9.5/data/ server_b:/var/lib/pgsql/9.5/data/
psql -c "select pg_stop_backup();"

On server_b I set standby_mode to on and gave it a primary_slot_name.
"select * from pg_replication_slots;" showed what I expected (namely
server_b active with an increasing xmin and restart_lsn value)

Next I began to setup server_c using the same method as above. When server_c
was connected and replicating running "select * from pg_replication_slots;"
on server_a showed the restart_lsn value incrementing, but the xmin value
was not.

I then ran "select * from pg_replication_slots;" on server_c and saw that it
had an entry for server_b marked as active="f" and with an xmin value that
matched the value listed for server_c on server_a. At this point I believe
server_c was holding transactions(?) for server_b and reporting this to the
upstream server_a

When I tried to drop the replication slot on server_c I was met with an
error which I failed to capture but I believe was "column does not exist"

To resolve the problem I removed the primary_slot_name value from
recovery.conf on server_c, restarted it, ran "select
pg_drop_replication_slot('server_b')" successfully and then added
primary_slot_name back to the recovery.conf and restarted the server again.

Did I do something wrong when setting up the replicas? Or should there be a
note in the documentation that if you already have replication slots active
on a master server the replica will copy those slots over?

Thanks,
Scott Frazer

Browse pgsql-bugs by date

	From	Date	Subject
Next Message	Tom Lane	2016-03-21 16:39:21	Re: BUG #14033: cross-compilation to ARM fails
Previous Message	Daniel Golle	2016-03-21 15:52:31	Re: BUG #14033: cross-compilation to ARM fails