BUG #14039: Possible documentation bug regarding streaming replication slots

From: sfrazer(at)lakeshoreint(dot)com
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #14039: Possible documentation bug regarding streaming replication slots
Date: 2016-03-21 16:21:22
Message-ID: 20160321162122.2919.27632@wrigleys.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 14039
Logged by: Scott Frazer
Email address: sfrazer(at)lakeshoreint(dot)com
PostgreSQL version: 9.5.1
Operating system: CentOS 7
Description:

When setting up replication to 3 servers I discovered a problem that
prevented autovacuum from running efficiently. The problem appears to be
with replication_slots.

>From scratch, I set up a master database (server_a) then created a replica
(server_b). I used the following commands on server_a to create the
replica:

psql -c "select pg_start_backup('replica_creation');"
rsync -cva --inplace --exclude=*pg_log* --exclude=*.conf
/var/lib/pgsql/9.5/data/ server_b:/var/lib/pgsql/9.5/data/
psql -c "select pg_stop_backup();"

On server_b I set standby_mode to on and gave it a primary_slot_name.
"select * from pg_replication_slots;" showed what I expected (namely
server_b active with an increasing xmin and restart_lsn value)

Next I began to setup server_c using the same method as above. When server_c
was connected and replicating running "select * from pg_replication_slots;"
on server_a showed the restart_lsn value incrementing, but the xmin value
was not.

I then ran "select * from pg_replication_slots;" on server_c and saw that it
had an entry for server_b marked as active="f" and with an xmin value that
matched the value listed for server_c on server_a. At this point I believe
server_c was holding transactions(?) for server_b and reporting this to the
upstream server_a

When I tried to drop the replication slot on server_c I was met with an
error which I failed to capture but I believe was "column does not exist"

To resolve the problem I removed the primary_slot_name value from
recovery.conf on server_c, restarted it, ran "select
pg_drop_replication_slot('server_b')" successfully and then added
primary_slot_name back to the recovery.conf and restarted the server again.

Did I do something wrong when setting up the replicas? Or should there be a
note in the documentation that if you already have replication slots active
on a master server the replica will copy those slots over?

Thanks,
Scott Frazer

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2016-03-21 16:39:21 Re: BUG #14033: cross-compilation to ARM fails
Previous Message Daniel Golle 2016-03-21 15:52:31 Re: BUG #14033: cross-compilation to ARM fails