From: | John Lumby <johnlumby(at)hotmail(dot)com> |
---|---|
To: | pgsql general <pgsql-general(at)postgresql(dot)org> |
Subject: | how to switch old replication Master to new Standby after promoting old Standby |
Date: | 2016-03-14 17:28:16 |
Message-ID: | COL131-W552E6E64C1C5202B7C3847A3880@phx.gbl |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
From: johnlumby(at)hotmail(dot)com
To: pgsql-general-owner(at)postgresql(dot)org
Subject: how to switch old replication Master to new Standby after promoting old Standby
Date: Mon, 14 Mar 2016 13:23:29 -0400
In a scenario involving replication where no failure occurs but I want to interchange Master <->Standby,
*and* want to avoid making another full base backup and rsyncing it across,
I have found it easy to promote old Standby to new Master with pg_ctl promote,
but very difficult to restart the old Primary as a new Standby *without* performing new base backup.
Assume current 9.5.1 and using streaming replication with a named replication slot if relevant
Second assumption - I am able to temporarily prevent any relational updates to the database before I start the switchover
e.g. the
default_transaction_read_only = on
setting together with being able to control what transactions do if anything
So I reach a point where both systems have postgresql running without any replication,
both have identical content in all databases (that I can control),
and I am willing to tolerate short restarts if need be and also to scp/rsync the contents of pg_xlog
and other small files but *not* the entire cluster directory or any database base directories.
What do I do next?
Here is what I have found seems to work but I am not sure it is robust:
1. shut down both new Master and intended-to-be-new-Standby
2. on intended-to-be-new-Standby, remove the entire content of pg_xlog and the global/pg_control
3. from new Master , tar + scp the entire content of pg_xlog and the global/pg_control to intended-to-be-new-Standby
4. create intended-to-be-new-Standby's recovery.conf,
specifying recovery_target_timeline = 'latest'
(but I think it works with this setting omitted)
5. start new Master
6. start new Standby and up it comes
in its postgres log(**) I see
LOG: 00000: database system was shut down at 2016-03-14 16:41:24 GMT
LOCATION: StartupXLOG, xlog.c:5936
LOG: 00000: entering standby mode
LOCATION: StartupXLOG, xlog.c:6016
LOG: 00000: consistent recovery state reached at C/480000D0
LOCATION: CheckRecoveryConsistency, xlog.c:7493
LOG: 00000: invalid record length at C/480000D0
LOCATION: ReadRecord, xlog.c:3960
LOG: 00000: started streaming WAL from primary at C/48000000 on timeline 3
LOCATION: WalReceiverMain, walreceiver.c:358
LOG: 00000: redo starts at C/480000D0
LOCATION: StartupXLOG, xlog.c:6700
Is the invalid record length msg anything to worry about?
But this method is purely empirical. Is it robust? Anyone have any better recommendations?
Cheers, John Lumby
(**Note this log was from a pre-release 9.5, 9.5alpha2
I don't have 9.5.1 to hand at present
From | Date | Subject | |
---|---|---|---|
Next Message | Shulgin, Oleksandr | 2016-03-14 17:54:11 | Re: how to switch old replication Master to new Standby after promoting old Standby |
Previous Message | Adrian Klaver | 2016-03-14 17:19:38 | Re: Unexpected result using floor() function |