From: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
---|---|
To: | depesz(at)depesz(dot)com |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: [COMMITTERS] pgsql: Allow a streaming replication standby to follow a timeline switc |
Date: | 2012-12-17 12:01:20 |
Message-ID: | 50CF0990.8020506@iki.fi |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-committers pgsql-general |
On 15.12.2012 17:06, hubert depesz lubaczewski wrote:
> I might be missing something, but what exactly does that commit give us?
>
> I mean - we were able, previously, to make slave switch to new master
> - as Phil Sorber described in here:
> http://philsorber.blogspot.com/2012/03/what-to-do-when-your-timeline-isnt.html
>
> After some talk on IRC, I understood that this patch will make it
> possible to switch to new master in plain SR replication, with no WAL
> archive (because if you have wal archive, you can use the method Phil
> described, which basically "just works").
Right, that's exactly the point of the patch. A WAL archive is no longer
necessary for failover.
> So I did setup three machines: master and two slaves.
> Master had 2 IPs - its own, and a floating one.
> Both slaves were connecting to the floating one, and recovery.conf
> looked like:
> ---------
> standby_mode = 'on'
> primary_conninfo = 'port=5920 user=replication host=172.28.173.253'
> trigger_file = '/tmp/finish.replication'
> recovery_target_timeline='latest'
> ---------
>
> After I verified that replication works to both slaves, I did failover one of
> the slaves, shut down master, and did ip takeover of floating ip to the slave
> that did takeover.
Hmm, is it possible that some WAL was generated in the old master, and
streamed to the standby, after the new master was already promoted? It's
important to kill the old master before promoting the new master.
Otherwise the timelines diverge, so that you have some WAL on the old
timeline that's not present in the new master, and some WAL in the new
master's timeline that's not present in the old master. In that
situation, if the standby has already replicated the WAL from the old
master, it can no longer start to follow the new master. I think that
would match the symptoms you're seeing.
I wouldn't rule out a bug in the patch either, though. Amit found a
worrying number of bugs in his testing, and although we stamped out all
the known bugs, it wouldn't surprise me if there's more :-(..
- Heikki
From | Date | Subject | |
---|---|---|---|
Next Message | hubert depesz lubaczewski | 2012-12-17 14:11:02 | Re: [COMMITTERS] pgsql: Allow a streaming replication standby to follow a timeline switc |
Previous Message | Tom Lane | 2012-12-16 20:03:25 | pgsql: Fix filling of postmaster.pid in bootstrap/standalone mode. |
From | Date | Subject | |
---|---|---|---|
Next Message | Kevin Grittner | 2012-12-17 13:22:21 | Re: problem with large inserts |
Previous Message | Groshev Andrey | 2012-12-17 11:33:40 | trouble with pg_upgrade 9.0 -> 9.1 |