Quick Links

Re: Replication recovery?

From:	"Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To:	"John Mudd EXTERN" <johnbmudd(at)gmail(dot)com>, <pgsql-general(at)postgresql(dot)org>
Subject:	Re: Replication recovery?
Date:	2012-05-18 07:39:05
Message-ID:	D960CB61B694CF459DCFB4B0128514C207E6A708@exadv11.host.magwien.gv.at
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

John Mudd wrote:
> Sorry if this is a dumb question. Feel free to just point me to a doc.

Sure, here:
http://www.postgresql.org/docs/current/static/warm-standby-failover.html

> I've read a little about Postgres replication and the concept of a
> master and one or more slaves. If one db is down then you just switch
> to one that's still running. There's even additional software like
> pgpool to make the switch easy. But I want to know more about how to
> resume normal operating mode.
>
> For example, I take it that if the master is unavailable then you
> switch to a slave. The former slave becomes the current master. When
> the original "master" is ready to run and network accessible then do
> you bring it online in slave mode and it syncs automatically with the
> current master? At which time you're almost back to normal.

No, to quote:
"To return to normal operation, a standby server must be recreated,
either on the former primary system when it comes up, or on a third,
possibly new, system."

That means that you have to take a new base backup from the new primary
to setup the new standby. Using something like "rsync" for the backup
can speed up the process if the difference is not yet too great.

That is necessary because after failover the database has changed
(it is running on a new time line).
Also, how would you handle the case that a transaction has already been
committed on the old master, but not yet replicated?

> Once they
> are back in sync do people typically switch the roles back to the
> original designation of who's a slave and who's a master? It's not
> clear to me if the last step is necessary.

It's not necessary to switch back again, as long as both machines are
equal. While it is a good idea to use comparable machines anyway if
you want the standby to be able to take over, it may for example be that
the standby is at a remote site and you want your primary server to
run locally - then you would want to switch back.

> Well, that's assuming that the master comes back online with all the
> data it had when it went offline. If it comes back but all data was
> lost (a worst case scenario) then I assume I have to take the current
> master offline and use it to repopulate the recovering master from
> scratch, correct? But... if I have additional slaves then I could just
> take one of the current slaves offline, use it to rebuild the original
> master, and then bring both the slave and the reconstructed master
> (now also a slave) back online and both will sync with the current
> master.

It should work to create a slave by copying another slave, if that
is what you have in mind.

Yours,
Laurenz Albe

In response to

Replication recovery? at 2012-05-17 19:10:27 from John Mudd

Browse pgsql-general by date

	From	Date	Subject
Next Message	Sergey Konoplev	2012-05-18 07:40:42	Re: Replication recovery?
Previous Message	E-Blokos	2012-05-17 23:17:44	cascade replication and multiple primary_conninfo