Re: PG replication across DataCenters

From: Sameer Kumar <sameer(dot)kumar(at)ashnik(dot)com>
To: Bill Moran <wmoran(at)potentialtech(dot)com>
Cc: Thomas Harold <thomas-lists(at)nybeta(dot)com>, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: PG replication across DataCenters
Date: 2013-12-24 06:39:42
Message-ID: CADp-Sm5=PZjiUSkXBNDYC6ZEJgcXi0d4KWg+CGGJiwV1cxGpgQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs pgsql-general

Though I will agree that slony is a nice and a great tool w.r.t.
replication (specifically selective replication). But I would dis-agree on
below points:

* Cascading replication chains (a really big deal when you want

multiple slaves in the secondary facility and don't want to hog

your bandwidth)

Really? which version of Postgres are we talking about? I think cascaded
replication facility is available since v9.2
http://www.postgresql.org/docs/9.2/static/warm-standby.html#CASCADING-REPLICATION

* Quick and easy movement of the master to any of the database in

the cluster without destroying replication.

Again, which version? Re-mastering is made simple in v9.3.

* Seeding of new slaves without interrupting existing nodes (assuming

your hardware has a little free capacity)

AFAIK, streaming replication does not cause any interruption while you add
a new node.

* Selective replication of tables, potentially in complex arrangements

where some tables are replicated to only to A and some only to B

and some to A and B, etc, etc.

Agree.

In general I do not like trigger based (replication) solutions for huge
clusters [this is my personal opinion and does not necessarily indicate my
employer's opinion ;-)] and for databases which has huge write volume
specifically if you do bulk insert/delete/update operations.
I think if it's slony or streaming replication will depend on below factors:

1) The change-set that you want to replicate contributes how much of your
total change set? e.g. on a per minute basis if it's 70% or above, I will
recommend you to go for streaming replication

2) Do you have too many tables to be added to replication set? lets say
above 70% of your total tables needs to be replication (unless rest 30%
have high write operations), then go for streaming replication

3) Do you too many bulk operations happening on the tables which needs to
be replicated

4) To some extent your choice will be influenced by the motivation behind
replication, DR, HA, reporting application (esp if you are particular about
replicating only selective tables for reports)

There are few easier ways of managing a slony cluster:

1)
http://shoaibmir.wordpress.com/2009/08/05/setting-up-slony-cluster-with-perltools/<http://shoaibmir.wordpress.com/2009/08/05/setting-up-slony-cluster-with-perltools/>

2) I think even pgadmin supports slony replication (not sure if its slony-I
or slony-II)

Regards
Sameer

In response to

Responses

Browse pgsql-docs by date

  From Date Subject
Next Message Bill Moran 2013-12-29 13:19:27 Re: PG replication across DataCenters
Previous Message Bruce Momjian 2013-12-20 18:58:00 mention of unlogged tables

Browse pgsql-general by date

  From Date Subject
Next Message Sameer Kumar 2013-12-24 07:55:47 Re: [GENERAL] Some good news ([info@arin.net: [arin-announce] ARIN Database Migration Completed])
Previous Message Adrian Klaver 2013-12-24 03:25:16 Re: to_json(now()) result doesn't have 'T' separator