replication choices

From: Ben <bench(at)silentmedia(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: replication choices
Date: 2007-01-25 20:17:52
Message-ID: Pine.LNX.4.64.0701251106380.8626@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi guys. I've inherited a system that I'm looking to add replication to.
It already has some custom replication code, but it's being nice to say
that code less than good. I'm hoping there's an existing project out there
that will work much better. Unfortunately, I'm not seeing anything that
obviously fits my needs, so maybe somebody here can suggest something.

I've got a single cluster in the datacenter and dozens of remote sites.
Many of these sites are on unreliable connections to the internet, and
while they're online more often then not, when their network will go down
isn't known, and even when it's up, the network isn't that fast.

A vast majority of activity occurs at these remote sites, with very little
at the datacenter cluster. That said, the datacenter cluster needs to keep
pretty good copies of most (but not all) of the data at each site.
Obviously the network unrealiability puts a limit on how up to date the
datacenter can be, but loosing data is considered Bad. So, for instance,
restoring the daily backup of each site at the datacenter is too
infrequent.

Each site will replicate to its own schema in the datacenter cluster, so I
don't *think* I need a multi-master solution.... but at the same time,
because data will be coming from multiple sites, simply replaying WAL
files at the datacenter won't work.

In addition, there will be some data changes made at the datacenter that
will need to replicate to all of the remote sites as soon as they're
online. It's ok if data being replicated from the datacenter ends up in a
different schema at the remote sites than the schema which holds the data
that will be replicated back to the datacenter.

My current best guess of what to do is create a global schema at every
database, a local schema at each site, and a schema for each site at the
datacenter. Then I can use Slony to replicate the global schema from the
datacenter to each site, and again use Slony to replicate the local schema
from each site to that site's schema in the datacenter. But I'm not too
familiar with Slony, and from what I understand, using Slony with bad
networks leads to bad problems. I'm also not sure that Slony supports
replicating from multiple sources to the same postgres install, even if
each replication process is writing to a different schema.

Are there any better options? Or is my Slony idea not so bad?

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Bruce Momjian 2007-01-25 20:19:34 Re: replication choices
Previous Message Furesz Peter 2007-01-25 19:47:28 loop plpgsql recordset variable