[Q] Cluster design for geographically separated dbs

From: "V S P" <toreason(at)fastmail(dot)fm>
To: pgsql-general(at)postgresql(dot)org
Subject: [Q] Cluster design for geographically separated dbs
Date: 2009-03-07 21:03:47
Message-ID: 1236459827.25013.1304185549@webmail.messagingengine.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello,
I am designing a db to hold often changed user data.

I just wanted to write down what I am thinking and ask people on the
list to comment
if they have any experiences in that area.

My thought is to have
say west-cost and east-cost data center and each user will
go to either East Coast or West Coast

and then within each Coast, I would want to partition by Hash on a user
id.

I am reviewing the Skype paper on the subject

http://kaiv.wordpress.com/2007/07/27/postgresql-cluster-partitioning-with-plproxy-part-i/

And wanted to ask what would be the main challenges I am facing with --
from the experience of the users on this list.

Especially I am not sure how to for example manage 'overlapping unique
IDs' data.

First, say I have a user who is trying to register with the same ID as
somebody
else only in a different data center -- that means that I always have to
check
first in each datacenter if ID exists. Then based on his/her IP address
I decide what data center is closest (but IP addresses are often not a
good indication of geographical location of the user either, so I will
give them a 'manual' select option)

Then if I have say 'BIG' serial in my tables, but since there is more
than one database -- the 'big-serial' in one database can well overlap
it in another database.

So if I have any tables that must contain data from different databases
-- I have to add something else to the 'foreign' key -- besides the
reference to the big serial. And so on...

Right now - on paper, I am just having quite a few 'extra' fields in my
tables just to support 'UNiqueness' of the record across clusters.

I am not sure if I am doing it the right way (because then I also have
to at some point in time 'Defgrament' the IDs (as the data with
BIGserial keys can be deleted).

It looks to me that If I design things to take advantage of Skype's
plproxy -- I will be able to leverage, what appears to be, a relatively
easy way to get data between databases (for reports that span clusters).

thanks in advance for any comments,
Vlad
--
V S P
toreason(at)fastmail(dot)fm

--
http://www.fastmail.fm - Email service worth paying for. Try it for free

In response to

  • Re: VACUUM at 2009-03-07 19:40:50 from John R Pierce

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Willy-Bas Loos 2009-03-07 21:27:02 open up firewall from "anywhere" to postgres ports?
Previous Message John R Pierce 2009-03-07 19:40:50 Re: VACUUM