Re: Fault Tolerant Postgresql (two machines, two postmasters, one disk array)

From: Andrew Sullivan <ajs(at)crankycanuck(dot)ca>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Fault Tolerant Postgresql (two machines, two postmasters, one disk array)
Date: 2007-05-18 14:29:03
Message-ID: 20070518142903.GJ10921@phlogiston.dyndns.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, May 17, 2007 at 03:55:43PM -0500, Ron Johnson wrote:
> Aren't there PCI heartbeat cards that are independent of the load on
> the host machine?

Yes, there is more than one way to do this. My main point is to
emphasise that you have to pay attention to the details -- all of
them. It's especially important not to trust the vendor to get it
right, because even if they sell a database product themselves, they
may get it wrong. Some failure modes are nearly impossible to
emulate in the lab (how do you cause a brand new working board to
start flaking out as though it has some intermittent problem?). So
you have to make sure that the thing can't wreck your data _by
design_, and not just empirically. This means you have to understand
all the technical details of how the thing works in order to know
whether it is safe. I'm sure we've all seen, more than once, things
happen that the vendor assures cannot.

What this really comes down to is risk analysis. If you add a
complicated failover system to get to five nines, and it breaks, it
might actually make your uptime numbers worse, because it takes so
long to recover from breakage. (If failover doesn't work, do you
have to restore from dumps? How big is your data? Did this outage
just go from five minutes to four hours?) Also, if it is complicated
enough, your sysadmins have a whole new class of loaded foot-gun to
fire at 03:00. So whatever you do, don't let your management talk
themselves into specifying this on Thursday and deploying on Monday.

A

--
Andrew Sullivan | ajs(at)crankycanuck(dot)ca
Information security isn't a technological problem. It's an economics
problem.
--Bruce Schneier

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Dave Golombek 2007-05-18 14:34:48 Re: Problem with inherited tables vs query planning
Previous Message Joshua D. Drake 2007-05-18 14:26:50 Re: Admin-Functions in Ubuntu's PG 8.2 missing?