Re: Fault Tolerant Postgresql (two machines, two postmasters, one disk array)

From: Andrew Sullivan <ajs(at)crankycanuck(dot)ca>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Fault Tolerant Postgresql (two machines, two postmasters, one disk array)
Date: 2007-05-17 14:35:25
Message-ID: 20070517143525.GJ6907@phlogiston.dyndns.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Mon, May 14, 2007 at 10:42:13AM -0500, John Gateley wrote:
> Thanks very much to all who responded, the replies were very helpful.

One thing I will mention, that seems not to have come out in a number
of the replies: the details _really really_ count when you set up
this sort of mutli-machine hot failover arrangement.

The general idea is that you have two machines, and the "standby"
machine notices when the "hot" machine disappears, and then mounts
the disk on the standby and takes over for the (now failed) hot
machine.

The problems come when you get a false detection of machine failure.
Consider a case, for instance, where the machine A gets overloaded,
goes into swap madness, or has a billion runaway processes that cause
it to stagger. In this case, A might not respond in time on the
heartbeat monitor, and then the standby machine B thinks A has
failed. But A doesn't know that, of course, because it is working as
hard as it can just to stay up. Now, if B mounts the disk and starts
the postmaster, but doesn't have a way to make _sure_ tha A is
completely disconnected from the disk, then it's entirely possible A
will flush buffers out to the still-mounted data area. Poof!
Instant data corruption.

People often dismiss these sorts of scenarios as unlikely, because of
the timing issues involved. But you have to remember that, if you're
building this kind of high-availability system, you've already built
your individual servers to be very fault tolerant anyway. They have
loads of extra capacity, ECC memory, multiple redundant data paths,
RAID -- all the goodies. So you're talking about an already
unlikely failure scenario. If you're going to the effort to get an
"extra 9" of availability, then you have to think about not only how
to ensure you get that availability, but the consequences of failure.
In this case, the consequence of having two systems mount the same
data area is extremely serious, and you have to be _absolutely sure_
that A is dead and disconnected from the disk when B mounts that
disk. Anything else is just asking for your weekend to be ruined by
a data recovery.

A

--
Andrew Sullivan | ajs(at)crankycanuck(dot)ca
"The year's penultimate month" is not in truth a good way of saying
November.
--H.W. Fowler

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Thiago Ventura 2007-05-17 14:48:14 New accec method
Previous Message Dave Page 2007-05-17 14:32:33 Re: Paypal and "going root"