Re: High Availability with Postgres

From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: John R Pierce <pierce(at)hogranch(dot)com>
Cc: PostgreSQL <pgsql-general(at)postgresql(dot)org>
Subject: Re: High Availability with Postgres
Date: 2010-06-22 03:08:32
Message-ID: 4C202930.9020807@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

John R Pierce wrote:
> the commercial cluster software vendors insist on using dedicated
> connections for the heartbeat messages between the cluster members and
> insist on having fencing capabilities (for instance, disabling the
> fiber switch port of the formerly active server and enabling the port
> for the to-be-activated server). with linux-ha and heartbeat, you're
> on your own.

This is worth highlighting. As John points out, it's straighforward to
build a shared storage implementation using PostgreSQL and either one of
the commercial clustering systems or using Linux-HA. And until
PostgreSQL gets fully synchronous replication, it's a viable alternate
solution for "must not lose a transaction" deployments when the storage
used is much more reliable than the nodes.

The hard part of shared storage failover is always solving the "shoot
the other node in the head problem", to keep a down node from coming
back once it's no longer the active one. In order to do that well, you
really need to lock the now unavailable node from accessing the storage
at the hardware level--"fencing"--with disabling its storage port being
one way to handle that. Figure out how you're going to do that reliably
in a way that's integrated into a proper cluster manager, and there's no
reason you can't do this with PostgreSQL.

There's a description of the fencing options for Linux-HA at
http://www.clusterlabs.org/doc/crm_fencing.html ; the cheap way to solve
this problem is to have a UPS that disables the power going to the shot
node. Once that's done, you can then safely failover the shared storage
to another system. At that point, you can probably even turn back on
the power, presuming that the now rebooted system will be able to regain
access to the storage during a fresh system start.

--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com www.2ndQuadrant.us

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Carlo Stonebanks 2010-06-22 03:28:57 No PL/PHP ? Any reason?
Previous Message George Weaver 2010-06-22 03:02:27 Problem Using RowType Declaration with Table Domains