From: | Benjamin Knoth <knoth(at)mpdl(dot)mpg(dot)de> |
---|---|
To: | <pgsql-admin(at)postgresql(dot)org> |
Subject: | HA-Fedora repository with HA-Postgresql |
Date: | 2011-07-27 08:41:28 |
Message-ID: | 4E2FCF38.4090802@mpdl.mpg.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin |
Hi all,
i have a Fedora repository which communicates with Postgresql.
I would like to have a High Availability solution for this scenario.
My first idea was, i use drbd on two VMs to synchronize both partions
and use Pacemaker to control the DRBD, IP, Fedora and Postgresql.
This part is done and works.
But now i think it's not enough to get a HA solution.
The problem is that Fedora firstly write the data on the filesystem,
than it writes the link and indexies in the DB.
For Example if the drbd crashs from Fedora it is possible that the
operation couldn't be finished. So the data is written but and the link
is written in the DB but not the Index of this object.
What should i do? If Fedora DRBD crashs, Pacemaker will detect it, shut
down the services in order and start them on the other VM. But at this
the operation is incomplete. The data is written on the Filesystem and
the Link too, but the Index not. In this moment i have a inconsistence
System.
The normal way on fedora is to resolve this problem, to reindex and
recache it. But we need 4 days to recache and reindex all items. And
this isn't possible to have a downtime of 4 days.
My idea is to solve this problem, to delete the last written file on the
filesystem in Fedora on time x and using WAL on Postgresql to restore
the database to time x. After that i should have a consistent system again.
But is my idea realistic or do i need pgpool II, slony or only DRBD with
Pacemaker?
Do someone have experience with a scenario like this?
Best regards
--
Benjamin Knoth
From | Date | Subject | |
---|---|---|---|
Next Message | Fujii Masao | 2011-07-27 09:16:59 | Re: replication_timeout does not seem to be working |
Previous Message | Greg Smith | 2011-07-27 06:47:01 | Re: replication_timeout does not seem to be working |