From: | Bruce Momjian <bruce(at)momjian(dot)us> |
---|---|
To: | Hannu Krosing <hannu(at)2ndQuadrant(dot)com> |
Cc: | MauMau <maumau307(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kevin Grittner <kgrittn(at)ymail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Simon Riggs <simon(at)2ndQuadrant(dot)com>, Rajeev rastogi <rajeev(dot)rastogi(at)huawei(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Standalone synchronous master |
Date: | 2014-01-09 17:15:37 |
Message-ID: | 20140109171537.GA4873@momjian.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Jan 9, 2014 at 04:55:22PM +0100, Hannu Krosing wrote:
> On 01/09/2014 04:15 PM, MauMau wrote:
> > From: "Hannu Krosing" <hannu(at)2ndQuadrant(dot)com>
> >> On 01/09/2014 01:57 PM, MauMau wrote:
> >>> Let me ask a (probably) stupid question. How is the sync rep
> >>> different from RAID-1?
> >>>
> >>> When I first saw sync rep, I expected that it would provide the same
> >>> guarantees as RAID-1 in terms of durability (data is always mirrored
> >>> on two servers) and availability (if one server goes down, another
> >>> server continues full service).
> >> What you describe is most like A-sync rep.
> >>
> >> Sync rep makes sure that data is always replicated before confirming to
> >> writer.
> >
> > Really? RAID-1 is a-sync?
> Not exactly, as there is no "master" just controller writing to two
> equal disks.
>
> But having a "degraded" mode makes it
> more like async - it continues even with single disk and syncs later if
> and when the 2nd disk comes back.
I think RAID-1 is a very good comparison because it is successful
technology and has similar issues.
RAID-1 is like Postgres synchronous_standby_names mode in the sense that
the RAID-1 controller will not return success until writes have happened
on both mirrors, but it is unlike synchronous_standby_names in that it
will degrade and continue writes even when it can't write to both
mirrors. What is being discussed is to allow the RAID-1 behavior in
Postgres.
One issue that came up in discussions is the insufficiency of writing a
degrade notice in a server log file because the log file isn't durable
from server failures, meaning you don't know if a fail-over to the slave
lost commits. The degrade message has to be stored durably against a
server failure, e.g. on a pager, probably using a command like we do for
archive_command, and has to return success before the server continues
in degrade mode. I assume degraded RAID-1 controllers inform
administrators in the same way.
I think RAID-1 controllers operate successfully with this behavior
because they are seen as durable and authoritative in reporting the
status of mirrors, while with Postgres, there is no central authority
that can report that degrade status of master/slaves.
Another concern with degrade mode is that once Postgres enters degrade
mode, how does it get back to synchronous_standby_names mode? We could
have each commit wait for the timeout before continuing, but that is
going to make degrade mode unusably slow. Would there be an admin
command? With a timeout to force degrade mode, a temporary network
outage could cause degrade mode, while our current behavior would
recover synchronous_standby_names mode once the network was repaired.
--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ Everyone has their own god. +
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2014-01-09 17:21:37 | Re: Turning off HOT/Cleanup sometimes |
Previous Message | Tom Lane | 2014-01-09 17:09:09 | Re: [PATCH] Negative Transition Aggregate Functions (WIP) |