Re: Synchronous Standalone Master Redoux

From: Jose Ildefonso Camargo Tolosa <ildefonso(dot)camargo(at)gmail(dot)com>
To: Aidan Van Dyk <aidan(at)highrise(dot)ca>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Synchronous Standalone Master Redoux
Date: 2012-07-13 01:26:51
Message-ID: CAETJ_S84UuyR=tT4ej-TxysTj4rTmAGtieRou+NeyysOnHLPeA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jul 12, 2012 at 8:29 PM, Aidan Van Dyk <aidan(at)highrise(dot)ca> wrote:
> On Thu, Jul 12, 2012 at 8:27 PM, Jose Ildefonso Camargo Tolosa
>
>> Yeah, you need that with PostgreSQL, but no with DRBD, for example
>> (sorry, but DRBD is one of the flagships of HA things in the Linux
>> world). Also, I'm not convinced about the "2nd standby" thing... I
>> mean, just read this on the docs, which is a little alarming:
>>
>> "If primary restarts while commits are waiting for acknowledgement,
>> those waiting transactions will be marked fully committed once the
>> primary database recovers. There is no way to be certain that all
>> standbys have received all outstanding WAL data at time of the crash
>> of the primary. Some transactions may not show as committed on the
>> standby, even though they show as committed on the primary. The
>> guarantee we offer is that the application will not receive explicit
>> acknowledgement of the successful commit of a transaction until the
>> WAL data is known to be safely received by the standby."
>>
>> So... there is no *real* warranty here either... I don't know how I
>> skipped that paragraph before today.... I mean, this implies that it
>> is possible that a transaction could be marked as commited on the
>> master, but the app was not informed on that (and thus, could try to
>> send it again), and the transaction was NOT applied on the standby....
>> how can this happen? I mean, when the master comes back, shouldn't the
>> standby get the missing WAL pieces from the master and then apply the
>> transaction? The standby part is the one that I don't really get, on
>> the application side... well, there are several ways in which you can
>> miss the "commit confirmation": connection issues in the worst moment,
>> and the such, so, I guess it is not *so* serious, and the app should
>> have a way of checking its last transaction if it lost connectivity to
>> server before getting the transaction commited.
>
> But you already have that in a single server situation as well. There
> is a window between when the commit is "durable" (and visible to
> others, and will be committed after recovery of a crash), when the
> client doesn't yet know it's committed (and might never get the commit
> message due to server crash, network disconnect, client middle-tier
> crash, etc).
>
> So people are already susceptible to that, and defending against it, no? ;-)

Right. What I'm saying is that particular part on the docs:

"If primary restarts while commits are waiting for acknowledgement,
those waiting transactions will be marked fully committed once the
primary database recovers. "(....)"Some transactions may not show as
committed on the standby, even though they show as committed on the
primary."(...)

See? it sounds like, after the primary database recovers, the standby
will still not have the transaction committed, and as far as I thought
I knew, the standby should get that over the WAL stream from master
once it reconnects to it.

>
> And they are susceptible to that if they are on PostgreSQL, Oracle, MS
> SQL, DB2, etc.

Certainly. That's why I said:

(...)"The standby part is the one that I don't really get, on
the application side... well, there are several ways in which you can
miss the "commit confirmation": connection issues in the worst moment,
and the such, so, I guess it is not *so* serious, and the app should
have a way of checking its last transaction if it lost connectivity to
server before getting the transaction commited."

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jose Ildefonso Camargo Tolosa 2012-07-13 01:36:26 Re: Synchronous Standalone Master Redoux
Previous Message Shigeru HANADA 2012-07-13 01:18:08 Re: pgsql_fdw in contrib