From: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com> |
---|---|
To: | Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com> |
Cc: | Josh Berkus <josh(at)agliodbs(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Beena Emerson <memissemerson(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>, Peter Eisentraut <peter_e(at)gmx(dot)net> |
Subject: | Re: Synch failover WAS: Support for N synchronous standby servers - take 2 |
Date: | 2015-07-03 09:23:20 |
Message-ID: | CAHGQGwGzv7BHUSYO692ifxXxYrzEkaamO6DfXSBieEGtro_QYw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Jul 3, 2015 at 5:59 PM, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com> wrote:
> On Fri, Jul 3, 2015 at 12:18 PM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>> On Fri, Jul 3, 2015 at 6:54 AM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>>> On 07/02/2015 12:44 PM, Andres Freund wrote:
>>>> On 2015-07-02 11:50:44 -0700, Josh Berkus wrote:
>>>>> So there's two parts to this:
>>>>>
>>>>> 1. I need to ensure that data is replicated to X places.
>>>>>
>>>>> 2. I need to *know* which places data was synchronously replicated to
>>>>> when the master goes down.
>>>>>
>>>>> My entire point is that (1) alone is useless unless you also have (2).
>>>>
>>>> I think there's a good set of usecases where that's really not the case.
>>>
>>> Please share! My plea for usecases was sincere. I can't think of any.
>>>
>>>>> And do note that I'm talking about information on the replica, not on
>>>>> the master, since in any failure situation we don't have the old
>>>>> master around to check.
>>>>
>>>> How would you, even theoretically, synchronize that knowledge to all the
>>>> replicas? Even when they're temporarily disconnected?
>>>
>>> You can't, which is why what we need to know is when the replica thinks
>>> it was last synced from the replica side. That is, a sync timestamp and
>>> lsn from the last time the replica ack'd a sync commit back to the
>>> master successfully. Based on that information, I can make an informed
>>> decision, even if I'm down to one replica.
>>>
>>>>> ... because we would know definitively which servers were in sync. So
>>>>> maybe that's the use case we should be supporting?
>>>>
>>>> If you want automated failover you need a leader election amongst the
>>>> surviving nodes. The replay position is all they need to elect the node
>>>> that's furthest ahead, and that information exists today.
>>>
>>> I can do that already. If quorum synch commit doesn't help us minimize
>>> data loss any better than async replication or the current 1-redundant,
>>> why would we want it? If it does help us minimize data loss, how?
>>
>> In your example of "2" : { "local_replica", "london_server", "nyc_server" },
>> if there is not something like quorum commit, only local_replica is synch
>> and the other two are async. In this case, if the local data center gets
>> destroyed, you need to promote either london_server or nyc_server. But
>> since they are async, they might not have the data which have been already
>> committed in the master. So data loss! Of course, as I said yesterday,
>> they might have all the data and no data loss happens at the promotion.
>> But the point is that there is no guarantee that no data loss happens.
>> OTOH, if we use quorum commit, we can guarantee that either london_server
>> or nyc_server has all the data which have been committed in the master.
>>
>> So I think that quorum commit is helpful for minimizing the data loss.
>>
>
> Yeah, quorum commit is helpful for minimizing data loss in comparison
> with today replication.
> But in this your case, how can we know which server we should use as
> the next master server, after local data center got down?
> If we choose a wrong one, we would get the data loss.
Check the progress of each server, e.g., by using
pg_last_xlog_replay_location(),
and choose the server which is ahead of as new master.
Regards,
--
Fujii Masao
From | Date | Subject | |
---|---|---|---|
Next Message | Syed, Rahila | 2015-07-03 09:45:44 | Re: [PROPOSAL] VACUUM Progress Checker. |
Previous Message | Sawada Masahiko | 2015-07-03 08:59:03 | Re: Synch failover WAS: Support for N synchronous standby servers - take 2 |