From: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com> |
---|---|
To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Cc: | Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Noah Misch <noah(at)leadboat(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, Vik Fearing <vik(at)2ndquadrant(dot)fr>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Quorum commit for multiple synchronous replication. |
Date: | 2017-04-18 18:03:38 |
Message-ID: | CAHGQGwE95S5GM9UZh0F3ef2D3iEwJ59skh=EwW5HmDJPe2aXog@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Apr 18, 2017 at 7:02 PM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> On Tue, Apr 18, 2017 at 6:40 PM, Kyotaro HORIGUCHI
> <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>> At Tue, 18 Apr 2017 14:58:50 +0900, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote in <CAD21AoBqSjUGx0LCDrjEDLB-yx2EvgLMdT8Nz4ZR_xpxrbMU+Q(at)mail(dot)gmail(dot)com>
>>> On Tue, Apr 18, 2017 at 3:04 AM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>>> > On Wed, Apr 12, 2017 at 2:36 AM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>>> >> On Thu, Apr 6, 2017 at 4:17 PM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>>> >>> On Thu, Apr 6, 2017 at 10:51 AM, Noah Misch <noah(at)leadboat(dot)com> wrote:
>>> >>>> On Thu, Apr 06, 2017 at 12:48:56AM +0900, Fujii Masao wrote:
>>> >>>>> On Wed, Apr 5, 2017 at 3:45 PM, Noah Misch <noah(at)leadboat(dot)com> wrote:
>>> >>>>> > On Mon, Dec 19, 2016 at 09:49:58PM +0900, Fujii Masao wrote:
>>> >>>>> >> Regarding this feature, there are some loose ends. We should work on
>>> >>>>> >> and complete them until the release.
>>> >>>>> >>
>>> >>>>> >> (1)
>>> >>>>> >> Which synchronous replication method, priority or quorum, should be
>>> >>>>> >> chosen when neither FIRST nor ANY is specified in s_s_names? Right now,
>>> >>>>> >> a priority-based sync replication is chosen for keeping backward
>>> >>>>> >> compatibility. However some hackers argued to change this decision
>>> >>>>> >> so that a quorum commit is chosen because they think that most users
>>> >>>>> >> prefer to a quorum.
>>> >>>>> >>
>>> >>>>> >> (2)
>>> >>>>> >> There will be still many source comments and documentations that
>>> >>>>> >> we need to update, for example, in high-availability.sgml. We need to
>>> >>>>> >> check and update them throughly.
>>> >>>>> >>
>>> >>>>> >> (3)
>>> >>>>> >> The priority value is assigned to each standby listed in s_s_names
>>> >>>>> >> even in quorum commit though those priority values are not used at all.
>>> >>>>> >> Users can see those priority values in pg_stat_replication.
>>> >>>>> >> Isn't this confusing? If yes, it might be better to always assign 1 as
>>> >>>>> >> the priority, for example.
>>> >>>>> >
>>> >>>>> > [Action required within three days. This is a generic notification.]
>>> >>>>> >
>>> >>>>> > The above-described topic is currently a PostgreSQL 10 open item. Fujii,
>>> >>>>> > since you committed the patch believed to have created it, you own this open
>>> >>>>> > item. If some other commit is more relevant or if this does not belong as a
>>> >>>>> > v10 open item, please let us know. Otherwise, please observe the policy on
>>> >>>>> > open item ownership[1] and send a status update within three calendar days of
>>> >>>>> > this message. Include a date for your subsequent status update. Testers may
>>> >>>>> > discover new open items at any time, and I want to plan to get them all fixed
>>> >>>>> > well in advance of shipping v10. Consequently, I will appreciate your efforts
>>> >>>>> > toward speedy resolution. Thanks.
>>> >>>>> >
>>> >>>>> > [1] https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com
>>> >>>>>
>>> >>>>> Thanks for the notice!
>>> >>>>>
>>> >>>>> Regarding the item (2), Sawada-san told me that he will work on it after
>>> >>>>> this CommitFest finishes. So we would receive the patch for the item from
>>> >>>>> him next week. If there will be no patch even after the end of next week
>>> >>>>> (i.e., April 14th), I will. Let's wait for Sawada-san's action at first.
>>> >>>>
>>> >>>> Sounds reasonable; I will look for your update on 14Apr or earlier.
>>> >>>>
>>> >>>>> The items (1) and (3) are not bugs. So I don't think that they need to be
>>> >>>>> resolved before the beta release. After the feature freeze, many users
>>> >>>>> will try and play with many new features including quorum-based syncrep.
>>> >>>>> Then if many of them complain about (1) and (3), we can change the code
>>> >>>>> at that timing. So we need more time that users can try the feature.
>>> >>>>
>>> >>>> I've moved (1) to a new section for things to revisit during beta. If someone
>>> >>>> feels strongly that the current behavior is Wrong and must change, speak up as
>>> >>>> soon as you reach that conclusion. Absent such arguments, the behavior won't
>>> >>>> change.
>>> >>>>
>>> >>>>> BTW, IMO (3) should be fixed so that pg_stat_replication reports NULL
>>> >>>>> as the priority if quorum-based sync rep is chosen. It's less confusing.
>>> >>>>
>>> >>>> Since you do want (3) to change, please own it like any other open item,
>>> >>>> including the mandatory status updates.
>>> >>>
>>> >>> I agree to report NULL as the priority. I'll send a patch for this as well.
>>> >>>
>>> >>> Regards,
>>> >>>
>>> >>
>>> >> Attached two draft patches. The one makes pg_stat_replication.sync
>>> >> priority report NULL if in quorum-based sync replication. To prevent
>>> >> extra change I don't change so far the code of setting standby
>>> >> priority. The another one improves the comment and documentation. If
>>> >> there is more thing what we need to mention in documentation please
>>> >> give me feedback.
>>> >
>>> > Attached is the modified version of the doc improvement patch.
>>> > Barring any objection, I will commit this version.
>>>
>>> Thank you for updating the patch.
>>>
>>> >
>>> > + In term of performance there is difference between two synchronous
>>> > + replication method. Generally quorum-based synchronous replication
>>> > + tends to be higher performance than priority-based synchronous
>>> > + replication. Because in quorum-based synchronous replication, the
>>> > + transaction can resume as soon as received the specified number of
>>> > + acknowledgement from synchronous standby servers without distinction
>>> > + of standby servers. On the other hand in priority-based synchronous
>>> > + replication, the standby server that the primary server must wait for
>>> > + is fixed until a synchronous standby fails. Therefore, if a server on
>>> > + low-performance machine a has high priority and is chosen as a
>>> > + synchronous standby server it can reduce performance for database
>>> > + applications.
>>> >
>>> > This description looks misleading. A quorum-based sync rep is basically
>>> > more efficient when there are multiple standbys in s_s_names and you want
>>> > to replicate the transactions to some of them synchronously. I think that
>>> > this assumption should be documented explicitly. So I modified this
>>> > description. Please see the modified version in the attached patch.
>>>
>>> You're right. The modified version looks good to me, thanks.
>>
>> It looks better to me, too. But (even I'm not sure, of course)
>> the sentences seem to need improvement.
>>
>> | <para>
>> | Quorum-based synchronous replication is basically more
>> | efficient than priority-based one when you specify multiple
>> | standbys in <varname>synchronous_standby_names</> and want
>> | to synchronously replicate transactions to two or more of
>> | them. In the priority-based case, the replication master
>> | must wait for a reply from the slowest standby in the
>> | required number of standbys in priority order, which may
>> | slower than the rest.
>
> I supposed that Fujii-san pointed out that quorum-based sync
> replication could be more efficient when we want to replicate the
> transaction to "part of" standbys listed in s_s_names.
Yes.
Anyway, I pushed the patch except this paragraph.
Regarding this paragraph, the patch for better descriptions is welcome.
Regards,
--
Fujii Masao
From | Date | Subject | |
---|---|---|---|
Next Message | Maksim Milyutin | 2017-04-18 18:31:03 | Re: [PATCH] New command to monitor progression of long running queries |
Previous Message | Jaime Casanova | 2017-04-18 17:50:06 | SASL minor docs typo |