From: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com> |
---|---|
To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Cc: | Thom Brown <thom(at)linux(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Beena Emerson <memissemerson(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Support for N synchronous standby servers - take 2 |
Date: | 2016-01-28 11:05:28 |
Message-ID: | CAHGQGwHc838dvraSD2MynnpnVUpwrGWxRUmu7_oe8YaUSegPHQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Jan 20, 2016 at 2:35 PM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> On Tue, Jan 19, 2016 at 1:52 AM, Thom Brown <thom(at)linux(dot)com> wrote:
>> On 3 January 2016 at 13:26, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>>> On Fri, Dec 25, 2015 at 7:21 AM, Thomas Munro
>>> <thomas(dot)munro(at)enterprisedb(dot)com> wrote:
>>>> On Fri, Dec 25, 2015 at 8:50 AM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>>>>> On Wed, Dec 23, 2015 at 8:45 AM, Thomas Munro
>>>>> <thomas(dot)munro(at)enterprisedb(dot)com> wrote:
>>>>>> On Wed, Dec 23, 2015 at 3:50 PM, Thomas Munro
>>>>>> <thomas(dot)munro(at)enterprisedb(dot)com> wrote:
>>>>>>> If you got rid of SyncRepGetSyncStandbysOnePriority as suggested
>>>>>>> above, then this function could be renamed to SyncRepGetSyncStandbys.
>>>>>>> I think it would be a tiny bit nicer if it also took a Size n argument
>>>>>>> along with the output buffer pointer.
>>>>>
>>>>> Sorry, I could not get your point. SyncRepGetSyncStandbysPriority()
>>>>> function uses synchronous_standby_num which is global variable.
>>>>> But you mean that the number of synchronous standbys is given as
>>>>> function argument?
>>>>
>>>> Yeah, I was thinking of it as the output buffer size which I would be
>>>> inclined to make more explicit (I am still coming to terms with the
>>>> use of global variables in Postgres) but it doesn't matter, please
>>>> disregard that suggestion.
>>>>
>>>>>>> As for the body of that function (which I won't paste here), it
>>>>>>> contains an algorithm to find the top K elements in an array of N
>>>>>>> elements. It does that with a linear search through the top K seen so
>>>>>>> far for each value in the input array, so its worst case is O(KN)
>>>>>>> comparisons. Some of the sorting gurus on this list might have
>>>>>>> something to say about that but my take is that it seems fine for the
>>>>>>> tiny values of K and N that we're dealing with here, and it's nice
>>>>>>> that it doesn't need any space other than the output buffer, unlike
>>>>>>> some other top-K algorithms which would win for larger inputs.
>>>>>
>>>>> Yeah, it's improvement point.
>>>>> But I'm assumed that the number of synchronous replication is not
>>>>> large, so I use this algorithm as first version.
>>>>> And I think that its worst case is O(K(N-K)). Am I missing something?
>>>>
>>>> You're right, I was dropping that detail, in the tradition of the
>>>> hand-wavy school of big-O notation. (I suppose you could skip the
>>>> inner loop when the priority is lower than the current lowest
>>>> priority, giving a O(N) best case when the walsenders are perfectly
>>>> ordered by coincidence. Probably a bad idea or just not worth
>>>> worrying about.)
>>>
>>> Thank you for reviewing the patch.
>>> Yeah, I added the logic that skip the inner loop.
>>>
>>>>
>>>>> Attached latest version patch.
>>>>
>>>> +/*
>>>> + * Obtain currently synced LSN location: write and flush, using priority
>>>> - * In 9.1 we support only a single synchronous standby, chosen from a
>>>> - * priority list of synchronous_standby_names. Before it can become the
>>>> + * In 9.6 we support multiple synchronous standby, chosen from a priority
>>>>
>>>> s/standby/standbys/
>>>>
>>>> + * list of synchronous_standby_names. Before it can become the
>>>>
>>>> s/Before it can become the/Before any standby can become a/
>>>>
>>>> * synchronous standby it must have caught up with the primary; that may
>>>> * take some time. Once caught up, the current highest priority standby
>>>>
>>>> s/standby/standbys/
>>>>
>>>> * will release waiters from the queue.
>>>>
>>>> +bool
>>>> +SyncRepGetSyncLsnsPriority(XLogRecPtr *write_pos, XLogRecPtr *flush_pos)
>>>> +{
>>>> + int sync_standbys[synchronous_standby_num];
>>>>
>>>> I think this should be sync_standbys[SYNC_REP_MAX_SYNC_STANDBY_NUM].
>>>> (Variable sized arrays are a feature of C99 and PostgreSQL is written
>>>> in C89.)
>>>>
>>>> +/*
>>>> + * Populate a caller-supplied array which much have enough space for
>>>> + * synchronous_standby_num. Returns position of standbys currently
>>>> + * considered as synchronous, and its length.
>>>> + */
>>>> +int
>>>> +SyncRepGetSyncStandbys(int *sync_standbys)
>>>>
>>>> s/much/must/ (my bad, in previous email).
>>>>
>>>> + ereport(ERROR,
>>>> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
>>>> + errmsg("The number of synchronous standbys must be smaller than the
>>>> number of listed : %d",
>>>> + synchronous_standby_num)));
>>>>
>>>> How about "the number of synchronous standbys exceeds the length of
>>>> the standby list: %d"? Error messages usually start with lower case,
>>>> ':' is not usually preceded by a space.
>>>>
>>>> + ereport(ERROR,
>>>> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
>>>> + errmsg("The number of synchronous standbys must be between 1 and %d : %d",
>>>>
>>>> s/The/the/, s/ : /: /
>>>
>>> Fixed you mentioned.
>>>
>>> Attached latest v5 patch.
>>> Please review it.
>>
>> synchronous_standby_num doesn't appear to be a valid GUC name:
>>
>> LOG: unrecognized configuration parameter "synchronous_standby_num"
>> in file "/home/thom/Development/test/primary/postgresql.conf" line 244
>>
>> All I did was uncomment it and set it to a value.
>>
>
> Thank you for having a look it.
>
> Yeah, synchronous_standby_num should not exists in postgresql.conf.
> Please test for multiple sync replication with latest patch.
In synchronous_replication_method = 'priority' case, when I set
synchronous_standby_names to invalid value like 'hoge,foo' and
reloaded the configuration file, the server crashed with
the following error. This crash should not happen.
FATAL: invalid input syntax for integer: "hoge"
+ /*
+ * After read all synchronous replication configuration parameter, we apply
+ * settings according to replication method.
+ */
+ ProcessSynchronousReplicationConfig();
Why does the above function need to be called in ProcessConfigFile(), i.e.,
by every postgres processes? I was thinking that only walsender should
call that to check which walsender is synchronous according to the setting.
When synchronous_replication_method = '1-priority' and
synchronous_standby_names = '*', I started one synchronous standby.
Then, when I ran "SELECT * FROM pg_stat_replication", I got the
following WARNING message.
WARNING: detected write past chunk end in ExprContext 0x2acb3c0
I don't think that it's good design to specify the number of sync replicas
to wait for, in synchronous_standby_names. It's confusing for the users.
It's better to add separate parameter (synchronous_standby_num) for
specifying that number. Which increases the number of GUC parameters,
though.
Are we really planning to implement synchronous_replication_method=quorum
at the first version? If not, I'd like to remove s_r_method parameter
because it's meaningless. We can add it later when we implement "quorum".
Regards,
--
Fujii Masao
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2016-01-28 11:09:51 | Re: Weighted Stats |
Previous Message | Bruce Momjian | 2016-01-28 11:03:02 | Re: Template for commit messages |