From: | Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> |
---|---|
To: | Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, robertmhaas(at)gmail(dot)com |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Fix of doc for synchronous_standby_names. |
Date: | 2016-04-22 08:27:07 |
Message-ID: | 5719E05B.4030701@lab.ntt.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Horiguchi-san,
On 2016/04/22 14:21, Kyotaro HORIGUCHI wrote:
> I came to think that both of you are misunderstanding how
> synchronous standbys are choosed so I'd like to clarify the
> behavior.
I certainly had a different (and/or wrong) idea in mind about how this
works. Thanks a lot for clarifying. I'm still a little confused, so
please see below.
> Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>>>> But this particular sentence seems to be talking
>>>> about what's the case for any given slot.
>>>
>>> Right, that's my reading also.
>
> In SyncRepInitConfig, every walsender sets sync_standby_priority
> by itself. The priority value is the index of its
> application_name in s_s_names list (1 based).
>
> When a walsender receives a feedback from walreceiver, it may
> release backends waiting for certain LSN to be secured.
>
> First, SyncRepGetSyncStandbys collects active standbys. Then it
> loops from the hightest priority value to pick up all of the
> active standbys for each priority value until all of the seats
> are occupied. Then SyncRepOldestSyncRepPtr calculates the oldest
> LSNs only among the standbys SyncRepGetSyncStandbys
> returned. Finally, it releases backends using the LSNs.
>
> In short, every 'slot' in s_s_names can corresponds to two or
> more *synchronous* standbys.
>
> The resulting behavior is as the following.
>
>> I don't certainly understnd what the 'sync slot' means. If it
>> means a name in a replication set description, that is, 'nameN'
>> in the following setting of s_s_names.
>>
>> '2(name1, name2, name3)'
>>
>> There may be two or more duplicates even in the
>> single-sync-age. But only one synchronous standby was allowed so
>> any 'sync slot' may have at least one matching synchronous
>> standby in the single-sync-age. This is what I see in the
>> sentense. Is this wrong?
>>
>> Now, we can have multiple synchronous standbys so, for example,
>> if three standbys with the name 'name1', two of them are choosed
>> as synchronous. This is a new behavior in the multi-sync-age and
>> syncrep.c has been changed so as to do so.
>>
>> For a supplemnet, the following case.
>>
>> '5(name1, name2, name3)'
>>
>> and the following standbys
>>
>> (name1, name1, name2, name2, name3, name3)
>
> This was a bad example,
>
> (name1, name1, name2, name2, name2, name2, name3)
>
> For this case, the followings are choosed as synchornous standby.
>
> (name1, name1, name2, name2, name2)
>
> Three of the four name2s are choosed but which name2s is an
> implement matter.
>
>> # However, 5 for three names causes a warning..
OK, I see. I tried to understand it (by carrying it out) using the
following example.
synchronous_standby_names: '3 (aa, aa, bb, bb, cc)'
Individual names get assigned the following priorities based on their
position in the list:
aa: 1, bb: 3, cc: 5
Then standbys connect one-by-one, say the following (host and
application_name):
host1: aa
host2: aa
host3: bb
host4: bb
host5: cc
syncrep.c will assign following priorities (also shown is their sync status)
host1: aa: 1 (sync)
host2: aa: 1 (sync)
host3: bb: 3 (sync)
host4: bb: 3 (potential)
host5: cc: 5 (potential)
As also evident in pg_stat_replication:
SELECT application_name, sync_priority, sync_state FROM
pg_stat_replication ORDER BY 2;
application_name | sync_priority | sync_state
------------------+---------------+------------
aa | 1 | sync
aa | 1 | sync
bb | 3 | sync
bb | 3 | potential
cc | 5 | potential
(5 rows)
Hm, I see (though I wondered if one of the 'aa's should have been assigned
priority 2 and likewise for 'bb').
Following is the documentation paragraph under scrutiny:
"""
The name of a standby server for this purpose is the
<varname>application_name</> setting of the standby, as set in the
<varname>primary_conninfo</> of the standby's WAL receiver. There is
no mechanism to enforce uniqueness. In case of duplicates one of the
matching standbys will be considered as higher priority, though
exactly which one is indeterminate.
"""
Consider the name 'aa', it turns out that there are in fact 2 connected
standbys with that name - so duplicate. Although, both got assigned
priority 1 (IOW, neither of them is considered higher priority than the
other as the sentence above seems to suggest). 2/3 sync spots are now
occupied.
Then let's look at 'bb'. Here (maybe) it makes sense. Two standbys named
'bb' show up, but only 1/3 sync spot is now left. So, one of them is
counted as sync whereas the other becomes a potential sync. So, what is
indeterminate is which (standby) becomes which (sync/potential). However,
it isn't which one gets *higher priority* that is indeterminate as the
concerned sentence says (both got the same priority viz. 3).
I'm kind of confused. Should the sentence say something else altogether?
By the way, I see the following in comment header of SyncRepGetSyncStandbys()
*
* If there are multiple standbys with the same priority,
* the first one found is selected preferentially.
* The caller must hold SyncRepLock.
*
Maybe, the following is possible rewording/rewrite:
"""
The name of individual standby servers for this purpose is the
<varname>application_name</> setting of the standby, as set in the
<varname>primary_conninfo</> of the standby's WAL receiver. There is
no mechanism to enforce uniqueness. Multiple standbys with a given
application_name get assigned the same priority. However, if only
fewer synchronous spots are left, which ones among those standbys
with the same priority become synchronous is indeterminate.
"""
So per Horiguchi-san's original complaint, multiplicity of *something*
needed to reflected after all.
Thoughts?
Thanks,
Amit
From | Date | Subject | |
---|---|---|---|
Next Message | Craig Ringer | 2016-04-22 08:37:10 | Re: Wire protocol compression |
Previous Message | Aleksander Alekseev | 2016-04-22 08:17:26 | Re: Parser extensions (maybe for 10?) |