From: | Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> |
---|---|
To: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, pasim(at)vmware(dot)com |
Cc: | masahiko(dot)sawada(at)2ndquadrant(dot)com, robertmhaas(at)gmail(dot)com, ashwinstar(at)gmail(dot)com, andres(at)anarazel(dot)de, pgsql-hackers(at)postgresql(dot)org, simon(at)2ndquadrant(dot)com, xzhang(at)pivotal(dot)io |
Subject: | Re: SyncRepLock acquired exclusively in default configuration |
Date: | 2020-08-26 17:40:29 |
Message-ID: | 27190204-9e11-6a39-d63f-be4423a8c7a2@oss.nttdata.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2020/08/20 11:29, Kyotaro Horiguchi wrote:
> At Wed, 19 Aug 2020 13:20:46 +0000, Asim Praveen <pasim(at)vmware(dot)com> wrote in
>>
>>
>>> On 12-Aug-2020, at 12:02 PM, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
>>>
>>> On Wed, 12 Aug 2020 at 14:06, Asim Praveen <pasim(at)vmware(dot)com> wrote:
>>>>
>>>>
>>>>
>>>>> On 11-Aug-2020, at 8:57 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>>>>
>>>>> I think this gets to the root of the issue. If we check the flag
>>>>> without a lock, we might see a slightly stale value. But, considering
>>>>> that there's no particular amount of time within which configuration
>>>>> changes are guaranteed to take effect, maybe that's OK. However, there
>>>>> is one potential gotcha here: if the walsender declares the standby to
>>>>> be synchronous, a user can see that, right? So maybe there's this
>>>>> problem: a user sees that the standby is synchronous and expects a
>>>>> transaction committing afterward to provoke a wait, but really it
>>>>> doesn't. Now the user is unhappy, feeling that the system didn't
>>>>> perform according to expectations.
>>>>
>>>> Yes, pg_stat_replication reports a standby in sync as soon as walsender updates priority of the standby to something other than 0.
>>>>
>>>> The potential gotcha referred above doesn’t seem too severe. What is the likelihood of someone setting synchronous_standby_names GUC with either “*” or a standby name and then immediately promoting that standby? If the standby is promoted before the checkpointer on master gets a chance to update sync_standbys_defined in shared memory, commits made during this interval on master may not make it to standby. Upon promotion, those commits may be lost.
>>>
>>> I think that if the standby is quite behind the primary and in case of
>>> the primary crashes, the likelihood of losing commits might get
>>> higher. The user can see the standby became synchronous standby via
>>> pg_stat_replication but commit completes without a wait because the
>>> checkpointer doesn't update sync_standbys_defined yet. If the primary
>>> crashes before standby catching up and the user does failover, the
>>> committed transaction will be lost, even though the user expects that
>>> transaction commit has been replicated to the standby synchronously.
>>> And this can happen even without the patch, right?
>>>
>>
>> It is correct that the issue is orthogonal to the patch upthread and I’ve marked
>> the commitfest entry as ready-for-committer.
Yes, thanks for the review!
> I find the name of SyncStandbysDefined macro is very confusing with
> the struct member sync_standbys_defined, but that might be another
> issue..
>
> - * Fast exit if user has not requested sync replication.
> + * Fast exit if user has not requested sync replication, or there are no
> + * sync replication standby names defined.
>
> This comment sounds like we just do that twice. The reason for the
> check is to avoid wasteful exclusive locks on SyncRepLock, or to form
> double-checked locking on the variable. I think we should explain that
> here.
I added the following comments based on the suggestion by Sawada-san upthread. Thought?
+ * Since this routine gets called every commit time, it's important to
+ * exit quickly if sync replication is not requested. So we check
+ * WalSndCtl->sync_standbys_define without the lock and exit
+ * immediately if it's false. If it's true, we check it again later
+ * while holding the lock, to avoid the race condition described
+ * in SyncRepUpdateSyncStandbysDefined().
Attached is the updated version of the patch. I didn't change how to
fix the issue. But I changed the check for fast exit so that it's called
before setting the "mode", to avoid a few cycle.
Regards,
--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
Attachment | Content-Type | Size |
---|---|---|
syncreplock_v2.patch | text/plain | 1.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Janes | 2020-08-26 17:52:17 | Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions |
Previous Message | Alvaro Herrera | 2020-08-26 17:17:10 | Re: [POC]Enable tuple change partition caused by BEFORE TRIGGER |