On Thu, May 18, 2017 at 1:43 PM, Thomas Munro
<thomas(dot)munro(at)enterprisedb(dot)com> wrote:
> On Thu, May 11, 2017 at 1:48 PM, Michael Paquier
> <michael(dot)paquier(at)gmail(dot)com> wrote:
>> I had my eyes on the WAL sender code this morning, and I have noticed
>> that walsender.c is not completely consistent with the PID lookups it
>> does in walsender.c. In two code paths, the PID value is checked
>> without holding the WAL sender spin lock (WalSndRqstFileReload and
>> pg_stat_get_wal_senders), which looks like a very bad idea contrary to
>> what the new WalSndWaitStopping() does and what InitWalSenderSlot() is
>> doing for ages.
>
> There is also code that accesses shared walsender state without
> spinlocks over in syncrep.c. I think that file could use a few words
> of explanation for why it's OK to access pid, state and flush without
> synchronisation.
Yes, that is read during the quorum and priority sync evaluation.
Except sync_standby_priority, all the other variables should be
protected using the spin lock of the WAL sender. walsender_private.h
is clear regarding that. So the current coding is inconsistent even
there. Attached is an updated patch.
--
Michael