Re: Race condition in SyncRepGetSyncStandbysPriority

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: masahiko(dot)sawada(at)2ndquadrant(dot)com, masao(dot)fujii(at)oss(dot)nttdata(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Race condition in SyncRepGetSyncStandbysPriority
Date: 2020-04-17 16:30:23
Message-ID: 8896.1587141023@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> writes:
> At Fri, 17 Apr 2020 17:03:11 +0900, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote in
>> On Fri, 17 Apr 2020 at 14:58, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> wrote:
>> Just for confirmation, since the new approach doesn't change that
>> walsenders reload new config at their convenient timing, it still can
>> happen that a walsender releases waiters according to the old config
>> that defines fewer number of sync standbys, during walsenders

> Right.

>> absorbing a change in the set of synchronous walsenders. In the worst
>> case where the master crashes in the middle, we cannot be sure how
>> many sync servers the data has been replicated to. Is that right?

> Wal senders can set a stupid value as priority or in a worse case the
> shared walsender information might be of another walsender that is
> launched just now. In any case SyncRepGetSyncStandbys can return a set
> of walsenders with descending priority (in priority mode). What can
> be happen in the worst case is some transactions are released by a bit
> wrong LSN information. Such inconsistency also can be happen when the
> oldest sync standby in priority mode goes out and sync-LSN goes back
> even if the wal-sender list is strictly kept consistent.

I don't really see a problem here. It's true that transactions might
be released based on either the old or the new value of num_sync,
depending on whether the particular walsender executing the release
logic has noticed the SIGHUP yet. But if a transaction was released,
then there were at least num_sync confirmed transmissions of data
to someplace, so it's not like you've got no redundancy at all.

The only thing that seems slightly odd is that there could in principle
be some transactions released on the basis of the new num_sync, and
then slightly later some transactions released on the basis of the old
num_sync. But I don't think it's really going to be possible to avoid
that, given that the GUC update is propagated in an asynchronous
fashion.

I spent a few moments wondering if we could avoid such cases by having
SyncRepReleaseWaiters check for GUC updates after it's acquired
SyncRepLock. But that wouldn't really guarantee much, since the
postmaster can't deliver SIGHUP to all the walsenders simultaneously.
I think the main practical effect would be to allow some possibly-slow
processing to happen while holding SyncRepLock, which surely isn't a
great idea.

BTW, it might be worth documenting in this thread that my proposed
patch intentionally doesn't move SyncRepReleaseWaiters' acquisition
of SyncRepLock. With the patch, SyncRepGetSyncRecPtr does not require
SyncRepLock so one could consider acquiring that lock only while updating
walsndctl and releasing waiters. My concern about that is that then
it'd be possible for a later round of waiter-releasing to happen on the
basis of slightly older SyncRepGetSyncRecPtr results, if a walsender that
had done SyncRepGetSyncRecPtr first were only able to acquire the lock
second. Perhaps that would be okay, but I'm not sure, so I left it
alone.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-04-17 17:07:15 Re: Possible cache reference leak by removeExtObjInitPriv
Previous Message Robert Haas 2020-04-17 16:19:32 Re: where should I stick that backup?