pgsql: Fix race with synchronous_standby_names at startup

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: pgsql-committers(at)lists(dot)postgresql(dot)org
Subject: pgsql: Fix race with synchronous_standby_names at startup
Date: 2025-04-11 01:04:07
Message-ID: E1u32oR-003fwI-2q@gemulon.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers

Fix race with synchronous_standby_names at startup

synchronous_standby_names cannot be reloaded safely by backends, and the
checkpointer is in charge of updating a state in shared memory if the
GUC is enabled in WalSndCtl, to let the backends know if they should
wait or not for a given LSN. This provides a strict control on the
timing of the waiting queues if the GUC is enabled or disabled, then
reloaded. The checkpointer is also in charge of waking up the backends
that could be waiting for a LSN when the GUC is disabled.

This logic had a race condition at startup, where it would be possible
for backends to not wait for a LSN even if synchronous_standby_names is
enabled. This would cause visibility issues with transactions that we
should be waiting for but they were not. The problem lasts until the
checkpointer does its initial update of the shared memory state when it
loads synchronous_standby_names.

In order to take care of this problem, the shared memory state in
WalSndCtl is extended to detect if it has been initialized by the
checkpointer, and not only check if synchronous_standby_names is
defined. In WalSndCtlData, sync_standbys_defined is renamed to
sync_standbys_status, a bits8 able to know about two states:
- If the shared memory state has been initialized. This flag is set by
the checkpointer at startup once, and never removed.
- If synchronous_standby_names is known as defined in the shared memory
state. This is the same as the previous sync_standbys_defined in
WalSndCtl.

This method gives a way for backends to decide what they should do until
the shared memory area is initialized, and they now ultimately fall back
to a check on the GUC value in this case, which is the best thing that
can be done.

Fortunately, SyncRepUpdateSyncStandbysDefined() is called immediately by
the checkpointer when this process starts, so the window is very narrow.
It is possible to enlarge the problematic window by making the
checkpointer wait at the beginning of SyncRepUpdateSyncStandbysDefined()
with a hardcoded sleep for example, and doing so has showed that a 2PC
visibility test is indeed failing. On machines slow enough, this bug
would cause spurious failures.

In 17~, we have looked at the possibility of adding an injection point
to have a reproducible test, but as the problematic window happens at
early startup, we would need to invent a way to make an injection point
optionally persistent across restarts when attached, something that
would be fine for this case as it would involve the checkpointer. This
issue is quite old, and can be reproduced on all the stable branches.

Author: Melnikov Maksim <m(dot)melnikov(at)postgrespro(dot)ru>
Co-authored-by: Michael Paquier <michael(at)paquier(dot)xyz>
Discussion: https://postgr.es/m/163fcbec-900b-4b07-beaa-d2ead8634bec@postgrespro.ru
Backpatch-through: 13

Branch
------
REL_17_STABLE

Details
-------
https://git.postgresql.org/pg/commitdiff/3339847ccde019e6e691c9d4caec4736c06277bf

Modified Files
--------------
src/backend/replication/syncrep.c | 97 ++++++++++++++++++++++++-----
src/backend/replication/walsender.c | 6 +-
src/include/replication/walsender_private.h | 23 +++++--
3 files changed, 104 insertions(+), 22 deletions(-)

Browse pgsql-committers by date

  From Date Subject
Next Message Peter Eisentraut 2025-04-11 07:07:29 pgsql: Add missing PGDLLIMPORT markings
Previous Message David Rowley 2025-04-10 23:36:43 pgsql: Add code comment explaining ins_since_vacuum and aborted inserts