From: | Noah Misch <noah(at)leadboat(dot)com> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | SyncRepGetSyncStandbysPriority() vs. SIGHUP |
Date: | 2020-02-06 07:45:52 |
Message-ID: | 20200206074552.GB3326097@rfd.leadboat.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Buildfarm runs have triggered the assertion at the end of
SyncRepGetSyncStandbysPriority():
sysname │ snapshot │ branch │ bfurl
──────────┼─────────────────────┼───────────────┼──────────────────────────────────────────────────────────────────────────────────────────────
hoverfly │ 2019-11-22 12:15:08 │ HEAD │ http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hoverfly&dt=2019-11-22%2012%3A15%3A08
hoverfly │ 2019-11-07 17:19:12 │ REL9_6_STABLE │ http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hoverfly&dt=2019-11-07%2017%3A19%3A12
nightjar │ 2019-08-13 23:04:41 │ REL_10_STABLE │ http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=nightjar&dt=2019-08-13%2023%3A04%3A41
skink │ 2018-11-28 21:03:35 │ HEAD │ http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2018-11-28%2021%3A03%3A35
On my development system, this delay injection reproduces the failure:
--- a/src/backend/replication/syncrep.c
+++ b/src/backend/replication/syncrep.c
@@ -399,6 +399,8 @@ SyncRepInitConfig(void)
{
int priority;
+ pg_usleep(100 * 1000);
SyncRepInitConfig() is the function responsible for updating, after SIGHUP,
the sync_standby_priority values that SyncRepGetSyncStandbysPriority()
consults. The assertion holds if each walsender's sync_standby_priority (in
shared memory) accounts for the latest synchronous_standby_names GUC value.
That ceases to hold for brief moments after a SIGHUP that changes the
synchronous_standby_names GUC value.
I think the way to fix this is to nominate one process to update all
sync_standby_priority values after SIGHUP. That process should acquire
SyncRepLock once per ProcessConfigFile(), not once per walsender. If
walsender startup occurs at roughly the same time as a SIGHUP, the new
walsender should avoid computing sync_standby_priority based on a GUC value
different from the one used for the older walsenders.
Would anyone like to fix this? I could add it to my queue, but it would wait
a year or more.
Thanks,
nm
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Langote | 2020-02-06 07:52:48 | Re: Identifying user-created objects |
Previous Message | Michael Paquier | 2020-02-06 07:31:51 | Re: Identifying user-created objects |