Quick Links

delay starting WAL receiver

From:	Nathan Bossart <nathandbossart(at)gmail(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	delay starting WAL receiver
Date:	2023-01-11 01:08:36
Message-ID:	20230111010836.GA1550875@nathanxps13
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

I discussed this a bit in a different thread [0], but I thought it deserved
its own thread.

After setting wal_retrieve_retry_interval to 1ms in the tests, I noticed
that the recovery tests consistently take much longer. Upon further
inspection, it looks like a similar race condition to the one described in
e5d494d's commit message. With some added debug logs, I see that all of
the callers of MaybeStartWalReceiver() complete before SIGCHLD is
processed, so ServerLoop() waits for a minute before starting the WAL
receiver.

The attached patch fixes this by adjusting DetermineSleepTime() to limit
the sleep to at most 100ms when WalReceiverRequested is set, similar to how
the sleep is limited when background workers must be restarted.

[0] https://postgr.es/m/20221215224721.GA694065%40nathanxps13

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment	Content-Type	Size
v1-0001-handle-race-condition-when-restarting-wal-receive.patch	text/x-diff	2.5 KB

Responses

Re: delay starting WAL receiver at 2023-01-11 04:20:38 from Thomas Munro

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Michael Paquier	2023-01-11 01:14:45	Re: Allow +group in pg_ident.conf
Previous Message	Michael Paquier	2023-01-11 01:02:17	Re: Strengthen pg_waldump's --save-fullpage tests