From: | Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> |
---|---|
To: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Fujii Masao <fujii(at)postgresql(dot)org>, pgsql-committers(at)lists(dot)postgresql(dot)org |
Subject: | Re: pgsql: Get rid of the dedicated latch for signaling the startup process |
Date: | 2020-11-04 12:03:03 |
Message-ID: | d1260696-30a2-f386-633d-dafd6bc46bd9@oss.nttdata.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-committers |
On 2020/11/04 20:20, Fujii Masao wrote:
>
>
> On 2020/11/04 19:27, Heikki Linnakangas wrote:
>> On 04/11/2020 09:44, Fujii Masao wrote:
>>> Get rid of the dedicated latch for signaling the startup process.
>>>
>>> This commit gets rid of the dedicated latch for signaling the startup
>>> process in favor of using its procLatch, since that comports better
>>> with possible generic signal handlers using that latch.
>>>
>>> Commit 1e53fe0e70 changed background processes so that they use standard
>>> SIGHUP handler. Like that, this commit also makes the startup process use
>>> standard SIGHUP handler to simplify the code.
>>
>> This seems to have made buildfarm member 'elver' to segfault. I've got a hunch that setting recoveryWakeupLatch to NULL, when WakeupRecovery() doesn't check for NULL, is not OK. It's surprising that we're only seeing this on 'elver', though.
>
> Thanks for the report! The latch is reset to NULL after ShutdownWalRcv(). So I thought that there are no processes setting the latch (i.e., walreceiver has already exited) after it's reset to NULL. But this is not true. There seem a window between ShutdownWalRcv() and the actual exit of walreceiver. If a signal is sent during that window, the segmentation fault would happen. This would be the reason why segv happened in some platforms, but not in others.
>
> I'm thinking to remove the following code to fix this issue. Thought?
>
> /*
> * We don't need the latch anymore. It's not strictly necessary to reset
> * it to NULL, but let's do it for the sake of tidiness.
> */
> if (ArchiveRecoveryRequested)
> XLogCtl->recoveryWakeupLatch = NULL;
Or ISTM that WakeupRecovery() should set the latch only when the latch
has not been reset to NULL yet.
Regards,
--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
From | Date | Subject | |
---|---|---|---|
Next Message | Fujii Masao | 2020-11-04 12:51:53 | pgsql: Fix segmentation fault that commit ac22929a26 caused. |
Previous Message | Peter Eisentraut | 2020-11-04 11:50:18 | pgsql: Enable hash partitioning of text arrays |