From: | Masahiro Ikeda <ikedamsh(at)oss(dot)nttdata(dot)com> |
---|---|
To: | Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> |
Cc: | kuroda(dot)hayato(at)fujitsu(dot)com, Zhihong Yu <zyu(at)yugabyte(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: make the stats collector shutdown without writing the statsfiles if the immediate shutdown is requested. |
Date: | 2021-03-23 05:54:34 |
Message-ID: | e5e9f874-28f8-2a05-225d-e9ba1ef5fabd@oss.nttdata.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2021/03/23 11:40, Fujii Masao wrote:
>
>
> On 2021/03/23 9:05, Masahiro Ikeda wrote:
>> Yes. I attached the v5 patch based on v3 patch.
>> I renamed SignalHandlerForUnsafeExit() and fixed the following comment.
>
> Thanks for updating the patch!
>
> When the startup process exits because of recovery_target_action=shutdown,
> reaper() calls TerminateChildren(SIGTERM). This function sends SIGTERM to
> the stats collector. Currently the stats collector ignores SIGTERM, but with
> the patch it exits normally. This change of behavior might be problematic.
>
> That is, TerminateChildren(SIGTERM) sends SIGTERM to various processes.
> But currently the stats collector and checkpointer don't exit even when
> SIGTERM arrives because they ignore SIGTERM. After several processes
> other than the stats collector and checkpointer exit by SIGTERM,
> PostmasterStateMachine() and reaper() make checkpointer exit and then
> the stats collector exit. The shutdown terminates the processes in this order.
>
> On the other hand, with the patch, the stats collector exits by SIGTERM
> before checkpointer exits. This is not normal order of processes to exit in
> shutdown.
>
> To address this issue, one idea is to use SIGUSR2 for normal exit of the stats
> collector, instead of SIGTERM. If we do this, TerminateChildren(SIGTERM)
> cannot terminate the stats collector. Thought?
>
> If we adopt this idea, the detail comment about why SIGUSR2 is used for that
> needs to be added.
Thanks for your comments. I agreed your concern and suggestion.
Additionally, we need to consider system shutdown cycle too as
CheckpointerMain()'s comment said.
```
* Note: we deliberately ignore SIGTERM, because during a standard Unix
* system shutdown cycle, init will SIGTERM all processes at once. We
* want to wait for the backends to exit, whereupon the postmaster will
* tell us it's okay to shut down (via SIGUSR2)
```
I changed the signal from SIGTERM to SIGUSR2 and add the comments why SIGUSR2
is used.
(v6-0001-pgstat_avoid_writing_on_sigquit.patch)
Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION
Attachment | Content-Type | Size |
---|---|---|
v6-0001-pgstat_avoid_writing_on_sigquit.patch | text/x-patch | 8.9 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2021-03-23 06:09:23 | Re: Replication slot stats misgivings |
Previous Message | Fujii Masao | 2021-03-23 05:49:08 | Re: PostmasterIsAlive() in recovery (non-USE_POST_MASTER_DEATH_SIGNAL builds) |