Re: make the stats collector shutdown without writing the statsfiles if the immediate shutdown is requested.

From: Masahiro Ikeda <ikedamsh(at)oss(dot)nttdata(dot)com>
To: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
Cc: kuroda(dot)hayato(at)fujitsu(dot)com, Zhihong Yu <zyu(at)yugabyte(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: make the stats collector shutdown without writing the statsfiles if the immediate shutdown is requested.
Date: 2021-03-23 05:54:34
Message-ID: e5e9f874-28f8-2a05-225d-e9ba1ef5fabd@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2021/03/23 11:40, Fujii Masao wrote:
>
>
> On 2021/03/23 9:05, Masahiro Ikeda wrote:
>> Yes. I attached the v5 patch based on v3 patch.
>> I renamed SignalHandlerForUnsafeExit() and fixed the following comment.
>
> Thanks for updating the patch!
>
> When the startup process exits because of recovery_target_action=shutdown,
> reaper() calls TerminateChildren(SIGTERM). This function sends SIGTERM to
> the stats collector. Currently the stats collector ignores SIGTERM, but with
> the patch it exits normally. This change of behavior might be problematic.
>
> That is, TerminateChildren(SIGTERM) sends SIGTERM to various processes.
> But currently the stats collector and checkpointer don't exit even when
> SIGTERM arrives because they ignore SIGTERM. After several processes
> other than the stats collector and checkpointer exit by SIGTERM,
> PostmasterStateMachine() and reaper() make checkpointer exit and then
> the stats collector exit. The shutdown terminates the processes in this order.
>
> On the other hand, with the patch, the stats collector exits by SIGTERM
> before checkpointer exits. This is not normal order of processes to exit in
> shutdown.
>
> To address this issue, one idea is to use SIGUSR2 for normal exit of the stats
> collector, instead of SIGTERM. If we do this, TerminateChildren(SIGTERM)
> cannot terminate the stats collector. Thought?
>
> If we adopt this idea, the detail comment about why SIGUSR2 is used for that
> needs to be added.

Thanks for your comments. I agreed your concern and suggestion.
Additionally, we need to consider system shutdown cycle too as
CheckpointerMain()'s comment said.

```
* Note: we deliberately ignore SIGTERM, because during a standard Unix
* system shutdown cycle, init will SIGTERM all processes at once. We
* want to wait for the backends to exit, whereupon the postmaster will
* tell us it's okay to shut down (via SIGUSR2)
```

I changed the signal from SIGTERM to SIGUSR2 and add the comments why SIGUSR2
is used.
(v6-0001-pgstat_avoid_writing_on_sigquit.patch)

Regards,

--
Masahiro Ikeda
NTT DATA CORPORATION

Attachment Content-Type Size
v6-0001-pgstat_avoid_writing_on_sigquit.patch text/x-patch 8.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-03-23 06:09:23 Re: Replication slot stats misgivings
Previous Message Fujii Masao 2021-03-23 05:49:08 Re: PostmasterIsAlive() in recovery (non-USE_POST_MASTER_DEATH_SIGNAL builds)