Re: Use standard SIGHUP and SIGTERM handlers in autoprewarm module

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Craig Ringer <craig(dot)ringer(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robert(dot)haas(at)enterprisedb(dot)com>
Subject: Re: Use standard SIGHUP and SIGTERM handlers in autoprewarm module
Date: 2020-11-18 12:22:50
Message-ID: 1ae40b1b-2f42-4510-1514-40bb7d7255ab@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2020/11/17 21:18, Bharath Rupireddy wrote:
> Thanks Craig!
>
> On Fri, Oct 23, 2020 at 9:37 AM Craig Ringer
> <craig(dot)ringer(at)enterprisedb(dot)com> wrote:
>>
>> src/test/modules/test_shm_mq/worker.c appears to do the right thing the wrong way - it has its own custom handler instead of using die() or SignalHandlerForShutdownRequest().
>>
>
> handle_sigterm() and die() are similar except that die() has extra
> handling(below) for exiting immediately when waiting for input/command
> from the client.
> /*
> * If we're in single user mode, we want to quit immediately - we can't
> * rely on latches as they wouldn't work when stdin/stdout is a file.
> * Rather ugly, but it's unlikely to be worthwhile to invest much more
> * effort just for the benefit of single user mode.
> */
> if (DoingCommandRead && whereToSendOutput != DestRemote)
> ProcessInterrupts();
>
> Having this extra handling is correct for normal backends as they can
> connect directly to clients for reading input commands, the above if
> condition may become true and ProcessInterrupts() may be called to
> handle the SIGTERM immediately.
>
> Note that DoingCommandRead can never be true in bg workers as it's
> being set to true only in normal backend PostgresMain(). If we use
> die()(instead of custom SIGTERM handlers such as handle_sigterm()) in
> a bg worker/non-backend process, there are no chances that the above
> part of the code gets hit and the interrupts are processed
> immediately. And also here are the bg worker process that use die()
> instead of their own custom handlers: parallel workers, autoprewarm
> worker, autovacuum worker, logical replication launcher and apply
> worker, wal sender process
>
> I think we can also use die() instead of handle_sigterm() in
> test_shm_mq/worker.c.
>
> I attached a patch for this change.

Thanks for the patch! It looks good to me.

BTW, I tried to find the past discussion about why die() was not used,
but I could not yet.

>
>>
>> In contrast src/test/modules/worker_spi/worker_spi.c looks plain wrong. Especially since it's quoted as an example of how to do things right. It won't respond to SIGTERM at all while it's executing a query from its queue, no matter how long that query takes or whether it blocks. It can inhibit even postmaster shutdown as a result.
>>
>
> Postmaster sends SIGTERM to all children(backends and bgworkers) for
> both smart shutdown(wait for children to end their work, then shut
> down.) and fast shutdown(rollback active transactions and shutdown
> when they are gone.) For immediate shutdown SIGQUIT is sent to
> children.(see pmdie()).
>
> For each bg worker(so is for worker_spi.c),
> SignalHandlerForCrashExit() is set as SIGQUIT handler in
> InitPostmasterChild(), which does nothing but exits immediately. We
> can not use quickdie() here because as a bg worker we don't have
> to/can not send anything to client. We are good with
> SignalHandlerForCrashExit() as with all other bg workers.
>
> Yes, if having worker_spi_sigterm/SignalHandlerForShutdownRequest,
> sometimes(as explained above) the worker_spi worker may not respond to
> SIGTERM. I think we should be having die() as SIGTERM handler here (as
> with normal backend and parallel workers).
>
> Attaching another patch that has replaced custom SIGTERM handler
> worker_spi_sigterm() with die() and custom SIGHUP handler
> worker_spi_sighup() with standard SignalHandlerForConfigReload().

This patch also looks good to me.

>
>>
>> I was going to lob off a quick patch to fix this by making both use quickdie() for SIGQUIT and die() for SIGTERM, but after reading for a bit I'm no longer sure what the right choice even is. I'd welcome some opinions.
>>
>
> We can not use quickdie() here because as a bg worker we don't have
> to/can not send anything to client. We are good with
> SignalHandlerForCrashExit() as with all other bg workers.
>
> Thoughts?

I think you're right.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2020-11-18 12:31:22 Re: Devel docs on website reloading
Previous Message a.pervushina 2020-11-18 12:05:00 Re: [HACKERS] make async slave to wait for lsn to be replayed