From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Fwd: background worker shudown (was Re: [HACKERS] Why does logical replication launcher exit with exit code 1?) |
Date: | 2018-10-09 18:28:54 |
Message-ID: | CA+TgmobwExL4kNj_eXJxPah_tVQ31N0cYDbUN0FFm6uaY_+X=w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Oct 4, 2018 at 7:37 PM Thomas Munro
<thomas(dot)munro(at)enterprisedb(dot)com> wrote:
> I still think the current situation is non-ideal. I don't have a
> strong view on whether some or all system-wide processes should say
> hello and goodbye explicitly in the log, but I do think we need a way
> to make that not look like an error condition, and ideally without
> special cases for known built-in processes.
>
> I looked into this a bit today, while debugging an extension that runs
> a background worker. Here are some assorted approaches to shutdown:
>
> 0. The default SIGTERM handler for bgworkers is bgworker_die(), which
> generates a FATAL ereport "terminating background worker \"%s\" due to
> administrator command", directly in the signal handler (hmm, see also
> recent discussions about the legality of similar code in quickdie()).
Yeah, I think that code is bad news, for the same reasons discussed
with regard to quickdie().
> 1. Some processes install their own custom SIGTERM handler that sets
> a flag, and use that to break out of their main loop and exit quietly.
> Example: autovacuum.c, or the open-source pg_cron extension, and
> probably many other things that just want a nice quiet shutdown.
>
> 2. Some processes install the standard SIGTERM handler die(), and
> then use CHECK_FOR_INTERRUPTS() to break out of their main loop. By
> default this looks like "FATAL: terminating connection due to
> administrator command". This approach is effectively required if the
> main loop will reach other code that has a CHECK_FOR_INTERRUPTS() wait
> loop. For example, a "launcher"-type process that calls
> WaitForBackgroundWorkerStartup() could hang forever if it used
> approach 1 (ask me how I know).
My experience with background workers has been that approach #1 does
not really work. I mean, if you aren't doing anything complicated it
might be OK, if for example you are executing SQL queries, and you try
to do #1, then your SQL queries don't respond to interrupts. And that
sucks. So I have generally adopted approach #2.
To put that another way, nearly everything can reach
CHECK_FOR_INTERRUPTS(), so saying that this is required whenever that
in the case isn't much different from saying that it is required, full
stop.
> 3. Some system processes generally use approach 2, but have a special
> case in ProcessInterrupts() to suppress or alter the usual FATAL
> message or behaviour. This includes the logical replication launcher.
Maybe we could replace this by a general-purpose hook. So instead of
the various tests for process types that are there now, we would just
have
if (procdie_hook != NULL)
(*procdie_hook)();
And that hook can do whatever you like (but probably including dying).
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Andrew Dunstan | 2018-10-09 18:32:29 | Re: pread() and pwrite() |
Previous Message | Robert Haas | 2018-10-09 18:27:44 | Re: executor relation handling |