Re: Use of signal-unsafe functions from signal handlers

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Mats Kindahl <mats(at)timescale(dot)com>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: Use of signal-unsafe functions from signal handlers
Date: 2022-05-24 10:14:18
Message-ID: 20220524101418.7pp65powa362vzln@jrouhaud
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi,

On Tue, May 24, 2022 at 11:42:35AM +0200, Mats Kindahl wrote:
>
> Typically, signal-unsafe functions should not be called from signal
> handlers. In particular, calling malloc() directly or indirectly can cause
> deadlocks, making PostgreSQL unresponsive to signals.
>
> Unless I am missing something, bgworker_die uses ereport, which indirectly
> calls printf-like functions, which are not signal-safe since they use
> malloc(). In rare cases, this can lead to deadlocks with stacks that look
> like this (from https://github.com/timescale/timescaledb/issues/4200):
>
> #0 0x00007f0e4d1040eb in __lll_lock_wait_private () from
> target:/lib/x86_64-linux-gnu/libc.so.6
> [...]
> #3 malloc (size=53)
> [...]
> #7 0x000055b9212235b1 in errmsg ()
> #8 0x00007f0e27bf27a8 in handle_sigterm (postgres_signal_arg=15) at
> /build/timescaledb/src/bgw/scheduler.c:841
> #9 <signal handler called>
> [...]
> #13 free (ptr=<optimized out>)
> #14 0x00007f0e4db12cb4 in OPENSSL_LH_free () from
> target:/lib/x86_64-linux-gnu/libcrypto.so.1.1
> [...]

As far as I can see the problem comes from your handle_sigterm:

static void handle_sigterm(SIGNAL_ARGS)
{
/*
* do not use a level >= ERROR because we don't want to exit here but
* rather only during CHECK_FOR_INTERRUPTS
*/
ereport(LOG,
(errcode(ERRCODE_ADMIN_SHUTDOWN),
errmsg("terminating TimescaleDB job scheduler due to administrator command")));
die(postgres_signal_arg);
}

As mentioned in the topmost comment of elog.c, there's an escape hatch for
out of memory situations, to make sure that a small enough message can be
displayed without trying to allocate memory. But this is only for ERROR or
higher levels. Using an ereport(LOG, ...) level in a sigterm handler indeed
isn't safe.

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Juan José Santamaría Flecha 2022-05-24 11:00:55 Re: BUG #17492: error MSB4126: The specified solution configuration "Release|arm64" is invalid
Previous Message Andrey Borodin 2022-05-24 10:08:33 Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY