Re: [BUG] Re-entering malloc problem when use --enable-nls build postgresql

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: 158306855 <anderson2013(at)qq(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: [BUG] Re-entering malloc problem when use --enable-nls build postgresql
Date: 2018-05-08 05:57:43
Message-ID: 20180508055743.2dsdxnp4jdb5auwy@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 2018-05-08 01:32:33 -0400, Tom Lane wrote:
> "=?ISO-8859-1?B?MTU4MzA2ODU1?=" <anderson2013(at)qq(dot)com> writes:
> > I found that compiling postgresql with enable-nls may be introduce a problem
>
> > 1. When build postgresql with enable-nls (Native Language Support), postgresql use dgettext function to translate Language.
> > 2. The quickdie use dgettext translate message ; dgettext use malloc (in __dcigettext function)
> > 3. When use pg_ctl -m fast to shutdown postgresql, pg backend process use function quickdie to shutdown database.
> > 4. Before receive quickdie signal, if backend process in malloc function and already have lock that will lead to process deadlock.

I don't think it's realistic to treat this is something we'll
necessarily backpatch. Any fix is going to be too complicated.

> I can't get excited about this. quickdie's attempt to report that it's
> killing the process is necessarily a "best effort" undertaking, because
> we cannot be sure that the process is in a good state. In this situation,
> it isn't. --enable-nls might make the odds of that a bit worse, but we
> could get such a failure regardless.

As previously, I disagree with this. There's plenty ways to reach
quickdie, several where we are quite sure about the state. -m immediate
is a thing, and there's plenty situations where it's an entirely
reasonable choice.

> There are not any better alternatives. We can't just set a flag in the
> signal handler and hope that control will someday reach a place that
> notices the flag. We could exit without attempting to report anything,
> but nobody would find that user-friendly. So we try to report, in the
> full understanding that sometimes it won't work.

It'd be fairly unproblematic to write an untranslated message out. There
we can make sure to either only use plain syscalls or use memory from
the preallocated context. I think it'd be ok to not to translate in
that situation.

We should also use _exit or the like when exiting quickly, calling exit
handlers from a signal handlers is a bad bad idea.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2018-05-08 06:02:16 BUG #15189: Could not connect to server
Previous Message Tom Lane 2018-05-08 05:32:33 Re: [BUG] Re-entering malloc problem when use --enable-nls build postgresql