| From: | Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> |
|---|---|
| To: | Nikhil Sontakke <nikhil(dot)sontakke(at)enterprisedb(dot)com> |
| Cc: | pgsql-hackers(at)postgresql(dot)org |
| Subject: | Re: PG signal handler and non-reentrant malloc/free calls |
| Date: | 2011-02-28 12:27:08 |
| Message-ID: | 4D6B949C.5050903@enterprisedb.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On 28.02.2011 14:04, Nikhil Sontakke wrote:
> I believe we have a case where not holding off interrupts while doing a
> malloc() can cause a deadlock due to system or libc level locking. In this
> case, a pg_ctl stop in fast mode was resorted to and that caused a backend
> to handle the interrupt when it was inside the malloc call. Now as part of
> the abort processing, in the subtransaction cleanup code path, this same
> backend tried to clear memory contexts, leading to an eventual free() call.
> The free() call tried to take the same lock which was already held by
> malloc() earlier resulting into a deadlock!
Our signal handlers shouldn't try to do anything that complicated.
die(), which handles SIGTERM caused by fast shutdown in backends,
doesn't do abort processing itself. It just sets a global variable.
Unless ImmediateInterruptOK is set, but it's only set around a few
blocking system calls where it is safe to do so. (Checks...) Actually,
md5_crypt_verify() looks suspicious, it does "ImmediateInterruptOK =
true", and then calls palloc() and pfree().
> Will try to get the call stack if needed.
Yes, please.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Fujii Masao | 2011-02-28 13:08:12 | Re: Replication server timeout patch |
| Previous Message | Nikhil Sontakke | 2011-02-28 12:04:03 | PG signal handler and non-reentrant malloc/free calls |