Re: BUG #9464: PANIC with 'failed to re-find shared lock object'

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter LaDow <petela(at)gocougs(dot)wsu(dot)edu>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #9464: PANIC with 'failed to re-find shared lock object'
Date: 2014-03-07 04:18:16
Message-ID: 339.1394165896@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Peter LaDow <petela(at)gocougs(dot)wsu(dot)edu> writes:
> On Thu, Mar 6, 2014 at 7:32 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> on_exit_reset() in the first-level child would likely be a good idea.

> Thanks for the tip! I tried it, and things are happy again.

Hah, thanks for confirming the diagnosis.

>> See atexit_callback in src/backend/storage/ipc/ipc.c: your first-level
>> child is killing all the parent backend's shared-memory state when it
>> does exit(). This is a safety feature we added at some point in the

> Ah! Well, I guess I could use _exit() as well...
> Any preference between the two?

I don't think _exit() is a terribly good idea. Consider the possibility
that some third-party library loaded into the backend has also established
an atexit callback, and unlike what we did, that code does need to get
control in a subprocess exit.

I'm thinking we failed to consider this situation, and really the right
thing is for atexit_callback() to defend itself against the case.
One possibility is to do this:

atexit_callback(void)
{
if (getpid() == MyProcPid)
{
/* Clean up everything that must be cleaned up */
/* ... too bad we don't know the real exit code ... */
proc_exit_prepare(-1);
}
}

but that's pretty off-the-cuff, so I'm not sure if it still has
failure modes.

Could you try patching atexit_callback as above, and verify that
it does what you want (with or without the on_exit_reset() in your
extension)?

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Peter LaDow 2014-03-07 04:37:43 Re: BUG #9464: PANIC with 'failed to re-find shared lock object'
Previous Message Peter LaDow 2014-03-07 04:08:08 Re: BUG #9464: PANIC with 'failed to re-find shared lock object'