Re: server crash with "process 22821 releasing ProcSignal slot 32, but it contains 0"

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: server crash with "process 22821 releasing ProcSignal slot 32, but it contains 0"
Date: 2012-06-26 14:19:34
Message-ID: CAHyXU0xHCMy8eWsakxjEzD4K9yQ-VwnW8WPvxCFv4LxeZj+FOA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Mon, Jun 25, 2012 at 10:03 AM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
> On Mon, Jun 25, 2012 at 9:57 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
>>> 2012-06-25 09:08:08 CDT [postgres(at)ysanalysis_hes]: LOG:  could not
>>> send data to client: Broken pipe
>>> 2012-06-25 09:08:10 CDT [postgres(at)ysanalysis_hes]: LOG:  unexpected
>>> EOF on client connection
>>> 2012-06-25 09:08:10 CDT [postgres(at)ysanalysis_hes]: LOG:  process 22821
>>> releasing ProcSignal slot 32, but it contains 0
>>> 2012-06-25 09:08:10 CDT [postgres(at)ysanalysis_hes]: LOG:  failed to
>>> find proc 0x7f48617e2ab0 in ProcArray
>>> [and a bit later]
>>> 2012-06-25 09:08:24 CDT [postgres(at)ysanalysis_hes]: FATAL:  latch already owned
>>
>> I think what we're looking at here is a screw-up in the process shutdown
>> sequence.  Perhaps caused by bad recovery from an attempt to send an
>> error message to the already-disconnected client; but that's just
>> speculation, and it's hard to see how to get more info without a core
>> dump.
>>
>> I wonder whether we shouldn't promote some or all of these three error
>> cases to PANIC, as they certainly suggest shared-memory corruption.
>> And if it did panic, we could hope to get a core dump for debugging
>> purposes.
>
> Ok, I'll look into reproducing the crash conditions.  Unfortunately
> this is a critical server and it crashed during a time sensitive
> process. I can schedule a maintenance window though but it will have
> to wait a bit.
>
> merlin

I have some good news: this was reproduce and i I believe it to be
operator invoked:

2012-06-26 09:12:19 CDT [postgres(at)ysanalysis_hes]: ERROR: index
"idx_lease_expiremonth2" does not exist
2012-06-26 09:12:19 CDT [postgres(at)ysanalysis_hes]: STATEMENT: DROP
INDEX idx_Lease_ExpireMonth2;
2012-06-26 09:15:10 CDT [rms(at)ysanalysis]: LOG: unexpected EOF on
client connection
2012-06-26 09:15:10 CDT [rms(at)ysanalysis]: LOG: process 10340
releasing ProcSignal slot 5, but it contains 0
2012-06-26 09:15:10 CDT [rms(at)ysanalysis]: LOG: failed to find proc
0x7f48617e6310 in ProcArray
2012-06-26 09:16:48 CDT [rms(at)ysanalysis]: FATAL: latch already owned
2012-06-26 09:16:48 CDT [(at)]: LOG: server process (PID 10928) exited
with exit code 1
2012-06-26 09:16:48 CDT [(at)]: LOG: terminating any other active server processes
2012-06-26 09:16:48 CDT [postgres(at)postgres]: WARNING: terminating
connection because of crash of another server process

...investigating...

merlin

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message lifeair 2012-06-26 15:39:20 BUG #6707: ERROR: could not open relation with OID
Previous Message Dave Page 2012-06-26 10:26:54 Re: BUG #6705: 32 bit