From: | Merlin Moncure <mmoncure(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-bugs <pgsql-bugs(at)postgresql(dot)org> |
Subject: | Re: server crash with "process 22821 releasing ProcSignal slot 32, but it contains 0" |
Date: | 2012-08-09 21:26:03 |
Message-ID: | CAHyXU0xqeQFZ_qcVyUBubJB+tQ3rAb7g0yp+m1YuLTWGHtu70w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On Tue, Jun 26, 2012 at 12:09 PM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
> On Tue, Jun 26, 2012 at 12:02 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
>>> I suspect (but haven't had time to prove and may not for several days
>>> -- unfortunately going on vacation momentarily) that this might be
>>> caused by pl/sh.
>>
>> Hm. The reported symptoms might be explainable if something had caused
>> multiple threads to become active within the backend process --- then
>> it would be plausible for it to try to do proc_exit cleanup twice.
>> Which would explain the first two errors, though I'm not sure how that
>> leads to failing to disown the process latch, as the third error
>> suggests must have happened. But I don't know enough about pl/sh to
>> know if it could cause threading activation.
>>
>>> In particular, we have a routine that was
>>> inadvertently applied to the database in with windows cr/lf instead of
>>> the normal linux newline.
>>
>> This doesn't seem real promising as an explanation ...
>
> right -- just a suspicion. maybe the relevant point was that it
> immediately failed. operator invoking the busted routine (which I had
> to fix) and the crash were highly correlated, although it does not
> always crash. yesterday was very heavy load and today not so much.
Follow up on this. It is pl/sh and it is a newline issue: one of the
developers is using a tool (I think pgadmin?) that is sticking \r
characters at the end of every line which is throwing off pl/sh's
shebang parsing. The issuing query gets an error along the lines of
'could not exec' and the server goes belly up if there is significant
concurrent load when that's issued. This is an out of date pl/sh, so
I'm going to upgrade it and try and reproduce. If I still can, I'll
supply a test case.
merlin
From | Date | Subject | |
---|---|---|---|
Next Message | andersonabreu | 2012-08-10 17:33:23 | BUG #7486: Error Group by |
Previous Message | Dave Page | 2012-08-09 14:55:09 | Re: BUG #6722: Debugger broken? |