From: | Herouth Maoz <herouth(at)unicell(dot)co(dot)il> |
---|---|
To: | Craig Ringer <craig(at)postnewspapers(dot)com(dot)au> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: stopping processes, preventing connections |
Date: | 2010-03-17 13:07:54 |
Message-ID: | 0F60ED4C-4FB6-48C7-907B-AFC4A8748062@unicell.co.il |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Mar 17, 2010, at 14:56 , Craig Ringer wrote:
> On 17/03/2010 8:43 PM, Herouth Maoz wrote:
>>
>> On Mar 17, 2010, at 13:34 , Craig Ringer wrote:
>>
>>> On 17/03/2010 6:32 PM, Herouth Maoz wrote:
>>>>
>>>> On Mar 3, 2010, at 18:01 , Josh Kupershmidt wrote:
>>>>
>>>>> Though next time you see a query which doesn't respond to
>>>>> pg_cancel_backend(), try gathering information about the query and
>>>>> what the backend is doing; either you're doing something unusual (e.g.
>>>>> an app is restarting the query automatically after getting canceled)
>>>>> or perhaps you've stumbled on a bug in Postgres.
>>>>
>>>> Hi. A long time has passed since you made that suggestion, but today we
>>>> stumbled again on a query that wouldn't be canceled. Not only does it
>>>> not respond to pg_cancel_backend(), it also doesn't respond to kill
>>>> -SIGTERM.
>>>
>>> Interesting. If you attach gdb to the backend and run "backtrace", what's the output?
>>
>> (gdb) backtrace
>> #0 0x8dfcb410 in ?? ()
>> #1 0xbff10a28 in ?? ()
>> #2 0x083b1bf4 in ?? ()
>> #3 0xbff10a00 in ?? ()
>> #4 0x8db98361 in send () from /lib/tls/i686/cmov/libc.so.6
>> #5 0x08195d54 in secure_write ()
>> #6 0x0819dc7e in pq_setkeepalivesidle ()
>> #7 0x0819ddd5 in pq_flush ()
>> #8 0x0819de3d in pq_putmessage ()
>> #9 0x0819fa63 in pq_endmessage ()
>> #10 0x08086dcb in printtup_create_DR ()
>> #11 0x08178dc4 in ExecutorRun ()
>> #12 0x08222326 in PostgresMain ()
>> #13 0x082232c0 in PortalRun ()
>> #14 0x0821e27d in pg_parse_query ()
>> #15 0x08220056 in PostgresMain ()
>> #16 0x081ef77f in ClosePostmasterPorts ()
>> #17 0x081f0731 in PostmasterMain ()
>> #18 0x081a0484 in main ()
>
> OK, so it seems to be stuck sending data down a socket. The fact that strace isn't reporting any new system calls suggests the backend is just blocked on that send() call and isn't doing any work.
>
> Is there any chance the client has disconnected/disappeared?
Yes, certainly. In fact, I mentioned in the past that the product we use for our reports, which is an application built on top of Crystal Reports, when told to cancel a report or when a report times out, instead of telling Crystal to cancel queries properly, simply kills Crystal's processes on the Windows machine side - which leaves us with orphan backends. It's stupid, but it's not under our control. But most of the time the backends respond to cancel requests.
Aren't socket writes supposed to have time outs of some sort? Stupid policies notwithstanding, processes on the client side can disappear for any number of reasons - bugs, power failures, whatever - and this is not something that is supposed to cause a backend to hang, I would assume.
Is there anything I can do about it?
Herouth
From | Date | Subject | |
---|---|---|---|
Next Message | josep porres | 2010-03-17 13:57:54 | db error messages when I try to debug with pgadmin |
Previous Message | Craig Ringer | 2010-03-17 12:56:19 | Re: stopping processes, preventing connections |