Re: BUG #15036: Un-killable queries Hanging in BgWorkerShutdown

From: David Kohn <djk447(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #15036: Un-killable queries Hanging in BgWorkerShutdown
Date: 2018-01-30 03:33:17
Message-ID: CAJhMaBhW9x9ER4btTUXyubGhuF9F8ca162sU5exkaPnL+cCvOQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Responses interleaved with yours.
On Mon, Jan 29, 2018 at 9:07 PM Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
wrote:

>
>
Hi David,
>
> Thanks for the report! Based on the mention of BtreePage, this sounds
> like the following bug:
>
> https://www.postgresql.org/message-id/flat/CAEepm%3D2xZUcOGP9V0O_G0%3D2P2wwXwPrkF%3DupWTCJSisUxMnuSg%40mail.gmail.com
>
>
The fix for that will be in 10.2 (current target date: February 8th).
> The workaround in the meantime would be to disable parallelism, at
> least for the queries doing parallel index scans if you can identify
> them.

That sounds great, I hope that patch will fix it, I'm not quite sure it
will though. Some of them have workers that are in the BtreePage state,
however at least as many of the hung queries have only workers in
the MessageQueuePutMessage state. Would you expect the patch to fix those
as well? Or could it be something different?

>
> However, I'm not entirely sure why you're not able to cancel these
> backends politely with pg_cancel_backend(). For example, the
> BtreePage waiter should be in ConditionVariableSleep() and should be
> interrupted by such a signal and error out in CHECK_FOR_INTERRUPTS().

All of them are definitely un-killable by anything other than a kill -9
that I've found so far. I have a feeling it has something to do with:
https://jobs.zalando.com/tech/blog/hack-to-terminate-tcp-conn-postgres/?gh_src=4n3gxh1
but
I'm not 100% sure, as I didn't set tcp settings low enough to make catching
a packet all that reasonable. I'm happy to try to investigate further, I
just don't quite know what that should entail. If you have things that you
think would be helpful, please do let me know.

Thanks for the help!
D

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Thomas Munro 2018-01-30 04:07:27 Re: BUG #15036: Un-killable queries Hanging in BgWorkerShutdown
Previous Message Tomas Vondra 2018-01-30 02:29:02 Re: BUG #15035: scram-sha-256 blocks all logins