Re: BUG #15036: Un-killable queries Hanging in BgWorkerShutdown

From: David Kohn <djk447(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #15036: Un-killable queries Hanging in BgWorkerShutdown
Date: 2018-02-01 22:54:22
Message-ID: CAJhMaBjocndZwmSDWNsTFVurBRn7tJ5Fv1rk4+Zq-JmKrhDX8Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Mon, Jan 29, 2018 at 11:08 PM Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
wrote:

>
> the real question is: why on earth aren't the wait loops responding to
> SIGINT and SIGTERM? I wonder if there might be something funky about
> parallel query + statement timeouts.

Agreed. Seems like a backtrace wouldn't help much. I saw the other thread
<https://www.postgresql.org/message-id/flat/CAKcux6kfXvOgz5WwE7Pc%2BpW%2BOpW-%2Bnvcu9ybJF%2Bjvq%2BnA87J%2Bg%40mail(dot)gmail(dot)com#CAKcux6kfXvOgz5WwE7Pc+pW+OpW-+nvcu9ybJF+jvq+nA87J+g(at)mail(dot)gmail(dot)com>
with
similar cancellation issues a couple notes that might help:
1) I also have a lateral select inside of a view there. seems doubtful that
the lateral has anything to do with it, but in case that could be it,
thought I'd pass that along.
2) Are there any settings that could potentially help with this? for
instance, this isn't on a replica, so max_standby_archive_delay wouldn't
more forcefully (potentially) cancel a query, is there anything similar
that could work here? as you noted we've already set a statement timeout,
so it isn't responding to that, but it does get cancelled when another
(hung) process is SIGKILL-ed. When that happens the db goes into recovery
mode - so is it being sent SIGKILL at that point as well? Or is it some
other signal that is a little less invasive? Probably not, but thought I'd
ask.

best,
D

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2018-02-02 00:56:39 BUG #15044: materialized views incompatibility with logical replication in postgres 10
Previous Message Robert Haas 2018-02-01 20:33:49 Re: Re: BUG #15039: some question about hash index code