From: | Noah Misch <noah(at)leadboat(dot)com> |
---|---|
To: | Francesco Degrassi <francesco(dot)degrassi(at)optionfactory(dot)net> |
Cc: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: Leader backend hang on IPC/ParallelFinish when LWLock held at parallel query start |
Date: | 2024-09-18 03:01:59 |
Message-ID: | 20240918030159.2a.nmisch@google.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On Mon, Sep 16, 2024 at 09:35:13PM +0200, Francesco Degrassi wrote:
> The problem appears to manifest when a backend is holding an LWLock and
> starting a query, and the planner chooses a parallel plan for the
> latter.
Thanks for the detailed report and for the fix.
> Potential fixes
> ---------------
>
> As an experiment, we modified the planner code to consider the state of
> `InterruptHoldoffCount` when determining the value of
> `glob->parallelOK`: if `InterruptHoldoffCount` > 0, then `parallelOK`
> is set to false.
>
> This ensures a sequential plan is executed if interrupts are being held
> on the leader backend, and the query completes normally.
>
> The patch is attached as `no_parallel_on_interrupts_held.patch`.
Looks good. An alternative would be something like the leader periodically
waking up to call HandleParallelMessages() outside of ProcessInterrupts(). I
like your patch better, though. Parallel query is a lot of infrastructure to
be running while immune to statement_timeout, pg_cancel_backend(), etc. I
opted to check INTERRUPTS_CAN_BE_PROCESSED(), since QueryCancelHoldoffCount!=0
doesn't cause the hang but still qualifies as a good reason to stay out of
parallel query. Pushed that way:
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=ac04aa8
> Related issues
> ==============
>
> - Query stuck with wait event IPC / ParallelFinish
> -
> https://www.postgresql.org/message-id/0f64b4c7fc200890f2055ce4d6650e9c2191fac2.camel\@j-davis.com
This one didn't reproduce for me. Like your test, it involves custom code
running inside an opclass. I'm comfortable assuming it's the same problem.
> - BUG \#18586: Process (and transaction) is stuck in IPC when the DB
> is under high load
> -
> https://www.postgresql.org/message-id/flat/18586-03e1535b1b34db81%40postgresql.org
Here, I'm not seeing enough detail to judge if it's the same. That's okay.
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2024-09-18 04:23:42 | Re: Leader backend hang on IPC/ParallelFinish when LWLock held at parallel query start |
Previous Message | Tom Lane | 2024-09-18 00:16:35 | Re: BUG #18545: \dt breaks transaction, calling error when executed in SET SESSION AUTHORIZATION |