Re: pgsql: Don't enter parallel mode when holding interrupts.

From: Noah Misch <noah(at)leadboat(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pgsql: Don't enter parallel mode when holding interrupts.
Date: 2024-09-20 18:39:31
Message-ID: 20240920183931.f0.nmisch@google.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On Thu, Sep 19, 2024 at 09:25:05AM -0400, Robert Haas wrote:
> On Wed, Sep 18, 2024 at 3:27 AM Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at> wrote:
> > On Wed, 2024-09-18 at 02:58 +0000, Noah Misch wrote:
> > > Don't enter parallel mode when holding interrupts.
> > >
> > > Doing so caused the leader to hang in wait_event=ParallelFinish, which
> > > required an immediate shutdown to resolve. Back-patch to v12 (all
> > > supported versions).
> > >
> > > Francesco Degrassi
> > >
> > > Discussion: https://postgr.es/m/CAC-SaSzHUKT=vZJ8MPxYdC_URPfax+yoA1hKTcF4ROz_Q6z0_Q@mail.gmail.com
> >
> > Does that warrant mention on this page?
> > https://www.postgresql.org/docs/current/when-can-parallel-query-be-used.html
>
> IMHO, no. This seems too low-level and too odd to mention.

Agreed. If I were documenting it, I would document it with the material for
writing opclasses. It's probably too esoteric to document even there.

> TBH, I'm kind of surprised to learn that it's possible to start
> executing a query while holding an LWLock. I see Tom is expressing
> some doubts on the original thread, too. I wonder if we should instead
> be erroring out if an LWLock is held at the start of query execution
> -- or even earlier, like when we try to call a plpgsql function while
> holding one. Leaving parallel query aside, what would prevent us from
> attempting to reacquire the exact same LWLock that we already hold and
> self-deadlocking? Or attempting to acquire some other LWLock and
> deadlocking that way? I don't really feel like this is a parallel
> query problem. I don't think we should be trying to run any
> user-defined code while holding an LWLock, unless that code is written
> in C (or C++, Rust, etc.). Trying to run procedural code at that point
> doesn't seem reasonable.

Nothing prevents those lwlock deadlocks. If you think it's worth breaking the
things folks use today (see original thread) in order to prevent that, please
do share that on the original thread. I'm fine either way. I think given
infinite resources across both postgresql.org and all extension maintainers, I
would do what you're thinking in v18 while in back branches, I would change
"erroring out" to "warn when assertions are enabled". I also think it's a
low-priority bug, given the only known ways to reach it are C code or a custom
opclass. Since resources aren't infinite, I'm inclined toward one of (a) stop
here or (b) all branches "warn when assertions are enabled" and maybe block
the plancache route discussed on the original thread.

In response to

Browse pgsql-committers by date

  From Date Subject
Next Message Tom Lane 2024-09-20 19:56:48 pgsql: Doc: explain how to test ADMIN privilege with pg_has_role().
Previous Message Peter Geoghegan 2024-09-20 18:06:40 pgsql: Fix nbtree pgstats accounting with parallel scans.

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2024-09-20 18:56:16 Re: pg_checksums: Reorder headers in alphabetical order
Previous Message Andres Freund 2024-09-20 18:17:05 Re: FullTransactionIdAdvance question