Quick Links

Re: Parallel query vs smart shutdown and Postmaster death

From:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Parallel query vs smart shutdown and Postmaster death
Date:	2019-02-26 22:43:55
Message-ID:	CA+hUKG+MF0G7f8UKvTWiGs4iFng5bA_jL8RT4X2WdhP+oE8gkg@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Mon, Feb 25, 2019 at 2:13 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> 1. In a nearby thread, I misdiagnosed a problem reported[1] by Justin
> Pryzby (though my misdiagnosis is probably still a thing to be fixed;
> see next). I think I just spotted the real problem he saw: if you
> execute a parallel query after a smart shutdown has been initiated,
> you wait forever in gather_readnext()! Maybe parallel workers can't
> be launched in this state, but we lack code to detect this case? I
> haven't dug into the exact mechanism or figured out what to do about
> it yet, and I'm tied up with something else for a bit, but I will come
> back to this later if nobody beats me to it.

Given smart shutdown's stated goal, namely that it "lets existing
sessions end their work normally", my questions are:

1. Why does pmdie()'s SIGTERM case terminate parallel workers
immediately? That breaks aborts running parallel queries, so they
don't get to end their work normally.
2. Why are new parallel workers not allowed to be started while in
this state? That hangs future parallel queries forever, so they don't
get to end their work normally.
3. Suppose we fix the above cases; should we do it for parallel
workers only (somehow), or for all bgworkers? It's hard to say since
I don't know what all bgworkers do.

In the meantime, perhaps we should teach the postmaster to report this
case as a failure to launch in back-branches, so that at least
parallel queries don't hang forever? Here's an initial sketch of a
patch like that: it gives you "ERROR: parallel worker failed to
initialize" and "HINT: More details may be available in the server
log." if you try to run a parallel query. The HINT is right, the
server logs say that a smart shutdown is in progress. If that seems a
bit hostile, consider that any parallel queries that were running at
the moment the smart shutdown was requested have already been ordered
to quit; why should new queries started after that get a better deal?
Then perhaps we could do some more involved surgery on master that
achieves smart shutdown's stated goal here, and lets parallel queries
actually run? Better ideas welcome.

--
Thomas Munro
https://enterprisedb.com

Attachment	Content-Type	Size
0001-Report-bgworker-launch-failure-during-smart-shutdown.patch	application/octet-stream	2.5 KB

In response to

Parallel query vs smart shutdown and Postmaster death at 2019-02-25 01:13:11 from Thomas Munro

Responses

Re: Parallel query vs smart shutdown and Postmaster death at 2019-02-27 15:38:46 from Robert Haas
Re: Parallel query vs smart shutdown and Postmaster death at 2019-03-17 04:53:35 from Arseny Sher

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Alvaro Herrera	2019-02-26 22:49:53	Re: Segfault when restoring -Fd dump on current HEAD
Previous Message	Tom Lane	2019-02-26 22:31:12	Re: Allowing extensions to supply operator-/function-specific info