From: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Unportable implementation of background worker start |
Date: | 2017-04-21 15:19:41 |
Message-ID: | 20170421151941.45njrtwykn5dd476@alvherre.pgsql |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Tom Lane wrote:
> After sleeping and thinking more, I've realized that the
> slow-bgworker-start issue actually exists on *every* platform, it's just
> harder to hit when select() is interruptable. But consider the case
> where multiple bgworker-start requests arrive while ServerLoop is
> actively executing (perhaps because a connection request just came in).
> The postmaster has signals blocked, so nothing happens for the moment.
> When we go around the loop and reach
>
> PG_SETMASK(&UnBlockSig);
>
> the pending SIGUSR1 is delivered, and sigusr1_handler reads all the
> bgworker start requests, and services just one of them. Then control
> returns and proceeds to
>
> selres = select(nSockets, &rmask, NULL, NULL, &timeout);
>
> But now there's no interrupt pending. So the remaining start requests
> do not get serviced until (a) some other postmaster interrupt arrives,
> or (b) the one-minute timeout elapses. They could be waiting awhile.
>
> Bottom line is that any request for more than one bgworker at a time
> faces a non-negligible risk of suffering serious latency.
Interesting. It's hard to hit, for sure.
> I'm coming back to the idea that at least in the back branches, the
> thing to do is allow maybe_start_bgworker to start multiple workers.
>
> Is there any actual evidence for the claim that that might have
> bad side effects?
Well, I ran tests with a few dozen thousand sample workers and the
neglect for other things (such as connection requests) was visible, but
that's probably not a scenario many servers run often currently. I
don't strongly object to the idea of removing the "return" in older
branches, since it's evidently a problem. However, as bgworkers start
to be used more, I think we should definitely have some protection. In
a system with a large number of workers available for parallel queries,
it seems possible for a high velocity server to get stuck in the loop
for some time. (I haven't actually verified this, though. My
experiments were with the early kind, static bgworkers.)
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2017-04-21 16:50:04 | Re: Unportable implementation of background worker start |
Previous Message | Tom Lane | 2017-04-21 15:09:16 | Re: Unportable implementation of background worker start |