Re: dynamic background workers

From: Markus Wanner <markus(at)bluegap(dot)ch>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: dynamic background workers
Date: 2013-06-20 14:59:33
Message-ID: 51C318D5.7010903@bluegap.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 06/20/2013 04:41 PM, Robert Haas wrote:
> The constant factor is also very small. Generally, I would expect
> num_worker_processes <~ # CPUs

That assumption might hold for parallel querying, yes. In case of
Postgres-R, it doesn't. In the worst case, i.e. with a 100% write load,
a cluster of n nodes, each with m backends performing transactions, all
of them replicated to all other (n-1) nodes, you end up with ((n-1) * m)
bgworkers. Which is pretty likely to be way above the # CPUs on any
single node.

I can imagine other extensions or integral features like autonomous
transactions that might possibly want many more bgworkers as well.

> and scanning a 32, 64, or even 128
> element array is not a terribly time-consuming operation.

I'd extend that to say scanning an array with a few thousand elements is
not terribly time-consuming, either. IMO the simplicity is worth it,
ATM. It's all relative to your definition of ... eh ... "terribly".

.oO( ... premature optimization ... all evil ... )

> We might
> need to re-think this when systems with 4096 processors become
> commonplace, but considering how many other things would also need to
> be fixed to work well in that universe, I'm not too concerned about it
> just yet.

Agreed.

> One thing I think we probably want to explore in the future, for both
> worker backends and regular backends, is pre-forking. We could avoid
> some of the latency associated with starting up a new backend or
> opening a new connection in that way. However, there are quite a few
> details to be thought through there, so I'm not eager to pursue that
> just yet. Once we have enough infrastructure to implement meaningful
> parallelism, we can benchmark it and find out where the bottlenecks
> are, and which solutions actually help most.

Do you mean pre-forking and connecting to a specific database? Or really
just the forking?

> I do think we need a mechanism to allow the backend that requested the
> bgworker to know whether or not the bgworker got started, and whether
> it unexpectedly died. Once we get to the point of calling user code
> within the bgworker process, it can use any number of existing
> mechanisms to make sure that it won't die without notifying the
> backend that started it (short of a PANIC, in which case it won't
> matter anyway). But we need a way to report failures that happen
> before that point. I have some ideas about that, but decided to leave
> them for future passes. The remit of this patch is just to make it
> possible to dynamically register bgworkers. Allowing a bgworker to be
> "tied" to the session that requested it via some sort of feedback loop
> is a separate project - which I intend to tackle before CF2, assuming
> this gets committed (and so far nobody is objecting to that).

Okay, sounds good. Given my background, I considered that a solved
problem. Thanks for pointing it out.

>> Sounds like the postmaster is writing to shared memory. Not sure why
>> I've been trying so hard to avoid that, though. After all, it can hardly
>> hurt itself *writing* to shared memory.
>
> I think there's ample room for paranoia about postmaster interaction
> with shared memory, but all it's doing is setting a flag, which is no
> different from what CheckPostmasterSignal() already does.

Sounds good to me.

Regards

Markus Wanner

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2013-06-20 15:04:03 Re: Re: Adding IEEE 754:2008 decimal floating point and hardware support for it
Previous Message Robert Haas 2013-06-20 14:58:59 Re: MVCC catalog access