Re: Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?

From: Craig James <cjames(at)emolecules(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, "Graeme B(dot) Bell" <graeme(dot)bell(at)nibio(dot)no>, postgres performance list <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?
Date: 2015-07-08 22:38:24
Message-ID: CAFwQ8rcmN5APWKS8v9JBqvQ8_x2by276LFBZxtci-32BnuMGUg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Wed, Jul 8, 2015 at 1:27 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:

> On 2015-07-08 13:46:53 -0500, Merlin Moncure wrote:
> > On Wed, Jul 8, 2015 at 12:48 PM, Craig James <cjames(at)emolecules(dot)com>
> wrote:
> > > On Tue, Jul 7, 2015 at 10:31 PM, Joshua D. Drake <jd(at)commandprompt(dot)com
> >
> > >> Using Apache Fast-CGI, you are going to fork a process for each
> instance
> > >> of the function being executed and that in turn will use all CPUs up
> to the
> > >> max available resource.
> > >>
> > >> With PostgreSQL, that isn't going to happen unless you are running (at
> > >> least) 8 functions across 8 connections.
> > >
> > >
> > > Well, right, which is why I mentioned "even with dozens of clients."
> > > Shouldn't that scale to at least all of the CPUs in use if the
> function is
> > > CPU intensive (which it is)?
> >
> > only in the absence of inter-process locking and cache line bouncing.
>
> And addititionally memory bandwidth (shared between everything, even in
> the numa case), cross socket/bus bandwidth (absolutely performance
> critical in multi-socket configurations), cache capacity (shared between
> cores, and sometimes even sockets!).
>

From my admittedly naive point of view, it's hard to see why any of this
matters. I have functions that do purely CPU-intensive mathematical
calculations ... you could imagine something like is_prime(N) that
determines if N is a prime number. I have eight clients that connect to
eight backends. Each client issues an SQL command like, "select
is_prime(N)" where N is a simple number.

Are you saying that in order to calculate is_prime(N), all of that stuff
(inter-process locking, memory bandwith, bus bandwidth, cache capacity,
etc.) is even relevant? And if so, how is it that Postgres is so different
from an Apache fast-CGI program that runs the exact same is_prime(N)
calculation?

Just curious ... as I said, I've already implemented a different solution.

Craig

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Andres Freund 2015-07-08 22:45:18 Re: Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?
Previous Message Andres Freund 2015-07-08 20:27:33 Re: Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?