From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | andres(at)anarazel(dot)de (Andres Freund) |
Cc: | Craig James <cjames(at)emolecules(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, "Graeme B(dot) Bell" <graeme(dot)bell(at)nibio(dot)no>, postgres performance list <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this? |
Date: | 2015-07-09 03:38:38 |
Message-ID: | 15702.1436413118@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
andres(at)anarazel(dot)de (Andres Freund) writes:
> On 2015-07-08 15:38:24 -0700, Craig James wrote:
>> From my admittedly naive point of view, it's hard to see why any of this
>> matters. I have functions that do purely CPU-intensive mathematical
>> calculations ... you could imagine something like is_prime(N) that
>> determines if N is a prime number. I have eight clients that connect to
>> eight backends. Each client issues an SQL command like, "select
>> is_prime(N)" where N is a simple number.
> I mostly replied to Merlin's general point (additionally in the context of
> plpgsql).
> But I have a hard time seing that postgres would be the bottleneck for a
> is_prime() function (or something with similar characteristics) that's
> written in C where the average runtime is more than, say, a couple
> thousand cyles. I'd like to see a profile of that.
But that was not the case that Graeme was complaining about. He's talking
about simple-arithmetic-and-looping written in plpgsql, in a volatile
function that is going to take a new snapshot for every statement, even if
that's only "n := n+1". So it's going to spend a substantial fraction of
its runtime banging on the ProcArray, and that doesn't scale. If you
write your is_prime function purely in plpgsql, and don't bother to mark
it nonvolatile, *it will not scale*. It'll be slow even in single-thread
terms, but it'll be particularly bad if you're saturating a multicore
machine with it.
One of my Salesforce colleagues has been looking into ways that we could
decide to skip the per-statement snapshot acquisition even in volatile
functions, if we could be sure that a particular statement isn't going to
do anything that would need a snapshot. Now, IMO that doesn't really do
much for properly written plpgsql; but there's an awful lot of bad plpgsql
code out there, and it can make a huge difference for that.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Graeme B. Bell | 2015-07-09 08:59:26 | Re: Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this? |
Previous Message | Andres Freund | 2015-07-08 22:45:18 | Re: Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this? |