Re: Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Craig James <cjames(at)emolecules(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, "Graeme B(dot) Bell" <graeme(dot)bell(at)nibio(dot)no>, postgres performance list <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?
Date: 2015-07-09 09:00:08
Message-ID: 20150709090008.GV10242@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 2015-07-08 23:38:38 -0400, Tom Lane wrote:
> andres(at)anarazel(dot)de (Andres Freund) writes:
> > On 2015-07-08 15:38:24 -0700, Craig James wrote:
> >> From my admittedly naive point of view, it's hard to see why any of this
> >> matters. I have functions that do purely CPU-intensive mathematical
> >> calculations ... you could imagine something like is_prime(N) that
> >> determines if N is a prime number. I have eight clients that connect to
> >> eight backends. Each client issues an SQL command like, "select
> >> is_prime(N)" where N is a simple number.
>
> > I mostly replied to Merlin's general point (additionally in the context of
> > plpgsql).
>
> > But I have a hard time seing that postgres would be the bottleneck for a
> > is_prime() function (or something with similar characteristics) that's
> > written in C where the average runtime is more than, say, a couple
> > thousand cyles. I'd like to see a profile of that.
>
> But that was not the case that Graeme was complaining about.

No, Craig was complaining about that case...

> One of my Salesforce colleagues has been looking into ways that we could
> decide to skip the per-statement snapshot acquisition even in volatile
> functions, if we could be sure that a particular statement isn't going to
> do anything that would need a snapshot.

Yea, I actually commented about that on IRC as well.

I was thinking about actually continuing to get a snapshot, but mark it
as 'complete on usage'. I.e. only call GetSnapshotData() only when the
snapshot is used to decide about visibility. We probably can't do that
in the toplevel visibility case because it'll probably have noticeable
semantic effects, but ISTM it should be doable for the volatile function
using spi case.

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Graeme B. Bell 2015-07-09 09:44:24 Re: Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?
Previous Message Graeme B. Bell 2015-07-09 08:59:26 Re: Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?