From: | David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Parallel Aggregate costs don't consider combine/serial/deserial funcs |
Date: | 2016-04-12 21:38:21 |
Message-ID: | CAKJS1f-GqG=oD8bEDsrYL3YPWUjmaef71dfC4t7poAZj_HRyBg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 13 April 2016 at 08:52, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Sun, Apr 10, 2016 at 8:47 AM, David Rowley
> <david(dot)rowley(at)2ndquadrant(dot)com> wrote:
>> I realised a few days ago that the parallel aggregate code does not
>> cost for the combine, serialisation and deserialisation functions at
>> all.
>
> Oops.
>
>> I've attached a patch which fixes this.
>
> I've committed this patch. I wonder if it's going to produce compiler
> warnings for some people, complaining about possible use of an
> uninitialized variable. That would kind of suck. I don't much mind
> having to insert a dummy assignment to shut the compiler up; a smarter
> compiler will just throw it out anyway. I'm less enthused about a
> dummy MemSet. The compiler is less likely to be able to get rid of
> that, and it's more expensive if it doesn't. But let's see what
> happens.
Thanks for committing.
I wondered that too, so checked a couple of compilers and got no
warnings, but the buildfarm should let us know. The other option would
be to palloc() them, and have them set to NULL initially... that's not
very nice either... Another option would be to protect the final
parallel path generation with if (grouped_rel->partial_pathlist &&
grouped_rel->consider_parallel). I'd imagine any compiler smart enough
to work out that uninitialised is not possible would also be able to
remove the check for grouped_rel->consider_parallel, but *shrug*, I
don't often look at the assembly that compilers generate, so I might
be giving them too much credit.
>> One small point which I was a little unsure of in the attached is,
>> should the "if (aggref->aggdirectargs)" part of
>> count_agg_clauses_walker() be within the "if
>> (!context->combineStates)". I simply couldn't decide. We currently
>> have no aggregates which this affects anyway, per; select * from
>> pg_aggregate where aggcombinefn <> 0 and aggkind <> 'n'; so for now
>> I've left it outwith.
>
> The direct arguments would be evaluated in the worker, but not in the
> leader, right? Or am I confused?
That seems right, but I just can't think of how its possible to
parallelise these aggregates anyway.
>> Another thing I thought of is that it's not too nice that I have to
>> pass 3 bools to count_agg_clauses() in order to tell it what to do. I
>> was tempted to invent some bitmask flags for this, then modify
>> create_agg_path() to use the same flags, but I thought I'd better not
>> cause too much churn with this patch.
>
> I'm kinda tempted to say this should be using an enum. I note that
> serialStates has a subtly different meaning here than in some other
> places where you have used the same term.
hmm, I'm not sure how it's subtly different. Do you mean the
preference towards costing the finalfn when finalizeAggs is true, and
ignoring the serialfn in this case? nodeAgg.c should do the same,
although it'll deserialize in such a case. We can never finalize and
serialize in the same node.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2016-04-12 21:40:25 | Re: pg_upgrade error regarding hstore operator |
Previous Message | Robert Haas | 2016-04-12 21:36:07 | Re: Performance degradation in commit 6150a1b0 |