Re: Optimizing aggregates

From: Andres Freund <andres(at)anarazel(dot)de>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Optimizing aggregates
Date: 2016-08-31 15:51:53
Message-ID: 20160831155153.5tnt4z5lzjhsjzlw@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2016-08-31 17:47:18 +0300, Heikki Linnakangas wrote:
> # ........ ........ .......... .................
> ........................................
> #
> 25.70% 0.00% postmaster [unknown] [k] 0000000000000000
> 14.23% 13.75% postmaster postgres [.] ExecProject

> ExecProject stands out. I find that pretty surprising.
>
> We're using ExecProject to extract the arguments from the input tuples, to
> pass to the aggregate transition functions. It looks like that's a pretty
> expensive way of doing it, for a typical aggregate that takes only one
> argument.
>
> We actually used to call ExecEvalExpr() directly for each argument, but that
> was changed by the patch that added support for ordered set aggregates. It
> looks like that was a bad idea, from a performance point of view.

I complained about that as well
http://archives.postgresql.org/message-id/20160519175727.ymv2y5tye4qgcmqx%40alap3.anarazel.de

> I propose that we go back to calling ExecEvalExpr() directly, for
> non-ordered aggregates, per the attached patch. That makes that example
> query about 10% faster on my laptop, which is in line with the fact that
> ExecProject() accounted for about 13% of the CPU time.

My approach is a bit different.

I've first combined the projection for all the aggregates, ordered set,
or not, into one projetion. That got rid of a fair amount of overhead
when you have multiple aggregates. I attached an, probably out of date,
WIP version of that patch.

Secondly, I'm working on overhauling expression evaluation to be
faster. Even without the ExecProject overhead, the computations very
quickly become the bottleneck. During that I pretty much merged
ExecProject and ExecEvalExpr into one - they're really not that
different, and the distinction serves no purpose, except to increase the
number of function calls. The reason I'm working on getting rid of
targetlist SRFs is precisely that. A proof of concept of that is
attached to
http://archives.postgresql.org/message-id/20160714011850.bd5zhu35szle3n3c%40alap3.anarazel.de

Greetings,

Andres Freund

Attachment Content-Type Size
0001-WIP-Only-perform-one-projection-in-aggregation.patch text/x-patch 7.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2016-08-31 15:56:45 Re: pg_sequence catalog
Previous Message Andres Freund 2016-08-31 15:41:53 Re: pg_sequence catalog