Re: Performance options for CPU bound multi-SUM query

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Matt <bsg075(at)gmail(dot)com>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Performance options for CPU bound multi-SUM query
Date: 2016-01-25 09:44:06
Message-ID: CAKJS1f-prjBR+vA=tWWBgBZC6UWN-0GvG2xWHuqWsE2GVm6RPg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 25 January 2016 at 15:45, Matt <bsg075(at)gmail(dot)com> wrote:
> I have a warehousing case where data is bucketed by a key of an hourly
> timestamp and 3 other columns. In addition there are 32 numeric columns. The
> tables are partitioned on regular date ranges, and aggregated to the lowest
> resolution usable.
>
> The principal use case is to select over a range of dates or hours, filter
> by the other key columns, and SUM() all 32 of the other columns. The
> execution plan shows the primary key index limits row selection efficiently,
> but the query appears CPU bound in performing all of those 32 SUM()
> aggregates.
>

SUM(numeric) also has to work quite a bit harder than an an aggregate
like sum(float8) too since the addition in numeric is implemented in
software.
It depends on the use case, but for some cases the float4 or float8
types might be an option and it will offer much faster aggregation.
There is also https://github.com/2ndQuadrant/fixeddecimal which may be
of some use if you need fixed precision up to a predefined scale. We
found that using fixeddecimal instead of numeric for the TPC-H
benchmark improved performance of query 1 significantly.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Matija Lesar 2016-01-25 09:44:33 Re: TABLESAMPLE usage
Previous Message Albe Laurenz 2016-01-25 08:55:41 Re: A motion