Re: Aggregate query on large tables

From: Bill Moran <wmoran(at)potentialtech(dot)com>
To: Job <Job(at)colliniconsulting(dot)it>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Aggregate query on large tables
Date: 2017-04-09 15:26:17
Message-ID: 20170409112617.c834de6ae8cf4a76f4799d67@potentialtech.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Sun, 9 Apr 2017 17:05:56 +0200
Job <Job(at)colliniconsulting(dot)it> wrote:

> Hi,
>
> i have a table with about 400 millions of rows and i need to build some aggregate function for reporting.
> I noticed that query performances are slowing down, even though indexes are present.
>
> Query is simple (i make an example, my table is in italian language):
> select a,sum(b) from table where a=x and c=y group by a
>
> a is a varchar
> b is an integer
>
> x and y are two field i use for filter results.
>
> I tried to create different indexes to try to speed up performance:
> index1 (a)
> index2 (c)
> index3 (a,c)
>
> I noticed, with query planner, that the mulfi-field index is not used.
> Postgresql 9.6.1 still use scan without indexes.
>
> I obtain significant improvements only if i create a materialized view with aggregated data.

It helps a lot if you provide EXPLAIN output for questions like this. Also,
definitions of what you consider "fast" and "slow" are helpful, as everyone
seems to have a different opinion on what those words mean.

However, my guess is that your WHERE condition isn't significantly restrictive
to make use of the index worth the time. If you'll be fetching a significant
percentage of the rows anyway, using the index would actually slow things
down.

You _might_ get better performance if you create an index on (a,c,b) which
would allow the query to run without ever needing to access the actual
table; but I'm just speculating.

In my experience, queries like these rarely benefit from filter indexes,
because most of the time involved is in the grouping and aggregate processing,
and the index does nothing to help with that. But, again, without EXPLAIN
output I'm only speculating.

--
Bill Moran <wmoran(at)potentialtech(dot)com>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2017-04-09 15:36:01 Re: Aggregate query on large tables
Previous Message Job 2017-04-09 15:05:56 Aggregate query on large tables