Re: SELECT AND AGG huge tables

From: Ondrej Ivanič <ondrej(dot)ivanic(at)gmail(dot)com>
To: houmanb <houman(at)gmx(dot)at>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: SELECT AND AGG huge tables
Date: 2012-10-15 21:54:47
Message-ID: CAM6mie+tHS+j8oXE7_DjqOOzZaatoHMwHkUSPzVM2LNKq=b7qQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

On 16 October 2012 07:59, houmanb <houman(at)gmx(dot)at> wrote:
> Dear all,
> We have a DB containing transactional data.
> There are about *50* to *100 x 10^6* rows in one *huge* table.
> We are using postgres 9.1.6 on linux with a *SSD card on PCIex* providing us
> a constant seeking time.

How many columns? What's the average row size?

> Is there any way to speed up select statements like this:
>
> SELECT
> SUM(T.x),
> SUM(T.y),
> SUM(T.z),
> AVG(T.a),
> AVG(T.b)
> FROM T
> GROUP BY
> T.c
> WHERE
> T.creation_date=$SOME_DATE;
>
> There is an Index on T.c. But would it help to partition the table by T.c?
> It should be mentioned, that T.c is actually a foreign key to a Table
> containing a
> tiny number of rows (15 rows representing different companies).
> my postgres.conf is actually the default one, despite the fact that we
> increased the value for work_mem=128MB

Partitioning by T.c is not going to help. You should partition by
T.creation_date. The question is if all queries have T.creation_date
in where clause. Moreover, you need to choose partition size base on
query range so majority of queries can operate on one or two
partitions.

You can try vertical partitioning ie. split table based on column usage:
- group by frequency of use
- group by number of NULLs (null_frac in pg_stats)

Having "SSD card on PCIex" joining tables should be the problem.

In my case table has > 200 columns and monthly partitions (> 30 mil
rows on average) and aggregation queries performed better than 200sec.

--
Ondrej Ivanic
(ondrej(dot)ivanic(at)gmail(dot)com)
(http://www.linkedin.com/in/ondrejivanic)

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Bob Lunney 2012-10-15 23:44:59 Re: SELECT AND AGG huge tables
Previous Message Matheus de Oliveira 2012-10-15 21:42:02 Re: SELECT AND AGG huge tables