From: | Andy Colson <andy(at)squeakycode(dot)net> |
---|---|
To: | Job <Job(at)colliniconsulting(dot)it>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Queries on very big table |
Date: | 2017-01-02 15:43:59 |
Message-ID: | f2306236-878e-2cd8-790c-7f7250f35f44@squeakycode.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On 01/02/2017 05:23 AM, Job wrote:
> Hello guys and very good new year to everybody!
>
> We are now approaching some queries and statistics on very big table (about 180 millions of record).
> The table is partitioned by day (about ~3 Gb of data for every partition/day).
> We use Postgresql 9.6.1
>
> I am experiencing quite important slowdown on queries.
> I manually made a "vacuum full" and a "reindex" on every partition in order to clean free space and reorder records.
>
> I have a BRIN index on timestamp and index on other field (btree)
>
> Starting by a simple query: explain analyze select count(domain) from webtraffic_archive:
>
>
> Other more complex queries are slower.
>
> How can i improve it?
> Records number can raise up until 1.000 millions.
> Do i need a third-part tool for big data?
>
> THANK YOU!
> /F
>
I do very similar thing, log all my webstats to PG, but querying millions of rows is always going to be slow. I use a summary table. Actually, several.
My detail table is like yours, but every 5 minutes I query out the last hour and summarize into a by_hour table. Every night I query out the last 24 hours and summarize into a by_day table. The detail table and by_hour table never have more than 24 hours worth of data, by_day goes back many years.
My stats pages all query the by_hour and by_day tables, and its very fast.
-Andy
From | Date | Subject | |
---|---|---|---|
Next Message | Melvin Davidson | 2017-01-02 16:02:03 | Re: Difficulty modelling sales taxes |
Previous Message | vod vos | 2017-01-02 15:38:00 | Re: COPY: row is too big |