From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Hrishikesh (हृषीकेश मेहेंदळे) <hashinclude(at)gmail(dot)com> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Performance issues with large amounts of time-series data |
Date: | 2009-08-26 18:01:43 |
Message-ID: | 18070.1251309703@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
=?UTF-8?B?SHJpc2hpa2VzaCAo4KS54KWD4KS34KWA4KSV4KWH4KS2IOCkruClh+CkueClh+CkguCkpuCksw==?= =?UTF-8?B?4KWHKQ==?= <hashinclude(at)gmail(dot)com> writes:
> In my timing tests, the performance of PG is quite a lot worse than the
> equivalent BerkeleyDB implementation.
Are you actually comparing apples to apples? I don't recall that BDB
has any built-in aggregation functionality. It looks to me like you've
moved some work out of the client into the database.
> 1. Is there anything I can do to speed up performance for the queries?
Do the data columns have to be bigint, or would int be enough to hold
the expected range? SUM(bigint) is a *lot* slower than SUM(int),
because the former has to use "numeric" arithmetic whereas the latter
can sum in bigint. If you want to keep the data on-disk as bigint,
but you know the particular values being summed here are not that
big, you could cast in the query (SUM(data_1::int) etc).
I'm also wondering if you've done something to force indexscans to be
used. If I'm interpreting things correctly, some of these scans are
traversing all/most of a partition and would be better off as seqscans.
> shared_buffers = 128MB
This is really quite lame for the size of machine and database you've
got. Consider knocking it up to 1GB or so.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Hrishikesh (हृषीकेश मेहेंदळे) | 2009-08-26 18:39:40 | Re: Performance issues with large amounts of time-series data |
Previous Message | Hrishikesh (हृषीकेश मेहेंदळे) | 2009-08-26 17:31:13 | Performance issues with large amounts of time-series data |