Quick Links

Re: Performance issues with large amounts of time-series data

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Hrishikesh (हृषीकेश मेहेंदळे) <hashinclude(at)gmail(dot)com>
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: Performance issues with large amounts of time-series data
Date:	2009-08-26 18:01:43
Message-ID:	18070.1251309703@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

=?UTF-8?B?SHJpc2hpa2VzaCAo4KS54KWD4KS34KWA4KSV4KWH4KS2IOCkruClh+CkueClh+CkguCkpuCksw==?= =?UTF-8?B?4KWHKQ==?= <hashinclude(at)gmail(dot)com> writes:
> In my timing tests, the performance of PG is quite a lot worse than the
> equivalent BerkeleyDB implementation.

Are you actually comparing apples to apples? I don't recall that BDB
has any built-in aggregation functionality. It looks to me like you've
moved some work out of the client into the database.

> 1. Is there anything I can do to speed up performance for the queries?

Do the data columns have to be bigint, or would int be enough to hold
the expected range? SUM(bigint) is a *lot* slower than SUM(int),
because the former has to use "numeric" arithmetic whereas the latter
can sum in bigint. If you want to keep the data on-disk as bigint,
but you know the particular values being summed here are not that
big, you could cast in the query (SUM(data_1::int) etc).

I'm also wondering if you've done something to force indexscans to be
used. If I'm interpreting things correctly, some of these scans are
traversing all/most of a partition and would be better off as seqscans.

> shared_buffers = 128MB

This is really quite lame for the size of machine and database you've
got. Consider knocking it up to 1GB or so.

regards, tom lane

In response to

Performance issues with large amounts of time-series data at 2009-08-26 17:31:13 from Hrishikesh (हृषीकेश मेहेंदळे)

Responses

Re: Performance issues with large amounts of time-series data at 2009-08-26 18:39:40 from Hrishikesh (हृषीकेश मेहेंदळे)

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Hrishikesh (हृषीकेश मेहेंदळे)	2009-08-26 18:39:40	Re: Performance issues with large amounts of time-series data
Previous Message	Hrishikesh (हृषीकेश मेहेंदळे)	2009-08-26 17:31:13	Performance issues with large amounts of time-series data