Quick Links

Optimizing Time Series Access

From:	Robert Burgholzer <rburghol(at)vt(dot)edu>
To:	pgsql-performance(at)postgresql(dot)org
Subject:	Optimizing Time Series Access
Date:	2014-04-08 21:20:19
Message-ID:	CACT-NGJAMP0KULGMf3B8+0YqeywD1i34R7Okzb74Gbzn7_oU-A@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

I am looking for advice on dealing with large tables of environmental model
data and looking for alternatives to my current optimization approaches.
Basically, I have about 1 Billion records stored in a table which I access
in groups of roughly 23 Million at a time. Which means that I have
somewhere in the neighborhood of 400-500 sets of 23Mil points.

The 23Mil that I pull at a time are keyed on 3 different columns, it's all
indexed, and retrieval happens in say, 2-3 minutes (my hardware is so-so).
So, my thought is to use some kind of caching and wonder if I can get
advice - here are my thoughts on options, would love to hear others:

* use cached tables for this - since my # of actual data groups is small,
why not just retrieve them once, then keep them around in a specially named
table (I do this with some other stuff, using a 30 day cache expiration)
* Use some sort of stored procedure? I don't even know if such a thing
really exists in PG and how it works.
* Use table partitioning?

Thanks,
/r/b

--
--
Robert W. Burgholzer
'Making the simple complicated is commonplace; making the complicated
simple, awesomely simple, that's creativity.' - Charles Mingus
Athletics: http://athleticalgorithm.wordpress.com/
Science: http://robertwb.wordpress.com/
Wine: http://reesvineyard.wordpress.com/

Responses

Re: Optimizing Time Series Access at 2014-04-09 19:48:57 from Robert Burgholzer

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Tom Lane	2014-04-08 21:26:37	Re: Performance regressions in PG 9.3 vs PG 9.0
Previous Message	Dhananjay Singh	2014-04-08 20:40:09	Re: Nested loop issue