Re: optimizing daily data storage in Pg

From: Andy Colson <andy(at)squeakycode(dot)net>
To: P Kishor <punk(dot)kish(at)gmail(dot)com>, PostgreSQL <pgsql-general(at)postgresql(dot)org>
Subject: Re: optimizing daily data storage in Pg
Date: 2010-07-22 21:56:24
Message-ID: 4C48BE88.80203@squeakycode.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 7/22/2010 9:41 AM, P Kishor wrote:
> I have been struggling with this for a while now, have even gone down
> a few paths but struck out, so I turn now to the community for ideas.
> First, the problem: Store six daily variables for ~ 25 years for cells
> in a grid.
>
> * Number of vars = 6
> * Number of cells ~ 13 million
> * Number of days ~ 9125 (25 * 365)
>
> Optimize the store for two different kinds of queries:
>
> Query one: Retrieve the value of a single var for all or a portion of
> the cells for a single day. This is analogous to an image where every
> pixel is the value of a single var.

> SELECT<var> FROM d WHERE yr = ? AND yday = ?;
> SELECT<var> FROM d WHERE yr = ? AND yday = ? AND cell_id IN
(?,?,?...);

>
> Query two: Retrieve values for all the days or a duration of days for
> a single var for a single cell. This is like grabbing a column out of
> a table in which each row holds all the vars for a single day.
> SELECT<var> FROM d WHERE cell_id = ?;
> SELECT<var> FROM d WHERE cell_id IN (?,?,?...);

First, I must admit to not reading your entire email.

Second, Query 1 should be fast, regardless of how you layout the tables.

Third, Query 2 will return 13M rows? I dont think it matters how you
layout the tables, returning 13M rows is always going to be slow.

-Andy

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message shakahshakah@gmail.com 2010-07-22 22:14:20 Re: psql problem
Previous Message Edmundo Robles L. 2010-07-22 21:24:46 Question about SCO openserver and postgres...