From: | "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com> |
---|---|
To: | "Andreas Strasser" <kontakt(at)andreas-strasser(dot)com> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Design Question (Time Series Data) |
Date: | 2007-10-04 09:20:19 |
Message-ID: | 162867790710040220j3297053bo1b41f12af000273f@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
2007/10/4, Andreas Strasser <kontakt(at)andreas-strasser(dot)com>:
> Hello,
>
> i'm currently designing an application that will retrieve economic data
> (mainly time series)from different sources and distribute it to clients.
> It is supposed to manage around 20.000 different series with differing
> numbers of observations (some have only a few dozen observations, others
> several thousand) and i'm now faced with the decision where and how to
> store the data.
>
> So far, i've come up with 3 possible solutions
>
> 1) Storing the observations in one big table with fields for the series,
> position within the series and the value (float)
> 2) Storing the observations in an array (either in the same table as the
> series or in an extra data-table)
> 3) Storing the observations in CSV-files on the hard disk and only
> putting a reference to it in the database
>
I did good experience with 2 variant. PostgreSQL needs 24bytes for
head of every row, so isn't too much efective store one field to one
row. You can simply do transformation between array and table now.
Pavel
> I expect that around 50 series will be updated daily - which would mean
> that for solution nr. 1 around 50.000 rows would be deleted and appended
> (again) every day.
>
> I personally prefer solution 1, because it is the easiest to implement
> (i need to make different calculations and be able to transform the data
> easily), but i'm concerned about perfomance and overhead. It effectively
> triples the space needed (over solutions nr. 2) and will result in huge
> index files.
>
> Are there any other storage methods which are better suited for this
> kind of data? How can i avoid trouble resulting from the daily updates
> (high number of deleted rows)? Which method would you prefer?
>
> Thanks in advance!
>
> Andreas
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend
>
From | Date | Subject | |
---|---|---|---|
Next Message | Jorge Godoy | 2007-10-04 09:34:01 | Re: Design Question (Time Series Data) |
Previous Message | Albe Laurenz | 2007-10-04 08:37:59 | Re: About selected data dump |