| From: | Louis-David Mitterrand <vindex+lists-pgsql-performance(at)apartia(dot)org> |
|---|---|
| To: | pgsql-performance(at)postgresql(dot)org |
| Subject: | Re: best db schema for time series data? |
| Date: | 2010-11-19 09:46:24 |
| Message-ID: | 20101119094624.GA27168@apartia.fr |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-performance |
On Tue, Nov 16, 2010 at 11:35:24AM -0500, Chris Browne wrote:
> vindex+lists-pgsql-performance(at)apartia(dot)org (Louis-David Mitterrand)
> writes:
> > I have to collect lots of prices from web sites and keep track of their
> > changes. What is the best option?
> >
> > 1) one 'price' row per price change:
> >
> > create table price (
> > id_price primary key,
> > id_product integer references product,
> > price integer
> > );
> >
> > 2) a single 'price' row containing all the changes:
> >
> > create table price (
> > id_price primary key,
> > id_product integer references product,
> > price integer[] -- prices are 'pushed' on this array as they change
> > );
> >
> > Which is bound to give the best performance, knowing I will often need
> > to access the latest and next-to-latest prices?
>
> I'd definitely bias towards #1, but with a bit of a change...
>
> create table product (
> id_product serial primary key
> );
>
> create table price (
> id_product integer references product,
> as_at timestamptz default now(),
> primary key (id_product, as_at),
> price integer
> );
Hi Chris,
So an "id_price serial" on the price table is not necessary in your
opinion? I am using "order by id_price limit X" or "max(id_price)" to
get at the most recent prices.
> The query to get the last 5 prices for a product should be
> splendidly efficient:
>
> select price, as_at from price
> where id_product = 17
> order by as_at desc limit 5;
>
> (That'll use the PK index perfectly nicely.)
>
> If you needed higher performance, for "latest price," then I'd add a
> secondary table, and use triggers to copy latest price into place:
>
> create table latest_prices (
> id_product integer primary key references product,
> price integer
> );
I did the same thing with a 'price_dispatch' trigger and partitioned
tables (inheritance). It's definitely needed when the price database
grow into the millions.
Thanks,
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Louis-David Mitterrand | 2010-11-19 09:50:21 | Re: best db schema for time series data? |
| Previous Message | Kevin Grittner | 2010-11-18 21:00:07 | Re: Query Performance SQL Server vs. Postgresql |