From: | Matthew Engelbert <mje1975(at)yahoo(dot)com> |
---|---|
To: | pgsql-novice(at)postgresql(dot)org |
Subject: | Re: Indexing Strategy |
Date: | 2004-12-11 00:57:02 |
Message-ID: | 20041211005702.75029.qmail@web60302.mail.yahoo.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-novice pgsql-sql |
The select query will be like this:
select * from quote where sybmol = 'MSFT' and extract(doy from time) = 340
order by time
I need to select all the records for a particular symbol, for a particular day,
ordered by time. I might need to select the records in the same way, but for a
shorter time interval, like some hour during the day. I'm using Postgresql
version 7.4.
Other info I forgot to mention.... this data is historical, and so I won't be
doing lots of inserts, updates or deletes - I'll just load a big chunk of data
once in a while, and rebuild the index. I realize that I can also add a lookup
table for symbol, as there are only about 150 unique values. Time values are
about 99% unique. Yes, I'm sure I need to tune the db better. I'm new to
Postgresql. Thanks for the help.
--- Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> Matthew,
>
> > I'm having trouble correctly indexing a large table. The table will be
> > receiving about 30 million rows/day, and has the following definition:
> >
> > CREATE TABLE quote (
> > symbol varchar (5),
> > price numeric (7,4),
> > time timestamp
> > );
> >
> > I need to configure the indices so that I can minimize the time to request
> > all the records for a particular symbol, for a particular day, ordered by
> > time.
>
> Please post the query you're using. How are you selecting the day? What
> version of PostgreSQL are you using?
>
> > I've tried clustering a single index based on symbol, then time, but this
> > operation takes much too long. I think this is because this operation
> > would have O(N^2) complexity(?).
>
> Nope, just because it's a very large table and you're probably swapping.
> Try
> increasing sort_mem dramatically, like half your RAM.
>
> > I think my next best option would be to
> > cluster on an index based on symbol, then have another index on time.
>
> Not that either. You need a multicolumn index. When you answer the above
> quesitons, I'll explain it.
>
> > Does any one know of some resources on the web that discusses indexing
> > strategies ? Any help would be appreciated? Thanks.
>
> Not that covers your question. I'm writing a book which covers this right
> now, but it won't be published until January 2006.
>
> --
> --Josh
>
> Josh Berkus
> Aglio Database Solutions
> San Francisco
>
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Fuhr | 2004-12-11 01:55:30 | Re: Get comment |
Previous Message | Michael Fuhr | 2004-12-11 00:55:42 | Re: Version 8 & column order |
From | Date | Subject | |
---|---|---|---|
Next Message | Stephan Szabo | 2004-12-11 02:28:38 | Re: Query is not using index when it should |
Previous Message | Stephan Szabo | 2004-12-11 00:48:24 | Re: Cast NULL into Timestamp? |