From: | Stephan Szabo <sszabo(at)megazone(dot)bigpanda(dot)com> |
---|---|
To: | Aditya <aditya(at)grot(dot)org> |
Cc: | Josh Berkus <josh(at)agliodbs(dot)com>, <sfpug(at)postgresql(dot)org> |
Subject: | Re: optimizing selects on time-series data in Pg |
Date: | 2003-08-01 18:07:41 |
Message-ID: | 20030801110243.C55372-100000@megazone.bigpanda.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | sfpug |
On Fri, 1 Aug 2003, Aditya wrote:
> On Fri, Aug 01, 2003 at 09:58:24AM -0700, Josh Berkus wrote:
> > Can you post your "slow queries"?
>
> Here's what I did:
>
> - I did a vacuum
>
> - did a select count(*) for a few different virtualhosts to see how many rows
> were being dealt with
>
> - run explain analyze on each of those select count(*) and noted the
> difference, ie. the planner decided that a select on a virtual host with many
> more rows was better served by sequential scan than index scan
>
> The typical query is something like show me all the rows for a virtualhost in
> the last two days with the latest first, ie.
>
> select
> *
> from zp_log
> where
> timestamp > (now() - interval '2 days')
> and virtualhost = 'www.bloki.com';
>
> - dropped the virtualhost index and recreated it, verified the behaviour was
> the same as above
>
> - forced the planner not to use sequential scan and verified that the index
> scan was in fact more costly then, here's a somewhat edited transcript (only
> for brevity):
Hmm, does a two column index on timestamp,virtualhost (or possibly the
other way around) help at all when forcing an index scan? I wouldn't
guess that it'd help enough, but it might be worth trying.
From | Date | Subject | |
---|---|---|---|
Next Message | Aditya | 2003-08-01 18:25:07 | Re: optimizing selects on time-series data in Pg |
Previous Message | Josh Berkus | 2003-08-01 18:05:22 | Re: optimizing selects on time-series data in Pg |