From: | Aditya <aditya(at)grot(dot)org> |
---|---|
To: | sfpug(at)postgresql(dot)org |
Subject: | optimizing selects on times-eries data in Pg |
Date: | 2003-08-01 00:52:32 |
Message-ID: | 20030801005232.GA21352@mighty.grot.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | sfpug |
does anyone have pointers to optimizing selects on a table with logfile data?
it's really a time-series -- webserver access logs. I keep data for about a
week and there are around a million rows and selects are done on virtualhost
basis (and there is an index on virtualhost) however as the number of rows
increases, a sequential scan is less costly than an index scan (at least
according to the output of explain analyze)...
I've searched/googled and I don't see any "hints" on how to make this work
much better -- I might have to be content on storing 2 days worth of data at a
time to make the selects work in a reasonable amount of time...
Hints, pointers welcome,
Thanks,
Adi
The table looks like and rows are inserted in real-time (I delete all rows
older than 7 days nightly followed by a full vaccuum):
zp1139=> \d zp_log
Table "zp_log"
Column | Type | Modifiers
-------------+--------------------------+-----------
priority | character varying(15) |
category | character varying(30) |
thread | character varying(30) |
message | character varying(255) |
timestamp | timestamp with time zone |
remoteip | text | not null
remotehost | text |
referer | text |
status | integer | not null
port | integer | not null
virtualhost | text | not null
query | text |
bytes | integer | not null
useragent | text |
Indexes: zp_log_timestamp,
zp_log_vhost
zp1139=> \d zp_log_timestamp
Index "zp_log_timestamp"
Column | Type
-----------+--------------------------
timestamp | timestamp with time zone
btree
zp1139=> \d zp_log_vhost
Index "zp_log_vhost"
Column | Type
-------------+------
virtualhost | text
btree
From | Date | Subject | |
---|---|---|---|
Next Message | Josh Berkus | 2003-08-01 16:58:24 | Re: optimizing selects on times-eries data in Pg |
Previous Message | LN Cisneros | 2003-07-30 19:36:32 | Re: Next Meeting -- Buca di Beppo Wed. Aug. 6 |