From: | "Scott Morrison" <smorrison(at)navtechinc(dot)com> |
---|---|
To: | <greg(at)turnstep(dot)com>, <pgsql-novice(at)postgresql(dot)org> |
Subject: | Re: |
Date: | 2003-02-10 22:46:43 |
Message-ID: | DHEKKBMAPBCGPOFEBIFFCEKECBAA.smorrison@navtechinc.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-novice |
Well I ran the queries through some more tests and the original query is
still faster than any of the suggested queries.
Query 1 (Original):
ANALYZE sample;
CREATE INDEX sample_id ON sample(id);
CREATE INDEX sample_date ON sample(date);
ANALYZE sample;
EXPLAIN ANALYZE
SELECT * FROM sample a WHERE (id,date) IN
(SELECT id,max(date) FROM sample
WHERE id=a.id AND date<='2003-02-07'
GROUP BY id);
DROP INDEX sample_id;
DROP INDEX sample_date;
Query 2 (Greg):
ANALYZE sample;
CREATE INDEX sample_id ON sample(id);
CREATE INDEX sample_date ON sample(date);
ANALYZE sample;
EXPLAIN ANALYZE
SELECT * FROM sample a WHERE date<='2003-02-07' AND oid=
(SELECT oid FROM sample WHERE id=a.id AND
date<='2003-02-07' ORDER BY date DESC LIMIT 1);
DROP INDEX sample_id;
DROP INDEX sample_date;
As for Greg's query, I tried it without an id index and it took a long long
time (I broke it after about 5 minutes, so that's why the id index is added
in there). I also took out the ORDER BY id because that slowed it down by
another 3 seconds or so, and I don't need it for what I'm querying out.
Below are the analysis reports for both queries.
Query 1 Analysis (Average time 21779.98ms):
Seq Scan on sample a (cost=0.00..13702.91 rows=3126 width=12) (actual
time=58.67..21770.83 rows=99999 loops=1)
SubPlan
-> Aggregate (cost=0.00..3.74 rows=1 width=8) (actual time=0.11..0.11
rows=1 loops=180566)
-> Group (cost=0.00..3.74 rows=1 width=8) (actual
time=0.06..0.10 rows=2 loops=180566)
-> Index Scan using sample_id on sample (cost=0.00..3.73
rows=1 width=8) (actual time=0.05..0.08 rows=2 loops=180566)
Query 2 Analysis (Average time 28358.92ms):
Index Scan using sample_date on sample a (cost=0.00..24294.52 rows=1
width=12) (actual time=0.42..27831.20 rows=99999 loops=1)
SubPlan
-> Limit (cost=3.84..3.84 rows=1 width=8) (actual time=0.15..0.16
rows=1 loops=160446)
-> Sort (cost=3.84..3.84 rows=1 width=8) (actual time=0.15..0.15
rows=2 loops=160446)
-> Index Scan using sample_id on sample (cost=0.00..3.83
rows=1 width=8) (actual time=0.05..0.07 rows=2 loops=160446)
So in conclusion, it doesn't look like I'll be able to get the query any
faster than my original query. I was really hoping for something that would
run in 5-10% of the time, but I guess that's not possible.
Thanks for the suggestions.
-- Scott
-----Original Message-----
From: pgsql-novice-owner(at)postgresql(dot)org
[mailto:pgsql-novice-owner(at)postgresql(dot)org]On Behalf Of greg(at)turnstep(dot)com
Sent: Monday, February 10, 2003 2:50 PM
To: pgsql-novice(at)postgresql(dot)org
Subject: Re: [NOVICE]
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
>>> Wouldn't there need to be an index on OID for your query to be fast?
>>>
>> No, the single index on the "date" field is enough. See the EXPLAIN
>> output from my first post in the original thread:
>>
> However the original poster suggested there would ultimately be 200,000
> or more rows in the table.
Yes, and my EXPLAIN is on a table with 200,000 rows in it. Hence the fact
that it took over 5 seconds - which is still very fast compared to the
original query.
- --
Greg Sabino Mullane greg(at)turnstep(dot)com
PGP Key: 0x14964AC8 200302101447
-----BEGIN PGP SIGNATURE-----
Comment: http://www.turnstep.com/pgp.html
iD8DBQE+SAHxvJuQZxSWSsgRAkxTAJ9TnEAjypMGbv8ZRn55yqe/8AZyQgCcCUPm
/dU6kHroxm1XpC0lUrvdY+Y=
=GwmT
-----END PGP SIGNATURE-----
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Mosher | 2003-02-11 00:09:36 | Characters To be Escaped in Perl |
Previous Message | greg | 2003-02-10 19:49:52 | Re: |