From: | "Dan Langille" <dan(at)langille(dot)org> |
---|---|
To: | pgsql-sql(at)postgresql(dot)org |
Subject: | using LIMIT only on primary table |
Date: | 2002-07-27 21:26:02 |
Message-ID: | 3D42D7AA.27447.3EE190A0@localhost |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-sql |
This is an extension of the problem solved by
http://archives.postgresql.org/pgsql-sql/2002-03/msg00020.php but
with a slightly different complication.
I want to get the last 100 port commits from the database. Commits
are stored in
commit_log and commit_log_ports relates commits to ports. A given
commit may
affect more than one port (i.e. there is a 1-N relationship between
commit_log and
commit_log_ports).
This gives me the last 100 commits:
SELECT commit_log.*
FROM commit_log
ORDER BY commit_date DESC, id
LIMIT 100;
So a starting point for the last 100 port commits is:
explain analyze
SELECT distinct commit_log.*
FROM commit_log_ports, commit_log
WHERE commit_log.id = commit_log_ports.commit_log_id
ORDER BY commit_log.commit_date DESC, commit_log_ports.commit_log_id
LIMIT 100;
But has a very high cost:
Limit (cost=11275.92..11283.42 rows=100 width=55) (actual
time=5769.07..5771.92 rows=100 loops=1)
-> Unique (cost=11275.92..11643.73 rows=4904 width=55) (actual
time=5769.05..5770.93 rows=101 loops=1)
-> Sort (cost=11275.92..11275.92 rows=49042 width=55)
(actual time=5769.04..5769.68 rows=112 loops=1)
-> Hash Join (cost=3478.15..6387.22 rows=49042
width=55) (actual time=1263.69..4319.53 rows=49042 loops=1)
-> Seq Scan on commit_log_ports
(cost=0.00..825.42 rows=49042 width=4) (actual time=0.12..346.32
rows=49042 loops=1)
-> Hash (cost=1749.51..1749.51 rows=42951
width=51) (actual time=1074.15..1074.15 rows=0 loops=1)
-> Seq Scan on commit_log
(cost=0.00..1749.51 rows=42951 width=51) (actual time=0.14..396.99
rows=42953 loops=1)
Total runtime: 6158.10 msec
I'd like to get that time down. I suspect it is high this is because
the entire table
is being joined, then the limit occurs. When I add "commit_log.id =
1" to the WHERE clause,
the query plan changes to:
Limit (cost=30.42..30.43 rows=1 width=55) (actual time=0.96..0.99
rows=1 loops=1)
-> Unique (cost=30.42..30.43 rows=1 width=55) (actual
time=0.95..0.97 rows=1 loops=1)
-> Sort (cost=30.42..30.42 rows=1 width=55) (actual
time=0.94..0.94 rows=1 loops=1)
-> Nested Loop (cost=0.00..30.41 rows=1 width=55)
(actual time=0.61..0.64 rows=1 loops=1)
-> Index Scan using commit_log_pkey on
commit_log (cost=0.00..5.93 rows=1 width=51) (actual time=0.31..0.32
rows=1 loops=1)
-> Index Scan using commit_log_ports_cli on
commit_log_ports (cost=0.00..24.41 rows=6 width=4) (actual
time=0.25..0.27 rows=1 loops=1)
I think that confirms that my indexes and PK/FK are set correctly.
I could do something like:
SELECT distinct commit_log_id
FROM commit_log_ports
ORDER BY commit_log_id desc
LIMIT 100;
But that doesn't take into consider the fact that commits can be
added out of order
and are not necessarily added in commit_date order.
Any clues please?
thanks
--
Dan Langille
And yes, I'm looking for a computer job:
http://www.freebsddiary.org/dan_langille.php
From | Date | Subject | |
---|---|---|---|
Next Message | Markus Bertheau | 2002-07-28 15:50:22 | performance difference in count(1) vs. count(*)? |
Previous Message | Thiemo Kellner | 2002-07-27 01:46:30 | Alter column |