From: | Stuart Grimshaw <nospam(at)smgsystems(dot)co(dot)uk> |
---|---|
To: | pgsql-sql(at)postgresql(dot)org |
Subject: | Why would this slow the query down so much? |
Date: | 2001-10-10 12:35:27 |
Message-ID: | xgXw7.2205$uU2.85101@wards |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-sql |
I have 3 tables that I am trying to join together:
------------------------------------------
Table "caturljoin"
Attribute | Type |
-----------+---------+
category | integer |
url | integer |
Index: caturljoin_url
caturljoin_cat
Table "stories"
Attribute | Type |
-------------+------------------------+
urn | integer |
headline | character varying |
author | character varying |
source | integer |
story | text |
added | date |
description | character varying |
displayall | smallint |
fullurl | character varying(255) |
publish | smallint |
error | integer |
sourceurl | character varying(255) |
sourcename | character varying(100) |
rank | smallint |
Indices: stories_added,
stories_source,
stories_unique_story,
stories_urn_key
Table "urllist"
Attribute | Type |
--------------+------------------------+
urn | integer |
url | character varying(255) |
friendlyname | character varying(30) |
homepage | character varying(255) |
method | smallint |
script | character varying(20) |
params | character varying(500) |
collect | smallint |
section | smallint |
index_script | character varying |
regexp | character varying(100) |
baseurl | character varying(75) |
Index: urllist_urn
------------------------------------------
With the following SQL:
------------------------------------------
SELECT a.category, b.headline, b.added, c.friendlyname
FROM caturljoin as a
INNER JOIN stories as b ON (a.url = b.source)
INNER JOIN urllist as c ON (a.url = d.urn)
WHERE a.category = 93 ORDER BY b.added DESC LIMIT 1;
------------------------------------------
The results of explain for the above are:
------------------------------------------
psql:scratch.sql:5: NOTICE: QUERY PLAN:
Limit (cost=1587.30..1587.30 rows=1 width=44)
-> Sort (cost=1587.30..1587.30 rows=1 width=44)
-> Merge Join (cost=249.89..1587.29 rows=1 width=44)
-> Sort (cost=249.89..249.89 rows=409 width=28)
-> Nested Loop (cost=0.00..232.15 rows=409 width=28)
-> Index Scan using caturljoin_cat on caturljoin
a (cost=0.00..5.09 rows=7 width=8)
-> Index Scan using stories_source on stories b
(cost=0.00..34.41 rows=29 width=20)
-> Index Scan using urllist_urn on urllist c
(cost=0.00..1323.69 rows=505 width=16)
EXPLAIN
------------------------------------------
and as you might be able to guess the query takes an age to complete.
If I remove the table urllist from the query, I get a much better response:
------------------------------------------
psql:scratch.sql:4: NOTICE: QUERY PLAN:
Limit (cost=0.00..207.74 rows=1 width=28)
-> Nested Loop (cost=0.00..84945.18 rows=409 width=28)
-> Index Scan Backward using stories_added on stories b
(cost=0.00..2310.04 rows=16149 width=20)
-> Index Scan using caturljoin_url on caturljoin a
(cost=0.00..5.10 rows=1 width=8)
EXPLAIN
------------------------------------------
Currently the tables contain the following rows of data:
------------------------------------------
caturljoin: 653 rows
urllist: 505 rows
stories: 21554 rows
------------------------------------------
Can anyone tell me why the inclusion of urllist would slow it down so much,
and what can I do to improve the speed of the query?
From | Date | Subject | |
---|---|---|---|
Next Message | Gerardo Perosio | 2001-10-10 13:06:01 | SELECT FOR UPDATE CLAUSE |
Previous Message | Allan Engelhardt | 2001-10-09 23:16:42 | Re: ROUND function ?? |