From: | Brian Herlihy <btherl(at)yahoo(dot)com(dot)au> |
---|---|
To: | Postgresql Performance <pgsql-performance(at)postgresql(dot)org> |
Subject: | An unwanted seqscan |
Date: | 2007-02-14 08:40:13 |
Message-ID: | 379816.98018.qm@web52311.mail.yahoo.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Hi,
I am having trouble understanding why a seqscan is chosen for this query.
In practice the seqscan is very expensive, whereas the nested loop is usually quite fast, even with several hundred rows returned from meta_keywords_url.
The server is running version 8.1.3, and both tables were analyzed recently. meta_keywords contains around 25% dead rows, meta_keywords_url contains no dead rows.
I have included the query written both as a subquery and as a join.
Thanks for any assistance!
Brian
live=> explain select * from meta_keywords where url_id in (select url_id from meta_keywords_url where host = 'postgresql.org');
QUERY PLAN
---------------------------------------------------------------------------------------------------------
Hash IN Join (cost=1755.79..545380.52 rows=9442 width=29)
Hash Cond: ("outer".url_id = "inner".url_id)
-> Seq Scan on meta_keywords (cost=0.00..507976.54 rows=7110754 width=29)
-> Hash (cost=1754.35..1754.35 rows=576 width=4)
-> Bitmap Heap Scan on meta_keywords_url (cost=11.02..1754.35 rows=576 width=4)
Recheck Cond: ((host)::text = 'postgresql.org'::text)
-> Bitmap Index Scan on meta_keywords_url_host_path (cost=0.00..11.02 rows=576 width=0)
Index Cond: ((host)::text = 'postgresql.org'::text)
(8 rows)
live=> set enable_seqscan=off;
SET
live=> explain select * from meta_keywords where url_id in (select url_id from meta_keywords_url where host = 'postgresql.org');
QUERY PLAN
---------------------------------------------------------------------------------------------------------
Nested Loop (cost=1755.79..3161748.83 rows=9442 width=29)
-> HashAggregate (cost=1755.79..1761.55 rows=576 width=4)
-> Bitmap Heap Scan on meta_keywords_url (cost=11.02..1754.35 rows=576 width=4)
Recheck Cond: ((host)::text = 'postgresql.org'::text)
-> Bitmap Index Scan on meta_keywords_url_host_path (cost=0.00..11.02 rows=576 width=0)
Index Cond: ((host)::text = 'postgresql.org'::text)
-> Index Scan using meta_keywords_url_id on meta_keywords (cost=0.00..5453.28 rows=2625 width=29)
Index Cond: (meta_keywords.url_id = "outer".url_id)
(8 rows)
live=> explain select * from meta_keywords join meta_keywords_url using (url_id) where host = 'postgresql.org'; QUERY PLAN
---------------------------------------------------------------------------------------------------------
Hash Join (cost=1758.52..543685.43 rows=9297 width=107)
Hash Cond: ("outer".url_id = "inner".url_id)
-> Seq Scan on meta_keywords (cost=0.00..506859.29 rows=6994929 width=28)
-> Hash (cost=1757.08..1757.08 rows=577 width=83)
-> Bitmap Heap Scan on meta_keywords_url (cost=11.02..1757.08 rows=577 width=83)
Recheck Cond: ((host)::text = 'postgresql.org'::text)
-> Bitmap Index Scan on meta_keywords_url_host_path (cost=0.00..11.02 rows=577 width=0)
Index Cond: ((host)::text = 'postgresql.org'::text)
(8 rows)
live=> set enable_seqscan=off;
SET
live=> explain select * from meta_keywords join meta_keywords_url using (url_id) where host = 'postgresql.org';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..3348211.21 rows=9297 width=107)
-> Index Scan using meta_keywords_url_host_path on meta_keywords_url (cost=0.00..2230.24 rows=577 width=83)
Index Cond: ((host)::text = 'postgresql.org'::text)
-> Index Scan using meta_keywords_url_id on meta_keywords (cost=0.00..5765.81 rows=2649 width=28)
Index Cond: (meta_keywords.url_id = "outer".url_id)
(5 rows)
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2007-02-14 08:53:54 | Re: An unwanted seqscan |
Previous Message | Tom Lane | 2007-02-14 07:15:44 | Re: cube operations slower than geo_distance() on production server |