Seqscan problem

From: Vlad Arkhipov <arhipov(at)dc(dot)baikal(dot)ru>
To: pgsql-performance(at)postgresql(dot)org
Subject: Seqscan problem
Date: 2008-05-06 11:14:03
Message-ID: 48203D7B.2000809@dc.baikal.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I've just discovered a problem with quite simple query. It's really
confusing me.
Postgresql 8.3.1, random_page_cost=1.1. All tables were analyzed before
query.

EXPLAIN ANALYZE
SELECT i.c, d.r
FROM i
JOIN d ON d.cr = i.c
WHERE i.dd between '2007-08-01' and '2007-08-30'

Hash Join (cost=2505.42..75200.16 rows=98275 width=16) (actual
time=2728.959..23118.632 rows=93159 loops=1)
Hash Cond: (d.c = i.c)
-> Seq Scan on d d (cost=0.00..61778.75 rows=5081098 width=16)
(actual time=0.075..8859.807 rows=5081098 loops=1)
-> Hash (cost=2226.85..2226.85 rows=89862 width=8) (actual
time=416.526..416.526 rows=89473 loops=1)
-> Index Scan using i_dd on i (cost=0.00..2226.85 rows=89862
width=8) (actual time=0.078..237.504 rows=89473 loops=1)
Index Cond: ((dd >= '2007-08-01'::date) AND (dd <=
'2007-08-30'::date))
Total runtime: 23246.640 ms

EXPLAIN ANALYZE
SELECT i.*, d.r
FROM i
JOIN d ON d.c = i.c
WHERE i.dd between '2007-08-01' and '2007-08-30'

Nested Loop (cost=0.00..114081.69 rows=98275 width=416) (actual
time=0.114..1711.256 rows=93159 loops=1)
-> Index Scan using i_dd on i (cost=0.00..2226.85 rows=89862
width=408) (actual time=0.075..207.574 rows=89473 loops=1)
Index Cond: ((dd >= '2007-08-01'::date) AND (dd <=
'2007-08-30'::date))
-> Index Scan using d_uniq on d (cost=0.00..1.24 rows=2 width=16)
(actual time=0.007..0.009 rows=1 loops=89473)
Index Cond: (d.c = i.c)
Total runtime: 1839.228 ms

And this never happened with LEFT JOIN.

EXPLAIN ANALYZE
SELECT i.c, d.r
FROM i
LEFT JOIN d ON d.cr = i.c
WHERE i.dd between '2007-08-01' and '2007-08-30'

Nested Loop Left Join (cost=0.00..114081.69 rows=98275 width=16)
(actual time=0.111..1592.225 rows=93159 loops=1)
-> Index Scan using i_dd on i (cost=0.00..2226.85 rows=89862
width=8) (actual time=0.072..210.421 rows=89473 loops=1)
Index Cond: ((dd >= '2007-08-01'::date) AND (dd <=
'2007-08-30'::date))
-> Index Scan using d_uniq on d (cost=0.00..1.24 rows=2 width=16)
(actual time=0.007..0.009 rows=1 loops=89473)
Index Cond: (d.c = i.c)
"Total runtime: 1720.185 ms"

d_uniq is unique index on d(r, ...).

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Frank van Vugt 2008-05-06 11:55:52 Re: plan difference between set-returning function with ROWS within IN() and a plain join
Previous Message PFC 2008-05-06 09:53:17 Re: plan difference between set-returning function with ROWS within IN() and a plain join