From: | Kevin Kempter <kevink(at)consistentstate(dot)com> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | Query tuning |
Date: | 2009-08-19 16:28:41 |
Message-ID: | 200908191028.42105.kevink@consistentstate.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Hi all;
we've been fighting this query for a few days now. we bumped up the statistict
target for the a.id , c.url_hits_id and the b.id columns below to 250 and ran
an analyze on the relevant tables. we killed it after 8hrs.
Note the url_hits table has > 1.4billion rows
Any suggestions?
$ psql -ef expl.sql pwreport
explain
select
a.id,
ident_id,
time,
customer_name,
extract('day' from timezone(e.name, to_timestamp(a.time))) as day,
category_id
from
pwreport.url_hits a left outer join
pwreport.url_hits_category_jt c on (a.id = c.url_hits_id),
pwreport.ident b,
pwreport.timezone e
where
a.ident_id = b.id
and b.timezone_id = e.id
and time >= extract ('epoch' from timestamp '2009-08-12')
and time < extract ('epoch' from timestamp '2009-08-13' )
and direction = 'REQUEST'
;
QUERY
PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Merge Right Join (cost=47528508.61..180424544.59 rows=10409251 width=53)
Merge Cond: (c.url_hits_id = a.id)
-> Index Scan using mt_url_hits_category_jt_url_hits_id_index on
url_hits_category_jt c (cost=0.00..122162596.63 rows=4189283233 width=8)
-> Sort (cost=47528508.61..47536931.63 rows=3369210 width=49)
Sort Key: a.id
-> Hash Join (cost=2565.00..47163219.21 rows=3369210 width=49)
Hash Cond: (b.timezone_id = e.id)
-> Hash Join (cost=2553.49..47116881.07 rows=3369210
width=37)
Hash Cond: (a.ident_id = b.id)
-> Seq Scan on url_hits a (cost=0.00..47051154.89
rows=3369210 width=12)
Filter: ((direction =
'REQUEST'::proxy_direction_enum) AND (("time")::double precision >=
1250035200::double precision) AND (("time")::double precision <
1250121600::double precision))
-> Hash (cost=2020.44..2020.44 rows=42644 width=29)
-> Seq Scan on ident b (cost=0.00..2020.44
rows=42644 width=29)
-> Hash (cost=6.78..6.78 rows=378 width=20)
-> Seq Scan on timezone e (cost=0.00..6.78 rows=378
width=20)
(15 rows)
From | Date | Subject | |
---|---|---|---|
Next Message | Grzegorz Jaśkiewicz | 2009-08-19 16:38:08 | Re: Query tuning |
Previous Message | Scott Marlowe | 2009-08-19 16:19:14 | Re: PG 8.3 and server load |