From: | Scott Carey <scott(at)richrelevance(dot)com> |
---|---|
To: | Kevin Kempter <kevink(at)consistentstate(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: Query tuning |
Date: | 2009-08-19 17:17:26 |
Message-ID: | C6B181B6.F2DD%scott@richrelevance.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On 8/19/09 9:28 AM, "Kevin Kempter" <kevink(at)consistentstate(dot)com> wrote:
> Hi all;
>
> we've been fighting this query for a few days now. we bumped up the statistict
> target for the a.id , c.url_hits_id and the b.id columns below to 250 and ran
> an analyze on the relevant tables. we killed it after 8hrs.
>
> Note the url_hits table has > 1.4billion rows
>
> Any suggestions?
>
Have you tried setting work_mem higher for just this query?
The big estimated cost is the sequential scan on url_hits. But in reality,
if the estimates are off the sort and index scan at the end might be your
bottleneck. Larger work_mem might make it choose another plan there.
But if the true cost is the sequential scan on url_hits, then only an index
there will help.
>
>
> $ psql -ef expl.sql pwreport
> explain
> select
> a.id,
> ident_id,
> time,
> customer_name,
> extract('day' from timezone(e.name, to_timestamp(a.time))) as day,
> category_id
> from
> pwreport.url_hits a left outer join
> pwreport.url_hits_category_jt c on (a.id = c.url_hits_id),
> pwreport.ident b,
> pwreport.timezone e
> where
> a.ident_id = b.id
> and b.timezone_id = e.id
> and time >= extract ('epoch' from timestamp '2009-08-12')
> and time < extract ('epoch' from timestamp '2009-08-13' )
> and direction = 'REQUEST'
> ;
>
> QUERY
> PLAN
> ------------------------------------------------------------------------------
> ------------------------------------------------------------------------------
> --------------------------------------------------------
> Merge Right Join (cost=47528508.61..180424544.59 rows=10409251 width=53)
> Merge Cond: (c.url_hits_id = a.id)
> -> Index Scan using mt_url_hits_category_jt_url_hits_id_index on
> url_hits_category_jt c (cost=0.00..122162596.63 rows=4189283233 width=8)
> -> Sort (cost=47528508.61..47536931.63 rows=3369210 width=49)
> Sort Key: a.id
> -> Hash Join (cost=2565.00..47163219.21 rows=3369210 width=49)
> Hash Cond: (b.timezone_id = e.id)
> -> Hash Join (cost=2553.49..47116881.07 rows=3369210
> width=37)
> Hash Cond: (a.ident_id = b.id)
> -> Seq Scan on url_hits a (cost=0.00..47051154.89
> rows=3369210 width=12)
> Filter: ((direction =
> 'REQUEST'::proxy_direction_enum) AND (("time")::double precision >=
> 1250035200::double precision) AND (("time")::double precision <
> 1250121600::double precision))
> -> Hash (cost=2020.44..2020.44 rows=42644 width=29)
> -> Seq Scan on ident b (cost=0.00..2020.44
> rows=42644 width=29)
> -> Hash (cost=6.78..6.78 rows=378 width=20)
> -> Seq Scan on timezone e (cost=0.00..6.78 rows=378
> width=20)
> (15 rows)
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>
From | Date | Subject | |
---|---|---|---|
Next Message | Nikolas Everett | 2009-08-19 17:31:30 | Re: Query tuning |
Previous Message | Grzegorz Jaśkiewicz | 2009-08-19 16:38:08 | Re: Query tuning |