From: | Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com> |
---|---|
To: | Ow Mun Heng <Ow(dot)Mun(dot)Heng(at)wdc(dot)com> |
Cc: | pgsql-general <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Evil Nested Loops |
Date: | 2009-06-03 07:28:14 |
Message-ID: | dcc563d10906030028j3111de02id57a82162929d358@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Wed, Jun 3, 2009 at 12:32 AM, Ow Mun Heng <Ow(dot)Mun(dot)Heng(at)wdc(dot)com> wrote:
> What can I do about this plan?
>
>
>
> HashAggregate (cost=8035443.21..8035445.17 rows=157 width=24)
> -> Nested Loop (cost=37680.95..7890528.72 rows=28982898 width=24) <<<<< suspect
> Join Filter: ((a.test_run_start_date_time >= date.start_time) AND (a.test_run_start_date_time <= date.end_time))
> -> Bitmap Heap Scan on d_trh_pbert a (cost=37677.22..1369372.99 rows=1661440 width=24)
> Recheck Cond: ((test_run_start_date_time >= '2009-05-08 07:00:00'::timestamp without time zone) AND (test_run_start_date_time <= '2009-05-15 06:59:59'::timestamp without time zone))
> -> Bitmap Index Scan on idx_d_trh_pbert_sdate (cost=0.00..37261.86 rows=1661440 width=0)
> Index Cond: ((test_run_start_date_time >= '2009-05-08 07:00:00'::timestamp without time zone) AND (test_run_start_date_time <= '2009-05-15 06:59:59'::timestamp without time zone))
> -> Materialize (cost=3.73..5.30 rows=157 width=24)
> -> Seq Scan on lookup_ww_date2 date (cost=0.00..3.57 rows=157 width=24)
>
>
> The query
> select week_id,count(serial_number)
> from d_trh a
> inner join lookup_ww_date2 date
> on ( a.test_run_start_date_time between start_time and end_time)
> where a.test_run_start_date_time between '5/8/2009 7:00:00 AM' and '5/15/2009 6:59:59 AM'
> group by
> week_id
>
> the lookup_ww_date looks like this
>
> ( week_id bigint
> start_time timestamp
> end_time timestamp
> )
>
> eg: 200949|5/8/2009 7:00:00am|5/15/2009 6:59:59AM
>
> The whole aim of this exercise is to look at which WW the particular date falls into.
OK, looking at your query and the plan, what you're doing is kind of this:
157 Rows times 1661440 Rows (cross product) = 260M or so and then you
filter out the 157 original rows and their matches. Note that an
explain ANALYZE might shed more light, but given the high cost in this
query for the nested loop I'm guessing the only thing you can do is
throw more work_mem at it. But it's fundamentally flawed in design I
think.
If you're always working with dates maybe joining on
date_trunc('day',test_run_start_date)=date_trunc('day',startdate')
with an index on both terms will work?
From | Date | Subject | |
---|---|---|---|
Next Message | Greg Smith | 2009-06-03 07:52:54 | Re: warm standby with WAL shipping |
Previous Message | Massa, Harald Armin | 2009-06-03 07:06:34 | Re: Accessing pg_controldata information from SQL |