From: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
---|---|
To: | Matheus de Oliveira <matioli(dot)matheus(at)gmail(dot)com> |
Cc: | "Huang, Suya" <Suya(dot)Huang(at)au(dot)experian(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: how to change the provoke table in hash join |
Date: | 2014-09-11 18:09:26 |
Message-ID: | CAMkU=1z3j2DAzkQ4tSDYNkZR0HX1Z8Z9_u_fMpJ9tg3myq5gCw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Thu, Sep 11, 2014 at 7:09 AM, Matheus de Oliveira <
matioli(dot)matheus(at)gmail(dot)com> wrote:
>
> On Wed, Sep 10, 2014 at 10:05 PM, Huang, Suya <Suya(dot)Huang(at)au(dot)experian(dot)com>
> wrote:
>
>> --plan 1, 10 seconds were spent on sequential scan on term_weekly table.
>>
>>
>>
>> dev=# explain analyze select distinct cs_id from lookup_weekly n inner
>> join term_weekly s on s.b_id=n.b_id and s.date=n.date where term in
>> ('cat'::text);
>>
>>
>>
>>
>>
>>
>> QUERY PLAN
>>
>>
>> -----------------------------------------------------------------------------------------------------------------------------------------------------------------
>>
>> HashAggregate (cost=2100211.06..2100211.11 rows=5 width=4) (actual
>> time=27095.470..27095.487 rows=138 loops=1)
>> ...
>>
>>
>>
>> --plan 2, only 1 second spent on index scan of term_weekly table,
>> however, as it selects the big table to do the hashing, it takes 22 seconds
>> for the hash to complete. The advantage get from index has been totally
>> lost because of this join order.
>>
>>
>>
>>
>> QUERY PLAN
>>
>>
>> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>
>> HashAggregate (cost=1429795.17..1429795.22 rows=5 width=4) (actual
>> time=22991.289..22991.307 rows=138 loops=1)
>> ...
>
>
> Am I reading something wrong here? I haven't looked all the plan, but the
> second is faster (overall), so why do you think you need a hint or change
> what the planner choose? For me looks like using the index is the best for
> this situation. Could you try running this multiple times and taking the
> min/max/avg time of both?
>
The difference in time could be a caching effect, not a reproducible
difference.
The 2nd plan uses 3GB of memory, and there might be better uses for that
memory.
Currently memory is un-costed, other than "cliff costing" once you thinks
it will exceed work_mem, which I think is a problem. Just because I will
let you use 4GB of memory if you will really benefit from it, doesn't mean
you should use 4GB gratuitously.
Suya, what happens if you lower work_mem setting? Does it revert to the
plan you want?
Cheers,
Jeff
From | Date | Subject | |
---|---|---|---|
Next Message | Huang, Suya | 2014-09-12 02:26:04 | weird execution plan |
Previous Message | Matheus de Oliveira | 2014-09-11 14:09:52 | Re: how to change the provoke table in hash join |