Re: 9.2.4 vs 9.3.0 query planning (sort merge join vs hash join)

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Tim Kane <tim(dot)kane(at)gmail(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: 9.2.4 vs 9.3.0 query planning (sort merge join vs hash join)
Date: 2014-05-12 16:02:12
Message-ID: CAMkU=1yEEP6Z3-wTXQLsMF9z_7Os57R9VSpqwTDWMb3g5Na5dw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Mon, May 12, 2014 at 3:45 AM, Tim Kane <tim(dot)kane(at)gmail(dot)com> wrote:

> Hi all,
>
> Is there a propensity for 9.2.4 to prefer a sort-merge-join, in place of a
> hash join?
>
> I’m fairly sure the answer is yes, but I also want to be sure I’m
> interpreting the explain output correctly.
>
> I’m comparing behaviour between two systems, which for all intents and
> purposes are identical save for the version of postgres.
> There appears to be nothing wrong with the row estimates given in the
> explain plan on either machine, however actual performance is significantly
> impaired on the 9.2.4 setup due to the preference for the use of a
> sort-merge join, compared to a hash-join on 9.3.0
>
>
>
> Snippet from 9.2.4
> -> Merge Left Join (cost=19598754.29..19602284.00
> rows=469996 width=434) (actual time=6369152.750..6386029.191 rows=6866896
> loops=1)
> Buffers: shared hit=489837 read=1724585
> * -> Sort (cost=19598650.62..19599825.61
> rows=469996 width=120) (actual time=6369151.307..6373591.881 rows=6866896
> loops=1) (A)*
> * Sort Method: quicksort Memory:
> 1162266kB*
> Buffers: shared hit=489765 read=1724585
> -> Hash Left Join
> (cost=429808.90..19554371.62 rows=469996 width=120) (actual
> time=37306.534..6353455.046 rows=6866896 loops=1) * (B)*
> Rows Removed by Filter: 20862464
> Buffers: shared hit=489765
> read=1724585
>
> Snippet from 9.3.0
>
> -> Hash Left Join (cost=617050.43..20948535.43
> rows=566893 width=434) (actual time=51816.864..934723.548 rows=6866896
> loops=1)
> Buffers: shared hit=1732 read=2010920 written=1
> -> Hash Left Join
> (cost=616993.23..20870529.73 rows=566893 width=120) (actual
> time=51796.882..923196.579 rows=6866896 loops=1)
> Rows Removed by Filter: 20862464
> Buffers: shared hit=1732 read=2010877
> written=1
>
>
> As you can see, the estimates are similar enough between them, but 9.2.4
> want’s to run sort-merge plan (A) – and the resulting execution time blows
> out hugely.
> Intersetingly, it actually looks like it is the hash join immediately
> preceding the quick sort that isn’t performing well (B). Though I suspect
> this is just how an explain plan reads - is this ultimately because the
> sort node is unable to retrieve tuples from the child node quickly enough?
>

It looks to me like a caching issue. The two Hash Left Joins seem to be
identical (although occurring on different levels of the plan) but one
takes much more time. But it is hard to know without seeing what if
feeding into those hash joins.

Cheers,

Jeff

>
>
> Setting *enable_mergejoin = 0 *appears to solve this, but I think an
> upgrade to 9.3.4 is going to win over.
>
> Cheers,
>
> Tim
>
>
>
>
>
>
>
>
>
>
>
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Matheus de Oliveira 2014-05-12 16:38:00 Re: Check memory consumption of postgresql query
Previous Message Tim Kane 2014-05-12 12:22:51 Re: 9.2.4 vs 9.3.0 query planning (sort merge join vs hash join)