Re: Re: Query planner using hash join when merge join seems orders of magnitude faster

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Branden Visser <mrvisser(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Re: Query planner using hash join when merge join seems orders of magnitude faster
Date: 2016-08-01 22:56:49
Message-ID: 20318.1470092209@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Branden Visser <mrvisser(at)gmail(dot)com> writes:
> I just wanted to update that I've found evidence that fixing the
> planner row estimation may not actually influence it to use the more
> performant merge join instead of hash join. I have found instances
> where the row estimation is *overestimated* by a magnitude of 4x
> (estimates 2.4m rows) and still chooses hash join over merge join,
> where merge join is much faster (45s v.s. 12s).

I wonder why the merge join is faster exactly. It doesn't usually have a
huge benefit unless the inputs are presorted already. The one case I can
think of where it can win quite a lot is if the range of merge keys in one
input is such that we can skip reading most of the other input. (Extreme
example: one input has keys 1..10, but the other input has keys 1..10000.
We only need to read the first 1% of the second input, assuming there's an
index on its key column so that we don't have to read the whole thing
anyway to sort it.)

The planner is aware of that effect, but I wonder if it's misestimating it
for some reason. Anyway it would be worth looking closely at your EXPLAIN
ANALYZE results to determine whether early-stop is happening or not. It'd
manifest as one join input node showing an actual number of rows returned
that's less than you'd expect.

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Venkata Balaji N 2016-08-02 05:30:47 Re: How to best archetect Multi-Tenant SaaS application using Postgres
Previous Message Jeff Janes 2016-08-01 21:21:19 Re: Force pg_hba.conf user with LDAP