Quick Links

Re: HashJoin order, hash the large or small table? Postgres likes to hash the big one, why?

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Scott Carey <scott(at)richrelevance(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-performance(at)postgresql(dot)org Performance" <pgsql-performance(at)postgresql(dot)org>
Subject:	Re: HashJoin order, hash the large or small table? Postgres likes to hash the big one, why?
Date:	2011-04-13 17:35:31
Message-ID:	8747.1302716131@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

Scott Carey <scott(at)richrelevance(dot)com> writes:
>> On Oct 27, 2010, at 12:56 PM, Tom Lane wrote:
>> Because a poorly distributed inner relation leads to long hash chains.
>> In the very worst case, all the keys are on the same hash chain and it
>> degenerates to a nested-loop join.

> A pathological skew case (all relations with the same key), should be
> _cheaper_ to probe.

I think you're missing the point, which is that all the hash work is
just pure overhead in such a case (and it is most definitely not
zero-cost overhead). You might as well just do a nestloop join.
Hashing is only beneficial to the extent that it allows a smaller subset
of the inner relation to be compared to each outer-relation tuple.
So I think biasing against skew-distributed inner relations is entirely
appropriate.

regards, tom lane

In response to

Re: HashJoin order, hash the large or small table? Postgres likes to hash the big one, why? at 2011-04-13 17:22:40 from Scott Carey

Responses

Re: HashJoin order, hash the large or small table? Postgres likes to hash the big one, why? at 2011-04-13 18:27:21 from Scott Carey

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Scott Carey	2011-04-13 18:27:21	Re: HashJoin order, hash the large or small table? Postgres likes to hash the big one, why?
Previous Message	Scott Carey	2011-04-13 17:22:40	Re: HashJoin order, hash the large or small table? Postgres likes to hash the big one, why?