Re: HashJoin w/option to unique-ify inner rel

From: "Lawrence, Ramon" <ramon(dot)lawrence(at)ubc(dot)ca>
To: "Robert Haas" <robertmhaas(at)gmail(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Greg Stark" <stark(at)enterprisedb(dot)com>, <pgsql-hackers(at)postgresql(dot)org>, "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Subject: Re: HashJoin w/option to unique-ify inner rel
Date: 2009-04-16 22:21:54
Message-ID: 6EEA43D22289484890D119821101B1DF05190E43@exchange20.mercury.ad.ubc.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> Upon further review, it appears that a big part of this problem is
> that cost_hashjoin() doesn't understand that it needs cost semi-joins
> differently from inner or left joins. The bogus logic looks to be
> right here:
> startup_cost += hash_qual_cost.startup;
> run_cost += hash_qual_cost.per_tuple *
> outer_path_rows * clamp_row_est(inner_path_rows *
> innerbucketsize) * 0.5;
>
> Of course, when the join type is JOIN_SEMI, we're going to stop
> looking after we find the first match, so this estimate is really far
> off.

The 8.3 version of cost_hashjoin() had a line like this:

joininfactor = join_in_selectivity(&path->jpath, root);

and a cost function like this:

run_cost += hash_qual_cost.per_tuple *
outer_path_rows * clamp_row_est(inner_path_rows *
innerbucketsize) * joininfactor * 0.5;

This compensated for IN joins being able to stop scanning a bucket once
a match is found. You may consider something similar for a semi-join.
Having experimented with a lot of this code recently, there is some
potential for improvement on the bucket sizes, etc., but it is a
non-trivial problem.

I tested a similar query on TPCH 100M1Z on version 8.3:

select * from customer where c_custkey in (select o_custkey from orders)

and found that hash aggregate was marginally faster. If you turn off
aggregation, it selects an IN hash join which is about 5% slower and the
planner is not too far off. So, it would be definitely possible to
modify the cost function appropriately.

> >>> It's tempting to have Hash cheat and just peek at the node beneath
it
> >>> to see if it's a HashAggregate, in which case it could call a
special
> >>> method to request the whole hash. But it would have to know that
it's
> >>> just a plain uniquify and not implementing a GROUP BY.

If HashAggregate is faster, then the question is can you make it better
by avoiding building the hash structure twice. I haven't considered all
the possibilities, but the situation you have used as an example, an IN
query, seems workable. Instead of translating to a hash
aggregate/hash/hash join query plan, it may be possible to create a
special hash join node that does uniquefy.

The benefit is that the planner knows about it (instead of changing the
execution plan), you can be more accurate on costs for the hash join,
and you can optimize by using only one hash table construction. A
challenge that must be dealt with is handling the multi-batch case. It
appears that hash aggregate does not currently handle this, but I may be
mistaken.

--
Ramon Lawrence

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kris Jurka 2009-04-16 23:09:37 No hash join across partitioned tables?
Previous Message Tom Lane 2009-04-16 20:48:23 Re: Performance of full outer join in 8.3