Quick Links

Re: [HACKERS] How to change order sort of table in HashJoin

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Man Trieu <man(dot)trieu(at)gmail(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org, pgsql-general(at)postgresql(dot)org
Subject:	Re: [HACKERS] How to change order sort of table in HashJoin
Date:	2016-11-19 16:18:33
Message-ID:	21633.1479572313@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general pgsql-hackers

Man Trieu <man(dot)trieu(at)gmail(dot)com> writes:
> As in the example below, i think the plan which hash table is created on
> testtbl2 (the fewer tuples) should be choosen.

The planner usually prefers to hash on the table that has a flatter
MCV histogram, since a hash table with many key collisions will be
inefficient. You might find it illuminating to read the comments around
estimate_hash_bucketsize().

In general, given a hashtable that fits in memory and light bucket
loading, a hash join is more or less O(M) + O(N); it doesn't matter
so much whether the larger table is on the inside. It does matter if
the table gets big enough to force batching of the join, but that's
not happening in your example (at least not the first one; it's unclear
to me why it did happen in the second one). The key thing that will
drive the choice, then, is avoiding a skewed bucket distribution that
causes lots of comparisons for common values.

regards, tom lane

In response to

How to change order sort of table in HashJoin at 2016-11-19 05:46:47 from Man Trieu

Responses

Re: How to change order sort of table in HashJoin at 2016-11-20 10:21:34 from Man

Browse pgsql-general by date

	From	Date	Subject
Next Message	Tom Lane	2016-11-19 16:25:30	Re: Database migration to RDS issues permissions
Previous Message	Fran ...	2016-11-19 15:21:31	Re: Database migration to RDS issues permissions

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Joshua D. Drake	2016-11-19 16:31:33	Re: Mail thread references in commits
Previous Message	Karl O. Pinc	2016-11-19 15:22:20	Re: Patch to implement pg_current_logfile() function