Quick Links

Re: Number of buckets in a hash join

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Number of buckets in a hash join
Date:	2013-01-28 14:30:33
Message-ID:	CA+U5nM+WJaQh3HBEVi-4n+rxu-NUqn0RGZ-gjy9S-zYMvhaWHw@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 28 January 2013 10:47, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> wrote:

> There's also some overhead from empty
> buckets when scanning the hash table

Seems like we should measure that overhead. That way we can plot the
cost against number per bucket, which sounds like it has a minima at
1.0, but that doesn't mean its symmetrical about that point. We can
then see where the optimal setting should be.

Having said that the hash bucket estimate is based on ndistinct, which
we know is frequently under-estimated, so it would be useful to err on
the low side.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Number of buckets in a hash join at 2013-01-28 10:47:58 from Heikki Linnakangas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Peter Eisentraut	2013-01-28 14:46:32	Re: "pg_ctl promote" exit status
Previous Message	Kevin Grittner	2013-01-28 14:28:43	Re: "pg_ctl promote" exit status