Quick Links

Re: A better way than tweaking NTUP_PER_BUCKET

From:	Stephen Frost <sfrost(at)snowman(dot)net>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Stephen Frost <sfrost(at)snowman(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A better way than tweaking NTUP_PER_BUCKET
Date:	2013-06-23 13:41:15
Message-ID:	CAOuzzgq0H2-CcMUy-xZwY0D-6od5JMdH8nvfR=mO_KJJveotKA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Sunday, June 23, 2013, Simon Riggs wrote:

> On 23 June 2013 03:16, Stephen Frost <sfrost(at)snowman(dot)net <javascript:;>>
> wrote:
>
> > Will think on it more.
>
> Some other thoughts related to this...
>
> * Why are we building a special kind of hash table? Why don't we just
> use the hash table code that we in every other place in the backend.
> If that code is so bad why do we use it everywhere else? That is
> extensible, so we could try just using that. (Has anyone actually
> tried?)

I've not looked at the hash table in the rest of the backend.

> * We're not thinking about cache locality and set correspondence
> either. If the join is expected to hardly ever match, then we should
> be using a bitmap as a bloom filter rather than assuming that a very
> large hash table is easily accessible.

That's what I was suggesting earlier, though I don't think it's technically
a bloom filter- doesn't that require multiple hash functions?I don't think
we want to require every data type to provide multiple hash functions.

> * The skew hash table will be hit frequently and would show good L2
> cache usage. I think I'll try adding the skew table always to see if
> that improves the speed of the hash join.
>

The skew tables is just for common values though... To be honest, I have
some doubts about that structure really being a terribly good approach for
anything which is completely in memory.

Thanks,

Stephen

In response to

Re: A better way than tweaking NTUP_PER_BUCKET at 2013-06-23 09:45:09 from Simon Riggs

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Dean Rasheed	2013-06-23 13:44:32	Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Previous Message	Stephen Frost	2013-06-23 13:40:55	Re: A better way than tweaking NTUP_PER_BUCKET