Re: Hash Joins vs. Bloom Filters / take 2

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Jim Finnerty <jfinnert(at)amazon(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hash Joins vs. Bloom Filters / take 2
Date: 2018-11-01 21:06:47
Message-ID: CAEepm=00SzM+AznZLj2n9jK3x5V1xvWL5SkkfVBo0+y8cR7ZSA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Nov 2, 2018 at 9:23 AM Jim Finnerty <jfinnert(at)amazon(dot)com> wrote:
> I'm very interested in this patch, and particularly in possible
> extensions to push the Bloom filter down on the probe side of the join. I
> made a few small edits to the patch to enable it to compile on PG11, and can
> send it to you if you're interested.

Hi Jim,

Would you compute the hash for the outer tuples in the scan, and then
again in the Hash Join when probing, or would you want to (somehow)
attach the hash to emitted tuples for later reuse by the higher node?
Someone pointed out to me off-list that a popular RDBMS emanating from
the bicycle capital of the North-West pushes down Bloom filters to
scans, but only when the key is a non-nullable integer; I wonder if
that is because they hash in both places, but consider that OK only
when it's really cheap to do so. (Along the same lines, if we could
attach extra data to tuples, I wonder if it would make sense to
transmit sort support information to higher nodes, so that (for
example) GatherMerge could use it to avoid full key comparison when
dealing with subplans that already did a sort and computed integers
for fast inequality checks.)

> It is currently in the list of patches for the current commitfest, but
> based on your previous post I'm not sure if you're planning to get back to
> this patch just now. If you plan to resume work on it, I'll sign up as a
> reviewer.

I'm also signed up to review.

--
Thomas Munro
http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2018-11-01 21:16:31 Re: Hash Joins vs. Bloom Filters / take 2
Previous Message Paul Ramsey 2018-11-01 20:55:16 Compressed TOAST Slicing