Re: Hash Joins vs. Bloom Filters / take 2

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Hash Joins vs. Bloom Filters / take 2
Date: 2018-11-01 21:16:31
Message-ID: 25aa5442-ed41-5a3f-4a45-0f659693a229@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11/01/2018 10:06 PM, Thomas Munro wrote:
> On Fri, Nov 2, 2018 at 9:23 AM Jim Finnerty <jfinnert(at)amazon(dot)com> wrote:
>> I'm very interested in this patch, and particularly in possible
>> extensions to push the Bloom filter down on the probe side of the join. I
>> made a few small edits to the patch to enable it to compile on PG11, and can
>> send it to you if you're interested.
>
> Hi Jim,
>
> Would you compute the hash for the outer tuples in the scan, and then
> again in the Hash Join when probing, or would you want to (somehow)
> attach the hash to emitted tuples for later reuse by the higher node?
> Someone pointed out to me off-list that a popular RDBMS emanating from
> the bicycle capital of the North-West pushes down Bloom filters to
> scans, but only when the key is a non-nullable integer; I wonder if
> that is because they hash in both places, but consider that OK only
> when it's really cheap to do so. (Along the same lines, if we could
> attach extra data to tuples, I wonder if it would make sense to
> transmit sort support information to higher nodes, so that (for
> example) GatherMerge could use it to avoid full key comparison when
> dealing with subplans that already did a sort and computed integers
> for fast inequality checks.)
>
>> It is currently in the list of patches for the current commitfest, but
>> based on your previous post I'm not sure if you're planning to get back to
>> this patch just now. If you plan to resume work on it, I'll sign up as a
>> reviewer.
>
> I'm also signed up to review.
>

I haven't really planned to work on this anytime soon, unfortunately,
which is why I proposed to mark it as RwF at the end of the last CF. I
already have a couple other patches there, and (more importantly) I
don't have a very clear idea how to implement the filter pushdown.

That being said I still think it'd be an interesting improvement, and if
someone wants to take over I'm available as a reviewer / tester ...

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-11-01 21:17:16 Re: PG vs macOS Mojave
Previous Message Thomas Munro 2018-11-01 21:06:47 Re: Hash Joins vs. Bloom Filters / take 2