From: | Peter Geoghegan <pg(at)bowt(dot)ie> |
---|---|
To: | Claudio Freire <klaussfreire(at)gmail(dot)com> |
Cc: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Hash Joins vs. Bloom Filters / take 2 |
Date: | 2018-02-21 00:19:50 |
Message-ID: | CAH2-WzkqXk9qk0a5aJFyQpV5TFrPe3XCf2pZ+Swu0kWkQ1JAfQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Feb 20, 2018 at 3:48 PM, Claudio Freire <klaussfreire(at)gmail(dot)com> wrote:
>> Do we need to eliminate 99% of all hash join probes (that find nothing
>> to join on) to make this Bloom filter optimization worthwhile?
>> Personally, I doubt it.
>
> Even for 90% it's about 4.6 bits per element.
4.6 bits is vastly less than typical hash join hash table entry sizes,
which are often comprised of several attributes. Even the skinniest
possible hash table elements have HJTUPLE_OVERHEAD() overhead, as well
as MinimalTuple overhead. To say nothing of all the serialization and
synchronization overhead that you could also have.
Whatever precise per-element overhead you come up with for your Bloom
filter, the *ratio* of those two things is what makes using a Bloom
filter potentially very effective. It could end up being something
like 1:100, which is a huge win for a highly selective hash join that
still has a fairly large hash table (that cannot fit in L3
cache/work_mem).
> What I'm saying is that you can't assume you can fit the whole thing
> in work_mem. If it could be said that any set worth a hash join will
> fit, you can build a work_mem-sized bloom filter and just accept that
> if you exceed its capacity its filtering efficiency will drop
> benignly. But IMO that can't be taken for granted, so at some point
> you should just give up, the overhead of querying a low-quality BF
> won't be worth it.
It's probably true that we will need to bail out of using a Bloom
filter, and it is certainly true that the optimization won't always
work out. Still, once your Bloom filter proves useless, chances are
good that the plan isn't a very good one, with or without that extra
optimization. Even if the plan is truly the fastest possible plan, and
even if you avoid wasting extra effort on the Bloom filter, it will
still be very slow.
Join selectivity is what really matters with this optimization. Of
course the optimization might not work out, but you can say that about
almost anything that uses hashing -- that's why safety critical
realtime systems (e.g. avionics systems) often completely avoid using
hash tables. Bloom filters are fairly resilient to somewhat inaccurate
estimates, but nothing will save you from terrible estimates. I think
that it's worth risking a small, predictable regression when the join
turns out to not be selective, in order to get a potentially very
large improvement in performance when the planner estimates join
selectivity reasonably accurately (and it's a selective hash join that
won't simply have a small hash table to begin with).
--
Peter Geoghegan
From | Date | Subject | |
---|---|---|---|
Next Message | Haribabu Kommi | 2018-02-21 00:29:03 | Re: [HACKERS] Pluggable storage |
Previous Message | Tomas Vondra | 2018-02-20 23:54:40 | Re: Hash Joins vs. Bloom Filters / take 2 |