| From: | Atri Sharma <atri(dot)jiit(at)gmail(dot)com> |
|---|---|
| To: | Greg Stark <stark(at)mit(dot)edu> |
| Cc: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Ants Aasma <ants(at)cybertec(dot)at>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: Bloom Filter lookup for hash joins |
| Date: | 2013-07-23 15:32:58 |
| Message-ID: | CAOeZVifPZtjQBuFM6L1Ks29hyn1b5bZFwmSjcv=4w1U-+W465A@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Tue, Jul 23, 2013 at 7:32 PM, Greg Stark <stark(at)mit(dot)edu> wrote:
> This exact idea was discussed a whole back. I think it was even implemented.
>
> The problem Tom raised at the time is that the memory usage of the bloom
> filter implies smaller or less efficient hash table. It's difficult to
> determine whether you're coming out ahead or behind.
>
> I think it should be possible to figure this out though. Bloom fillers have
> well understood math for the error rate given the size and number of hash
> functions (and please read up on it and implement the optimal combination
> for the target error rate, not just an wag) and it should be possible to
> determine the resulting advantage.
>
> Measuring the cost of the memory usage is harder but some empirical results
> should give some idea. I expect the result to be wildly uneven one way or
> the other so hopefully it doesn't matter of its not perfect. If it's close
> then probably is not worth doing anyways.
>
> I would suggest looking up the archives of the previous discussion. You mind
> find the patch still usable. Iirc there's been no major changes to the hash
> join code.
>
Right, I will definitely have a look on the thread. Thanks for the info!
Regards,
Atri
--
Regards,
Atri
l'apprenant
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Josh Berkus | 2013-07-23 16:10:43 | Re: [v9.4] row level security |
| Previous Message | Robert Haas | 2013-07-23 15:25:27 | Re: getting rid of SnapshotNow |