From: | Tomas Vondra <tv(at)fuzzy(dot)cz> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: tweaking NTUP_PER_BUCKET |
Date: | 2014-07-03 18:50:34 |
Message-ID: | 53B5A5FA.4050705@fuzzy.cz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi Stephen,
On 3.7.2014 20:10, Stephen Frost wrote:
> Tomas,
>
> * Tomas Vondra (tv(at)fuzzy(dot)cz) wrote:
>> However it's likely there are queries where this may not be the case,
>> i.e. where rebuilding the hash table is not worth it. Let me know if you
>> can construct such query (I wasn't).
>
> Thanks for working on this! I've been thinking on this for a while
> and this seems like it may be a good approach. Have you considered a
> bloom filter over the buckets..? Also, I'd suggest you check the
I know you've experimented with it, but I haven't looked into that yet.
> archives from about this time last year for test cases that I was
> using which showed cases where hashing the larger table was a better
> choice- those same cases may also show regression here (or at least
> would be something good to test).
Good idea, I'll look at the test cases - thanks.
> Have you tried to work out what a 'worst case' regression for this
> change would look like? Also, how does the planning around this
> change? Are we more likely now to hash the smaller table (I'd guess
> 'yes' just based on the reduction in NTUP_PER_BUCKET, but did you
> make any changes due to the rehashing cost?)?
The case I was thinking about is underestimated cardinality of the inner
table and a small outer table. That'd lead to a large hash table and
very few lookups (thus making the rehash inefficient). I.e. something
like this:
Hash Join
Seq Scan on small_table (rows=100) (actual rows=100)
Hash
Seq Scan on bad_estimate (rows=100) (actual rows=1000000000)
Filter: ((a < 100) AND (b < 100))
But I wasn't able to reproduce this reasonably, because in practice
that'd lead to a nested loop or something like that (which is a planning
issue, impossible to fix in hashjoin code).
Tomas
From | Date | Subject | |
---|---|---|---|
Next Message | Greg Stark | 2014-07-03 18:51:40 | Re: tweaking NTUP_PER_BUCKET |
Previous Message | Atri Sharma | 2014-07-03 18:40:26 | Re: tweaking NTUP_PER_BUCKET |