Re: tweaking NTUP_PER_BUCKET

From: Tomas Vondra <tv(at)fuzzy(dot)cz>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: tweaking NTUP_PER_BUCKET
Date: 2014-07-19 18:15:28
Message-ID: 53CAB5C0.6000400@fuzzy.cz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 14.7.2014 06:29, Stephen Frost wrote:
> Tomas,
>
> * Tomas Vondra (tv(at)fuzzy(dot)cz) wrote:
>> On 6.7.2014 17:57, Stephen Frost wrote:
>>> * Tomas Vondra (tv(at)fuzzy(dot)cz) wrote:
>>>> I can't find the thread / test cases in the archives. I've found this
>>>> thread in hackers:
>>>>
>>>> http://www.postgresql.org/message-id/CAOeZVif-R-iLF966wEipk5By-KhzVLOqpWqurpaK3p5fYw-Rdw@mail.gmail.com
>>>>
>>>> Can you point me to the right one, please?
>>>
>>> This:
>>>
>>> http://www.postgresql.org/message-id/20130328235627.GV4361@tamriel.snowman.net
>>
>> Sadly, the link to the SQL does not work :-(
>>
>> http://snowman.net/~sfrost/test_case.sql => 404
>>
>>> and I think there may have been other test cases on that thread
>>> somewhere... I certainly recall having at least a couple of them that I
>>> was playing with.
>>>
>>> I can probably find them around here somewhere too, if necessary.
>>
>> If you could look for them, that'd be great.
>
> cc'ing me directly helps me notice these...
>
> I've added back test_case.sql and test_case2.sql. Please check them out
> and let me know- they're real-world cases we've run into.

Hi Stephen,

I've reviewed the two test cases mentioned here, and sadly there's
nothing that can be 'fixed' by this patch. The problem here lies in the
planning stage, which decides to hash the large table - we can't fix
that in the executor.

I did three runs - with master, with the 'dense allocation' patch
applied, and with 'dense allocation + ntup patch' applied.

As the estimates are pretty exact here, the ntup patch effectively only
sets NTUP_PER_BUCKET=1 (and does not do any dynamic resize or so).

For the first testcase, it behaves like this:

####### master (no patches) ########

QUERY PLAN
-------------------------------------------------------------------------
Hash Join (cost=115030.10..116671.32 rows=41176 width=24) (actual
time=980.363..1015.623 rows=13731 loops=1)
Hash Cond: (small_table.id_short = big_table.id_short)
-> Seq Scan on small_table (cost=0.00..714.76 rows=41176 width=24)
(actual time=0.015..2.563 rows=41176 loops=1)
-> Hash (cost=61626.71..61626.71 rows=4272271 width=4) (actual
time=979.399..979.399 rows=4272271 loops=1)
Buckets: 524288 Batches: 1 Memory Usage: 150198kB
-> Seq Scan on big_table (cost=0.00..61626.71 rows=4272271
width=4) (actual time=0.026..281.151 rows=4272271 loops=1)
Planning time: 0.196 ms
Execution time: 1034.931 ms
(8 rows)

# without explain analyze

Time: 841,198 ms
Time: 825,000 ms

####### master + dense allocation ########

QUERY PLAN
-------------------------------------------------------------------------
Hash Join (cost=115030.10..116671.32 rows=41176 width=24) (actual
time=778.116..815.254 rows=13731 loops=1)
Hash Cond: (small_table.id_short = big_table.id_short)
-> Seq Scan on small_table (cost=0.00..714.76 rows=41176 width=24)
(actual time=0.006..2.423 rows=41176 loops=1)
-> Hash (cost=61626.71..61626.71 rows=4272271 width=4) (actual
time=775.803..775.803 rows=4272271 loops=1)
Buckets: 524288 Batches: 1 Memory Usage: 150198kB
-> Seq Scan on big_table (cost=0.00..61626.71 rows=4272271
width=4) (actual time=0.005..227.274 rows=4272271 loops=1)
Planning time: 0.061 ms
Execution time: 826.346 ms
(8 rows)

# without explain analyze

Time: 758,393 ms
Time: 734,329 ms

####### master + dense allocation + ntup=1 patch ########

QUERY PLAN
-------------------------------------------------------------------------
Hash Join (cost=115030.10..116465.44 rows=41176 width=24) (actual
time=1020.288..1033.539 rows=13731 loops=1)
Hash Cond: (small_table.id_short = big_table.id_short)
-> Seq Scan on small_table (cost=0.00..714.76 rows=41176 width=24)
(actual time=0.015..2.214 rows=41176 loops=1)
-> Hash (cost=61626.71..61626.71 rows=4272271 width=4) (actual
time=953.890..953.890 rows=4272271 loops=1)
Buckets: 8388608 Batches: 1 Memory Usage: 215734kB
-> Seq Scan on big_table (cost=0.00..61626.71 rows=4272271
width=4) (actual time=0.014..241.223 rows=4272271 loops=1)
Planning time: 0.193 ms
Execution time: 1055.351 ms
(8 rows)

# without explain analyze

Time: 836,050 ms
Time: 822,818 ms

#########################################################

For the second test case, the times are like this:

# master
Time: 2326,579 ms
Time: 2320,232 ms

# master + dense allocation
Time: 2207,254 ms
Time: 2251,691 ms

# master + dense allocation + ntup=1
Time: 2374,278 ms
Time: 2377,409 ms

So while the dense allocation shaves off ~5%, the time with both patches
is slightly higher (again, ~5% difference). After spending some time by
profiling the code, I believe it's because of worse cache hit ratio when
accessing the buckets. By decreasing NTUP_PER_BUCKET the second test
case now uses ~130MB of buckets, while originally it used only ~16MB).
That means slightly lower cache hit ratio when accessing
hashtable->buckets (e.g. in ExecHashTableInsert).

The resulting hash table should be much faster when doing lookups,
however the main issue illustrated by the test cases is that we're
hashing large table and scanning the small one, which means relatively
small amount of lookups. Which pretty much prevents getting anything
from the better lookup performance.

So, I think we need to fix the planning problem here, if possible. I
might try doing that, but if someone with better knowledge of the
planner could do that, that'd be great.

regards
Tomas

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2014-07-19 18:24:00 Re: tweaking NTUP_PER_BUCKET
Previous Message Michael Banck 2014-07-19 15:58:45 Re: Proposal for updating src/timezone