Re: hash_search_with_hash_value is high in "perf top" on a replica

From: Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: hash_search_with_hash_value is high in "perf top" on a replica
Date: 2025-01-31 14:38:10
Message-ID: d353f0de-ac5a-4bfa-b1e9-69953bb00522@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

31.01.2025 17:25, Álvaro Herrera пишет:
> On 2025-Jan-31, Dmitry Koterov wrote:
>
>> PG "startup recovering" eats up a lot of CPU (like 65 %user and 30 %sys),
>> which is a little surprising (what is it doing with all those CPU cycles?
>> it looked like WAL replay should be more IO bound than CPU bound?).
>>
>> Running "perf top -p <pid>", it shows this:
>>
>> Samples: 1M of event 'cycles:P', 4000 Hz, Event count (approx.):
>> 18178814660 lost: 0/0 drop: 0/0
>> Overhead Shared Object Symbol
>> 16.63% postgres [.] hash_search_with_hash_value
>
> Yeah, I noticed that this function was showing high in some profiles a
> couple of days ago as well. Looking now at hashfn.c (hash_bytes_uint32
> there is the function involved in the buffer mapping hash table), the
> comments state that we're updated to Bob Jenkins code from 2006, but
> there's a version in his website that (if I read correctly) is twice as
> fast as what we're using now:
> http://burtleburtle.net/bob/hash/spooky.html
>
> Apparently this code in our repo is mostly unchanged since commit
> 1f559b7d3aa4, in 2007.
>
> He mentions that on Intel chips, Google's CityHash is faster; but they
> in turn claim that the difference is small on Intel chips and that
> Jenkins' hash is better on AMD chips.
> https://github.com/google/cityhash
>
> Anyway if you wanted to try your luck at improving things, here's your
> chance.
>

`hash_search_with_hash_value` uses already calculated hash value, so its
performance doesn't depend on performance of hash function
(hash_bytes_uint32 or any other).

I believe, it is memory bound, since dynahash does at least three
indirection jumps before it event reaches first node in linked list

segp = HTAB->dir[segment_num];
*bucketptr = segp[segment_ndx];

And then iterates through linked list with each iteration is being (for
large hash table) both cache and TLB (without huge pages) miss.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jelte Fennema-Nio 2025-01-31 14:39:53 Re: New process of getting changes into the commitfest app
Previous Message Jean-Christophe Arnu 2025-01-31 14:33:08 Re: FileFallocate misbehaving on XFS