Re: hash_search_with_hash_value is high in "perf top" on a replica

From: Dmitry Koterov <dmitry(dot)koterov(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: hash_search_with_hash_value is high in "perf top" on a replica
Date: 2025-02-01 11:46:33
Message-ID: CA+CZih4Enpksib2xC8uD2KafOqX=W8-NvC=9qYQDnLsHnQc+PA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> It'd be interesting to see what the paths towards
hash_search_with_hash_value
are.

One of the popular paths is on the screenshot. They are all more or less
the same when recovery_prefetch=on (and when it's off, the replica behaves
worse, more replication lag).

That COPY command - it's the initial sync stage after CREATE SUBSCRIPTION
basically (with streaming=off, default). Large table (hundreds of
gigabytes).

[image: image.png]

It's also interesting, how gradually the disk writes and IOPS grow on the
replica during that large table COPY (dark blue line on the screenshot
below). And in the end, when "EXPLAIN SELECT 1 FROM mytable" showed ~100%
of the rows (i.e. almost everything is copied), it got stuck or ~5 minutes
with a spike on both reads and writes (the dark blue charts are all about
the replica). I.e. that COPY load, it is not spread uniformly: in the end,
it causes more load (I also saw that on the "startup recovering" process
CPU metrics).

[image: CleanShot 2025-02-01 at 03(dot)42(dot)16(at)2x(dot)png]

On Fri, Jan 31, 2025 at 6:43 AM Andres Freund <andres(at)anarazel(dot)de> wrote:

> Hi,
>
> On 2025-01-31 03:30:35 -0800, Dmitry Koterov wrote:
> > Debugging some replication lag on a replica when the master node
> > experiences heavy writes.
> >
> > PG "startup recovering" eats up a lot of CPU (like 65 %user and 30 %sys),
> > which is a little surprising (what is it doing with all those CPU cycles?
> > it looked like WAL replay should be more IO bound than CPU bound?).
> >
> > Running "perf top -p <pid>", it shows this:
> >
> > Samples: 1M of event 'cycles:P', 4000 Hz, Event count (approx.):
> > 18178814660 lost: 0/0 drop: 0/0
> > Overhead Shared Object Symbol
> > 16.63% postgres [.] hash_search_with_hash_value
>
> It'd be interesting to see what the paths towards
> hash_search_with_hash_value
> are.
>
> You said it's a COPY workloads, which surprises me a bit, because that
> should
> normally be a bit less sensitive to it. Perhaps you have triggers or such
> that
> prevent use of the multi-insert path?
>
>
>
> > 5.38% postgres [.] __aarch64_ldset4_sync
> > 4.42% postgres [.] __aarch64_cas4_acq_rel
>
> These two suggest that it might be worth compiling with an -march CPU that
> provides native atomics (everything above armv8.1-a, I think).
>
>
> > Maybe it's a red herring though, but it looks pretty suspicious.
>
> It's unfortunately not too surprising - our buffer mapping table is a
> pretty
> big bottleneck. Both because a hash table is just not a good fit for the
> buffer mapping table due to the lack of locality and because dynahash is
> really poor hash table implementation.
>
>
>
> Greetings,
>
> Andres Freund
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Álvaro Herrera 2025-02-01 12:20:17 Re: Proposal to CREATE FOREIGN TABLE LIKE
Previous Message Jelte Fennema-Nio 2025-02-01 11:03:24 Re: Commitfest app release on Feb 17 with many improvements