From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> |
Cc: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Jeff Davis <pgsql(at)j-davis(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com> |
Subject: | Re: walsender performance regression due to logical decoding on standby changes |
Date: | 2023-05-17 20:55:56 |
Message-ID: | 20230517205556.dyw5c6cxcg6ij44f@awork3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2023-05-17 12:53:15 -0700, Andres Freund wrote:
> I'll try to come up with a benchmark without the issues I pointed out in
> https://postgr.es/m/20230517194331.ficfy5brpfq5lrmz%40awork3.anarazel.de
Here we go:
setup:
create primary
SELECT pg_create_physical_replication_slot('reserve', true);
create standby using pg_basebackup
create WAL:
psql -c CREATE TABLE testtable_logged(other_data int default 1);' && \
c=16; PGOPTIONS='-c synchronous_commit=off' /path-to-pgbench --random-seed=0 -n -c$c -j$c -t1000 -P1 -f <( echo "INSERT INTO testtable_logged SELECT generate_series(1, 1000)" ) && \
psql -c "SELECT pg_create_restore_point('end');"
benchmark:
rm -rf /tmp/test && \
cp -ar /srv/dev/pgdev-dev-standby /tmp/test && \
cp -ar /srv/dev/pgdev-dev/pg_wal/* /tmp/test/pg_wal/ && \
sync && \
/usr/bin/time -f '%es' /path-to-postgres -D /tmp/test -c recovery_target_action=shutdown -c recovery_target_name=end -c shared_buffers=1GB -c fsync=off
That way I can measure how long it takes to replay exactly the same WAL, and
also take profiles of exactly the same work, without influencing time results.
I copy the WAL files to the primary to ensure that walreceiver (standby) /
walsender (primary) performance doesn't make the result variability higher.
max_walsenders=10 max_walsenders=100
e101dfac3a5 reverted 7.01s 7.02s
093e5c57d50 / HEAD 8.25s 19.91s
bharat-v3 7.14s 7.13s
So indeed, bharat-v3 largely fixes the issue.
The regression of v3 compared to e101dfac3a5 reverted seems pretty constant at
~0.982x, independent of the concrete max_walsenders value. Which makes sense,
the work is constant.
To make it more extreme, I also tested a workload that is basically free to replay:
c=16; /srv/dev/build/m-opt/src/bin/pgbench/pgbench --random-seed=0 -n -c$c -j$c -t5000 -P1 -f <( echo "SELECT pg_logical_emit_message(false, 'c', 'a') FROM generate_series(1, 1000)" ) && psql -c "SELECT pg_create_restore_point('end');"
max_walsenders=10 max_walsenders=100
e101dfac3a5 reverted 1.70s 1.70s
093e5c57d50 / HEAD 3.00s 14.56s
bharat-v3 1.88s 1.88s
In this extreme workload we still regress by ~0.904x.
I'm not sure how much it's worth worrying about that - this is a quite
unrealistic testcase.
FWIW, if I just make WalSndWakeup() do nothing, I still see a very small, but
reproducible, overhead: 1.72s - that's just the cost of the additional
external function call.
If I add a no-waiters fastpath using proclist_is_empty() to
ConditionVariableBroadcast(), I get 1.77s. So the majority of the remaining
slowdown indeed comes from the spinlock acquisition in
ConditionVariableBroadcast().
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2023-05-17 21:10:48 | Re: run pgindent on a regular basis / scripted manner |
Previous Message | Matthias van de Meent | 2023-05-17 20:38:37 | Inconsistent behavior with locale definition in initdb/pg_ctl init |