Quick Links

Re: Conflict detection for update_deleted in logical replication

From:	Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>
To:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc:	Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Conflict detection for update_deleted in logical replication
Date:	2025-01-13 12:21:11
Message-ID:	CABdArM4WTx_UZSerdT_-aVhz=ePLcXSrqXcA0VfxzAxBEV7CGw@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Here are the performance test results and analysis with the recent patches.

Test Setup:
- Created Pub and Sub nodes with logical replication and below configurations.
autovacuum_naptime = '30s'
shared_buffers = '30GB'
max_wal_size = 20GB
min_wal_size = 10GB
track_commit_timestamp = on (only on Sub node).
- Pub and Sub had different pgbench tables with initial data of scale=100.

-------------------------------
Case-0: Collected data on pgHead
-------------------------------
- Ran pgbench(read-write) on both the publisher and the subscriber
with 30 clients for a duration of 15 minutes, collecting data over 3
runs.

Results:
Run# pub_TPS sub_TPS
1 30551.63471 29476.81709
2 30112.31203 28933.75013
3 29599.40383 28379.4977
Median 30112.31203 28933.75013

-------------------------------
Case-1: Long run(15-minutes) tests when retain_conflitc_info=ON
-------------------------------
- Code: pgHead + v19 patches.
- At Sub set autovacuum=false.
- Ran pgbench(read-write) on both the publisher and the subscriber
with 30 clients for a duration of 15 minutes, collecting data over 3
runs.

Results:
Run# pub_TPS sub_TPS
1 30326.57637 4890.410972
2 30412.85115 4787.192754
3 30860.13879 4864.549117
Median 30412.85115 4864.549117
regression 1% -83%

- A 15-minute pgbench run test showed higher reduction in the sub's
TPS. As the test run time increased the TPS reduced further at the Sub
node.

-------------------------------
Case-2 : Re-ran the case-1 with autovacuum enabled and running every 30 seconds.
-------------------------------
- Code: pgHead + v19 patches.
- At Sub set autovacuum=true.
- Also measured the frequency of slot.xmin and the worker's
oldest_nonremovable_xid updates.

Results:
Run# pub_TPS sub_TPS #slot.xmin_updates
#worker's_oldest_nonremovable_xid_updates
1 31080.30944 4573.547293 0 1
regression 3% -84%

- Autovacuum did not help in improving the Sub's TPS.
- The slot's xmin was not advanced.
~~~~

Observations and RCA for TPS reduction in above tests:
- The launcher was not able to advance slot.xmin during the 15-minute
pgbench run, leading to increased dead tuple accumulation on the
subscriber node.
- The launcher failed to advance slot.xmin because the apply worker
could not set the oldest_nonremovable_xid early and frequently enough
due to following two reasons -
1) For large pgbench tables (scale=100), the tablesync takes time
to complete, forcing the apply worker to wait before updating its
oldest_nonremovable_xid.
2) With 30 clients generating operations at a pace that a single
apply worker cannot match, the worker fails to catch up with the
rapidly increasing remote_lsn, lagging behind the Publisher's LSN
throughout the 15-minute run.

Considering the above reasons, for better performance measurements,
collected data when table_sync is off, with a varying number of
clients on the publisher node. Below test used the v21 patch set,
which also includes improvement patches (006 and 007) for more
frequent slot.xmin updates.
-------------------------------
Case-3: Create the subscription with option "copy_data=false", so, no
tablesync in the picture.
-------------------------------
Test setup:
- Code: pgHead + v21 patches.
- Created Pub and Sub nodes with logical replication and below configurations.
autovacuum_naptime = '30s'
shared_buffers = '30GB'
max_wal_size = 20GB
min_wal_size = 10GB
track_commit_timestamp = on (only on Sub node).

- The Pub and Sub had different pgbench tables with initial data of scale=100.
- Ran pgbench(read-write) on both the pub and the sub for a duration
of 15 minutes, using 30 clients on the Subscriber while varying the
number of clients on the Publisher.
- In addition to TPS, the frequency of slot.xmin and the worker's
oldest_nonremovable_xid updates was also measured.

Observations:
- As the number of clients on the publisher increased, the
publisher's TPS improved, but the subscriber's TPS dropped
significantly.
- The frequency of slot.xmin updates also declined with more clients
on the publisher, indicating that the apply worker updated its
oldest_nonremovable_xid less frequently as the read-write operations
on the publisher increased.

Results:
#Pub-clients pubTPS pubTPS_increament subTPS pubTPS_reduction
#slot.xmin_updates #worker's_oldest_nonremovable_xid_updates
1 1364.487898 0 35000.06738 0 6976 6977
2 2706.100445 98% 32297.81408 -8% 5838 5839
4 5079.522778 272% 8581.034791 -75% 268 269
30 31308.18524 2195% 5324.328696 -85% 4 5

Note: In the above result table, the column -
- "PubTPS_increment" represents the % improvement in the Pub's
TPS compared to its TPS in the initial run with #Pub-clients=1 and
- "SubTPS_reduction" indicates the % decrease in the Sub's TPS
compared to its TPS in the initial run with #Pub-clients=1.
~~~~

Conclusion:
There is some improvement in slot.xmin update frequency with
table_sync off and the additional patches that updates slot's xmin
aggressively.
However, the key point is that with a large number of clients
generating write operations, apply worker LAGs with a large margin
leading to non-updation of slot's xmin as the test run time increases.
This is also visible [in case-3] that with only 1 client on publisher,
there is no degradation on the subscriber. As the number of clients
increases, the degradation also increases.

Based on this test analysis I can say that we need some way/option to
invalidate such slots that LAG by a threshold margin, as mentioned at
[1]. This should solve the performance degradation and bloat problem.
~~~~

(Attached the test scripts used for above tests)

[1] https://www.postgresql.org/message-id/CAA4eK1Jyo4odkVsnSeAWPh8Wgpw12EbS9q8s_eN14LtcFNXCSA%40mail.gmail.com

--
Thanks,
Nisha

Attachment	Content-Type	Size
v21_setup.sh	text/x-sh	2.8 KB
v21_measure.sh	text/x-sh	1.1 KB

In response to

Re: Conflict detection for update_deleted in logical replication at 2025-01-13 06:36:26 from Amit Kapila

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Alena Rybakina	2025-01-13 13:26:57	Re: Vacuum statistics
Previous Message	Bertrand Drouvot	2025-01-13 12:20:39	Re: Reorder shutdown sequence, to flush pgstats later