Re: Conflict detection for update_deleted in logical replication

From: Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Conflict detection for update_deleted in logical replication
Date: 2025-01-06 05:26:09
Message-ID: CABdArM4iPwbHqUumLGw7HMBefFa718AJOWJv0avufOtR03vFVw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 1, 2025 at 11:06 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Thu, Dec 19, 2024 at 4:34 PM Zhijie Hou (Fujitsu)
> <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> >
> > On Sunday, December 15, 2024 9:39 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> >
> > >
> > > 5. The apply worker needs to at least twice get the publisher status message to
> > > advance oldest_nonremovable_xid once. It then uses the remote_lsn of the last
> > > such message to ensure that it has been applied locally. Such a remote_lsn
> > > could be a much later value than required leading to delay in advancing
> > > oldest_nonremovable_xid. How about if while first time processing the
> > > publisher_status message on walsender, we get the
> > > latest_transaction_in_commit by having a function
> > > GetLatestTransactionIdInCommit() instead of
> > > GetOldestTransactionIdInCommit() and then simply wait till that proc has
> > > written commit WAL (aka wait till it clears DELAY_CHKPT_IN_COMMIT)?
> > > Then get the latest LSN wrote and send that to apply worker waiting for the
> > > publisher_status message. If this is feasible then we should be able to
> > > advance oldest_nonremovable_xid with just one publisher_status message.
> > > Won't that be an improvement over current? If so, we can even further try to
> > > improve it by just using commit_LSN of the transaction returned by
> > > GetLatestTransactionIdInCommit(). One idea is that we can try to use
> > > MyProc->waitLSN which we are using in synchronous replication for our
> > > purpose. See SyncRepWaitForLSN.
> >
> > I will do more performance tests on this and address if it improves
> > the performance.
> >
>
> Did you check this idea? Again, thinking about this, I see a downside
> to the new proposal. In the new proposal, the walsender needs to
> somehow wait for the transactions in the commit which essentially
> means that it may lead delay in decoding and sending the decoded WAL.
> But it is still worth checking the impact of such a change, if nothing
> else, we can add a short comment in the code to suggest such an
> improvement is not worthwhile.
>

Here are the performance test results for the proposed idea where the
WAL sender waits for transactions during commit:

The results indicate that the change did not deliver the expected
performance improvements. On the contrary:
- The subscriber's TPS showed no improvement, there was a performance
reduction of ~58%, consistent with previous observations shared at
[1].
- Additionally, the publisher's TPS was impacted, showing a
performance drop of 7-8%.

Below are the test details:

Test setup:
- Applied v17 patch-set atop pgHead and then the top-up patch to wait
on publisher side. The top-up patch is attached.
- Created a Pub-Sub setup with configurations -

autovacuum = false
shared_buffers = '30GB'
max_wal_size = 20GB
min_wal_size = 10GB
track_commit_timestamp = on (only on sub node)

- To avoid the update/delete_missing conflicts, distinct initial
pgbench tables were created on the publisher and subscriber nodes. On
the publisher side, the tables were renamed: pgbench_XXX ->
pgbench_pub_XXX. Initial data was inserted with scale=100 on both
nodes.
(The scripts used to run the tests - "measure.sh" and "setup.sh" are attached)

Test Run:
- Ran pgbench on both pub and sub simultaneously, on different tables.

Observations:
- No improvement observed in sub's TPS performance, as TPS reduced by -58%.
- The publisher's TPS was also reduced by -8%.

Results:

Run# | Pub's TPS | Sub's TPS
1 | 30279.48756 | 13397.7329
2 | 29634.12687 | 13316.93142
3 | 29350.39389 | 13264.91436
4 | 29700.10173 | 13394.91969
5 | 30121.19058 | 13380.86673
Median | 29700.10173 | 13380.86673
Regression | -8% | -58%

Perf analysis for Sub and walsender:

a) The sub-side backend process shows high time spent in heap tuple
scan, confirming the cause is dead_tuple accumulation.
```
....
--68.69%--ExecModifyTable
--60.92%--ExecScan
--60.75%--IndexNext
--60.55%--index_getnext_slot
--58.12%--index_fetch_heap
--57.83%--heapam_index_fetch_tuple
--40.11%--heap_hot_search_buffer
....
```

b) Collected walsender profile and it shows overall 2-3% higher time
spent in wait. Attached the walsender profiles with and without top-up
patch.
~~~~

[1] https://www.postgresql.org/message-id/CABdArM4OEwmh_31dQ8_F__VmHwk2ag_M%3DYDD4H%2ByYQBG%2BbHGzg%40mail.gmail.com

--
Thanks,
Nisha

Attachment Content-Type Size
walsender_profile_with_topup_patch.txt text/plain 1.2 MB
0001-wait-for-committing-txns-to-finish-in-walsender.patch application/x-patch 6.4 KB
walsender_profile_without_topup_patch.txt text/plain 1.2 MB
measure.sh application/x-sh 1016 bytes
setup.sh application/x-sh 2.0 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2025-01-06 06:07:43 Re: apply_scanjoin_target_to_paths and partitionwise join
Previous Message Peter Smith 2025-01-06 05:15:51 Re: Logical Replication of sequences