RE: Conflict detection for update_deleted in logical replication

From: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: Conflict detection for update_deleted in logical replication
Date: 2025-01-10 11:05:58
Message-ID: OS0PR01MB571614DD226864AF8B245B04941C2@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Friday, January 10, 2025 8:43 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:

Hi,

>
> On Wed, Jan 8, 2025 at 7:26 PM Zhijie Hou (Fujitsu)
> <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> >
> > On Thursday, January 9, 2025 9:48 AM Masahiko Sawada
> <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > Hi,
> >
> > >
> > > On Wed, Jan 8, 2025 at 3:00 AM Zhijie Hou (Fujitsu)
> <houzj(dot)fnst(at)fujitsu(dot)com>
> > > wrote:
> > > >
> > > > On Wednesday, January 8, 2025 6:33 PM Masahiko Sawada
> > > <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > > >
> > > > Hi,
> > > >
> > > > > On Wed, Jan 8, 2025 at 1:53 AM Amit Kapila
> <amit(dot)kapila16(at)gmail(dot)com>
> > > > > wrote:
> > > > > > On Wed, Jan 8, 2025 at 3:02 PM Masahiko Sawada
> > > > > <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > > > > > >
> > > > > > > On Thu, Dec 19, 2024 at 11:11 PM Nisha Moond
> > > > > <nisha(dot)moond412(at)gmail(dot)com> wrote:
> > > > > > > >
> > > > > > > >
> > > > > > > > [3] Test with pgbench run on both publisher and subscriber.
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > Test setup:
> > > > > > > >
> > > > > > > > - Tests performed on pgHead + v16 patches
> > > > > > > >
> > > > > > > > - Created a pub-sub replication system.
> > > > > > > >
> > > > > > > > - Parameters for both instances were:
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > share_buffers = 30GB
> > > > > > > >
> > > > > > > > min_wal_size = 10GB
> > > > > > > >
> > > > > > > > max_wal_size = 20GB
> > > > > > > >
> > > > > > > > autovacuum = false
> > > > > > >
> > > > > > > Since you disabled autovacuum on the subscriber, dead tuples
> > > > > > > created by non-hot updates are accumulated anyway regardless of
> > > > > > > detect_update_deleted setting, is that right?
> > > > > > >
> > > > > >
> > > > > > I think hot-pruning mechanism during the update operation will
> > > > > > remove dead tuples even when autovacuum is disabled.
> > > > >
> > > > > True, but why did it disable autovacuum? It seems that
> > > > > case1-2_setup.sh doesn't specify fillfactor, which makes hot-updates
> less
> > > likely to happen.
> > > >
> > > > IIUC, we disable autovacuum as a general practice in read-write tests
> > > > for stable TPS numbers.
> > >
> > > Okay. TBH I'm not sure what we can say with these results. At a glance, in
> a
> > > typical bi-directional-like setup, we can interpret these results as that if
> > > users turn retain_conflict_info on the TPS goes 50% down. But I'm not
> sure
> > > this 50% dip is the worst case that users possibly face. It could be better in
> > > practice thanks to autovacuum, or it also could go even worse due to
> further
> > > bloats if we run the test longer.
> > > Suppose that users had 50% performance dip due to dead tuple retention
> for
> > > update_deleted detection, is there any way for users to improve the
> situation?
> > > For example, trying to advance slot.xmin more frequently might help to
> reduce
> > > dead tuple accumulation. I think it would be good if we could have a way to
> > > balance between the publisher performance and the subscriber
> performance.
> >
> > AFAICS, most of the time in each xid advancement is spent on waiting for the
> > target remote_lsn to be applied and flushed, so increasing the frequency
> could
> > not help. This can be proved to be reasonable in the testcase 4 shared by
> > Nisha[1], in that test, we do not request a remote_lsn but simply wait for the
> > commit_ts of incoming transaction to exceed the candidate_xid_time, the
> > regression is still the same.
>
> True, but I think that not only more frequently asking the publisher
> its status but also the apply worker frequently trying to advance the
> RetainConflictInfoPhase and the launcher frequently trying to advance
> the slot.xmin are important.

I agree.

>
> > I think it indicates that we indeed need to wait
> > for this amount of time before applying all the transactions that have earlier
> > commit timestamp. IOW, the performance impact on the subscriber side is a
> > reasonable behavior if we want to detect the update_deleted conflict reliably.
>
> It's reasonable behavior for this approach but it might not be a
> reasonable outcome for users if they could be affected by such a
> performance dip without no way to avoid it.
>
> To closely look at what is happening in the apply worker and the
> launcher, I did a quick test with the same setup, where running
> pgbench with 30 clients to each of the publisher and subscriber (on
> different pgbench tables so conflicts don't happen on the subscriber),
> and I recorded how often the worker and the launcher tried to update
> the worker's xmin and slot's xmin, respectively. During the 120
> seconds test I observed that the apply worker advanced its
> oldest_nonremovable_xid 10 times with 43 attempts and the launcher
> advanced the slot's xmin 5 times with 20 attempts, which seems to be
> less frequent. And there seems no way for users to increase these
> frequencies. Actually, these XID advancements happened only early in
> the test and in the later part there was almost no attempt to advance
> XIDs (I described the reason below). Therefore, after 120 secs tests,
> slot's xmin was 2366291 XIDs behind (TPS on the publisher and
> subscriber were 15728 and 18052, respectively).

Thanks for testing ! It appears that the frequency observed in your tests is
higher than what we've experienced locally. Could you please share the scripts
you used and possibly the machine configuration? This information will help us
verify the differences in the data you've shared.

> I think there 3 things we need to deal with:

Thanks for the suggestions. We will analyze them and share some top-up patches
for the suggested changes later.

Best Regards,
Hou zj

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Maxim Orlov 2025-01-10 11:08:51 Re: Potential null pointer dereference in postgres.c
Previous Message Anthonin Bonnefoy 2025-01-10 10:59:44 Re: Add Pipelining support in psql