RE: Conflict detection for update_deleted in logical replication

From: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: Conflict detection for update_deleted in logical replication
Date: 2024-12-11 09:02:19
Message-ID: OS0PR01MB57164C2D26F2E7D489832D26943E2@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wednesday, December 11, 2024 3:34 PM Zhijie Hou (Fujitsu) <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> On Wednesday, December 11, 2024 1:06 PM Amit Kapila
> <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > On Fri, Dec 6, 2024 at 1:28 PM Zhijie Hou (Fujitsu)
> > <houzj(dot)fnst(at)fujitsu(dot)com>
> > wrote:
> > >
> > > On Thursday, December 5, 2024 6:00 PM Amit Kapila
> > <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > >
> > > >
> > > > A few more comments:
> > > > 1.
> > > > +static void
> > > > +wait_for_local_flush(RetainConflictInfoData *data)
> > > > {
> > > > ...
> > > > +
> > > > + data->phase = RCI_GET_CANDIDATE_XID;
> > > > +
> > > > + maybe_advance_nonremovable_xid(data);
> > > > +}
> > > >
> > > > Isn't it better to reset all the fields of data before the next
> > > > round of GET_CANDIDATE_XID phase? If we do that then we don't need
> > > > to reset
> > > > data->remote_lsn = InvalidXLogRecPtr; and data->last_phase_at =
> > > > InvalidFullTransactionId; individually in
> > > > request_publisher_status() and
> > > > get_candidate_xid() respectively. Also, it looks clean and logical
> > > > to me unless I am missing something.
> > >
> > > The remote_lsn was used to determine whether a status is received,
> > > so was reset each time in request_publisher_status. To make it more
> > > straightforward, I added a new function parameter 'status_received',
> > > which would be set to true when calling
> > > maybe_advance_nonremovable_xid() on receving the status. After this
> > change, there is no need to reset the remote_lsn.
> > >
> >
> > As part of the above comment, I had asked for three things (a) avoid
> > setting
> > data->remote_lsn = InvalidXLogRecPtr; in request_publisher_status();
> > data->(b)
> > avoid setting data->last_phase_at =InvalidFullTransactionId; in
> > get_candidate_xid(); (c) reset data in
> > wait_for_local_flush() after wait is over. You only did (a) in the
> > patch and didn't mention anything about (b) or (c). Is that intentional? If so,
> what is the reason?
>
> I think I misunderstood the intention, so will address in next version.
>
> >
> > *
> > +static bool
> > +can_advance_nonremovable_xid(RetainConflictInfoData *data) {
> > +
> >
> > Isn't it better to make this an inline function as it contains just one check?
>
> Agreed. Will address in next version.
>
> >
> > *
> > + /*
> > + * The non-removable transaction ID for a subscription is centrally
> > + * managed by the main apply worker.
> > + */
> > + if (!am_leader_apply_worker())
> >
> > I have tried to improve this comment in the attached.
>
> Thanks, will check and merge the next version.

Attach the V16 patch set which addressed above comments.

There is a new 0002 patch where I tried to dynamically adjust the interval for
advancing the transaction ID. Instead of always waiting for
wal_receiver_status_interval, we can start with a short interval and increase
it if there is no activity (no xid assigned on subscriber), but not beyond
wal_receiver_status_interval.

The intention is to more effectively advance xid to avoid retaining too much
dead tuples. My colleague will soon share detailed performance data and
analysis related to this enhancement.

Best Regards,
Hou zj

Attachment Content-Type Size
v16-0006-Add-a-tap-test-to-verify-the-new-slot-xmin-mecha.patch application/octet-stream 7.5 KB
v16-0001-Maintain-the-oldest-non-removeable-tranasction-I.patch application/octet-stream 34.7 KB
v16-0002-Dynamically-adjust-xid-advancement-interval.patch application/octet-stream 4.8 KB
v16-0003-Maintain-the-replication-slot-in-logical-launche.patch application/octet-stream 11.9 KB
v16-0004-Support-the-conflict-detection-for-update_delete.patch application/octet-stream 21.5 KB
v16-0005-Add-a-detect_update_deleted-option-to-subscripti.patch application/octet-stream 73.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrei Lepikhov 2024-12-11 09:02:32 Re: Removing unneeded self joins
Previous Message RECHTÉ Marc 2024-12-11 08:59:55 Re: Logical replication timeout