RE: Conflict detection for update_deleted in logical replication

From: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: Conflict detection for update_deleted in logical replication
Date: 2025-01-18 03:45:13
Message-ID: OS0PR01MB571604762E4DB7357430D16B94E52@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thursday, January 16, 2025 6:02 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Wed, Jan 15, 2025 at 2:20 PM Zhijie Hou (Fujitsu)
> <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> >
> > In the latest version, we implemented a simpler approach that allows
> > the apply worker to directly advance the oldest_nonremovable_xid if
> > the waiting time exceeds the newly introduced option's limit. I've
> > named this option "max_conflict_retention_duration," as it aligns
> > better with the conflict detection concept and the "retain_conflict_info"
> option.
> >
> > During the last phase (RCI_WAIT_FOR_LOCAL_FLUSH), the apply worker
> > evaluates how much time it has spent waiting. If this duration exceeds
> > the max_conflict_retention_duration, the worker directly advances the
> > oldest_nonremovable_xid and logs a message indicating the forced
> > advancement of the non-removable transaction ID.
> >
> > This approach is a bit like a time-based option that discussed before.
> > Compared to the slot invalidation approach, this approach is simpler
> > because we can avoid adding 1) new slot invalidation type due to apply
> > lag, 2) new field lag_behind in shared memory (MyLogicalRepWorker) to
> > indicate when the lag exceeds the limit, and 3) additional logic in
> > the launcher to handle each worker's lag status.
> >
> > In the slot invalidation, user would be able to confirm if the current
> > by checking if the slot in pg_replication_slot in invalidated or not,
> > while in the simpler approach mentioned, user could only confirm that by
> checking the LOGs.
> >
>
> The user needs to check the LOGs corresponding to all subscriptions on the
> node. I see the simplicity of the approach you used but still the
> slot_invalidation idea sounds better to me on the grounds that it will be
> convenient for users/DBA to know when to rely on the update_missing type
> conflict if there is a valid and active slot with the name 'pg_conflict_detection'
> (or whatever name we decide to give) then users can rely on the detected
> conflict. Sawada-San, and others, do you have any preference on this matter?

I think invalidating the slot is OK and we could also let the apply worker to
automatic recovery as suggested in [1].

Here is the V24 patch set. I modified 0004 patch to implement the slot
Invalidation part. Since the automatic recovery could be an optimization and
the discussion is in progress, I didn't implement that part.

> Do we want to prohibit the combination copy_data as true and
> retain_conflict_info=true? I understand that with the new parameter
> 'max_conflict_retention_duration', for large copies slot would anyway be
> invalidated but I don't want to give users more ways to see this slot invalidated
> in the beginning itself. Similarly during ALTER SUBSCRIPTION, if the initial
> synch is in progress, we can disallow enabling retain_conflict_info. Later, if
> there is a real demand for such a combination, we can always enable it.

I didn't restrict this case, but I will analyze more to see if it can be improved.

[1] https://www.postgresql.org/message-id/CAA4eK1J0W_ebXSSJXb7uM67%3DKyzrUG1sVz-kCd-zT16A7Obq%3Dg%40mail.gmail.com

Best Regards,
Hou zj

Attachment Content-Type Size
v24-0006-Support-the-conflict-detection-for-update_delete.patch application/octet-stream 25.7 KB
v24-0001-Maintain-the-oldest-non-removeable-tranasction-I.patch application/octet-stream 40.7 KB
v24-0002-Maintain-the-replication-slot-in-logical-launche.patch application/octet-stream 21.9 KB
v24-0003-Add-a-retain_conflict_info-option-to-subscriptio.patch application/octet-stream 79.7 KB
v24-0004-add-a-max_conflict_retention_duration-subscripti.patch application/octet-stream 82.3 KB
v24-0005-Add-a-tap-test-to-verify-the-management-of-the-n.patch application/octet-stream 6.6 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2025-01-18 05:01:27 Re: create subscription with (origin = none, copy_data = on)
Previous Message Sergey Tatarintsev 2025-01-18 03:08:43 Re: create subscription with (origin = none, copy_data = on)