Re: Conflict detection for update_deleted in logical replication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Conflict detection for update_deleted in logical replication
Date: 2025-01-16 10:01:55
Message-ID: CAA4eK1J1ZRAdGfMARSu7bVv5DSuNcGm-_DY_XDXegugj_uE2mg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 15, 2025 at 2:20 PM Zhijie Hou (Fujitsu)
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> In the latest version, we implemented a simpler approach that allows the apply
> worker to directly advance the oldest_nonremovable_xid if the waiting time
> exceeds the newly introduced option's limit. I've named this option
> "max_conflict_retention_duration," as it aligns better with the conflict
> detection concept and the "retain_conflict_info" option.
>
> During the last phase (RCI_WAIT_FOR_LOCAL_FLUSH), the apply worker evaluates
> how much time it has spent waiting. If this duration exceeds the
> max_conflict_retention_duration, the worker directly advances the
> oldest_nonremovable_xid and logs a message indicating the forced advancement of
> the non-removable transaction ID.
>
> This approach is a bit like a time-based option that discussed before.
> Compared to the slot invalidation approach, this approach is simpler because we
> can avoid adding 1) new slot invalidation type due to apply lag, 2) new field
> lag_behind in shared memory (MyLogicalRepWorker) to indicate when the lag
> exceeds the limit, and 3) additional logic in the launcher to handle each
> worker's lag status.
>
> In the slot invalidation, user would be able to confirm if the current by
> checking if the slot in pg_replication_slot in invalidated or not, while in the
> simpler approach mentioned, user could only confirm that by checking the LOGs.
>

The user needs to check the LOGs corresponding to all subscriptions on
the node. I see the simplicity of the approach you used but still the
slot_invalidation idea sounds better to me on the grounds that it will
be convenient for users/DBA to know when to rely on the update_missing
type conflict if there is a valid and active slot with the name
'pg_conflict_detection' (or whatever name we decide to give) then
users can rely on the detected conflict. Sawada-San, and others, do
you have any preference on this matter?

Do we want to prohibit the combination copy_data as true and
retain_conflict_info=true? I understand that with the new parameter
'max_conflict_retention_duration', for large copies slot would anyway
be invalidated but I don't want to give users more ways to see this
slot invalidated in the beginning itself. Similarly during ALTER
SUBSCRIPTION, if the initial synch is in progress, we can disallow
enabling retain_conflict_info. Later, if there is a real demand for
such a combination, we can always enable it.

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2025-01-16 10:14:56 Re: Conflict detection for update_deleted in logical replication
Previous Message Shlok Kyal 2025-01-16 09:45:20 Re: Adding a '--two-phase' option to 'pg_createsubscriber' utility.