Re: Conflict detection for update_deleted in logical replication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Conflict detection for update_deleted in logical replication
Date: 2024-10-15 11:33:14
Message-ID: CAA4eK1+fahe4DaeDkFM-vZDgZT+Ch6b8t8FJp2VSGi5UD1svNQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Oct 14, 2024 at 9:09 AM Zhijie Hou (Fujitsu)
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> While reviewing the patch, I noticed that the current design could not work in
> a non-bidirectional cluster (publisher -> subscriber) when the publisher is
> also a physical standby. (We supported logical decoding on a physical standby
> recently, so it's possible to take a physical standby as a logical publisher).
>
> The cluster looks like:
>
> physical primary -> physical standby (also publisher) -> logical subscriber (detect_update_deleted)
>
> The issue arises because the physical standby (acting as the publisher) might
> lag behind its primary. As a result, the logical walsender on the standby might
> not be able to get the latest WAL position when requested by the logical
> subscriber. We can only get the WAL replay position but there may be more WALs
> that are being replicated from the primary and those WALs could have older
> commit timestamp. (Note that transactions on both primary and standby have
> the same commit timestamp).
>
> So, the logical walsender might send an outdated WAL position as feedback.
> This, in turn, can cause the replication slot on the subscriber to advance
> prematurely, leading to the unwanted removal of dead tuples. As a result, the
> apply worker may fail to correctly detect update-delete conflicts.
>
> We thought of few options to fix this:
>
> 1) Add a Time-Based Subscription Option:
>
> We could add a new time-based option for subscriptions, such as
> retain_dead_tuples = '5s'. In the logical launcher, after obtaining the
> candidate XID, the launcher will wait for the specified time before advancing
> the slot.xmin. This ensures that deleted tuples are retained for at least the
> duration defined by this new option.
>
> This approach is designed to simulate the functionality of the GUC
> (vacuum_committs_age), but with a simpler implementation that does not impact
> vacuum performance. We can maintain both this time-based method and the current
> automatic method. If a user does not specify the time-based option, we will
> continue using the existing approach to retain dead tuples until all concurrent
> transactions from the remote node have been applied.
>
> 2) Modification to the Logical Walsender
>
> On the logical walsender, which is as a physical standby, we can build an
> additional connection to the physical primary to obtain the latest WAL
> position. This position will then be sent as feedback to the logical
> subscriber.
>
> A potential concern is that this requires the walsender to use the walreceiver
> API, which may seem a bit unnatural. And, it starts an additional walsender
> process on the primary, as the logical walsender on the physical standby will
> need to communicate with this walsender to fetch the WAL position.
>

This idea is worth considering, but I think it may not be a good
approach if the physical standby is cascading. We need to restrict the
update_delete conflict detection, if the standby is cascading, right?

The other approach is that we send current_timestamp from the
subscriber and somehow check if the physical standby has applied
commit_lsn up to that commit_ts, if so, it can send that WAL position
to the subscriber, otherwise, wait for it to be applied. If we do this
then we don't need to add a restriction for cascaded physical standby.
I think the subscriber anyway needs to wait for such an LSN to be
applied on standby before advancing the xmin even if we get it from
the primary. This is because the subscriber can only be able to apply
and flush the WAL once it is applied on the standby. Am, I missing
something?

This approach has a disadvantage that we are relying on clocks to be
synced on both nodes which we anyway need for conflict resolution as
discussed in the thread [1]. We also need to consider the Commit
Timestamp and LSN inversion issue as discussed in another thread [2]
if we want to pursue this approach because we may miss an LSN that has
a prior timestamp.

> 3) Documentation of Restrictions
>
> As an alternative, we could simply document the restriction that detecting
> update_delete is not supported if the publisher is also acting as a physical
> standby.
>

If we don't want to go for something along the lines of the approach
mentioned in (2) then I think we can do a combination of (1) and (3)
where we can error out if the user has not provided retain_dead_tuples
and the publisher is physical standby.

[1] - https://www.postgresql.org/message-id/CABdArM4%3D152B9PoyF4kggQ4LniYtjBCdUjL%3DqBwD-jcogP2BPQ%40mail.gmail.com
[2] - https://www.postgresql.org/message-id/CAJpy0uBxEJnabEp3JS%3Dn9X19Vx2ZK3k5AR7N0h-cSMtOwYV3fA%40mail.gmail.com

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ranier Vilela 2024-10-15 11:44:17 Re: replace strtok()
Previous Message Junwang Zhao 2024-10-15 10:55:50 Re: Use function smgrclose() to replace the loop