Re: Conflict detection for update_deleted in logical replication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Conflict detection for update_deleted in logical replication
Date: 2024-09-20 09:46:26
Message-ID: CAA4eK1Ly_Y9DKJHzbFSK=7JmfaUj-P9jJifO4ZTC7qPEiy8x3w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 20, 2024 at 8:25 AM Zhijie Hou (Fujitsu)
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> Apart from the vacuum_defer_cleanup_age idea.
>

I think you meant to say vacuum_committs_age idea.

> we’ve given more thought to our
> approach for retaining dead tuples and have come up with another idea that can
> reliably detect conflicts without requiring users to choose a wise value for
> the vacuum_committs_age. This new idea could also reduce the performance
> impact. Thanks a lot to Amit for off-list discussion.
>
> The concept of the new idea is that, the dead tuples are only useful to detect
> conflicts when applying *concurrent* transactions from remotes. Any subsequent
> UPDATE from a remote node after removing the dead tuples should have a later
> timestamp, meaning it's reasonable to detect an update_missing scenario and
> convert the UPDATE to an INSERT when applying it.
>
> To achieve above, we can create an additional replication slot on the
> subscriber side, maintained by the apply worker. This slot is used to retain
> the dead tuples. The apply worker will advance the slot.xmin after confirming
> that all the concurrent transaction on publisher has been applied locally.
>
> The process of advancing the slot.xmin could be:
>
> 1) the apply worker call GetRunningTransactionData() to get the
> 'oldestRunningXid' and consider this as 'candidate_xmin'.
> 2) the apply worker send a new message to walsender to request the latest wal
> flush position(GetFlushRecPtr) on publisher, and save it to
> 'candidate_remote_wal_lsn'. Here we could introduce a new feedback message or
> extend the existing keepalive message(e,g extends the requestReply bit in
> keepalive message to add a 'request_wal_position' value)
> 3) The apply worker can continue to apply changes. After applying all the WALs
> upto 'candidate_remote_wal_lsn', the apply worker can then advance the
> slot.xmin to 'candidate_xmin'.
>
> This approach ensures that dead tuples are not removed until all concurrent
> transactions have been applied. It can be effective for both bidirectional and
> non-bidirectional replication cases.
>
> We could introduce a boolean subscription option (retain_dead_tuples) to
> control whether this feature is enabled. Each subscription intending to detect
> update-delete conflicts should set retain_dead_tuples to true.
>

As each apply worker needs a separate slot to retain deleted rows, the
requirement for slots will increase. The other possibility is to
maintain one slot by launcher or some other central process that
traverses all subscriptions, remember the ones marked with
retain_dead_rows (let's call this list as retain_sub_list). Then using
running_transactions get the oldest running_xact, and then get the
remote flush location from the other node (publisher node) and store
those as candidate values (candidate_xmin and
candidate_remote_wal_lsn) in slot. We can probably reuse existing
candidate variables of the slot. Next, we can check the remote_flush
locations from all the origins corresponding in retain_sub_list and if
all are ahead of candidate_remote_wal_lsn, we can update the slot's
xmin to candidate_xmin.

I think in the above idea we can an optimization to combine the
request for remote wal LSN from different subscriptions pointing to
the same node to avoid sending multiple requests to the same node. I
am not sure if using pg_subscription.subconninfo is sufficient for
this, if not we can probably leave this optimization.

If this idea is feasible then it would reduce the number of slots
required to retain the deleted rows but the launcher needs to get the
remote wal location corresponding to each publisher node. There are
two ways to achieve that (a) launcher requests one of the apply
workers corresponding to subscriptions pointing to the same publisher
node to get this information; (b) launcher launches another worker to
get the remote wal flush location.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nazir Bilal Yavuz 2024-09-20 10:25:01 Re: meson and check-tests
Previous Message shveta malik 2024-09-20 09:44:14 Re: Allow logical failover slots to wait on synchronous replication