RE: Fix slot synchronization with two_phase decoding enabled

From: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: Fix slot synchronization with two_phase decoding enabled
Date: 2025-04-09 05:14:42
Message-ID: OS3PR01MB5718261201D7E842EC28326794B42@OS3PR01MB5718.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Apr 3, 2025 at 3:16 PM Zhijie Hou (Fujitsu) wrote:
>
> On Thu, Apr 3, 2025 at 1:38 PM Masahiko Sawada wrote:
>
> >
> > On Wed, Apr 2, 2025 at 7:58 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> > wrote:
> > >
> > > On Thu, Apr 3, 2025 at 7:50 AM Zhijie Hou (Fujitsu)
> > > <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> > > >
> > > > On Thu, Apr 3, 2025 at 3:30 AM Masahiko Sawada wrote:
> > > >
> > > > >
> > > > > On Wed, Apr 2, 2025 at 6:33 AM Zhijie Hou (Fujitsu)
> > > > > <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> > > > >
> > > > > Thank you for the explanation! I agree that the issue happens in
> > > > > these
> > cases.
> > > > >
> > > > > As another idea, I wonder if we could somehow defer to make the
> > > > > synced slot as 'sync-ready' until we can ensure that the slot
> > > > > doesn't have any transactions that are prepared before the point
> > > > > of enabling two_phase. For example, when the slotsync worker
> > > > > fetches the remote slot, it remembers the confirmed_flush_lsn
> > > > > (say
> > > > > LSN-1) if the local slot's two_phase becomes true or the local
> > > > > slot is newly created with enabling two_phase, and then it makes
> > > > > the slot 'sync-ready' once it confirmed that the slot's
> > > > > restart_lsn passed
> > LSN-1. Does it work?
> > > >
> > > > Thanks for the idea!
> > > >
> > > > We considered a similar approach in [1] to confirm there is no
> > > > prepared transactions before two_phase_at, but the issue is that
> > > > when the two_phase flag is switched from 'false' to 'true' (as in
> > > > the case with (copy_data=true, failover=true, two_phase=true)). In
> > > > this case, the slot may have already been marked as sync-ready
> > > > before the two_phase flag is enabled, as slotsync is unaware of
> > > > potential
> > future changes to the two_phase flag.
> > >
> > > This can happen because when copy_data is true, tablesync can take a
> > > long time to complete the sync and in the meantime, slot without a
> > > two_phase flag would have been synced to standby. Such a slot would
> > > be marked as sync-ready even if we follow the calculation proposed
> > > by Sawada-san. Note that we enable two_phase once all the tables are
> > > in ready state (See run_apply_worker() and comments atop worker.c
> > > (TWO_PHASE TRANSACTIONS)).
> >
> > Right. It doesn't make sense to make the slot not-sync-ready and then
> > back to sync-ready.
> >
> > While I agree with the approach for HEAD and it seems difficult to
> > find a solution, I'm concerned that disallowing to use both failover
> > and two_phase in a minor release would affect users much. Users who
> > are already using that combination might end up needing to re-think
> > their system architecture. So I'm trying to narrow down use cases
> > where we're going to prohibit or to find workarounds.

We analyzed more for the backbranch fix, and here is a summary of different
approaches that could be used for PG17.

----------
Approach 1
----------

This is the original approach implemented in V4 patch.

In this approach, we entirely disallow enabling both failover and the two-phase
feature together for a replication slot or subscription.

pros:

This restriction is simple to implement and easy for users to comprehend.

cons:

Users would be unable to use the two-phase feature in conjunction with
failover.

Following the upgrade to a new release with this fix, existing subscriptions
that have both failover and two-phase enabled would require manual re-creation,
which is time-consuming.

----------
Approach 2
----------

Instead of disallowing the use of two-phase and failover together, a more
flexible strategy could be only restrict failover for slots with two-phase
enabled when there's a possibility of existing prepared transactions before the
two_phase_at that are not yet replicated. During slot creation with two-phase
and failover, we could check for any decoded prepared transactions when
determining the decoding start point (DecodingContextFindStartpoint). For
subsequent attempts to alter failover to true, we ensure that two_phase_at is
less than restart_lsn, indicating that all prepared transactions have been
committed and replicated, thus the bug would not happen.

pros:

This method minimizes restrictions for users. Especially during slot creation
with (two_phase=on, failover=on), as it’s uncommon for transactions to prepare
during consistent snapshot creation, the restriction becomes almost
unnoticeable.

Users are not always forced to re-create subscriptions post-upgrade.

cons:

The logic involved for (restart_lsn > two_phase_at) might be less intuitive for
users.

After upgrading, it's recommended that users verify whether two_phase_at for a
source slot is less than restart_lsn of the synced slot. If it isn't, users
should be careful when using the synced slot on a standby. This might be
complex to understand.

----------
Approach 3
----------

This approach is similar to Approach 2 but simplifies some aspects by avoiding
checks for prepared transactions during slot creation. It disallows enabling
both failover and twophase during slot creation, but permits altering failover
to true once ensured that no prepared transaction exists (restart_lsn >
two_phase_at).

pros:

Like Approach 2, this reduces user restrictions compared to completely
disallowing failover and two-phase usage together.

Users are not always forced to re-create subscriptions post-upgrade.

cons:

User still could not enable failover and twophase when creating a slot.

Similar to approach 1, the check (restart_lsn > two_phase_at) might be less
intuitive for users.

After upgrading, it's recommended that users verify whether two_phase_at for a
source slot is less than restart_lsn of the synced slot. If it isn't, users
should be careful when using the synced slot on a standby. This might be
complex to understand.

Best Regards,
Hou zj

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message jian he 2025-04-09 05:16:20 Re: Support NOT VALID / VALIDATE constraint options for named NOT NULL constraints
Previous Message Peter Smith 2025-04-09 05:11:51 Re: Enhance 'pg_createsubscriber' to retrieve databases automatically when no database is provided.