RE: Fix slot synchronization with two_phase decoding enabled

From: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: Fix slot synchronization with two_phase decoding enabled
Date: 2025-03-31 11:34:15
Message-ID: OS0PR01MB57161D9BB5409F229564957994AD2@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 27, 2025 at 2:29 PM Amit Kapila wrote:

>
> On Tue, Mar 25, 2025 at 12:1 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> wrote:
> >
> > On Tue, Mar 25, 2025 at 11:05 AM Zhijie Hou (Fujitsu)
> > <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> > >
> > > Hi,
> > >
> > > When testing the slot synchronization with logical replication slots that
> > > enabled two_phase decoding, I found that transactions prepared before
> two-phase
> > > decoding is enabled may fail to replicate to the subscriber after being
> > > committed on a promoted standby following a failover.
> > >
> > > To reproduce this issue, please follow these steps (also detailed in the
> > > attached TAP test, v1-0001):
> > >
> > > 1. sub: create a subscription with (two_phase = false)
> > > 2. primary (pub): prepare a txn A.
> > > 3. sub: alter subscription set (two_phase = true) and wait for the logical
> slot to
> > > be synced to standby.
> > > 4. primary (pub): stop primary, promote the standby and let the subscriber
> use
> > > the promoted standby as publisher.
> > > 5. promoted standby (pub): COMMIT PREPARED A;
> > > 6. sub: the apply worker will report the following ERROR because it didn't
> > > receive the PREPARE.
> > > ERROR: prepared transaction with identifier "pg_gid_16387_752"
> does not exist
> > >
> > > I think the root cause of this issue is that the two_phase_at field of the
> > > slot, which indicates the LSN from which two-phase decoding is enabled
> (used to
> > > prevent duplicate data transmission for prepared transactions), is not
> > > synchronized to the standby server.
> > >
> > > In step 3, transaction A is not immediately replicated because it occurred
> > > before enabling two-phase decoding. Thus, the prepared transaction
> should only
> > > be replicated after decoding the final COMMIT PREPARED, as referenced
> in
> > > ReorderBufferFinishPrepared(). However, due to the invalid two_phase_at
> on the
> > > standby, the prepared transaction fails to send at that time.
> > >
> > > This problem arises after the support for altering the two-phase option
> > > (1462aad).
> > >
>
> I suspect that this can happen in PG17 as well, but I need to think
> more about it to make a reproducible test case.

After further analysis, I was able to reproduce the same issue [1] in
PG 17.

However, since the proposed fix requires catalog changes and the issue is not a
security risk significant enough to justify changing the catalog in back
branches, we cannot back-patch the same solution. Following off-list
discussions with Amit and Kuroda-san, we are considering disallowing enabling
failover and two-phase decoding together for a replication slot, as suggested
in attachment 0002.

Another idea considered is to prevent the slot that enables two-phase decoding
from being synced to standby. IOW, this means displaying the failover field as
false in the view, if there is any possibility that transactions prepared
before the two_phase_at position exist (e.g., if restart_lsn is less than
two_phase_at). However, implementing this change would require additional
explanations to users for this new behavior, which seems tricky.

>
> In the meantime, I have a few minor comments on the proposed patches:
> 1.
> ##################################################
> # Promote the standby1 to primary. Confirm that:
> # a) the slot 'lsub1_slot' and 'snap_test_slot' are retained on the new primary
> # b) logical replication for regress_mysub1 is resumed successfully
> after failover
> # c) changes can be consumed from the synced slot 'snap_test_slot'
> ##################################################
> -$standby1->start;
> $primary->wait_for_replay_catchup($standby1);
>
> # Capture the time before the standby is promoted
> @@ -885,6 +940,15 @@ $standby1->wait_for_catchup('regress_mysub1');
> is($subscriber1->safe_psql('postgres', q{SELECT count(*) FROM tab_int;}),
> "20", 'data replicated from the new primary');
>
> +# Commit the prepared transaction
> +$standby1->safe_psql('postgres',
> + "COMMIT PREPARED 'test_twophase_slotsync';");
> +$standby1->wait_for_catchup('regress_mysub1');
> +
> +# Confirm that the prepared transaction is replicated to the subscriber
> +is($subscriber1->safe_psql('postgres', q{SELECT count(*) FROM tab_int;}),
> + "21", 'prepared data replicated from the new primary');
>
> The commentary above this test should include information about
> verifying the replication of previously prepared transactions after
> promotion. Also, it would be better if confirm the commit prepared
> before confirming the new Insert is replicated after promotion.
>
> 2.
> @@ -249,6 +250,7 @@ update_local_synced_slot(RemoteSlot *remote_slot,
> Oid remote_dbid,
> SpinLockAcquire(&slot->mutex);
> slot->data.restart_lsn = remote_slot->restart_lsn;
> slot->data.confirmed_flush = remote_slot->confirmed_lsn;
> + slot->data.two_phase_at = remote_slot->two_phase_at;
>
> Why do we need to update the two_phase_at here when the patch does it
> later in this function when local and remote values don't match?

Thanks for the comments, they have been addressed in V2.

[1]
- pub: created a slot 'sub' with two_phase=false, then prepared a transaction
- pub: after some activity, advanced the confirmed_flush_lsn of 'sub', so it is
greater than prepared txn lsn.
- sub: create subscription with (slot_name='sub', create_slot=false, failover =
true, two_phase=true, copy_data=false); two_phase_at will be set to the same
as confirmed_flush_lsn which is greater than the prepared transaction.
- stop the primary and promote the standby.
- commit the prepared transaction on standby, the following error will be
reported on subscriber:

LOG: logical replication apply worker for subscription "sub2" has started
ERROR: prepared transaction with identifier "pg_gid_16398_764" does not exist.

Best Regards,
Hou zj

Attachment Content-Type Size
v2-0001-Fix-slot-synchronization-with-two_phase-decoding-.patch application/octet-stream 14.1 KB
v2-0001-PG17-Disallow-enabling-failover-for-a-replication-slot.patch.txt text/plain 7.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2025-03-31 11:37:15 Re: Test to dump and restore objects left behind by regression
Previous Message Yugo Nagata 2025-03-31 11:22:35 Re: Prevent internal error at concurrent CREATE OR REPLACE FUNCTION