From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Fix slot synchronization with two_phase decoding enabled |
Date: | 2025-03-25 06:44:53 |
Message-ID: | CAA4eK1+Row5XWDbOCTgd4_s=eaqXAL7iXDFQkAinuJFqOTt46A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Mar 25, 2025 at 11:05 AM Zhijie Hou (Fujitsu)
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> Hi,
>
> When testing the slot synchronization with logical replication slots that
> enabled two_phase decoding, I found that transactions prepared before two-phase
> decoding is enabled may fail to replicate to the subscriber after being
> committed on a promoted standby following a failover.
>
> To reproduce this issue, please follow these steps (also detailed in the
> attached TAP test, v1-0001):
>
> 1. sub: create a subscription with (two_phase = false)
> 2. primary (pub): prepare a txn A.
> 3. sub: alter subscription set (two_phase = true) and wait for the logical slot to
> be synced to standby.
> 4. primary (pub): stop primary, promote the standby and let the subscriber use
> the promoted standby as publisher.
> 5. promoted standby (pub): COMMIT PREPARED A;
> 6. sub: the apply worker will report the following ERROR because it didn't
> receive the PREPARE.
> ERROR: prepared transaction with identifier "pg_gid_16387_752" does not exist
>
> I think the root cause of this issue is that the two_phase_at field of the
> slot, which indicates the LSN from which two-phase decoding is enabled (used to
> prevent duplicate data transmission for prepared transactions), is not
> synchronized to the standby server.
>
> In step 3, transaction A is not immediately replicated because it occurred
> before enabling two-phase decoding. Thus, the prepared transaction should only
> be replicated after decoding the final COMMIT PREPARED, as referenced in
> ReorderBufferFinishPrepared(). However, due to the invalid two_phase_at on the
> standby, the prepared transaction fails to send at that time.
>
> This problem arises after the support for altering the two-phase option
> (1462aad).
>
Thanks for the report and patch. I'll look into it.
--
With Regards,
Amit Kapila.
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2025-03-25 06:47:10 | Re: Query ID Calculation Fix for DISTINCT / ORDER BY and LIMIT / OFFSET |
Previous Message | Andrei Lepikhov | 2025-03-25 06:40:47 | Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment |