RE: Fix slot synchronization with two_phase decoding enabled

From: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>
Subject: RE: Fix slot synchronization with two_phase decoding enabled
Date: 2025-04-22 04:00:01
Message-ID: OS3PR01MB5718ED3F0123737C7380BBAD94BB2@OS3PR01MB5718.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Apr 22, 2025 at 11:23 AM Amit Kapila wrote:

>
> On Fri, Apr 18, 2025 at 9:58 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Thu, Apr 17, 2025 at 6:14 PM Zhijie Hou (Fujitsu)
> > <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> > >
> > > -----
> > > Fix
> > > -----
> > >
> > > I think we should keep the confirmed_flush even if the previous
> > > synced restart_lsn/catalog_xmin is newer. Attachments include a patch
> for the same.
> > >
> >
> > This will fix the case we are facing but adds a new rule for slot
> > synchronization. Can we think of a simpler way to fix this by avoiding
> > updating other slot fields (like two_phase, two_phase_at) if
> > restart_lsn or catalog_xmin of the local slot is ahead of the remote
> > slot?
> >
>
> Thinking more about this problem, it seems to me that if the catalog_xmin of
> synced slot is allowed to be ahead than the remote_slot when there is still an
> open (prepared transaction), it could cause data loss. I mean that after the
> promotion, some of the required catalog rows could be removed, and decoding
> corresponding changes (changes from tables affected by DDL) could give
> unexpected results. Those would be protected on primary/publisher because
> the catalog_xmin on it was still accurate and behind. If this theory turns out to
> be true, then this is a drawback/bug of the existing fast_forward mode code.

I agree that could be a problem.

Upon further analysis, I find that the core problem is we do not build a base
snapshot when decoding changes during fast forward mode, preventing reference
to the minimum transaction ID that remains visible in the snapshot when
determining the candidate for catalog_xmin. As a result, catalog_xmin was
directly advanced to the oldest running transaction ID found in the latest
running_xacts record.

In code-level, SnapBuildProcessChange -> ReorderBufferSetBaseSnapshot could not
be reached during fast forward decoding, resulting in
rb->txns_by_base_snapshot_lsn being NULL. When advancing catalog_xmin, the
system attempts to refer to rb->txns_by_base_snapshot_lsn via
ReorderBufferGetOldestXmin(). However, since rb->txns_by_base_snapshot_lsn is
NULL, it defaults to directly using running->oldestRunningXid as the candidate
for catalog_xmin. For reference, see the implementation details in
SnapBuildProcessRunningXacts.

I think this is a general issue in fast forward decoding, which not only affect
slotsync. I can reproduce the issue using slot_advance SQL API as well. See the
attachment for a patch that includes a test to prove that the catalog data that
are still required would be removed after premature catalog_xmin advancement
during fast forward decoding. If you test that patch on HEAD, you would find
that the output missed a column due to vacuum removal.

I will start a new thread to report and fix this general.

Best Regards,
Hou zj

Attachment Content-Type Size
0001-Add-a-test.patch application/octet-stream 3.3 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Hayato Kuroda (Fujitsu) 2025-04-22 04:02:21 RE: doc patch: clarify the naming rule for injection_points
Previous Message Peter Smith 2025-04-22 03:52:41 Re: DOCS: Make the Server Application docs synopses more consistent