From: | "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com> |
Subject: | RE: Fix slot synchronization with two_phase decoding enabled |
Date: | 2025-04-22 04:00:01 |
Message-ID: | OS3PR01MB5718ED3F0123737C7380BBAD94BB2@OS3PR01MB5718.jpnprd01.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Apr 22, 2025 at 11:23 AM Amit Kapila wrote:
>
> On Fri, Apr 18, 2025 at 9:58 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Thu, Apr 17, 2025 at 6:14 PM Zhijie Hou (Fujitsu)
> > <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> > >
> > > -----
> > > Fix
> > > -----
> > >
> > > I think we should keep the confirmed_flush even if the previous
> > > synced restart_lsn/catalog_xmin is newer. Attachments include a patch
> for the same.
> > >
> >
> > This will fix the case we are facing but adds a new rule for slot
> > synchronization. Can we think of a simpler way to fix this by avoiding
> > updating other slot fields (like two_phase, two_phase_at) if
> > restart_lsn or catalog_xmin of the local slot is ahead of the remote
> > slot?
> >
>
> Thinking more about this problem, it seems to me that if the catalog_xmin of
> synced slot is allowed to be ahead than the remote_slot when there is still an
> open (prepared transaction), it could cause data loss. I mean that after the
> promotion, some of the required catalog rows could be removed, and decoding
> corresponding changes (changes from tables affected by DDL) could give
> unexpected results. Those would be protected on primary/publisher because
> the catalog_xmin on it was still accurate and behind. If this theory turns out to
> be true, then this is a drawback/bug of the existing fast_forward mode code.
I agree that could be a problem.
Upon further analysis, I find that the core problem is we do not build a base
snapshot when decoding changes during fast forward mode, preventing reference
to the minimum transaction ID that remains visible in the snapshot when
determining the candidate for catalog_xmin. As a result, catalog_xmin was
directly advanced to the oldest running transaction ID found in the latest
running_xacts record.
In code-level, SnapBuildProcessChange -> ReorderBufferSetBaseSnapshot could not
be reached during fast forward decoding, resulting in
rb->txns_by_base_snapshot_lsn being NULL. When advancing catalog_xmin, the
system attempts to refer to rb->txns_by_base_snapshot_lsn via
ReorderBufferGetOldestXmin(). However, since rb->txns_by_base_snapshot_lsn is
NULL, it defaults to directly using running->oldestRunningXid as the candidate
for catalog_xmin. For reference, see the implementation details in
SnapBuildProcessRunningXacts.
I think this is a general issue in fast forward decoding, which not only affect
slotsync. I can reproduce the issue using slot_advance SQL API as well. See the
attachment for a patch that includes a test to prove that the catalog data that
are still required would be removed after premature catalog_xmin advancement
during fast forward decoding. If you test that patch on HEAD, you would find
that the output missed a column due to vacuum removal.
I will start a new thread to report and fix this general.
Best Regards,
Hou zj
Attachment | Content-Type | Size |
---|---|---|
0001-Add-a-test.patch | application/octet-stream | 3.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Hayato Kuroda (Fujitsu) | 2025-04-22 04:02:21 | RE: doc patch: clarify the naming rule for injection_points |
Previous Message | Peter Smith | 2025-04-22 03:52:41 | Re: DOCS: Make the Server Application docs synopses more consistent |