Re: Conflict detection for update_deleted in logical replication

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Conflict detection for update_deleted in logical replication
Date: 2025-01-07 07:05:06
Message-ID: CAD21AoB6RaXZAQLjg_iDuXPPMBmXx4JGnpAS6jo7v+8WSe6D3A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 6, 2025 at 10:40 PM Zhijie Hou (Fujitsu)
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> On Tuesday, January 7, 2025 2:00 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> Hi,
>
> >
> > On Mon, Jan 6, 2025 at 3:22 AM Zhijie Hou (Fujitsu) <houzj(dot)fnst(at)fujitsu(dot)com>
> > wrote:
> > >
> > > On Friday, January 3, 2025 2:36 PM Masahiko Sawada
> > <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > > >
> > > > I have one comment on the 0001 patch:
> > >
> > > Thanks for the comments!
> > >
> > > >
> > > > + /*
> > > > + * The changes made by this and later transactions are still
> > > > non-removable
> > > > + * to allow for the detection of update_deleted conflicts
> > > > + when
> > > > applying
> > > > + * changes in this logical replication worker.
> > > > + *
> > > > + * Note that this info cannot directly protect dead tuples from
> > being
> > > > + * prematurely frozen or removed. The logical replication launcher
> > > > + * asynchronously collects this info to determine whether to
> > > > + advance
> > > > the
> > > > + * xmin value of the replication slot.
> > > > + *
> > > > + * Therefore, FullTransactionId that includes both the
> > > > transaction ID and
> > > > + * its epoch is used here instead of a single Transaction ID. This is
> > > > + * critical because without considering the epoch, the transaction
> > ID
> > > > + * alone may appear as if it is in the future due to transaction ID
> > > > + * wraparound.
> > > > + */
> > > > + FullTransactionId oldest_nonremovable_xid;
> > > >
> > > > The last paragraph of the comment mentions that we need to use
> > > > FullTransactionId to properly compare XIDs even after the XID
> > > > wraparound happens. But once we set the oldest-nonremovable-xid it
> > > > prevents XIDs from being wraparound, no? I mean that workers'
> > > > oldest-nonremovable-xid values and slot's non-removal-xid (i.e., its
> > > > xmin) are never away from more than 2^31 XIDs.
> > >
> > > I think the issue is that the launcher may create the replication slot
> > > after the apply worker has already set the 'oldest_nonremovable_xid'
> > > because the launcher are doing that asynchronously. So, Before the
> > > slot is created, there's a window where transaction IDs might wrap
> > > around. If initially the apply worker has computed a candidate_xid
> > > (755) and the xid wraparound before the launcher creates the slot,
> > > causing the new current xid to be (740), then the old
> > > candidate_xid(755) looks like a xid in the future, and the launcher
> > > could advance the xmin to 755 which cause the dead tuples to be removed
> > prematurely.
> > > (We are trying to reproduce this to ensure that it's a real issue and
> > > will share after finishing)
> >
> > The slot's first xmin is calculated by
> > GetOldestSafeDecodingTransactionId(false). The initial computed
> > cancidate_xid could be newer than this xid?
>
> I think the issue occurs when the slot is created after an XID wraparound. As a
> result, GetOldestSafeDecodingTransactionId() returns the current XID
> (after wraparound), which appears older than the computed candidate_xid (e.g.,
> oldest_nonremovable_xid). Nisha has shared detailed steps to reproduce the
> issue in [1]. What do you think ?

I agree that the scenario Nisha shared could happen with the current
patch. On the other hand, I think that if slot's initial xmin is
always newer than or equal to the initial computed non-removable-xid
(i.e., the oldest of workers' oldest_nonremovable_xid values), we can
always use slot's first xmin. And I think it might be true while I'm
concerned the fact that worker's oldest_nonremoable_xid and the slot's
initial xmin is calculated differently (GetOldestActiveTransactionId()
and GetOldestSafeDecodingTransactionId(), respectively). That way,
subsequent comparisons between slot's xmin and computed candidate_xid
won't need to take care of the epoch. IOW, the worker's
non-removable-xid values effectively are not used until the slot is
created.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2025-01-07 07:34:39 Re: Use Python "Limited API" in PL/Python
Previous Message Tatsuo Ishii 2025-01-07 06:57:57 Re: Proposal: add new API to stringinfo