Re: Potential data loss due to race condition during logical replication slot creation

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, "Callahan, Drew" <callaan(at)amazon(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: Potential data loss due to race condition during logical replication slot creation
Date: 2024-07-10 01:15:37
Message-ID: CAD21AoBij9Pg7-mjUzvj2kLSXoK7BNGqmPfAeAvYj+-TJZho+A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Tue, Jul 9, 2024 at 6:27 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Tue, Jul 9, 2024 at 11:00 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > On Mon, Jul 8, 2024 at 11:02 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > > On Mon, Jul 8, 2024 at 6:01 PM Hayato Kuroda (Fujitsu)
> > > <kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
> > > >
> > > > Dear Sawada-san,
> > > >
> > > > Thanks for creating the patch!
> > > >
> > > > > I've attached updated patches for HEAD and pg17 for now (I will create
> > > > > the patch for other backbranches).
> > > > >
> > > > > In the patches, I used a different approach in between HEAD and
> > > > > backbranches. On HEAD, we store a flag indicating whether or not we
> > > > > should skip snapshot restores into the SnapBuild struct and set it
> > > > > only while finding the start point. Therefore we have to bump
> > > > > SNAPBUILD_VERSION. On backbranches, I used the approach we discussed
> > > > > above; store the flag in LogicalDecodingContext and set it when
> > > > > creating the LogicalDecodingContext for a new logical slot. A possible
> > > > > downside of the approach taken for backbranches is that we implicitly
> > > > > require for users to use the same LogicalDecodingContext for both
> > > > > initializing the context for a new slot and finding its start point.
> > > > > IIUC it was not strictly required. This restriction would not be a
> > > > > problem at least in the core, but I'm not sure if there are no
> > > > > external extensions that create a logical slot in that way. This is
> > > > > the main reason why I used a different approach on HEAD and
> > > > > backbranches. Therefore, if it does not matter, I think we can use the
> > > > > same approach on all branches, which is better in terms of
> > > > > maintainability.
> > > >
> > > > I want to confirm your point. You meant that you wanted to unify appraoches,
> > > > especially you wanted to store the flag in LogicalDecodingContext, rigth?
> > >
> > > Yes. Ideally I'd like to use the same approach in all branches
> > > regardless of how to fix it for better maintainability.
> > >
>
> The difference is minor so using slightly different approaches should
> be okay. In the ideal case, I agree that using the same approach makes
> sense but for future extendability, it is better to keep this new
> variable in SnapBuild at least in HEAD and PG17.
>
> >
> > I've attached the new version patches for all branches.
> >
>
> Few comments:
> 1.
> @@ -650,6 +650,9 @@ DecodingContextFindStartpoint(LogicalDecodingContext *ctx)
> {
> ReplicationSlot *slot = ctx->slot;
>
> + /* Let snapshot builder start to find the start point */
> + SnapBuildSetFindStartPoint(ctx->snapshot_builder, true);
> +
> /* Initialize from where to start reading WAL. */
> XLogBeginRead(ctx->reader, slot->data.restart_lsn);
>
> @@ -683,6 +686,9 @@ DecodingContextFindStartpoint(LogicalDecodingContext *ctx)
> if (slot->data.two_phase)
> slot->data.two_phase_at = ctx->reader->EndRecPtr;
> SpinLockRelease(&slot->mutex);
> +
> + /* Complete to find the start point */
> + SnapBuildSetFindStartPoint(ctx->snapshot_builder, false);
>
> I wonder why you didn't choose to set this variable in
> AllocateSnapshotBuilder()? If we do so, then we may not need set/reset
> in DecodingContextFindStartpoint(). The one advantage of using
> set/reset for the minimal time as done here is that we can avoid any
> impact of this new variable but I still feel setting it in
> AllocateSnapshotBuilder() seems more natural.

Initially what I thought was; if we set the flag in
AllocateSnapshotBuilder(), we need to pass true through like
CreateInitDecodingContext() -> StartupDecodingContext() ->
AllocateSnapshotBuilder(), meaning that the only
LogicalDecodingContext created via CreateInitDecodingContext() can be
used in DecodingContextFindStartpoint(). IOW if we use the
LogicalDecodingContext created via CreateDecodingContext() (i.e.,
setting the flag as false) in DecodingContextFindStartpoint() we would
end up having the same problem. IIUC we haven't had such a usage
restriction before. But thinking on this approach further, probably
the same is true for initial_xmin_horizon. The field is set only when
the LogicalDecodingContext is created via CreateInitDecodingContext(),
and is used only while finding the start point. So I'm fine with
setting the flag in AllocateSnapshotBuilder(). If we go this approach,
I think we should check if the flag is set before finding the start
point.

>
> 2.
> Since in this case
> + * the restart LSN could be in the middle of transactions we need to
> + * find the start point where we won't see the data for partial
> + * transactions.
>
> There is a connecting word missing between *transactions* and *we*.
> Can we use a slightly different wording like: "Can't use this method
> while finding the start point for decoding changes as the restart LSN
> would be an arbitrary LSN but we need to find the start point to
> extract changes where we won't see the data for partial
> transactions."?

Looks good. Will fix.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Masahiko Sawada 2024-07-10 02:21:48 Re: Potential data loss due to race condition during logical replication slot creation
Previous Message Michael Paquier 2024-07-09 23:12:29 Re: pg_rewind fails on Windows where tablespaces are used