Quick Links

Re: Potential data loss due to race condition during logical replication slot creation

From:	Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc:	"Callahan, Drew" <callaan(at)amazon(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject:	Re: Potential data loss due to race condition during logical replication slot creation
Date:	2024-07-05 14:52:52
Message-ID:	CAD21AoCS7Xfswo0v_jxZqJ97ndq=59s4Z2GVY8YjQZU-t6198w@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-bugs

On Wed, Jun 26, 2024 at 6:17 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Tue, Jun 25, 2024 at 1:56 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > On Tue, Jun 25, 2024 at 1:24 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > On Mon, Jun 24, 2024 at 10:32 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > > >
> > > > On Mon, Jun 24, 2024 at 12:54 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > > >
> > > > > On Fri, Jun 21, 2024 at 12:16 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > > > > > >
> > > > > > > The approach (a) has a downside, it will lead to tracking more
> > > > > > > transactions (non-catalog) than required without any benefit for the
> > > > > > > user. Considering that is true, I wouldn't prefer that approach.
> > > > > >
> > > > > > Yes, it will lead to tracking non-catalog-change transactions as well.
> > > > > > If there are many subtransactions, the overhead could be noticeable.
> > > > > > But it happens only once when creating a slot.
> > > > > >
> > > > >
> > > > > True, but it doesn't seem advisable to add such an overhead even
> > > > > during create time without any concrete reason.
> > > > >
> > > > > > Another variant of (a) is that we skip snapshot restores if the
> > > > > > initial_xmin_hirizon is a valid transaction id. The
> > > > > > initia_xmin_horizon is always set to a valida transaction id when
> > > > > > initializing the decoding context, e.g. during
> > > > > > CreateInitDecodingContext(). That way, we don't need to track
> > > > > > non-catalog-change transctions. A downside is that this approach
> > > > > > assumes that DecodingContextFindStartpoint() is called with the
> > > > > > decoding context created by CreateInitDecodingContxt(), which is true
> > > > > > in the core codes, but might not be true in third party extensions.
> > > > > >
> > > > >
> > > > > I think it is better to be explicit in this case rather than relying
> > > > > on initia_xmin_horizon. So, we can store in_create/create_in_progress
> > > > > flag in the Snapbuild in HEAD and store it in LogicalDecodingContext
> > > > > in back branches.
> > > >
> > > > I think we cannot access the flag in LogicalDecodingContext from
> > > > snapbuild.c at least in backbranches. I've discussed adding such a
> > > > flag in snapbuild.c as a global variable, but I'm slightly hesitant to
> > > > add a global variable besides InitialRunningXacts.
> > > >
> > >
> > > I agree that adding a global variable is not advisable. Can we pass
> > > the flag stored in LogicalDecodingContext to snapbuild.c?
> >
> > Ah, I found a good path: snapbuild->reorder->private_data (storing a
> > pointer to a LogicalDecodingContext). This assumes private_data always
> > stores a pointer to a LogicalDecodingContext but I think that's find
> > at least for backbranches.
> >
>
> +1. This approach looks good to me.
>

I've attached updated patches for HEAD and pg17 for now (I will create
the patch for other backbranches).

In the patches, I used a different approach in between HEAD and
backbranches. On HEAD, we store a flag indicating whether or not we
should skip snapshot restores into the SnapBuild struct and set it
only while finding the start point. Therefore we have to bump
SNAPBUILD_VERSION. On backbranches, I used the approach we discussed
above; store the flag in LogicalDecodingContext and set it when
creating the LogicalDecodingContext for a new logical slot. A possible
downside of the approach taken for backbranches is that we implicitly
require for users to use the same LogicalDecodingContext for both
initializing the context for a new slot and finding its start point.
IIUC it was not strictly required. This restriction would not be a
problem at least in the core, but I'm not sure if there are no
external extensions that create a logical slot in that way. This is
the main reason why I used a different approach on HEAD and
backbranches. Therefore, if it does not matter, I think we can use the
same approach on all branches, which is better in terms of
maintainability.

Also, I excluded the test case for the problem that Kuroda-san
reported[1] since the reported problem occurred due to the same cause
of the problem originally reported on this thread. The bug that
decodes only partial transactions could lead to various symptoms. IIUC
these test cases test the same behavior.

Regards,

[1] https://www.postgresql.org/message-id/TYCPR01MB1207719C811F580A8774C79B7F52A2%40TYCPR01MB12077.jpnprd01.prod.outlook.com

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachment	Content-Type	Size
HEAD_v1-0001-Fix-possibility-of-logical-decoding-partial-trans.patch	application/octet-stream	12.1 KB
rel17_v1-0001-Fix-possibility-of-logical-decoding-partial-trans.patch	application/octet-stream	11.7 KB

In response to

Re: Potential data loss due to race condition during logical replication slot creation at 2024-06-26 09:17:03 from Amit Kapila

Responses

Re: Potential data loss due to race condition during logical replication slot creation at 2024-07-07 23:56:42 from Michael Paquier
RE: Potential data loss due to race condition during logical replication slot creation at 2024-07-08 09:01:47 from Hayato Kuroda (Fujitsu)

Browse pgsql-bugs by date

	From	Date	Subject
Next Message	Tom Lane	2024-07-05 15:07:35	Re: Omitting relpages for toast table access not expected
Previous Message	Stefan Litsche	2024-07-05 14:41:41	Omitting relpages for toast table access not expected