From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | "Callahan, Drew" <callaan(at)amazon(dot)com> |
Cc: | "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Potential data loss due to race condition during logical replication slot creation |
Date: | 2024-02-19 09:12:23 |
Message-ID: | CAD21AoBJcRiVBT5_ys13wA9c-8zbbqQJsMiVELX3t+HREwJwbA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On Thu, Feb 8, 2024 at 2:54 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Sat, Feb 3, 2024 at 12:12 PM Callahan, Drew <callaan(at)amazon(dot)com> wrote:
> >
> > > I think we can prepare an isolation test of the above steps and
> > > include it in contrib/test_decoding/specs.
> >
> > Thanks for the suggestion. Included an isolation test for the
> > exact repro and a couple other tests for fully before and fully after.
> > These two tests don't add a whole lot of value since this
> > is already heavily tested elsewhere, so I'm good with removing.
> > However, thought it made sense to include from a perspective of
> > totality given the test name.
> >
> > > Another simple way to fix this is that we always set
> > > need_full_snapshot to true when slot creation. But it would then make
> > > the snapshot builder include non-catalog-change transactions too while
> > > finding the initial startpoint. Which is not necessary.
> >
> > Since we're not accumulating changes anyway and we never distribute snapshots
> > while inconsistent it wouldn't cause any memory pressure in the ReorderBuffer.
> > This seems pretty contained and wouldn't require the use of a global variable
> > and memory context callbacks which is a bit nicer from a readability standpoint.
>
> Sorry I missed this comment.
>
> I think it's a good point that this fix doesn't lead to any on-disk
> compatibility while not affecting existing users much, especially for
> back-patching. If we want to choose this approach for bank branches,
> we need to carefully consider the side impacts.
>
> > It's a little inefficient with memory and CPU, especially in the presence of
> > long running transactions, but likely would not be noticeable to most users.
>
> This matches my analysis.
>
> Here is the summary of several proposals we've discussed:
>
> a) Have CreateInitDecodingContext() always pass need_full_snapshot =
> true to AllocateSnapshotBuilder().
>
> This is the simplest approach and doesn't break the on-disk
> SnapBuildOnDisck compatibility. A downside is that it's a little
> inefficient with memory and CPU since it includes non-catalog change
> transactions too. Also, passing need_full_snapshot to
> CreateInitDecodingContext() will no longer have any meaning.
>
> b) Have snapbuild.c being able to handle multiple SnapBuildOnDisk versions.
>
> This bumps SNAPBUILD_VERSION and therefore breaks the on-disk
> SnapBuildOnDisk compatibility, but snapbuild.c can handle multiple
> versions. But since different branches are using different versions of
> SnapBuildOnDisk, it would probably be hard to have the SnapBuildOnDisk
> version that is consistent across all versions.
>
> c) Add a global variable, say in_create, to snapbuild.c
>
> it would also require adding a setter function for in_create. There
> are several ideas where to set/reset the flag. One idea is that we
> reset the flag in AllocateSnapshotBuilder() and set the flag anywhere
> before starting to find the start point, for example at the beginning
> of DecodingContextFindStartpoint(). It probably won't require a memory
> context callback to make sure to clear the flag. This idea doesn't
> have a downside from users and extensions perspects. But I'm slightly
> hesitant to add a global variable.
>
> What do you think? and any other ideas?
I've drafted the idea (c) for discussion (for master and v16 for now).
I also liked the idea (a) but I'm concerned a bit about future impact.
>
> As a separate topic, I think that this backpatching complexity comes
> from the fact that we're serializing a whole SnapBuild to the disk
> although we don't need to serialize all of its fields. In order to
> make future back-patching easy, it might be worth considering
> decoupling the fields that need to be serialized from SnapBuild
> struct.
If we implement this idea, we won't need to bump SNAPBUILD_VERSION for HEAD.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
Attachment | Content-Type | Size |
---|---|---|
master_fix_snapbuild_bug_v3.patch | application/octet-stream | 5.3 KB |
v16_fix_snapbuild_bug_v3.patch | application/octet-stream | 5.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Langote | 2024-02-19 09:45:21 | Re: BUG #18344: Pruning tables partitioned by bool range fails with invalid strategy |
Previous Message | Tender Wang | 2024-02-19 08:21:11 | Re: BUG #18344: Pruning tables partitioned by bool range fails with invalid strategy |