Re: Potential data loss due to race condition during logical replication slot creation

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "Callahan, Drew" <callaan(at)amazon(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: Potential data loss due to race condition during logical replication slot creation
Date: 2024-03-19 02:33:06
Message-ID: CAA4eK1JSZnRJmHatGdC1LFiFB=VT2js32rDtpV5_Seoa0nbJpw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Tue, Mar 19, 2024 at 7:46 AM Hayato Kuroda (Fujitsu)
<kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
>
> I think the approach was most conservative one which does not have to change
> the version of the snapshot. However, I understood that you wanted to consider
> the optimized solution for HEAD first.
>

Right, let's see if we can have a solution other than always avoiding
restoring snapshots during slot creation even if that is for just
HEAD.

> > See SnapBuildCommitTxn(). Can we avoid this problem if we
> > would have list of all running xacts when we serialize the snapshot by
> > not decoding any xact whose xid lies in that list? If so, one idea to
> > achieve could be that we maintain the highest_running_xid while
> > serailizing the snapshot and then during restore if that
> > highest_running_xid is <= builder->initial_xmin_horizon, then we
> > ignore restoring the snapshot. We already have few such cases handled
> > in SnapBuildRestore().
>
> Based on the idea, I made a prototype. It can pass tests added by others and me.
> How do other think?
>

Won't it be possible to achieve the same thing if we just save
(serialize) the highest xid among all running xacts?

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message ocean_li_996 2024-03-19 02:58:38 Re:BUG #18369: logical decoding core on AssertTXNLsnOrder()
Previous Message Hayato Kuroda (Fujitsu) 2024-03-19 02:16:35 RE: Potential data loss due to race condition during logical replication slot creation