From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Restrict copying of invalidated replication slots |
Date: | 2025-02-28 20:58:14 |
Message-ID: | CAD21AoCYzmDPqVWDnADdb4T3+he3h+qe59NK3vd+zZN-MXvAxQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Feb 27, 2025 at 7:26 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Fri, Feb 28, 2025 at 5:10 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > On Thu, Feb 27, 2025 at 12:52 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > On Thu, Feb 27, 2025 at 10:47 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > > >
> > > > On Tue, Feb 25, 2025 at 7:33 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > > >
> > > > > AFAICU, InvalidateObsoleteReplicationSlots() is not serialized with
> > > > > this operation. So, isn't it possible that the source slot exists at
> > > > > the later position in ReplicationSlotCtl->replication_slots and the
> > > > > loop traversing slots is already ahead from the point where the newly
> > > > > copied slot is created?
> > > >
> > > > Good point. I think it's possible.
> > > >
> > > > > If so, the newly created slot won't be marked
> > > > > as invalid whereas the source slot will be marked as invalid. I agree
> > > > > that even in such a case, at a later point, the newly created slot
> > > > > will also be marked as invalid.
> > > >
> > > > The wal_status of the newly created slot would immediately become
> > > > 'lost' and the next checkpoint will invalidate it. Do we need to do
> > > > something to deal with this case?
> > > >
> > >
> > > + /* Check if source slot became invalidated during the copy operation */
> > > + if (second_slot_contents.data.invalidated != RS_INVAL_NONE)
> > > + ereport(ERROR,
> > > + errmsg("cannot copy replication slot \"%s\"",
> > > + NameStr(*src_name)),
> > > + errdetail("The source replication slot was invalidated during the
> > > copy operation."));
> > >
> > > How about adding a similar test as above for MyReplicationSlot? That
> > > should suffice the need because we have already acquired the new slot
> > > by this time and invalidation should signal this process before
> > > marking the new slot as invalid.
> >
> > IIUC in the scenario you mentioned, the loop traversing slots already
> > passed the position of newly created slot in
> > ReplicationSlotCtl->replication_slots array, so
> > MyReplicationSlot->data.invalidated is still RS_INVAL_NONE, no?
> >
>
> Right, I don't have a simpler solution for this apart from making it
> somehow serialize this operation with
> InvalidateObsoleteReplicationSlots(). I don't think we need to go that
> far to handle the scenario being discussed.
Agreed.
> It is better to add a
> comment for this race and why it won't harm.
+1
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | Fabrízio de Royes Mello | 2025-02-28 21:05:59 | Re: Space missing from EXPLAIN output |
Previous Message | Nathan Bossart | 2025-02-28 20:56:41 | Re: Statistics Import and Export |