RE: persist logical slots to disk during shutdown checkpoint

From: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>
Subject: RE: persist logical slots to disk during shutdown checkpoint
Date: 2023-09-05 11:34:14
Message-ID: OS0PR01MB57167ADBAA5F87D78FD6B87394E8A@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tuesday, September 5, 2023 4:15 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:

Hi,

>
> On Tue, Sep 5, 2023 at 10:12 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> wrote:
> >
> > On Tue, Sep 5, 2023 at 7:54 AM Zhijie Hou (Fujitsu)
> > <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> > >
> > > On Monday, September 4, 2023 6:15 PM vignesh C
> <vignesh21(at)gmail(dot)com> wrote:
> > > >
> > > > On Mon, 4 Sept 2023 at 15:20, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> wrote:
> > > > >
> > > > > On Fri, Sep 1, 2023 at 10:50 AM vignesh C <vignesh21(at)gmail(dot)com>
> wrote:
> > > > > >
> > > > > > On Fri, 1 Sept 2023 at 10:06, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> wrote:
> > > > > > > I think we should also ensure that slots are not invalidated
> > > > > > > (slot.data.invalidated != RS_INVAL_NONE) before marking them
> > > > > > > dirty because we don't allow decoding from such slots, so we
> > > > > > > shouldn't include those.
> > > > > >
> > > > > > Added this check.
> > > > > >
> > > > > > Apart from this I have also fixed the following issues that
> > > > > > were agreed on: a) Setting slots to dirty in
> > > > > > CheckPointReplicationSlots instead of setting it in
> > > > > > SaveSlotToPath
> > > > > >
> > > > >
> > > > > + if (is_shutdown && SlotIsLogical(s)) {
> > > > > + SpinLockAcquire(&s->mutex); if (s->data.invalidated ==
> > > > > + RS_INVAL_NONE &&
> > > > > + s->data.confirmed_flush != s->last_saved_confirmed_flush)
> > > > > + s->dirty = true;
> > > > >
> > > > > I think it is better to use ReplicationSlotMarkDirty() as that
> > > > > would be consistent with all other usages.
> > > >
> > > > ReplicationSlotMarkDirty works only on MyReplicationSlot whereas
> > > > CheckpointReplicationSlots loops through all the slots and marks
> > > > the appropriate slot as dirty, we might have to change
> > > > ReplicationSlotMarkDirty to take the slot as input parameter and all caller
> should pass MyReplicationSlot.
> > >
> > > Personally, I feel if we want to centralize the code of marking
> > > dirty into a function, we can introduce a new static function
> > > MarkSlotDirty(slot) to mark passed slot dirty and let
> > > ReplicationSlotMarkDirty and CheckpointReplicationSlots call it. Like:
> > >
> > > void
> > > ReplicationSlotMarkDirty(void)
> > > {
> > > MarkSlotDirty(MyReplicationSlot); }
> > >
> > > +static void
> > > +MarkSlotDirty(ReplicationSlot *slot) {
> > > + Assert(slot != NULL);
> > > +
> > > + SpinLockAcquire(&slot->mutex);
> > > + slot->just_dirtied = true;
> > > + slot->dirty = true;
> > > + SpinLockRelease(&slot->mutex); }
> > >
> > > This is somewhat similar to the relation between
> > > ReplicationSlotSave(serialize my backend's replications slot) and
> SaveSlotToPath(save the passed slot).
> > >
> > > > Another thing is we have already taken spin lock to access
> > > > last_confirmed_flush_lsn from CheckpointReplicationSlots, we could
> > > > set dirty flag here itself, else we will have to release the lock
> > > > and call ReplicationSlotMarkDirty which will take lock again.
> > >
> > > Yes, this is unavoidable, but maybe it's not a big problem as we
> > > only do it at shutdown.
> > >
> >
> > True but still it doesn't look elegant. I also thought about having a
> > probably inline function that marks both just_dirty and dirty fields.
> > However, that requires us to assert that the caller has already
> > acquired a spinlock. I see a macro SpinLockFree() that might help but
> > it didn't seem to be used anywhere in the code so not sure if we can
> > rely on it.
>
> Can't we just have code like this? I mean we will have to make
> ReplicationSlotMarkDirty take slot as an argument or have another version
> which takes slot as an argument and that would be called by us as well as by
> ReplicationSlotMarkDirty(). I mean why do we need these checks
> (s-(data.invalidated == RS_INVAL_NONE &&
> s->data.confirmed_flush != s->last_saved_confirmed_flush) under the
> mutex? Walsender is shutdown so confirmed flush LSN can not move
> concurrently and slot can not be invalidated as well because that is done by
> checkpointer and we are in checkpointer?

I agree with your analysis that the lock may be unnecessary for now and the
code will work, but I personally feel we'd better take the spinlock.

Firstly, considering our discussion on the potential extension of persisting
the slot for online checkpoints in the future, we anyway need the lock at that
time, so taking the lock here could avoid overlooking the need to update it
later. And the lock also won't cause any performance or concurrency issue.

Additionally, if we don't take the lock, we rely on the assumption that the
walsender will exit before the shutdown checkpoint, currently, that's true for
logical walsender, but physical walsender can exit later than checkpointer. So,
I am slight woirred that if we change the logical walsender's exit timing in
the future, the assumption may not hold.

Besides, for non-built-in logical replication, if someone creates their own
walsender or other processes to send the changes and the process doesn't exit
before the shutdown checkpoint, it may also be a problem. Although I don't have
exsiting examples about these extensions, but I feel taking the lock would make
it more robust.

Best Regards,
Hou zj

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-09-05 11:50:24 Re: Autogenerate some wait events code and documentation
Previous Message Peter Eisentraut 2023-09-05 11:29:47 Re: cataloguing NOT NULL constraints