Re: persist logical slots to disk during shutdown checkpoint

From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: persist logical slots to disk during shutdown checkpoint
Date: 2023-08-31 06:55:17
Message-ID: CAExHW5uXq_CU80XJtmWbPJinRjJx54kbQJ9DT=UFySKXpjVwrw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 31, 2023 at 12:10 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> > +
> > + /*
> > + * We won't ensure that the slot is persisted after the confirmed_flush
> > + * LSN is updated as that could lead to frequent writes. However, we need
> > + * to ensure that we do persist the slots at the time of shutdown whose
> > + * confirmed_flush LSN is changed since we last saved the slot to disk.
> > + * This will help in avoiding retreat of the confirmed_flush LSN after
> > + * restart. This variable is used to track the last saved confirmed_flush
> > + * LSN value.
> > + */
> >
> > This comment makes more sense in SaveSlotToPath() than here. We may decide to
> > use last_saved_confirmed_flush for something else in future.
> >
>
> I have kept it here because it contains some information that is not
> specific to SaveSlotToPath. So, it seems easier to follow the whole
> theory if we keep it at the central place in the structure and then
> add the reference wherever required but I am fine if you and others
> feel strongly about moving this to SaveSlotToPath().

Saving slot to disk happens only in SaveSlotToPath, so except the last
sentence rest of the comment makes sense in SaveSlotToPath().

> >
> > This function assumes that the subscriber will receive and confirm WAL upto
> > checkpoint location and publisher's WAL sender will update it in the slot.
> > Where is the code to ensure that? Does the WAL sender process wait for
> > checkpoint
> > LSN to be confirmed when shutting down?
> >
>
> Note, that we need to compare if all the WAL before the
> shutdown_checkpoint WAL record is sent. Before the clean shutdown, we
> do ensure that all the pending WAL is confirmed back. See the use of
> WalSndDone() in WalSndLoop().

Ok. Thanks for pointing that out to me.

>
> > I
> > think we should shut down subscriber, restart publisher and then make this
> > check based on the contents of the replication slot instead of server log.
> > Shutting down subscriber will ensure that the subscriber won't send any new
> > confirmed flush location to the publisher after restart.
> >
>
> But if we shutdown the subscriber before the publisher there is no
> guarantee that the publisher has sent all outstanding logs up to the
> shutdown checkpoint record (i.e., the latest record). Such a guarantee
> can only be there if we do a clean shutdown of the publisher before
> the subscriber.

So the sequence is shutdown publisher node, shutdown subscriber node,
start publisher node and carry out the checks.

>
> > All the places which call ReplicationSlotSave() mark the slot as dirty. All
> > the places where SaveSlotToPath() is called, the slot is marked dirty except
> > when calling from CheckPointReplicationSlots(). So I am wondering whether we
> > should be marking the slot dirty in CheckPointReplicationSlots() and avoid
> > passing down is_shutdown flag to SaveSlotToPath().
> >
>
> I feel that will add another spinlock acquire/release pair without
> much benefit. Sure, it may not be performance-sensitive but still
> adding another pair of lock/release doesn't seem like a better idea.

We call ReplicatioinSlotMarkDirty() followed by ReplicationSlotSave()
at all the places, even those which are more frequent than this. So I
think it's better to stick to that protocol rather than adding a new
flag.

--
Best Wishes,
Ashutosh Bapat

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2023-08-31 07:07:12 Re: Statistics Import and Export
Previous Message Corey Huinker 2023-08-31 06:47:31 Statistics Import and Export