From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | Dilip Kumar <dilipbalaut(at)gmail(dot)com> |
Cc: | vignesh C <vignesh21(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de> |
Subject: | Re: persist logical slots to disk during shutdown checkpoint |
Date: | 2023-09-05 04:28:13 |
Message-ID: | CAA4eK1LS2-qOq2w5AZe5nwmcFVOh4qb5DUfSAL_OR3oFW6GRSg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Sep 5, 2023 at 9:10 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Fri, Sep 1, 2023 at 12:16 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> >
> > On Fri, 1 Sept 2023 at 10:06, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > On Thu, Aug 31, 2023 at 6:12 PM Ashutosh Bapat
> > > <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
> > > >
> > > > On Thu, Aug 31, 2023 at 2:52 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > > >
> > > > > All but one. Normally, the idea of marking dirty is to indicate that
> > > > > we will actually write/flush the contents at a later point (except
> > > > > when required for correctness) as even indicated in the comments atop
> > > > > ReplicatioinSlotMarkDirty(). However, I see your point that we use
> > > > > that protocol at all the current places including CreateSlotOnDisk().
> > > > > So, we can probably do it here as well.
> > > >
> > > > yes
> > > >
> > >
> > > I think we should also ensure that slots are not invalidated
> > > (slot.data.invalidated != RS_INVAL_NONE) before marking them dirty
> > > because we don't allow decoding from such slots, so we shouldn't
> > > include those.
> >
> > Added this check.
> >
> > Apart from this I have also fixed the following issues that were
> > agreed on: a) Setting slots to dirty in CheckPointReplicationSlots
> > instead of setting it in SaveSlotToPath b) The comments were moved
> > from ReplicationSlot and moved to CheckPointReplicationSlots c) Tests
> > will be run in autovacuum = off d) updating last_saved_confirmed_flush
> > based on cp.slotdata.confirmed_flush rather than
> > slot->data.confirmed_flush.
> > I have also added the commitfest entry for this at [1].
>
> The overall idea looks fine to me
>
> +
> + /*
> + * We won't ensure that the slot is persisted after the
> + * confirmed_flush LSN is updated as that could lead to frequent
> + * writes. However, we need to ensure that we do persist the slots at
> + * the time of shutdown whose confirmed_flush LSN is changed since we
> + * last saved the slot to disk. This will help in avoiding retreat of
> + * the confirmed_flush LSN after restart.
> + */
> + if (is_shutdown && SlotIsLogical(s))
> + {
> + SpinLockAcquire(&s->mutex);
> + if (s->data.invalidated == RS_INVAL_NONE &&
> + s->data.confirmed_flush != s->last_saved_confirmed_flush)
> + s->dirty = true;
> + SpinLockRelease(&s->mutex);
> + }
>
> The comments don't mention anything about why we are just flushing at
> the shutdown checkpoint. I mean the checkpoint is not that frequent
> and we already perform a lot of I/O during checkpoints so isn't it
> wise to flush during every checkpoint. We may argue that there is no
> extra advantage of that as we are not optimizing for crash recovery
> but OTOH there is no reason for not doing so for other checkpoints or
> we are worried about the concurrency with parallel walsender running
> during non shutdown checkpoint if so then better we explain that as
> well? If it is already discussed in the thread and we have a
> conclusion on this then maybe we can mention this in comments?
>
The point is that at the time of non-shutdown checkpoints, it is not
clear that there is an extra advantage but we will definitely add
extra I/O for this. Because at other times, we will already be saving
the slot from time to time as the replication makes progress. And, we
also need to flush such slots during shutdown for correctness for some
use cases like upgrades. We can probably add something like: "At other
times, the walsender keeps saving the slot from time to time as the
replication progresses, so there is no clear advantage of flushing
additional slots at the time of checkpoint". Will that work for you?
Having said that, I am not opposed to doing it for non-shutdown
checkpoints if one makes a separate case for it.
--
With Regards,
Amit Kapila.
From | Date | Subject | |
---|---|---|---|
Next Message | Dilip Kumar | 2023-09-05 04:39:19 | Re: Impact of checkpointer during pg_upgrade |
Previous Message | Amit Kapila | 2023-09-05 04:08:37 | Re: Impact of checkpointer during pg_upgrade |