Re: persist logical slots to disk during shutdown checkpoint

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: Julien Rouhaud <rjuju123(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: persist logical slots to disk during shutdown checkpoint
Date: 2023-08-22 10:12:40
Message-ID: CAA4eK1KOoS043uXj48q0h8JQeNW9TmKGP7Zz+Yh02wDRFF8aBA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 22, 2023 at 2:56 PM Ashutosh Bapat
<ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
>
> On Tue, Aug 22, 2023 at 9:48 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > Another idea is to record the confirm_flush_lsn at the time of
> > > persisting the slot. We can use it in two different ways 1. to mark a
> > > slot dirty and persist if the last confirm_flush_lsn when slot was
> > > persisted was too far from the current confirm_flush_lsn of the slot.
> > > 2. at shutdown checkpoint, persist all the slots which have these two
> > > confirm_flush_lsns different.
> > >
> >
> > I think using it in the second (2) way sounds advantageous as compared
> > to storing another dirty flag because this requires us to update
> > last_persisted_confirm_flush_lsn only while writing the slot info.
> > OTOH, having a flag dirty_for_shutdown_checkpoint will require us to
> > update it each time we update confirm_flush_lsn under spinlock at
> > multiple places. But, I don't see the need of doing what you proposed
> > in (1) as the use case for it is very minor, basically this may
> > sometimes help us to avoid decoding after crash recovery.
>
> Once we have last_persisted_confirm_flush_lsn, (1) is just an
> optimization on top of that. With that we take the opportunity to
> persist confirmed_flush_lsn which is much farther than the current
> persisted value and thus improving chances of updating restart_lsn and
> catalog_xmin faster after a WAL sender restart. We need to keep that
> in mind when implementing (2). The problem is if we don't implement
> (1) right now, we might just forget to do that small incremental
> change in future. My preference is 1. Do both (1) and (2) together 2.
> Do (2) first and then (1) as a separate commit. 3. Just implement (2)
> if we don't have time at all for first two options.
>

I prefer one of (2) or (3). Anyway, it is better to do that
optimization (persist confirm_flush_lsn at a regular interval) as a
separate patch as we need to test and prove its value separately.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2023-08-22 10:41:30 Re: list of acknowledgments for PG16
Previous Message Richard Guo 2023-08-22 09:51:38 Re: Support run-time partition pruning for hash join