Re: Introduce XID age and inactive timeout based replication slot invalidation

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>
Cc: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Introduce XID age and inactive timeout based replication slot invalidation
Date: 2024-12-04 10:26:53
Message-ID: CALDaNm0mTWwg0z4v-sorq08S2CdZmL2s+rh4nHpWeJaBQ2F+mg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 4 Dec 2024 at 15:01, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com> wrote:
>
> On Tue, Dec 3, 2024 at 1:09 PM Hayato Kuroda (Fujitsu)
> <kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
> >
> > Dear Nisha,
> >
> > Thanks for updating the patch!
> >
> > > Fixed. It is reasonable to align with other timeout parameters by
> > > using milliseconds as the unit.
> >
> > It looks you just replaced to GUC_UNIT_MS, but the documentation and
> > postgresql.conf.sample has not been changed yet. They should follow codes.
> > Anyway, here are other comments, mostly cosmetic.
> >
>
> Here is v53 patch-set addressing all the comments in [1] and [2].

Currently, replication slots are invalidated based on the
replication_slot_inactive_timeout only during a checkpoint. This means
that if the checkpoint_timeout is set to a higher value than the
replication_slot_inactive_timeout, slot invalidation will occur only
when the checkpoint is triggered. Identifying the invalidation slots
might be slightly delayed in this case. As an alternative, users can
forcefully invalidate inactive slots that have exceeded the
replication_slot_inactive_timeout by forcing a checkpoint. I was
thinking we could suggest this in the documentation.

+ <para>
+ Slot invalidation due to inactive timeout occurs during checkpoint.
+ The duration of slot inactivity is calculated using the slot's
+ <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>
+ value.
+ </para>
+

We could accurately invalidate the slots using the checkpointer
process by calculating the invalidation time based on the active_since
timestamp and the replication_slot_inactive_timeout, and then set the
checkpointer's main wait-latch accordingly for triggering the next
checkpoint. Ideally, a different process handling this task would be
better, but there is currently no dedicated daemon capable of
identifying and managing slots across streaming replication, logical
replication, and other slots used by plugins. Additionally,
overloading the checkpointer with this responsibility may not be
ideal. As an alternative, we could document about this delay in
identifying and mention that it could be triggered by forceful manual
checkpoint.

Regards,
Vignesh

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kirill Reshke 2024-12-04 10:31:47 Re: Pass ParseState as down to utility functions.
Previous Message Dagfinn Ilmari Mannsåker 2024-12-04 10:25:39 Re: Contradictory behavior of array_agg(distinct) aggregate.