Re: Introduce XID age and inactive timeout based replication slot invalidation

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Introduce XID age and inactive timeout based replication slot invalidation
Date: 2024-12-20 09:42:06
Message-ID: CAA4eK1JO+3Q+wkyjhMK+h_5MveTLFY7q2KRbrg1ojtO6dLCaMw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Dec 16, 2024 at 4:10 PM Nisha Moond <nisha(dot)moond412(at)gmail(dot)com> wrote:
>
> Here is the v56 patch set with the above comments incorporated.
>

Review comments:
===============
1.
+ {
+ {"idle_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+ gettext_noop("Sets the duration a replication slot can remain idle before "
+ "it is invalidated."),
+ NULL,
+ GUC_UNIT_MS
+ },
+ &idle_replication_slot_timeout_ms,

I think users are going to keep idele_slot timeout at least in hours.
So, millisecond seems the wrong choice to me. I suggest to keep the
units in minutes. I understand that writing a test would be
challenging as spending a minute or more on one test is not advisable.
But I don't see any test testing the other GUCs that are in minutes
(wal_summary_keep_time and log_rotation_age). The default value should
be one day.

2.
+ /*
+ * An error is raised if error_if_invalid is true and the slot is found to
+ * be invalid.
+ */
+ if (error_if_invalid && s->data.invalidated != RS_INVAL_NONE)
+ {
+ StringInfoData err_detail;
+
+ initStringInfo(&err_detail);
+
+ switch (s->data.invalidated)
+ {
+ case RS_INVAL_WAL_REMOVED:
+ appendStringInfo(&err_detail, _("This slot has been invalidated
because the required WAL has been removed."));
+ break;
+
+ case RS_INVAL_HORIZON:
+ appendStringInfo(&err_detail, _("This slot has been invalidated
because the required rows have been removed."));
+ break;
+
+ case RS_INVAL_WAL_LEVEL:
+ /* translator: %s is a GUC variable name */
+ appendStringInfo(&err_detail, _("This slot has been invalidated
because \"%s\" is insufficient for slot."),
+ "wal_level");
+ break;
+
+ case RS_INVAL_NONE:
+ pg_unreachable();
+ }
+
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("can no longer get changes from replication slot \"%s\"",
+ NameStr(s->data.name)),
+ errdetail_internal("%s", err_detail.data));
+ }
+

This should be moved to a separate function.

3.
+static inline bool
+IsSlotIdleTimeoutPossible(ReplicationSlot *s)

Would it be better to name this function as CanInvalidateIdleSlot()?
The current name doesn't seem to match with similar other
functionalities.

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bertrand Drouvot 2024-12-20 09:57:19 Re: per backend I/O statistics
Previous Message Michael Paquier 2024-12-20 09:24:37 Re: per backend I/O statistics