Re: Introduce XID age and inactive timeout based replication slot invalidation

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Introduce XID age and inactive timeout based replication slot invalidation
Date: 2024-12-27 03:52:18
Message-ID: CALDaNm14QrW5j6su+EAqjwnHbiwXJwO+yk73_=7yvc5TVY-43g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 24 Dec 2024 at 17:07, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com> wrote:
>
> On Fri, Dec 20, 2024 at 3:12 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>
>> On Mon, Dec 16, 2024 at 4:10 PM Nisha Moond <nisha(dot)moond412(at)gmail(dot)com> wrote:
>> >
>> > Here is the v56 patch set with the above comments incorporated.
>> >
>>
>> Review comments:
>> ===============
>> 1.
>> + {
>> + {"idle_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
>> + gettext_noop("Sets the duration a replication slot can remain idle before "
>> + "it is invalidated."),
>> + NULL,
>> + GUC_UNIT_MS
>> + },
>> + &idle_replication_slot_timeout_ms,
>>
>> I think users are going to keep idele_slot timeout at least in hours.
>> So, millisecond seems the wrong choice to me. I suggest to keep the
>> units in minutes. I understand that writing a test would be
>> challenging as spending a minute or more on one test is not advisable.
>> But I don't see any test testing the other GUCs that are in minutes
>> (wal_summary_keep_time and log_rotation_age). The default value should
>> be one day.
>
>
> +1
> - Changed the GUC unit to "minute".
>
> Regarding the tests, we have two potential options:
> 1) Introduce an additional "debug_xx" GUC parameter with units of seconds or milliseconds, only for testing purposes.
> 2) Skip writing tests for this, similar to other GUCs with units in minutes.
>
> IMO, adding an additional GUC just for testing may not be worthwhile. It's reasonable to proceed without the test.
>
> Thoughts?
>
> The attached v57 patch-set addresses all the comments. I have kept the test case in the patch for now, it takes 2-3 minutes to complete.

Few comments:
1) We have disabled the similar configuration max_slot_wal_keep_size
by setting to -1, as this GUC also is in similar lines, should we
disable this and let the user configure it?
+/*
+ * Invalidate replication slots that have remained idle longer than this
+ * duration; '0' disables it.
+ */
+int idle_replication_slot_timeout_min =
HOURS_PER_DAY * MINS_PER_HOUR;
+

2) I felt this behavior is an existing behavior, so this can also be
moved to 0001 patch:
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index a586156614..199d7248ee 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2566,7 +2566,8 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
</para>
<para>
The time when the slot became inactive. <literal>NULL</literal> if the
- slot is currently being streamed.
+ slot is currently being streamed. If the slot becomes invalidated,
+ this value will remain unchanged until server shutdown.

3) Can we change the comment below to "We don't allow the value of
idle_replication_slot_timeout other than 0 during the binary upgrade.
See start_postmaster() in pg_upgrade for more details.":
+ * The idle_replication_slot_timeout must be disabled (set to 0)
+ * during the binary upgrade.
+ */
+bool
+check_idle_replication_slot_timeout(int *newval, void **extra,
GucSource source)

Regards,
Vignesh

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2024-12-27 04:00:37 Re: PoC: history of recent vacuum/checkpoint runs (using new hooks)
Previous Message Michael Paquier 2024-12-27 03:32:25 WAL-logging facility for pgstats kinds