Re: Introduce XID age and inactive timeout based replication slot invalidation

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Introduce XID age and inactive timeout based replication slot invalidation
Date: 2025-02-01 06:11:53
Message-ID: CALDaNm0FS+FqQk2dadiJFCMM_MhKROMsJUb=b8wtRH6isScQsQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 31 Jan 2025 at 17:50, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com> wrote:
>
> Thanks for the review! I have incorporated the above comments. The
> test in patch-002 has been optimized as suggested and now completes in
> less than a second.
> Please find the attached v66 patch set. The base patch(v65-001) is
> committed now, so I have rebased the patches.

Few comments:
1)We should set inactive_since only if the slot can be invalidated:
+ /* For testing timeout slot
invalidation */
+ if
(IS_INJECTION_POINT_ATTACHED("slot-time-out-inval"))
+ s->inactive_since = 1;
+

2) Instead of "alter system set" and reload, let's do this in
$node->append_conf itself:
+# Set timeout GUC so that the next checkpoint will invalidate inactive slots
+$node->safe_psql(
+ 'postgres', qq[
+ ALTER SYSTEM SET idle_replication_slot_timeout TO '1min';
+]);
+$node->reload;

3) No need to trigger checkpoint twice, we can move it outside so that
just a single checkpoint will invalidate both the slots:
+sub trigger_slot_invalidation
+{
+ my ($node, $slot, $offset) = @_;
+ my $node_name = $node->name;
+ my $invalidated = 0;
+
+ # Run a checkpoint
+ $node->safe_psql('postgres', "CHECKPOINT");

4) I fel this trigger_slot_invalidation is not required after removing
the checkpoint from the function, let's move the waiting for
"invalidating obsolete replication slot" also to
wait_for_slot_invalidation function:
+ # The slot's invalidation should be logged
+ $node->wait_for_log(qr/invalidating obsolete replication slot
\"$slot\"/,
+ $offset);
+
+ # Check that the invalidation reason is 'idle_timeout'
+ $node->poll_query_until(
+ 'postgres', qq[
+ SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+ WHERE slot_name = '$slot' AND
+ invalidation_reason = 'idle_timeout';
+ ])

5) Can we move the subroutine to the beginning, I noticed in other
places we have kept it before the tests like in 027_nosuperuser and
040_createsubscriber:
+# Wait for slot to first become idle and then get invalidated
+sub wait_for_slot_invalidation
+{
+ my ($node, $slot, $offset) = @_;
+ my $node_name = $node->name;
+
+ trigger_slot_invalidation($node, $slot, $offset);
+
+ # Check that an invalidated slot cannot be acquired
+ my ($result, $stdout, $stderr);
+ ($result, $stdout, $stderr) = $node->psql(
+ 'postgres', qq[
+ SELECT pg_replication_slot_advance('$slot', '0/1');
+ ]);

6) Since idle_replication_slot_timeout is related more closely with
max_slot_wal_keep_size, let's keep it along with it.
diff --git a/src/backend/utils/misc/postgresql.conf.sample
b/src/backend/utils/misc/postgresql.conf.sample
index 079efa1baa..0ed9eb057e 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -329,6 +329,7 @@
#wal_sender_timeout = 60s # in milliseconds; 0 disables
#track_commit_timestamp = off # collect timestamp of transaction commit
# (change requires restart)
+#idle_replication_slot_timeout = 1d # in minutes; 0 disables

If you accept the comments, you can merge the changes from the attached patch.

Regards,
Vignesh

Attachment Content-Type Size
Vignesh_review_comment_fix.patch text/x-patch 6.4 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Zhang Mingli 2025-02-01 06:55:12 Proposal to CREATE FOREIGN TABLE LIKE
Previous Message Amit Kapila 2025-02-01 05:07:07 Re: Conflict detection for update_deleted in logical replication