From: | vignesh C <vignesh21(at)gmail(dot)com> |
---|---|
To: | Nisha Moond <nisha(dot)moond412(at)gmail(dot)com> |
Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Introduce XID age and inactive timeout based replication slot invalidation |
Date: | 2025-01-29 07:14:26 |
Message-ID: | CALDaNm2dAJB=fJ2X7EMb7meNTjMyL-+-xA93JL_jPkGF4=RUYw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, 28 Jan 2025 at 17:28, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com> wrote:
>
> Please find the attached v64 patches. The changes in this version
> w.r.t. older patch v63 are as -
> - The changes from the v63-0001 patch have been moved to a separate thread [1].
> - The v63-0002 patch has been split into two parts in v64:
> 1) 001 patch: Implements the main feature - inactive timeout-based
> slot invalidation.
> 2) 002 patch: Separates the TAP test "044_invalidate_inactive_slots"
> as suggested above.
Currently the test takes around 220 seconds for me. We could do the
following changes to bring it down to around 70 to 80 seconds:
1) Set idle_replication_slot_timeout to 70 seconds
+# Avoid unpredictability
+$primary->append_conf(
+ 'postgresql.conf', qq{
+checkpoint_timeout = 1h
+});
+$primary->start;
2) I felt just 1 second more is enough unless you anticipate a random
failure, the test passes for me:
+# Give enough time for inactive_since to exceed the timeout
+sleep($idle_timeout_1min * 60 + 10);
3) Since we will be setting it to 70 seconds above, changing the
configuration and reload is not required:
+# Set timeout GUC so that the next checkpoint will invalidate inactive slots
+$primary->safe_psql(
+ 'postgres', qq[
+ ALTER SYSTEM SET idle_replication_slot_timeout TO
'${idle_timeout_1min}min';
+]);
+$primary->reload;
4) Here you can add some comments that 60s has elapsed and the slot
will get invalidated in another 10 seconds, and pass timeout as 10s to
wait_for_slot_invalidation:
+# Wait for logical failover slot to become inactive on the primary. Note that
+# nobody has acquired the slot yet, so it must get invalidated due to
+# idle timeout.
+wait_for_slot_invalidation($primary, 'sync_slot1', $logstart,
+ $idle_timeout_1min);
5) We can have another streaming replication cluster setup, may be
primary2 and standby2 nodes and stop the standby2 immediately along
with the first streaming replication cluster itself:
+# Make the standby slot on the primary inactive and check for invalidation
+$standby1->stop;
+wait_for_slot_invalidation($primary, 'sb_slot1', $logstart,
+ $idle_timeout_1min);
6) We can rename primary to primary or standby1 to standby to keep the
name consistent:
+# Create standby slot on the primary
+$primary->safe_psql(
+ 'postgres', qq[
+ SELECT pg_create_physical_replication_slot(slot_name :=
'sb_slot1', immediately_reserve := true);
+]);
+
+# Create standby
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
Regards,
Vignesh
From | Date | Subject | |
---|---|---|---|
Next Message | John Naylor | 2025-01-29 07:15:02 | Re: Comment cleanup - it's vs its |
Previous Message | John Naylor | 2025-01-29 07:02:59 | Re: Change GUC hashtable to use simplehash? |