Re: Introduce XID age and inactive timeout based replication slot invalidation

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Introduce XID age and inactive timeout based replication slot invalidation
Date: 2025-02-05 07:28:09
Message-ID: CAHut+PtCpOnifF9wnhJ=jo7KLmtT=MikuYnM9GGPTVA80rq7OA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Nisha,

Some review comments for the patch v69-0002.

======
src/backend/replication/slot.c

1.
+#ifdef USE_INJECTION_POINTS
+
+ /*
+ * To test idle timeout slot invalidation, if the
+ * slot-time-out-inval injection point is attached,
+ * immediately invalidate the slot.
+ */
+ if (IS_INJECTION_POINT_ATTACHED("slot-time-out-inval"))
+ {
+ invalidation_cause = cause;
+ inactive_since = s->inactive_since = now;
+ break;
+ }
+#endif

1a.
I didn't understand the reason for the assignment ' = now' here. This
is not happening in the normal code path so why do you need to do this
in this test code path? It works for me without doing this.

~

1b.
For testing, I think we should try to keep the injection code
differences minimal -- e.g. share the same (normal build) code as much
as possible. For example, I suggest refactoring like below. Well, it
works for me.

/*
* Check if the slot needs to be invalidated due to
* idle_replication_slot_timeout GUC.
*
* To test idle timeout slot invalidation, if the
* "slot-time-out-inval" injection point is attached,
* immediately invalidate the slot.
*/
if (
#ifdef USE_INJECTION_POINTS
IS_INJECTION_POINT_ATTACHED("slot-time-out-inval") ||
#endif
TimestampDifferenceExceedsSeconds(s->inactive_since, now,
idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
{
invalidation_cause = cause;
inactive_since = s->inactive_since;
}

~

1c.
Can we call the injection point "timeout" instead of "time-out"?

======
.../t/044_invalidate_inactive_slots.pl

2.
+if ($ENV{enable_injection_points} ne 'yes')
+{
+ plan skip_all => 'Injection points not supported by this build';
+}

At first, I had no idea how to build for this test. It would be good
to include a link to the injection build instructions in a comment
somewhere near here.

~~~

3.
+# Wait for slot to first become idle and then get invalidated
+sub wait_for_slot_invalidation
+{
+ my ($node, $slot, $offset) = @_;
+ my $node_name = $node->name;
+

Might be better to call the variable $slot_name instead of $slot.

Also then it will be consistent with $node_name

~~~

4.
+# Check if the extension injection_points is available, as it may be
+# possible that this script is run with installcheck, where the module
+# would not be installed by default.

I misread this comment at first -- maybe it is clearer to reverse the wording?

/extension injection_points/’injection_points’ extension/

~~~

5.
+# Run a checkpoint which will invalidate the slots
+$node->safe_psql('postgres', "CHECKPOINT");

The explanation seems a bit terse -- I think the comment should
elaborate a bit more to explain that CHECKPOINT is just where the idle
slot timeout is checked, but since the test is using injection point
and the injection code enforces immediate idle timeout THAT is why it
will invalidate the slots...

~~~

6.
+# Wait for slots to become inactive. Note that nobody has acquired the slot
+# yet, so it must get invalidated due to idle timeout.

IIUC this comment means:

SUGGESTION
Note that since nobody has acquired the slot yet, then if it has been
invalidated that can only be due to the idle timeout mechanism.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Sutou Kouhei 2025-02-05 07:31:56 Re: Make COPY format extendable: Extract COPY TO format implementations
Previous Message Alexandra Wang 2025-02-05 07:20:30 Re: SQL:2023 JSON simplified accessor support