From: | Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com> |
---|---|
To: | Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> |
Cc: | shveta malik <shveta(dot)malik(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Introduce XID age and inactive timeout based replication slot invalidation |
Date: | 2024-03-26 17:52:12 |
Message-ID: | ZgMLTBD3xJIP9W93@ip-10-97-1-34.eu-west-3.compute.internal |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On Tue, Mar 26, 2024 at 09:59:23PM +0530, Bharath Rupireddy wrote:
> On Tue, Mar 26, 2024 at 4:35 PM Bharath Rupireddy
> <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
> >
> > If we just sync inactive_since value for synced slots while in
> > recovery from the primary, so be it. Why do we need to update it to
> > the current time when the slot is being created? We don't expose slot
> > creation time, no? Aren't we fine if we just sync the value from
> > primary and document that fact? After the promotion, we can reset it
> > to the current time so that it gets its own time.
>
> I'm attaching v24 patches. It implements the above idea proposed
> upthread for synced slots. I've now separated
> s/last_inactive_time/inactive_since and synced slots behaviour. Please
> have a look.
Thanks!
==== v24-0001
It's now pure mechanical changes and it looks good to me.
==== v24-0002
1 ===
This commit does two things:
1) Updates inactive_since for sync slots with the value
received from the primary's slot.
Tested it and it does that.
2 ===
2) Ensures the value is set to current timestamp during the
shutdown of slot sync machinery to help correctly interpret the
time if the standby gets promoted without a restart.
Tested it and it does that.
3 ===
+/*
+ * Reset the synced slots info such as inactive_since after shutting
+ * down the slot sync machinery.
+ */
+static void
+update_synced_slots_inactive_time(void)
Looks like the comment "reset" is not matching the name of the function and
what it does.
4 ===
+ /*
+ * We get the current time beforehand and only once to avoid
+ * system calls overhead while holding the lock.
+ */
+ if (now == 0)
+ now = GetCurrentTimestamp();
Also +1 of having GetCurrentTimestamp() just called one time within the loop.
5 ===
- if (!(RecoveryInProgress() && slot->data.synced))
+ if (!(InRecovery && slot->data.synced))
slot->inactive_since = GetCurrentTimestamp();
else
slot->inactive_since = 0;
Not related to this change but more the way RestoreSlotFromDisk() behaves here:
For a sync slot on standby it will be set to zero and then later will be
synchronized with the one coming from the primary. I think that's fine to have
it to zero for this window of time.
Now, if the standby is down and one sets sync_replication_slots to off,
then inactive_since will be set to zero on the standby at startup and not
synchronized (unless one triggers a manual sync). I also think that's fine but
it might be worth to document this behavior (that after a standby startup
inactive_since is zero until the next sync...).
6 ===
+ print "HI $slot_name $name $inactive_since $slot_creation_time\n";
garbage?
7 ===
+# Capture and validate inactive_since of a given slot.
+sub capture_and_validate_slot_inactive_since
+{
+ my ($node, $slot_name, $slot_creation_time) = @_;
+ my $name = $node->name;
We know have capture_and_validate_slot_inactive_since at 2 places:
040_standby_failover_slots_sync.pl and 019_replslot_limit.pl.
Worth to create a sub in Cluster.pm?
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Geoghegan | 2024-03-26 18:01:58 | Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan |
Previous Message | Andrey M. Borodin | 2024-03-26 17:26:14 | Re: UUID v7 |