Re: Introduce XID age and inactive timeout based replication slot invalidation

From: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: shveta malik <shveta(dot)malik(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Introduce XID age and inactive timeout based replication slot invalidation
Date: 2024-04-03 06:50:19
Message-ID: Zgz8K/dD/Ptd+7p9@ip-10-97-1-34.eu-west-3.compute.internal
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Wed, Apr 03, 2024 at 11:17:41AM +0530, Bharath Rupireddy wrote:
> On Wed, Apr 3, 2024 at 8:38 AM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> >
> > > > Or a simple solution is that the slotsync worker updates
> > > > inactive_since as it does for non-synced slots, and disables
> > > > timeout-based slot invalidation for synced slots.
> >
> > I like this idea better, it takes care of such a case too when the
> > user is relying on sync-function rather than worker and does not want
> > to get the slots invalidated in between 2 sync function calls.
>
> Please find the attached v31 patches implementing the above idea:

Thanks!

Some comments related to v31-0001:

=== testing the behavior

T1 ===

> - synced slots get their on inactive_since just like any other slot

It behaves as described.

T2 ===

> - synced slots inactive_since is set to current timestamp after the
> standby gets promoted to help inactive_since interpret correctly just
> like any other slot.

It behaves as described.

CR1 ===

+ <structfield>inactive_since</structfield> value will get updated
+ after every synchronization

indicates the last synchronization time? (I think that after every synchronization
could lead to confusion).

CR2 ===

+ /*
+ * Set the time since the slot has become inactive after shutting
+ * down slot sync machinery. This helps correctly interpret the
+ * time if the standby gets promoted without a restart.
+ */

It looks to me that this comment is not at the right place because there is
nothing after the comment that indicates that we shutdown the "slot sync machinery".

Maybe a better place is before the function definition and mention that this is
currently called when we shutdown the "slot sync machinery"?

CR3 ===

+ * We get the current time beforehand and only once to avoid
+ * system calls overhead while holding the lock.

s/avoid system calls overhead while holding the lock/avoid system calls while holding the spinlock/?

CR4 ===

+ * Set the time since the slot has become inactive. We get the current
+ * time beforehand to avoid system call overhead while holding the lock

Same.

CR5 ===

+ # Check that the captured time is sane
+ if (defined $reference_time)
+ {

s/Check that the captured time is sane/Check that the inactive_since is sane/?

Sorry if some of those comments could have been done while I did review v29-0001.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2024-04-03 06:58:41 Re: [HACKERS] make async slave to wait for lsn to be replayed
Previous Message Michał Kłeczek 2024-04-03 06:43:17 Is it safe to cache data by GiST consistent function