Re: Introduce XID age and inactive timeout based replication slot invalidation

From: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Introduce XID age and inactive timeout based replication slot invalidation
Date: 2024-03-22 09:45:29
Message-ID: Zf1TOYzC9fniRft1@ip-10-97-1-34.eu-west-3.compute.internal
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Fri, Mar 22, 2024 at 01:45:01PM +0530, Bharath Rupireddy wrote:
> On Fri, Mar 22, 2024 at 12:39 PM Bertrand Drouvot
> <bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
> >
> > > > Please find the v14-0001 patch for now.
> >
> > Thanks!
> >
> > > LGTM. Let's wait for Bertrand to see if he has more comments on 0001
> > > and then I'll push it.
> >
> > LGTM too.
>
>
> Please see the attached v14 patch set. No change in the attached
> v14-0001 from the previous patch.

Looking at v14-0002:

1 ===

@@ -691,6 +699,13 @@ ReplicationSlotRelease(void)
ConditionVariableBroadcast(&slot->active_cv);
}

+ if (slot->data.persistency == RS_PERSISTENT)
+ {
+ SpinLockAcquire(&slot->mutex);
+ slot->last_inactive_at = GetCurrentTimestamp();
+ SpinLockRelease(&slot->mutex);
+ }

I'm not sure we should do system calls while we're holding a spinlock.
Assign a variable before?

2 ===

Also, what about moving this here?

"
if (slot->data.persistency == RS_PERSISTENT)
{
/*
* Mark persistent slot inactive. We're not freeing it, just
* disconnecting, but wake up others that may be waiting for it.
*/
SpinLockAcquire(&slot->mutex);
slot->active_pid = 0;
SpinLockRelease(&slot->mutex);
ConditionVariableBroadcast(&slot->active_cv);
}
"

That would avoid testing twice "slot->data.persistency == RS_PERSISTENT".

3 ===

@@ -2341,6 +2356,7 @@ RestoreSlotFromDisk(const char *name)

slot->in_use = true;
slot->active_pid = 0;
+ slot->last_inactive_at = 0;

I think we should put GetCurrentTimestamp() here. It's done in v14-0006 but I
think it's better to do it in 0002 (and not taking care of inactive_timeout).

4 ===

Track last_inactive_at in pg_replication_slots

doc/src/sgml/system-views.sgml | 11 +++++++++++
src/backend/catalog/system_views.sql | 3 ++-
src/backend/replication/slot.c | 16 ++++++++++++++++
src/backend/replication/slotfuncs.c | 7 ++++++-
src/include/catalog/pg_proc.dat | 6 +++---
src/include/replication/slot.h | 3 +++
src/test/regress/expected/rules.out | 5 +++--
7 files changed, 44 insertions(+), 7 deletions(-)

Worth to add some tests too (or we postpone them in future commits because we're
confident enough they will follow soon)?

5 ===

Most of the fields that reflect a time (not duration) in the system views are
xxxx_time, so I'm wondering if instead of "last_inactive_at" we should use
something like "last_inactive_time"?

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bertrand Drouvot 2024-03-22 09:53:13 Re: Introduce XID age and inactive timeout based replication slot invalidation
Previous Message walther 2024-03-22 09:41:39 Re: Regression tests fail with musl libc because libpq.so can't be loaded