Re: Add missing ConditionVariableCancelSleep() in slot.c

From: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Add missing ConditionVariableCancelSleep() in slot.c
Date: 2024-04-13 11:07:14
Message-ID: 202404131107.x7lrxfrigttp@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2024-Apr-13, Bharath Rupireddy wrote:

> It looks like there's missing ConditionVariableCancelSleep() in
> InvalidatePossiblyObsoleteSlot() after waiting for the replication
> slot to be released by another process. Although prepare to sleep
> cancels the sleep if the previous one wasn't canceled, the calling
> process still remains in the CV's wait queue for the last cycle after
> the replication slot holding process releases the slot. I'm wondering
> why there's no issue detected if the calling process isn't removed
> from the CV's wait queue. It may be due to the cancel sleep call in
> the sigsetjmp() path for almost every process. IMO, it's a clean
> practice to cancel the sleep immediately after the sleep ends like
> elsewhere.

Hmm, but shouldn't we cancel the sleep after we have completed sleeping
altogether, that is, until we've determined that we're no longer to
sleep waiting for this slot? That would suggest to put the call to
cancel sleep after the for(;;) loop is complete, rather than immediately
after sleeping. No?

diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index cebf44bb0f..59e9bef642 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -1756,6 +1756,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
}
}

+ ConditionVariableCancelSleep();
+
Assert(released_lock == !LWLockHeldByMe(ReplicationSlotControlLock));

return released_lock;

However, I noticed that ConditionVariablePrepareToSleep() does a
CancelSleep upon being called ... so what I suggest would not have any
effect whatsoever, because the sleep would be cancelled next time
through the loop anyway. But shouldn't we also modify PrepareToSleep to
exit without doing anything if our current sleep CV is the same one
being newly installed?

diff --git a/src/backend/storage/lmgr/condition_variable.c b/src/backend/storage/lmgr/condition_variable.c
index 112a518bae..65811ff989 100644
--- a/src/backend/storage/lmgr/condition_variable.c
+++ b/src/backend/storage/lmgr/condition_variable.c
@@ -57,6 +57,14 @@ ConditionVariablePrepareToSleep(ConditionVariable *cv)
{
int pgprocno = MyProcNumber;

+ /*
+ * If we're preparing to sleep on the same CV we were already going to
+ * sleep on, we don't actually need to do anything. This may seem like
+ * supporting sloppy coding, which is what it actually does, so ¯\_(ツ)_/¯
+ */
+ if (cv_sleep_target == cv)
+ return;
+
/*
* If some other sleep is already prepared, cancel it; this is necessary
* because we have just one static variable tracking the prepared sleep,

Alternatively, maybe we need to not prepare-to-sleep in
InvalidatePossiblyObsoleteSlot() if we have already prepared to sleep in
a previous iteration through the loop (and of course, don't cancel the
sleep until we're out of the loop).

--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
Tom: There seems to be something broken here.
Teodor: I'm in sackcloth and ashes... Fixed.
http://postgr.es/m/482D1632.8010507@sigaev.ru

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robins Tharakan 2024-04-13 13:02:52 Re: Why is parula failing?
Previous Message Tomas Vondra 2024-04-13 11:02:03 Re: post-freeze damage control