| From: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> | 
|---|---|
| To: | Shawn Debnath <sdn(at)amazon(dot)com> | 
| Cc: | Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> | 
| Subject: | Re: Introduce timeout capability for ConditionVariableSleep | 
| Date: | 2019-07-13 03:02:25 | 
| Message-ID: | CA+hUKGJxFAEg5A0B55JXOdRetYbP0Ng3vFL2An6mL0i0yAYJKA@mail.gmail.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
On Fri, Jul 12, 2019 at 6:08 PM Shawn Debnath <sdn(at)amazon(dot)com> wrote:
> On Tue, Jul 09, 2019 at 11:03:18PM +1200, Thomas Munro wrote:
> > On Sun, Jul 7, 2019 at 3:09 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> > > +               /* Timed out */
> > > +               if (rc == 0)
> > > +                       return true;
> >
> > Here's a version without that bit, because I don't think we need it.
>
> This works. Agree that letting it fall through covers the first gap.
Pushed, like that (with the now unused rc variable also removed).
Thanks for the patch!
> > > That still leaves the danger that the CV can be signalled some time
> > > after ConditionVariableTimedSleep() returns.  So now I'm wondering if
> > > ConditionVariableCancelSleep() should signal the CV if it discovers
> > > that this process is not in the proclist, on the basis that that must
> > > indicate that we've been signalled even though we're not interested in
> > > the message anymore, and yet some other process else might be
> > > interested, and that might have been the only signal that is ever
> > > going to be delivered by ConditionVariableSignal().
> >
> > And a separate patch to do that.  Thoughts?
>
> I like it. This covers the gap all the way till cancel is invoked and it
> manipulates the list to remove itself or realizes that it needs to
> forward the signal to some other process.
I pushed this too.  It's a separate commit, because I think there is
at least a theoretical argument that it should be back-patched.  I'm
not going to do that today though, because I doubt anyone is relying
on ConditionVariableSignal() working that reliably yet, and it's
really with timeouts that it becomes a likely problem.
I thought about this edge case because I have long wanted to propose a
pair of functions that provide a simplified payloadless blocking
alternative to NOTIFY, that would allow for just the right number of
waiting sessions to wake up to handle SKIP LOCKED-style job queues.
Otherwise you sometimes get thundering herds of wakeups fighting over
crumbs.  That made me think about the case where a worker session
decides to time out and shut down due to being idle for too long, but
eats a wakeup on its way out.  Another question that comes up in that
use case is whether CV wakeup queues should be LIFO or FIFO.  I think
the answer is LIFO, to support class worker pool designs that
stabilise at the right size using a simple idle timeout rule.  They're
currently FIFO (proclist_pop_head_node() to wake up, but
proclist_push_tail() to sleep).  I understand why Robert didn't care
about that last time I mentioned it: all our uses of CVs today are
"broadcast" wakeups.  But a productised version of the "poke" hack I
showed earlier that supports poking just one waiter would care about
the thing this patch fixed, and also the wakeup queue order.
-- 
Thomas Munro
https://enterprisedb.com
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Thomas Munro | 2019-07-13 04:54:06 | Re: Tab completion for CREATE TYPE | 
| Previous Message | Tomas Vondra | 2019-07-13 01:58:05 | Re: Check-out mutable functions in check constraints |