From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> |
Cc: | "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: condition variables |
Date: | 2016-08-15 17:05:01 |
Message-ID: | CA+TgmobhMRTJz8bSGxTnkEX87DhxbZZghjxF7ESO5E930Z2b1g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Aug 15, 2016 at 1:58 AM, Thomas Munro
<thomas(dot)munro(at)enterprisedb(dot)com> wrote:
> On Sun, Aug 14, 2016 at 9:04 AM, Thomas Munro
> <thomas(dot)munro(at)enterprisedb(dot)com> wrote:
>> On Fri, Aug 12, 2016 at 9:47 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>> [condition-variable-v1.patch]
>>
>> Don't you need to set proc->cvSleeping = false in ConditionVariableSignal?
>
> I poked at this a bit... OK, a lot... and have some feedback:
>
> 1. As above, we need to clear cvSleeping before setting the latch.
Right, OK.
> 2. The following schedule corrupts the waitlist by trying to delete
> something from it that isn't in it:
>
> P1: ConditionVariablePrepareToSleep: push self onto waitlist
> P2: ConditionVariableSignal: pop P1 from waitlist
> P1: <check user condition, decide to break without ever sleeping>
> P3: ConditionVariablePrepareToSleep: push self onto waitlist
> P1: ConditionVariableCancelSleep: delete self from waitlist (!)
>
> Without P3 coming along you probably wouldn't notice because the
> waitlist will be happily cleared and look valid, but P3's entry gets
> lost and then it hangs forever.
>
> One solution is to teach ConditionVariableCancelSleep to check if
> we're still actually there first. That can be done in O(1) time by
> clearing nodes' next and prev pointers when deleting, so that we can
> rely on that in a new proclist_contains() function. See attached.
How about instead using cvSleeping to test this? Suppose we make a
rule that cvSleeping can be changed from false to true only by the
process whose PGPROC it is, and thus no lock is needed, but changing
it from true to false always requires the spinlock.
> 3. The following schedule corrupts the waitlist by trying to insert
> something into it that is already in it:
>
> P1: ConditionVariablePrepareToSleep: push self onto waitlist
> P1: <check user condition, decide to sleep>
> P1: ConditionVariableSleep
> P1: ConditionVariablePrepareToSleep: push self onto waitlist (!)
>
> Nodes before and after self's pre-existing position can be forgotten
> when self's node is pushed to the front of the list. That can be
> fixed by making ConditionVariablePrepareToSleep also check if we're
> already in the list.
OK.
> 4. The waitlist is handled LIFO-ly. Would it be better for the first
> guy to start waiting to be woken up first, like we do for locks? The
> Pthreads equivalent says that it depends on "scheduling policy". I
> don't know if it matters much, just an observation.
I don't know whether this matters. It's possible that FIFO is a
better policy; I don't really care.
> 5. The new proclist function you added is the first to work in terms
> of PGPROC* rather than procno. Maybe the whole interface should work
> with either PGPROC pointers or procnos? No strong view.
Hmm, maybe so. But wouldn't any caller translate to a PGPROC * straight off?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Shay Rojansky | 2016-08-15 17:08:06 | Re: Slowness of extended protocol |
Previous Message | David Steele | 2016-08-15 16:50:53 | Re: patch proposal |