From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | pgsql-committers(at)postgresql(dot)org |
Subject: | pgsql: Rewrite ConditionVariableBroadcast() to avoid live-lock. |
Date: | 2018-01-06 00:21:40 |
Message-ID: | E1eXcF2-0003W9-Ia@gemulon.postgresql.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-committers |
Rewrite ConditionVariableBroadcast() to avoid live-lock.
The original implementation of ConditionVariableBroadcast was, per its
self-description, "the dumbest way possible". Thomas Munro found out
it was a bit too dumb. An awakened process may immediately re-queue
itself, if the specific condition it's waiting for is not yet satisfied.
If this happens before ConditionVariableBroadcast is able to see the wait
queue as empty, then ConditionVariableBroadcast will re-awaken the same
process, repeating the cycle. Given unlucky timing this back-and-forth
can repeat indefinitely; loops lasting thousands of seconds have been
seen in testing.
To fix, add our own process to the end of the wait queue to serve as a
sentinel, and exit the broadcast loop once our process is not there
anymore. There are various special considerations described in the
comments, the principal disadvantage being that wakers can no longer
be sure whether they awakened a real waiter or just a sentinel. But in
practice nobody pays attention to the result of ConditionVariableSignal
or ConditionVariableBroadcast anyway, so that problem seems hypothetical.
Back-patch to v10 where condition_variable.c was introduced.
Tom Lane and Thomas Munro
Discussion: https://postgr.es/m/CAEepm=0NWKehYw7NDoUSf8juuKOPRnCyY3vuaSvhrEWsOTAa3w@mail.gmail.com
Branch
------
REL_10_STABLE
Details
-------
https://git.postgresql.org/pg/commitdiff/1c77e990833a72039a71ca3f813ff6d05a4d09b9
Modified Files
--------------
src/backend/storage/lmgr/condition_variable.c | 82 +++++++++++++++++++++++++--
1 file changed, 77 insertions(+), 5 deletions(-)
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2018-01-06 00:42:57 | pgsql: Reorder steps in ConditionVariablePrepareToSleep for more safety |
Previous Message | Michael Paquier | 2018-01-06 00:10:51 | Re: pgsql: Implement channel binding tls-server-end-point for SCRAM |