Re: Is the unfair lwlock behavior intended?

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Ants Aasma <ants(dot)aasma(at)eesti(dot)ee>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Is the unfair lwlock behavior intended?
Date: 2016-05-25 18:09:43
Message-ID: CA+TgmoYuHXs5+wwgZqLbRvKwsLrf809z_o+gS6BKZP_SW0sxsw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, May 24, 2016 at 6:50 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> Now we potentially could mark individual lwlocks as being fair
> locks. But which ones would those be? Certainly not ProcArrayLock, it's
> way too heavily contended.

I think this is looking at the problem from the wrong angle. The OP's
complaint is pretty fair: a 30-second wait for ProcArrayLock is
horrendous, and if that's actually something that is happening with
any significant regularity on well-configured systems, we need to fix
it somehow. Whether FIFO queueing on LWLocks is the right way to fix
it is a separate question, and I agree that the answer is probably
"no" ... although it does strike me that Amit's work on group XID
clearing might make that a lot more palatable, since you won't
normally have more than one exclusive waiter in the queue at a time.
But regardless of that, I don't think we can just say "oh, well,
sometimes the system becomes totally unresponsive for more than 30
seconds, but we don't care". We have to care about that.

What I think we need to understand better from is Tsunakawa-san is (1)
whether this behavior still occurs in 9.6 and (2) what sort of test
case is required to produce it. Depending on how extreme that test
case is, we can decide how much of a problem we think this is. And we
can test potential solutions to see whether and to what degree they
address it, as well as how they affect throughput.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-05-25 18:12:22 Re: pg_bsd_indent - improvements around offsetof and sizeof
Previous Message Alvaro Herrera 2016-05-25 18:03:17 Re: [PROPOSAL] Move all am-related reloption code into src/backend/access/[am-name] and get rid of relopt_kind