Re: Move PinBuffer and UnpinBuffer to atomics

From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: YUriy Zhuravlev <u(dot)zhuravlev(at)postgrespro(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Move PinBuffer and UnpinBuffer to atomics
Date: 2015-12-08 10:26:38
Message-ID: CAPpHfdsytkTFMy3N-zfSo+kAuUx=u-7JG6q2bYB6Fpuw2cD5DQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 8, 2015 at 1:04 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:

> On 2015-12-08 12:53:49 +0300, Alexander Korotkov wrote:
> > ​This is why atomic increment *could be* cheaper than loop over CAS and,
> it
> > worth having experiments. ​Another idea is that we can put arbitrary
> logic
> > between lwarx and stwcx. Thus, we can implement PinBuffer using single
> loop
> > of lwarx and stwcx which could be better than loop of CAS.
>
> You can't really put that much between an ll/sc - the hardware is only
> able to track a very limited number of cacheline references.
>

​I have some doubts about this, but I didn't find the place where it's​
​ explicitly documented. In the case of LWLockAttemptLock it not very much
between
lwarx/stwcx
​: 4 instructions while CAS have 2 instructions.
Could you please share some link to docs, if any?​

> > 3) lwlock-increment.patch – LWLockAttemptLock change state using atomic
> > increment operation instead of loop of CAS. This patch does it for
> > LWLockAttemptLock like pinunpin-increment.patch does for PinBuffer.
> > Actually, this patch is not directly related to buffer manager. However,
> > it's nice to test loop of CAS vs atomic increment in different places.
>
> Yea, that's a worthwhile improvement. Actually it's how the first
> versions of the lwlock patches worked - unfortunately I couldn't see big
> differences on hardware I had available at the time.
>
> There's some more trickyness required than what you have in your patch
> (afaics at least). The problem is that when you 'optimistically'
> increment by LW_VAL_SHARED and notice that there actually was another
> locker, you possibly, until you've 'fixed' the state, are blocking new
> exclusive lockers from acquiring the locks. So you additionally need to
> do special handling in these cases, and check the queue more.
>

​Agree. This patch need to be carefully verified. Current experiments just
show that it is promising direction for improvement. I'll come with better
version of this patch.

Also, after testing on large machines I have another observation to share.
For now, LWLock doesn't guarantee that exclusive lock would be ever
acquired (assuming each shared lock duration is finite). It because when
there is no exclusive lock, new shared locks aren't queued and LWLock state
is changed directly. Thus, process which tries to acquire exclusive lock
have to wait for gap in shared locks. But with high concurrency for shared
lock that could happen very rare, say never.

We did see this on big Intel machine in practice. pgbench -S gets shared
ProcArrayLock very frequently. Since some number of connections is
achieved, new connections hangs on getting exclusive ProcArrayLock. I think
we could do some workaround for this problem. For instance, when exclusive
lock waiter have some timeout it could set some special bit which prevents
others to get new shared locks.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message FattahRozzaq 2015-12-08 10:33:26 HELP!!! The WAL Archive is taking up all space
Previous Message Kyotaro HORIGUCHI 2015-12-08 10:18:48 Re: psql: add \pset true/false