Re: better atomics - v0.5

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Geoghegan <pg(at)heroku(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Ants Aasma <ants(at)cybertec(dot)at>
Subject: Re: better atomics - v0.5
Date: 2014-06-30 16:48:48
Message-ID: 20140630164848.GQ26930@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2014-06-30 12:15:06 -0400, Robert Haas wrote:
> More broadly, it doesn't seem consistent. I think that other projects
> also sometimes write code that acquires a spinlock while holding
> another spinlock, and we don't do that; in fact, we've elevated that
> to a principle, regardless of whether it performs well in practice.

It's actually more than a principle: It's now required for correctness,
because otherwise the semaphore based spinlock implementation will
deadlock otherwise.

> In some cases, the CPU instruction that we issue to acquire that
> spinlock might be the exact same instruction we'd issue to manipulate
> an atomic variable. I don't see how it can be right to say that a
> spinlock-protected critical section is allowed to manipulate an atomic
> variable with cmpxchg, but not allowed to acquire another spinlock
> with cmpxchg. Either acquiring the second spinlock is OK, in which
> case our current coding rule is wrong, or acquiring the atomic
> variable is not OK, either. I don't see how we can have that both
> ways.

The no nested spinlock thing used to be a guideline. One nobody had big
problems with because it didn't prohibit solutions to problems. Since
guidelines are more useful when simple it's "no nested
spinlocks!".

Also those two cmpxchg's aren't the same:

The CAS for spinlock acquiration will only succeed if the lock isn't
acquired. If the lock holder sleeps you'll have to wait for it to wake
up.

In contrast to that CASs for lockfree (or lock-reduced) algorithms won't
be blocked if the last manipulation was done by a backend that's now
sleeping. It'll possibly loop once, get the new value, and retry the
CAS. Yes, it can still take couple of iterations under heavy
concurrency, but you don't have the problem that the lockholder goes to
sleep.

Now, you can argue that the spinlock based fallbacks make that
difference moot, but I think that's just the price for an emulation
fringe platforms will have to pay. They aren't platforms used under
heavy concurrency.

> >> What I'm basically afraid of is that this will work fine in many cases
> >> but have really ugly failure modes. That's pretty much what happens
> >> with spinlocks already - the overhead is insignificant at low levels
> >> of contention but as the spinlock starts to become contended the CPUs
> >> all fight over the cache line and performance goes to pot. ISTM that
> >> making the spinlock critical section significantly longer by putting
> >> atomic ops in there could appear to win in some cases while being very
> >> bad in others.
> >
> > Well, I'm not saying it's something I suggest doing all the time. But if
> > using an atomic op in the slow path allows you to remove the spinlock
> > from 99% of the cases I don't see it having a significant impact.
> > In most scenarios (where atomics aren't emulated, i.e. platforms we
> > expect to used in heavily concurrent cases) the spinlock and the atomic
> > will be on the same cacheline making stalls much less likely.
>
> And this again is my point: why can't we make the same argument about
> two spinlocks situated on the same cache line?

Because it's not relevant? Where and why do we want to acquire two
spinlocks that are on the same cacheline?
As far as I know there's never been an actual need for nested
spinlocks. So the guideline can be as restrictive as is it is because
the price is low and there's no benefit in making it more complex.

I think this line of discussion is pretty backwards. The reason I (and
seemingly Heikki) want to use atomics is that it can reduce contention
significantly. That contention is today too a good part based on
spinlocks. Your argument is basically that increasing spinlock
contention by doing things that can take more than a fixed cycles can
increase contention. But the changes we've talked about only make sense
if they significantly reduce spinlock contention in the first place - a
potential for a couple iterations while holding better not be relevant
for proposed patches.

I don't think a blanket rule makes sense here.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2014-06-30 16:54:10 Re: better atomics - v0.5
Previous Message Robert Haas 2014-06-30 16:46:29 Re: Spinlocks and compiler/memory barriers