Re: spinlocks on powerpc

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jeremy Harris <jgh(at)wizmail(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: spinlocks on powerpc
Date: 2012-01-03 20:18:41
Message-ID: CA+TgmoYJhfyRR28ASt-FgB7ubDrdWmZ+m722XcDYdv4mEwUEBQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jan 3, 2012 at 3:05 PM, Jeremy Harris <jgh(at)wizmail(dot)org> wrote:
> On 2012-01-03 04:44, Robert Haas wrote:
>> On read-only workloads, you get spinlock contention, because everyone
>> who wants a snapshot has to take the LWLock mutex to increment the
>> shared lock count and again (just a moment later) to decrement it.
>
> Does the LWLock protect anything but the shared lock count?  If not
> then the usually quickest manipulation is along the lines of:
>
> loop: lwarx r5,0,r3  #load and reserve
>        add     r0,r4,r5 #increment word
>        stwcx. r0,0,r3  #store new value if still reserved
>        bne-    loop      #loop if lost reservation
>
> (per IBM's software ref manual,
>  https://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/852569B20050FF778525699600719DF2
> )
>
> The same sort of thing generally holds on other instruction-sets also.

Sure, but the actual critical section is not that simple. You might
look at the code for LWLockAcquire() if you're curious.

> Also, heavy-contention locks should be placed in cache lines away from other
> data (to avoid thrashing the data cache lines when processors are fighting
> over the lock cache lines).

Yep. This is possibly a problem, and has been discussed before, but I
don't think we have any firm evidence that it's a problem, or how much
padding helps. The heavily contended LWLocks are mostly
non-consecutive, except perhaps for the buffer mapping locks.

It's been suggested to me that we should replace our existing LWLock
implementation with a CAS-based implementation that crams all the
relevant details into a single 8-byte word. The pointer to the head
of the wait queue, for example, could be stored as an offset into the
allProcs array rather than a pointer value, which would allow it to be
stored in 24 bits rather than 8 bytes. But there's not quite enough
bit space to make it work without making compromises -- most likely,
reducing the theoretical upper limit on MaxBackends from 2^24 to, say,
2^22. Even if we were willing to do that, the performance benefits of
using atomics here are so far unproven... which doesn't mean they
don't exist, but someone's going to have to do some work to show that
they do.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2012-01-03 20:33:44 Re: Collect frequency statistics for arrays
Previous Message Alexander Korotkov 2012-01-03 20:09:16 Re: Collect frequency statistics for arrays