From: | Mikko Tiihonen <mikko(dot)tiihonen(at)nitorcreations(dot)com> |
---|---|
To: | Merlin Moncure <mmoncure(at)gmail(dot)com> |
Cc: | Peter Geoghegan <peter(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Use gcc built-in atomic inc/dec in lock.c |
Date: | 2012-12-14 16:29:58 |
Message-ID: | 50CB5406.9090606@nitorcreations.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 12/14/2012 05:55 PM, Merlin Moncure wrote:
> On Fri, Dec 14, 2012 at 9:33 AM, Mikko Tiihonen
> <mikko(dot)tiihonen(at)nitorcreations(dot)com> wrote:
>> On 12/13/2012 12:19 AM, Peter Geoghegan wrote:
>>>
>>> On 12 December 2012 22:11, Mikko Tiihonen
>>> <mikko(dot)tiihonen(at)nitorcreations(dot)com> wrote:
>>>>
>>>> noticed a "XXX: It might be worth considering using an atomic
>>>> fetch-and-add
>>>> instruction here, on architectures where that is supported." in lock.c
>>>>
>>>> Here is my first try at using it.
>>>
>>>
>>> That's interesting, but I have to wonder if there is any evidence that
>>> this *is* actually helpful to performance.
>>
>>
>> One of my open questions listed in the original email was request for help
>> on
>> creating a test case that exercise the code path enough so that it any
>> improvements can be measured.
>>
>> But apart from performance I think there are two other aspects to consider:
>> 1) Code clarity: I think the lock.c code is easier to understand after the
>> patch
>> 2) Future possibilities: having the atomic_inc/dec generally available
>> allows
>> other performance critical parts of postgres take advantage of them in
>> the
>> future
>
> This was actually attempted a little while back; a spinlock was
> replaced with a few atomic increment and decrement calls for managing
> the refcount and other things on the freelist. It helped or hurt
> depending on contention but the net effect was negative. On
> reflection I think that was because that the assembly 'lock'
> instructions are really expensive relative to the others: so it's not
> safe to assume that say 2-3 gcc primitive increment calls are cheaper
> that a spinlock.
The spinlock uses one 'lock' instruction when taken, and no lock
instructions when released.
Thus I think replacing one spinlock protected add/sub with atomic 'lock'
add/sub not perform worse.
But if you replace "mutex-lock,add,add,unlock" with "atomic add, atomic
add" you already have more hw level synchronization and thus risk lower
performance if they are on separate cache lines. This of course limits
the use cases of the atomic operations.
-Mikko
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2012-12-14 16:33:13 | Re: Assert for frontend programs? |
Previous Message | Heikki Linnakangas | 2012-12-14 16:20:03 | Re: Assert for frontend programs? |