From: | Andres Freund <andres(at)2ndquadrant(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Geoghegan <pg(at)heroku(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Ants Aasma <ants(at)cybertec(dot)at> |
Subject: | Re: better atomics - v0.5 |
Date: | 2014-07-13 19:17:38 |
Message-ID: | 20140713191738.GI1136@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi Amit,
On 2014-07-08 11:51:14 +0530, Amit Kapila wrote:
> On Thu, Jul 3, 2014 at 10:56 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> wrote:
> > On Tue, Jul 1, 2014 at 4:10 PM, Andres Freund <andres(at)2ndquadrant(dot)com>
> wrote:
>
> Further review of patch:
> 1.
> /*
> * pg_atomic_test_and_set_flag - TAS()
> *
> * Acquire/read barrier semantics.
> */
> STATIC_IF_INLINE_DECLARE bool
> pg_atomic_test_set_flag(volatile pg_atomic_flag *ptr);
>
> a. I think Acquire and read barrier semantics aren't equal.
Right. It was mean as Acquire (thus including read barrier) semantics. I
guess I'll better update README.barrier to have definitions of all barriers.
> b. I think it adheres to Acquire semantics for platforms where
> that semantics
> are supported else it defaults to full memory ordering.
> Example :
> #define TAS(lock) (InterlockedCompareExchange(lock, 1, 0))
>
> Is it right to declare that generic version of function adheres to
> Acquire semantics.
Implementing stronger semantics than required should always be
fine. There's simply no sane way to work with the variety of platforms
we support in any other way.
>
> 2.
> bool
> pg_atomic_test_set_flag_impl(volatile pg_atomic_flag *ptr)
> {
> return TAS((slock_t *) &ptr->sema);
> #define TAS(lock) (InterlockedCompareExchange(lock, 1, 0))
Where's that from? I can't recall adding a line of code like that?
> a. In above usage (ptr->sema), sema itself is not declared as volatile,
> so would it be right to use it in this way for API
> (InterlockedCompareExchange)
> expecting volatile.
Upgrading a pointer to volatile is defined, so I don't see the problem?
> 3.
> /*
> * pg_atomic_unlocked_test_flag - TAS()
> *
> * No barrier semantics.
> */
> STATIC_IF_INLINE_DECLARE bool
> pg_atomic_unlocked_test_flag(volatile pg_atomic_flag *ptr);
>
> a. There is no setting in this, so using TAS in function comments
> seems to be wrong.
Good point.
> b. BTW, I am not able see this function in C11 specs.
So? It's needed for a sensible implementation of spinlocks ontop of
atomics.
> 4.
> #if !defined(PG_HAS_ATOMIC_TEST_SET_FLAG) &&
> defined(PG_HAVE_ATOMIC_EXCHANGE_U32)
> ..
> #define PG_HAS_ATOMIC_UNLOCKED_TEST_FLAG
> static inline bool
> pg_atomic_unlocked_test_flag_impl(volatile pg_atomic_flag *ptr)
> {
> return pg_atomic_read_u32_impl(ptr) == 1;
> }
>
>
> #elif !defined(PG_HAS_ATOMIC_TEST_SET_FLAG) &&
> defined(PG_HAVE_ATOMIC_COMPARE_EXCHANGE_U32)
> ..
> #define PG_HAS_ATOMIC_UNLOCKED_TEST_FLAG
> static inline bool
> pg_atomic_unlocked_test_flag_impl(volatile pg_atomic_flag *ptr)
> {
> return (bool) pg_atomic_read_u32_impl(ptr);
> }
>
> Is there any reason to keep these two implementations different?
No, will change.
> 5.
> atomics-generic-gcc.h
> #ifndef PG_HAS_ATOMIC_UNLOCKED_TEST_FLAG
> #define PG_HAS_ATOMIC_UNLOCKED_TEST_FLAG
> static inline bool
> pg_atomic_unlocked_test_flag_impl(volatile pg_atomic_flag *ptr)
> {
> return ptr->value == 0;
> }
> #endif
>
> Don't we need to compare it with 1 instead of 0?
Why? It returns true if the lock is free, makes sense imo? Will add
comments to atomics.h
> 6.
> atomics-fallback.h
> pg_atomic_unlocked_test_flag_impl(volatile pg_atomic_flag *ptr)
> {
> /*
> * Can't do this in the semaphore based implementation - always return
> * true. Do this in the header so compilers can optimize the test away.
> */
> return true;
> }
>
> Why we can't implement this in semaphore based implementation?
Because semaphores don't provide the API for it?
> 7.
> /*
> * pg_atomic_clear_flag - release lock set by TAS()
> *
> * Release/write barrier semantics.
> */
> STATIC_IF_INLINE_DECLARE void
> pg_atomic_clear_flag(volatile pg_atomic_flag *ptr);
>
> a. Are Release and write barriers equivalent?
Same answer as above. Meant as "Release (thus including write barrier) semantics"
> b. IIUC then current code doesn't have release semantics for unlock/clear.
If you're referring to s_lock.h with 'current code', yes it should have:
* On platforms with weak memory ordering, the TAS(), TAS_SPIN(), and
* S_UNLOCK() macros must further include hardware-level memory fence
* instructions to prevent similar re-ordering at the hardware level.
* TAS() and TAS_SPIN() must guarantee that loads and stores issued after
* the macro are not executed until the lock has been obtained. Conversely,
* S_UNLOCK() must guarantee that loads and stores issued before the macro
* have been executed before the lock is released.
> We are planing to introduce it in this patch, also this doesn't follow the
> specs which says that clear should have memory_order_seq_cst ordering
> semantics.
Making it guarantee full memory barrier semantics is a major performance
loss on x86-64, so I don't want to do that.
Also there is atomic_flag_test_and_set_explicit()...
> As per its current usage in patch (S_UNLOCK), it seems reasonable
> to have *release* semantics for this API, however I think if we use it for
> generic purpose then it might not be right to define it with Release
> semantics.
I can't really see a user where it's not what you want. And if there is
- well, it can make the semantics stronger if it needs that.
> 8.
> #define PG_HAS_ATOMIC_CLEAR_FLAG
> static inline void
> pg_atomic_clear_flag_impl(volatile pg_atomic_flag *ptr)
> {
> /* XXX: release semantics suffice? */
> pg_memory_barrier_impl();
> pg_atomic_write_u32_impl(ptr, 0);
> }
>
> Are we sure that having memory barrier before clearing flag will
> achieve *Release* semantics; as per my understanding the definition
> of Release sematics is as below and above doesn't seem to follow the
> same.
Yes, a memory barrier guarantees a release semantics. It guarantees
more, but that's not a problem.
> 9.
> static inline uint32
> pg_atomic_exchange_u32_impl(volatile pg_atomic_uint32 *ptr, uint32 xchg_)
> {
> uint32 old;
> while (true)
> {
> old = pg_atomic_read_u32_impl(ptr);
> if (pg_atomic_compare_exchange_u32_impl(ptr, &old, xchg_))
> break;
> }
> return old;
> }
>
> What is the reason to implement exchange as compare-and-exchange, at least
> for
> Windows I know corresponding function (InterlockedExchange) exists.
> I could not see any other implementation of same.
> I think this at least deserves comment explaining why we have done
> the implementation this way.
Because Robert and Tom insist that we shouldn't add more operations
employing hardware features. I think that's ok for now, we can always
add more capabilities later.
> 10.
> STATIC_IF_INLINE_DECLARE uint32
> pg_atomic_fetch_add_u32(volatile pg_atomic_uint32 *ptr, int32 add_);
> STATIC_IF_INLINE_DECLARE uint32
> pg_atomic_fetch_sub_u32(volatile pg_atomic_uint32 *ptr, int32 sub_);
>
> The function name and input value seems to be inconsistent.
> The function name contains *_u32*, however the input value is *int32*.
A u32 is implemented by adding/subtracting a signed int. There's some
platforms why that's the only API provided by intrinsics and it's good
enough for pretty much everything.
> 11.
> Both pg_atomic_fetch_and_u32_impl() and pg_atomic_fetch_or_u32_impl() are
> implemented using compare_exchange routines, don't we have any native API's
> which we can use for implementation.
Same reason as for 9).
> 12.
> I am not able to see API's pg_atomic_add_fetch_u32(),
> pg_atomic_sub_fetch_u32()
> in C11 atomic ops specs, do you need it for something specific?
Yes, it's already used in the lwlocks code and it's provided by several
other atomics libraries. I don't really see a reason not to provide it,
especially as some platforms could implement it more efficiently.
> 14.
> * pg_memory_barrier - prevent the CPU from reordering memory access
> *
> * A memory barrier must act as a compiler barrier,
>
> Is it ensured in all cases that pg_memory_barrier implies compiler barrier
> as well?
Yes.
> Example for msvc case, the specs doesn't seem to guarntee it.
I think it actually indirectly guarantees it, but I'm on a plane and
can't check :)
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2014-07-13 19:20:23 | Re: better atomics - v0.5 |
Previous Message | Tomas Vondra | 2014-07-13 18:59:49 | Re: tweaking NTUP_PER_BUCKET |