Re: better atomics - v0.5

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Geoghegan <pg(at)heroku(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Ants Aasma <ants(at)cybertec(dot)at>
Subject: Re: better atomics - v0.5
Date: 2014-06-30 05:34:53
Message-ID: CAA4eK1JqqqZuqvKzfUB-XX381xxfGYHSCVmtYA307GKf5X1uyQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jun 29, 2014 at 2:54 PM, Andres Freund <andres(at)2ndquadrant(dot)com>
wrote:
> On 2014-06-29 11:53:28 +0530, Amit Kapila wrote:
> > On Sat, Jun 28, 2014 at 1:48 PM, Andres Freund <andres(at)2ndquadrant(dot)com>
> > I think it is better to have some discussion about it. Apart from this,
> > today I noticed atomic read/write doesn't use any barrier semantics
> > and just rely on volatile.
>
> Yes, intentionally so. It's often important to get/set the current value
> without the cost of emitting a memory barrier. It just has to be a
> recent value and it actually has to come from memory.

I agree with you that we don't want to incur the cost of memory barrier
for get/set, however it should not be at the cost of correctness.

> And that's actually
> enforced by the current definition - I think?

Yeah, this is the only point which I am trying to ensure, thats why I sent
you few links in previous mail where I had got some suspicion that
just doing get/set with volatile might lead to correctness issue in some
cases.

Some another Note, I found today while reading more on it which suggests
that my previous observation is right:

"Within a thread of execution, accesses (reads and writes) to all volatile
objects are guaranteed to not be reordered relative to each other, but this
order is not guaranteed to be observed by another thread, since volatile
access does not establish inter-thread synchronization."
http://en.cppreference.com/w/c/atomic/memory_order

> > > b) It's only supported from vista onwards. Afaik we still support XP.
> > #ifndef pg_memory_barrier_impl
> > #define pg_memory_barrier_impl() MemoryBarrier()
> > #endif
> >
> > The MemoryBarrier() macro used also supports only on vista onwards,
> > so I think for reasons similar to using MemoryBarrier() can apply for
> > this as well.
>
> I think that's odd because barrier.h has been using MemoryBarrier()
> without a version test for a long time now. I guess everyone uses a new
> enough visual studio.

Yeap or might be the people who even are not using new enough version
doesn't have a use case that can hit a problem due to MemoryBarrier()

> Those intrinsics aren't actually OS but just
> compiler dependent.
>
> Otherwise we could just define it as:
>
> FORCEINLINE
> VOID
> MemoryBarrier (
> VOID
> )
> {
> LONG Barrier;
> __asm {
> xchg Barrier, eax
> }
> }

Yes, I think it will be better, how about defining this way just for
the cases where MemoryBarrier macro is not supported.

> > > c) It doesn't make any difference on x86 ;)
> >
> > What about processors like Itanium which care about acquire/release
> > memory barrier semantics?
>
> Well, it still will be correct? I don't think it makes much sense to
> focus overly much on itanium here with the price of making anything more
> complicated for others.

Completely agreed that we should not add or change anything which
makes patch complicated or can effect the performance, however
the reason for focusing is just to ensure that we should not break
any existing use case for Itanium w.r.t performance or otherwise.

Another way is that we can give ignore or just give lower priority for
verfiying if the patch has anything wrong for Itanium processors.

> > If I am reading C11's spec for this API correctly, then it says as
below:
> > "Atomically compares the value pointed to by obj with the value pointed
> > to by expected, and if those are equal, replaces the former with desired
> > (performs read-modify-write operation). Otherwise, loads the actual
value
> > pointed to by obj into *expected (performs load operation)."
> >
> > So it essentialy means that it loads actual value in expected only if
> > operation is not successful.
>
> Yes. But in the case it's successfull it will already contain the right
> value.

The point was just that we are not completely following C11 atomic
specification, so it might be okay to tweak a bit and make things
simpler and save LOAD instruction. However as there is no correctness
issue here and you think that current implementation is good and
can serve more cases, we can keep as it is and move forward.

> > > > 4.
> > ..
> > > > There is a Caution notice in microsoft site indicating
> > > > _ReadWriteBarrier/MemoryBarrier are deprected.
> > >
> > > It seemed to be the most widely available API, and it's what barrier.h
> > > already used.
> > > Do you have a different suggestion?
> >
> > I am trying to think based on suggestion given by Microsoft, but
> > not completely clear how to accomplish the same at this moment.
>
> Well, they refer to C11 stuff. But I think it'll be a while before it
> makes sense to use a fallback based on that.

In this case, I have a question for you.

Un-patched usage in barrier.h is as follows:
..
#elif defined(__ia64__) || defined(__ia64)
#define pg_compiler_barrier() _Asm_sched_fence()
#define pg_memory_barrier() _Asm_mf()

#elif defined(WIN32_ONLY_COMPILER)
/* Should work on both MSVC and Borland. */
#include <intrin.h>
#pragma intrinsic(_ReadWriteBarrier)
#define pg_compiler_barrier() _ReadWriteBarrier()
#define pg_memory_barrier() MemoryBarrier()
#endif

If I understand correctly the current define mechanism in barrier.h,
it will have different definition for Itanium processors even for windows.

However the patch defines as below:

#if defined(HAVE_GCC__ATOMIC_INT32_CAS) || defined(HAVE_GCC__SYNC_INT32_CAS)
# include "storage/atomics-generic-gcc.h"
#elif defined(WIN32_ONLY_COMPILER)
# include "storage/atomics-generic-msvc.h"

What I can understand from above is that defines in
storage/atomics-generic-msvc.h, will override any previous defines
for compiler/memory barriers and _ReadWriteBarrier()/MemoryBarrier()
will be considered for Windows always.

> > > > 6.
> > > > pg_atomic_compare_exchange_u32()
> > > >
> > > > It is better to have comments above this and all other related
> > functions.
> > Okay, generally code has such comments on top of function
> > definition rather than with declaration.
>
> I don't want to have it on the _impl functions because they're
> duplicated for the individual platforms and will just become out of
> date...

Sure, I was also not asking for _impl functions. What I was asking
in this point was to have comments on top of definition of
pg_atomic_compare_exchange_u32() in atomics.h
In particular, on top of below and similar functions, rather than
at the place where they are declared.

STATIC_IF_INLINE bool

pg_atomic_compare_exchange_u32(volatile pg_atomic_uint32 *ptr,
uint32 *expected, uint32 newval)
{
CHECK_POINTER_ALIGNMENT(ptr, 4);
CHECK_POINTER_ALIGNMENT(expected, 4);

return pg_atomic_compare_exchange_u32_impl(ptr, expected, newval);
}

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Rajeev rastogi 2014-06-30 06:17:13 Re: psql: show only failed queries
Previous Message Dilip kumar 2014-06-30 05:19:10 Re: pg_xlogdump --stats