Re: Atomics hardware support table & supported architectures

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Atomics hardware support table & supported architectures
Date: 2014-06-18 15:15:15
Message-ID: CA+TgmoaPPQeucxBEnGj4rC3xYkY9_5LSnO88hHFfLtogbweyyA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jun 17, 2014 at 1:55 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> But the concern is more whether 1 byte can actually be written
> without also writing neighbouring values. I.e. there's hardware out
> there that'll implement a 1byte store as reading 4 bytes, changing one
> of the bytes in a register, and then write the 4 bytes out again. Which
> would mean that a neighbouring write will possibly cause a wrong value
> to be written...

Ah, OK. I've heard about that before, but had forgotten.

> What happens is that gcc will do a syscall triggering the kernel to turn
> of scheduling; perform the math and store the result; turn scheduling on
> again. That way there cannot be a other operation influencing the
> calculation/store. Imagine if you have hardware that, internally, only
> does stores in 4 byte units. Even if it's a single CPU machine, which
> most of those are, the kernel could schedule a separate process after
> the first 4bytes have been written. Oops. The kernel has ways to prevent
> that, userspace doesn't...

Interesting. "Turn off scheduling" sounds like a pretty dangerous syscall.

>> > Does somebody want other columns in there?
>>
>> I think the main question at the developer meeting was how far we want
>> to go with supporting primitives like atomic add, atomic and, atomic
>> or, etc. So I think we should add columns for those.
>
> Well, once CAS is available, atomic add etc is all trivially
> implementable - without further hardware support. It might be more
> efficient to use the native instruction (e.g. xadd can be much better
> than a cmpxchg loop because there's no retries), but that's just
> optimization that won't matter unless you have a fair bit of
> concurrency.
>
> There's currently fallbacks like:
> #ifndef PG_HAS_ATOMIC_FETCH_ADD_U32
> #define PG_HAS_ATOMIC_FETCH_ADD_U32
> STATIC_IF_INLINE uint32
> pg_atomic_fetch_add_u32_impl(volatile pg_atomic_uint32 *ptr, uint32 add_)
> {
> uint32 old;
> while (true)
> {
> old = pg_atomic_read_u32_impl(ptr);
> if (pg_atomic_compare_exchange_u32_impl(ptr, &old, old + add_))
> break;
> }
> return old;
> }

I understand, but the performance characteristics are quite different.
My understanding from the developer meeting was that we'd be OK with
having, say, three levels of support for atomic ops: all ops
supported, only TAS, none. Or maybe four: all ops, CAS + TAS, only
TAS, none. But I think there was resistance (in which I participate)
to the idea of, say, having platform 1 with "add" but not "and" and
"or", platform 2 with "and" and "or" but not "add", platform 3 with
both, platform 4 with neither, etc. Then it becomes too hard for
developers to predict whether something that is a win on their
platform will be a loss on some other platform.

>> > 3) sparcv8: Last released model 1997.
>>
>> I seem to recall hearing about this in a customer situation relatively
>> recently, so there may be a few of these still kicking around out
>> there.
>
> Really? As I'd written in a reply solaris 10 (released 2005) dropped
> support for it. Dropping support for a platform that's been desupported
> 10 years ago by it's manufacturer doesn't sound bad imo...

We definitely have at least one customer using Solaris 9. I don't
know their architecture for certain, but they did recently install a
new version of PostgreSQL.

>> > 4) i386: Support dropped from windows 98 (yes, really), linux, openbsd
>> > (yes, really), netbsd (yes, really). No code changes needed.
>>
>> Wow, OK. In that case, yeah, let's dump it. But let's make sure we
>> adequately document that someplace in the code comments, along with
>> the reasons, because not everyone may realize how dead it is.
>
> I'm generally wondering how to better document the supported os/platform
> combinations. E.g. it's not apparent that we only support certain
> platforms on a rather limited set of compilers...
>
> Maybe a table with columns like: platform, platform version,
> supported-OSs, supported-compilers?

Sounds worth at try.

>> > 6) armv-v5
>>
>> I think this is also a bit less dead than the other ones; Red Hat's
>> shows Bugzilla shows people filing bugs for platform-specific problems
>> as recently as January of 2013:
>>
>> https://bugzilla.redhat.com/show_bug.cgi?id=892378
>
> Closed as WONTFIX :P.
>
> Joking aside, I think there are still usecases for arm-v5 - but it's
> embedded stuff without a real OS and such. Nothing you'd install PG
> on. There's distributions that are dropping ARMv6 support already... My
> biggest problem is that it's not even documented whether v5 has atomic
> 4byte stores - while it's documted for v6.

I think in doubtful cases we might as well keep the support in. If
you've got the fallback to non-atomics, keeping the other code around
doesn't hurt much, and might make it easier for someone who is
interested in one of those platforms. It's fine and good to kill
things that are totally dead, but I think it's better for a user of
some obscure platform to find that it doesn't *quite* work than that
we've deliberately broken it. But maybe I am being too conservative.

>> > Note that this is *not* a requirement for the atomics abstraction - it
>> > now has a fallback to spinlocks if atomics aren't available.
>>
>> That seems great. Hopefully with a configure option to disable
>> atomics so that it's easy to test the fallback.
>
> It's a #define right now. Do you think we really need a configure
> option?

Well, we've got one for --disable-spinlocks, so it seems like it would
be a good idea for symmetry. More than that, I actually really hate
things that don't have a configure option, like WAL_DEBUG, because you
have to change a checked-in file, which shows up as a diff, and if
you're not careful you check it in, and if you are careful it still
gets blown away every time you git reset --hard, which I do a lot. I
think the fact that both Heikki and I on separate occasions have made
commits enabling WAL_DEBUG shows pretty clearly the weaknesses of that
method of doing business.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2014-06-18 15:17:56 Re: Quantify small changes to predicate evaluation
Previous Message Claudio Freire 2014-06-18 15:09:42 Re: Minmax indexes