From: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
---|---|
To: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Noah Misch <noah(at)leadboat(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Remove last traces of HPPA support |
Date: | 2024-07-31 10:32:19 |
Message-ID: | CA+hUKG+ezNC_Y9acaRPPv2qK0quB_LOqBLJ8-CfEUN-je8i5WA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Jul 31, 2024 at 8:47 PM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
> On 31/07/2024 08:52, Thomas Munro wrote:
> The old __i386__ implementation of TAS() said:
>
> > * When this was last tested, we didn't have separate TAS() and TAS_SPIN()
> > * macros. Nowadays it probably would be better to do a non-locking test
> > * in TAS_SPIN() but not in TAS(), like on x86_64, but no-one's done the
> > * testing to verify that. Without some empirical evidence, better to
> > * leave it alone.
>
> It seems that you did what the comment suggested. That seems fine. For
> sake of completeness, if someone has an i386 machine lying around, it
> would be nice to verify that. Or an official CPU manufacturer's
> implementation guide, or references to other implementations or something.
Hmm, the last "real" 32 bit CPU is from ~20 years ago. Now the only
32 bit x86 systems we should nominally care about are modern CPUs that
can also run 32 bit instruction; is there a reason to think they'd
behave differently at this level? Looking at the current Intel
optimisation guide's discussion of spinlock implementation at page
2-34 of [1], it doesn't distinguish between 32 and 64, and it has that
double-check thing.
> > - * On platforms with weak memory ordering, the TAS(), TAS_SPIN(), and
> > - * S_UNLOCK() macros must further include hardware-level memory fence
> > - * instructions to prevent similar re-ordering at the hardware level.
> > - * TAS() and TAS_SPIN() must guarantee that loads and stores issued after
> > - * the macro are not executed until the lock has been obtained. Conversely,
> > - * S_UNLOCK() must guarantee that loads and stores issued before the macro
> > - * have been executed before the lock is released.
>
> That old comment means that both SpinLockAcquire() and SpinLockRelease()
> acted as full memory barriers, and looking at the implementations, that
> was indeed so. With the new implementation, SpinLockAcquire() will have
> "acquire semantics" and SpinLockRelease will have "release semantics".
> That's very sensible, and I don't believe it will break anything, but
> it's a change in semantics nevertheless.
Yeah. It's interesting that our pg_atomic_clear_flag(f) is like
standard atomic_flag_clear_explicit(f, memory_order_release), not like
atomic_flag_clear(f) which is short for atomic_flag_clear_explicit(f,
memory_order_seq_cst). Example spinlock code I've seen written in
modern C or C++ therefore uses the _explicit variants, so it can get
acquire/release, which is what people usually want from a lock-like
thing. What's a good way to test the performance in PostgreSQL? In a
naive loop that just test-and-sets and clears a flag a billion times
in a loop and does nothing else, I see 20-40% performance increase
depending on architecture when comparing _seq_cst with
_acquire/_release. You're right that this semantic change deserves
explicit highlighting, in comments somewhere... I wonder if we have
anywhere that is counting on the stronger barrier...
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2024-07-31 10:53:14 | Re: Conflict detection and logging in logical replication |
Previous Message | Andrew Dunstan | 2024-07-31 09:47:40 | Re: can we mark upper/lower/textlike functions leakproof? |