From: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Subject: | Re: Suboptimal spinlock code due to volatile |
Date: | 2024-07-30 07:45:54 |
Message-ID: | b30e197b-f04d-4971-b2e3-f5f56a740f81@iki.fi |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 29/07/2024 22:59, Andres Freund wrote:
> After being confused for a while, the explanation is fairly simple: We use
> volatile and dereference the address:
>
> static __inline__ int
> tas(volatile slock_t *lock)
> {
> slock_t _res = 1;
>
> __asm__ __volatile__(
> " lock \n"
> " xchgb %0,%1 \n"
> : "+q"(_res), "+m"(*lock)
> : /* no inputs */
> : "memory", "cc");
> return (int) _res;
> }
>
> (note the (*lock) and the volatile in the signature).
>
> I think it'd be just as defensible to not emit a separate load here, despite
> the volatile, and indeed clang doesn't emit a separate load. But it also does
> seem defensible to take translate the code very literally, as gcc does.
>
>
> If I remove the volatile from the signature or cast it away, gcc indeed
> generates the offset version:
> 4230: f0 86 82 c0 01 00 00 lock xchg %al,0x1c0(%rdx)
Good catch. Seems safe to just remove the volatile.
> A second, even smaller, issue with the code is that we use "lock xchgb"
> despite xchg having implied lock approximately forever ([2]). That makes the code
> slightly wider than necessary (the lock prefix is one byte).
>
>
> I doubt there's a lot of situations where these end up having a meaningful
> performance impact, but it still seems suboptimal. I may be seeing a *small*
> gain in a workload inserting lots of tiny records, but it's hard to be sure if
> it's above the noise floor.
>
>
> I'm wondering in how many places our fairly broad use of volatiles causes
> more substantially worse code being generated.
Aside from performance, I find "volatile" difficult to reason about. I
feel more comfortable with atomics and memory barriers.
--
Heikki Linnakangas
Neon (https://neon.tech)
From | Date | Subject | |
---|---|---|---|
Next Message | Richard Guo | 2024-07-30 07:46:55 | Re: Short-circuit sort_inner_and_outer if there are no mergejoin clauses |
Previous Message | Richard Guo | 2024-07-30 07:42:30 | Re: Trivial revise for the check of parameterized partial paths |