Re: Suboptimal spinlock code due to volatile

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Suboptimal spinlock code due to volatile
Date: 2024-07-30 07:45:54
Message-ID: b30e197b-f04d-4971-b2e3-f5f56a740f81@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 29/07/2024 22:59, Andres Freund wrote:
> After being confused for a while, the explanation is fairly simple: We use
> volatile and dereference the address:
>
> static __inline__ int
> tas(volatile slock_t *lock)
> {
> slock_t _res = 1;
>
> __asm__ __volatile__(
> " lock \n"
> " xchgb %0,%1 \n"
> : "+q"(_res), "+m"(*lock)
> : /* no inputs */
> : "memory", "cc");
> return (int) _res;
> }
>
> (note the (*lock) and the volatile in the signature).
>
> I think it'd be just as defensible to not emit a separate load here, despite
> the volatile, and indeed clang doesn't emit a separate load. But it also does
> seem defensible to take translate the code very literally, as gcc does.
>
>
> If I remove the volatile from the signature or cast it away, gcc indeed
> generates the offset version:
> 4230: f0 86 82 c0 01 00 00 lock xchg %al,0x1c0(%rdx)

Good catch. Seems safe to just remove the volatile.

> A second, even smaller, issue with the code is that we use "lock xchgb"
> despite xchg having implied lock approximately forever ([2]). That makes the code
> slightly wider than necessary (the lock prefix is one byte).
>
>
> I doubt there's a lot of situations where these end up having a meaningful
> performance impact, but it still seems suboptimal. I may be seeing a *small*
> gain in a workload inserting lots of tiny records, but it's hard to be sure if
> it's above the noise floor.
>
>
> I'm wondering in how many places our fairly broad use of volatiles causes
> more substantially worse code being generated.

Aside from performance, I find "volatile" difficult to reason about. I
feel more comfortable with atomics and memory barriers.

--
Heikki Linnakangas
Neon (https://neon.tech)

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Richard Guo 2024-07-30 07:46:55 Re: Short-circuit sort_inner_and_outer if there are no mergejoin clauses
Previous Message Richard Guo 2024-07-30 07:42:30 Re: Trivial revise for the check of parameterized partial paths