Re: Atomic operations within spinlocks

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Atomic operations within spinlocks
Date: 2020-06-03 20:45:42
Message-ID: 20200603204542.reuladbmgmjqzoi6@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2020-06-03 14:19:45 -0400, Tom Lane wrote:
> In connection with the nearby thread about spinlock coding rule
> violations, I noticed that we have several places that are doing
> things like this:
>
> SpinLockAcquire(&WalRcv->mutex);
> ...
> written_lsn = pg_atomic_read_u64(&WalRcv->writtenUpto);
> ...
> SpinLockRelease(&WalRcv->mutex);
>
> That's from pg_stat_get_wal_receiver(); there is similar code in
> freelist.c's ClockSweepTick() and StrategySyncStart().
>
> This seems to me to be very bad code. In the first place, on machines
> without the appropriate type of atomic operation, atomics.c is going
> to be using a spinlock to emulate atomicity, which means this code
> tries to take a spinlock while holding another one. That's not okay,
> either from the standpoint of performance or error-safety.

I'm honestly not particularly concerned about either of those in
general:

- WRT performance: Which platforms where we care about performance can't
do a tear-free read of a 64bit integer, and thus needs a spinlock for
this? And while the cases in freelist.c aren't just reads, they are
really cold paths (clock wrapping around).
- WRT error safety: What could happen here? The atomics codepaths is
no-fail code? And nothing should ever nest inside the atomic-emulation
spinlocks. What am I missing?

I think straight out prohibiting use of atomics inside a spinlock will
lead to more complicated and slower code, rather than than improving
anything. I'm to blame for the ClockSweepTick() case, and I don't really
see how we could avoid the atomic-while-spinlocked without relying on
64bit atomics, which are emulated on more platforms, and without having
a more complicated retry loop.

> In the second place, this coding seems to me to indicate serious
> confusion about which lock is protecting what. In the above example,
> if writtenUpto is only accessed through atomic operations then it
> seems like we could just move the pg_atomic_read_u64 out of the
> spinlock section; or if the spinlock is adequate protection then we
> could just do a normal fetch. If we actually need both locks then
> this needs significant re-thinking, IMO.

Yea, it doesn't seem necessary at all that writtenUpto is read with the
spinlock held. That's very recent:

commit 2c8dd05d6cbc86b7ad21cfd7010e041bb4c3950b
Author: Michael Paquier <michael(at)paquier(dot)xyz>
Date: 2020-05-17 09:22:07 +0900

Make pg_stat_wal_receiver consistent with the WAL receiver's shmem info

and I assume just was caused by mechanical replacement, rather than
intentionally doing so while holding the spinlock. As far as I can tell
none of the other writtenUpto accesses are under the spinlock.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mikhail Titov 2020-06-03 21:31:44 [PATCH] Leading minus for negative time interval in ISO 8601
Previous Message Bruce Momjian 2020-06-03 20:34:20 Re: what can go in root.crt ?