Re: pgsql: Fix double-release of spinlock

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi>, pgsql-committers(at)lists(dot)postgresql(dot)org
Subject: Re: pgsql: Fix double-release of spinlock
Date: 2024-07-29 17:46:09
Message-ID: 20240729174609.jbfei3sokdobeeeo@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On 2024-07-29 12:45:19 -0400, Tom Lane wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > On 2024-07-29 12:33:13 -0400, Tom Lane wrote:
> >> I dunno, is that the only extra check that the --disable-spinlocks
> >> implementation is providing?
>
> > I think it also provides the (valuable!) check that spinlocks were actually
> > initialized. But that again seems like something we'd be better off adding
> > more general infrastructure for - nobody runs --disable-spinlocks locally, we
> > shouldn't need to run this on the buildfarm to find problems like this.
>
> Hmm, but how?

I think there's a few ways:

> One of the things we gave up by nuking HPPA support
> was that that platform's representation of an initialized, free
> spinlock was not all-zeroes, so that it'd catch this type of problem.
> I think all the remaining platforms do use zeroes, so it's hard to
> see how anything short of valgrind would be likely to catch it.

1) There's nothing forcing us to use 0/1 for most of the spinlock
implementations. E.g. for x86-64 we could use 0 for uninitialized, 1 for free
and 2 for locked.

2) We could also change the layout of slock_t in assert enabled builds, adding
a dedicated 'initialized' field when assertions are enabled. But that might be
annoying from an ABI POV?

1) seems preferrable, so I gave it a quick try. Seems to work. There's a
*slight* difference in the instruction sequence:

old:
41f6: f0 86 10 lock xchg %dl,(%rax)
41f9: 84 d2 test %dl,%dl
41fb: 75 1b jne 4218 <GetRecoveryState+0x38>

new:
4216: f0 86 10 lock xchg %dl,(%rax)
4219: 80 fa 02 cmp $0x2,%dl
421c: 74 22 je 4240 <GetRecoveryState+0x40>

I.e. the version using 2 as the locked state uses a three byte instruction vs
a two byte instruction before.

*If* we are worried about this, we could

a) Change the representation only for assert enabled builds, but that'd have
ABI issues again.

b) Instead define the spinlock to have 1 as the unlocked state and 0 as the
locked state. That makes it a bit harder to understand that initialization
is missing, compared to a dedicated state, as the first use of the spinlock
just blocks.

To make 1) b) easier to understand it might be worth changing the slock_t
typedef to be something like

typedef struct slock_t
{
char is_free;
} slock_t;

which also might help catch some cases of type confusion - the char typedef is
too forgiving imo.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Andres Freund 2024-07-29 17:48:53 Re: Detect double-release of spinlock
Previous Message Heikki Linnakangas 2024-07-29 17:38:41 pgsql: Remove dead generators for cyrillic encoding conversion tables

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2024-07-29 17:48:53 Re: Detect double-release of spinlock
Previous Message Heikki Linnakangas 2024-07-29 17:41:06 Re: Remove dead code generation tools in src/backend/utils/mb/