Re: Issue with the PRNG used by Postgres

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Parag Paul <parag(dot)paul(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Issue with the PRNG used by Postgres
Date: 2024-04-10 17:32:42
Message-ID: 20240410173242.p2ukq5dcyav6huya@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2024-04-10 13:03:05 -0400, Tom Lane wrote:
> After thinking about this some more, it is fairly clear that that *is*
> a mistake that can cause a thundering-herd problem.

> Assume we have two or more backends waiting in perform_spin_delay, and for
> whatever reason the scheduler wakes them up simultaneously.

That's not really possible, at least not repeatably. Multiple processes
obviously can't be scheduled concurrently on one CPU and scheduling something
on another core entails interrupting that CPU with an inter processor
interrupt or that other CPU scheduling on its own, without coordination.

That obviously isn't a reason to not fix the delay logic in lwlock.c.

Looks like the wrong logic was introduced by me in

commit 008608b9d51061b1f598c197477b3dc7be9c4a64
Author: Andres Freund <andres(at)anarazel(dot)de>
Date: 2016-04-10 20:12:32 -0700

Avoid the use of a separate spinlock to protect a LWLock's wait queue.

Likely because I was trying to avoid the overhead of init_local_spin_delay(),
without duplicating the few lines to acquire the "spinlock".

> So I think we need something like the attached.

LGTM.

I think it might be worth breaking LWLockWaitListLock() into two pieces, a
fastpath to be inlined into a caller, and a slowpath, but that's separate work
from a bugfix.

I looked around and the other uses of init_local_spin_delay() look correct
from this angle. However LockBufHdr() is more expensive than it needs to be,
because it always initializes SpinDelayStatus. IIRC I've seen that show up in
profiles before, but never got around to writing a nice-enough patch. But
that's also something separate from a bugfix.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-04-10 18:02:20 Re: Issue with the PRNG used by Postgres
Previous Message Greg Sabino Mullane 2024-04-10 17:31:56 Re: psql: Greatly speed up "\d tablename" when not using regexes