Quick Links

Re: Issue with the PRNG used by Postgres

From:	Andres Freund <andres(at)anarazel(dot)de>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Parag Paul <parag(dot)paul(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject:	Re: Issue with the PRNG used by Postgres
Date:	2024-04-10 19:08:21
Message-ID:	20240410190821.yhquanxyhpqtkett@awork3.anarazel.de
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi,

On 2024-04-10 14:02:20 -0400, Tom Lane wrote:
> On third thought ... while I still think this is a misuse of
> perform_spin_delay and we should change it, I'm not sure it'll do
> anything to address Parag's problem, because IIUC he's seeing actual
> "stuck spinlock" reports. That implies that the inner loop of
> LWLockWaitListLock slept NUM_DELAYS times without ever seeing
> LW_FLAG_LOCKED clear. What I'm suggesting would change the triggering
> condition to "NUM_DELAYS sleeps without acquiring the lock", which is
> strictly more likely to happen, so it's not going to help him. It's
> certainly still well out in we-shouldn't-get-there territory, though.

I think it could exascerbate the issue. Parag reported ~7k connections on a
128 core machine. The buffer replacement logic in < 16 tries to lock the old
and new lock partitions at once. That can lead to quite bad "chains" of
dependent lwlocks, occasionally putting all the pressure on a single lwlock.
With 7k waiters on a single spinlock, higher frequency of wakeups will make it
much more likely that the process holding the spinlock will be put to sleep.

This is greatly exacerbated by the issue fixed in a4adc31f690, once the
waitqueue is long, the spinlock will be held for an extended amount of time.

Greetings,

Andres Freund

In response to

Re: Issue with the PRNG used by Postgres at 2024-04-10 18:02:20 from Tom Lane

Responses

Re: Issue with the PRNG used by Postgres at 2024-04-10 20:05:21 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Bruce Momjian	2024-04-10 19:24:32	Re: Table AM Interface Enhancements
Previous Message	Pavel Stehule	2024-04-10 18:54:46	Re: broken JIT support on Fedora 40