Re: Issue with the PRNG used by Postgres

From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Parag Paul <parag(dot)paul(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Issue with the PRNG used by Postgres
Date: 2024-04-16 15:54:01
Message-ID: 20240416155401.spixsrw3qt4y6mjx@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2024-04-15 10:54:16 -0400, Robert Haas wrote:
> On Fri, Apr 12, 2024 at 3:33 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > Here's a patch implementing this approach. I confirmed that before we trigger
> > the stuck spinlock logic very quickly and after we don't. However, if most
> > sleeps are interrupted, it can delay the stuck spinlock detection a good
> > bit. But that seems much better than triggering it too quickly.
>
> +1 for doing something about this. I'm not sure if it goes far enough,
> but it definitely seems much better than doing nothing.

One thing I started to be worried about is whether a patch ought to prevent
the timeout used by perform_spin_delay() from increasing when
interrupted. Otherwise a few signals can trigger quite long waits.

But as a I can't quite see a way to make this accurate in the backbranches, I
suspect something like what I posted is still a good first version.

> Given your findings, I'm honestly kind of surprised that I haven't seen
> problems of this type more frequently.

Same. I did a bunch of searches for the error, but found surprisingly
little.

I think in practice most spinlocks just aren't contended enough to reach
perform_spin_delay(). And we have improved some on that over time.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2024-04-16 15:58:01 Re: Table AM Interface Enhancements
Previous Message Andres Freund 2024-04-16 15:35:01 Re: Stack overflow issue