Re: backends stuck in "startup"

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: andres(at)anarazel(dot)de (Andres Freund)
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: backends stuck in "startup"
Date: 2017-11-22 00:02:01
Message-ID: 7806.1511308921@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

andres(at)anarazel(dot)de (Andres Freund) writes:
> On 2017-11-21 18:50:05 -0500, Tom Lane wrote:
>> (If Justin saw that while still on 9.6, then it'd be worth looking
>> closer.)

> Right. I took this to be referring to something before the current
> migration, but I might have overinterpreted things. There've been
> various forks/ports of pg around that had hand-coded replacements with
> futex usage, and there were definitely buggy versions going around a few
> years back.

Poking around in the archives reminded me of this thread:
https://www.postgresql.org/message-id/flat/14947(dot)1475690465(at)sss(dot)pgh(dot)pa(dot)us
which describes symptoms uncomfortably close to what Justin is showing.

I remember speculating that the SysV-sema implementation, because it'd
always enter the kernel, would provide some memory barrier behavior
that POSIX-sema code based on futexes might miss when taking the no-wait
path. I'd figured that any real problems of that sort would show up
pretty quickly, but that could've been over optimistic. Maybe we need
to take a closer look at where LWLocks devolve to blocking on the process
semaphore and see if there's any implicit assumptions about barriers there.

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Andres Freund 2017-11-22 00:11:00 Re: backends stuck in "startup"
Previous Message Andres Freund 2017-11-21 23:52:39 Re: backends stuck in "startup"