From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | andres(at)anarazel(dot)de (Andres Freund) |
Cc: | Justin Pryzby <pryzby(at)telsasoft(dot)com>, pgsql-general(at)postgresql(dot)org |
Subject: | Re: backends stuck in "startup" |
Date: | 2017-11-22 00:02:01 |
Message-ID: | 7806.1511308921@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
andres(at)anarazel(dot)de (Andres Freund) writes:
> On 2017-11-21 18:50:05 -0500, Tom Lane wrote:
>> (If Justin saw that while still on 9.6, then it'd be worth looking
>> closer.)
> Right. I took this to be referring to something before the current
> migration, but I might have overinterpreted things. There've been
> various forks/ports of pg around that had hand-coded replacements with
> futex usage, and there were definitely buggy versions going around a few
> years back.
Poking around in the archives reminded me of this thread:
https://www.postgresql.org/message-id/flat/14947(dot)1475690465(at)sss(dot)pgh(dot)pa(dot)us
which describes symptoms uncomfortably close to what Justin is showing.
I remember speculating that the SysV-sema implementation, because it'd
always enter the kernel, would provide some memory barrier behavior
that POSIX-sema code based on futexes might miss when taking the no-wait
path. I'd figured that any real problems of that sort would show up
pretty quickly, but that could've been over optimistic. Maybe we need
to take a closer look at where LWLocks devolve to blocking on the process
semaphore and see if there's any implicit assumptions about barriers there.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2017-11-22 00:11:00 | Re: backends stuck in "startup" |
Previous Message | Andres Freund | 2017-11-21 23:52:39 | Re: backends stuck in "startup" |