From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: anole: assorted stability problems |
Date: | 2015-06-29 02:07:56 |
Message-ID: | CA+TgmoaaeRv=1120hQdTjF++Sd4G2zMA-U2-UKzJMD1vMF+CWg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Jun 28, 2015 at 9:17 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> That sucks. It was easy to see that the old fallback barrier
>> implementation wasn't re-entrant, but this one should be. And now
>> that I look at it again, doesn't the failure message indicate that's
>> not the problem anyway?
>
>> ! PANIC: stuck spinlock (c00000000d6f4140) detected at lwlock.c:816
>> ! PANIC: stuck spinlock (c00000000d72f6e0) detected at lwlock.c:770
>
> I was assuming that a leaky memory barrier was allowing the spinlock
> state to become inconsistent, or at least to be perceived as inconsistent.
> But I'm not too clear on how the barrier changes you and Andres have been
> making have affected the spinlock code.
For the most part, they haven't. Andres did a bunch of work to add
atomics support, and overhauled the barrier implementation that I
committed to 9.2 along the way. But that had minimal impact on
s_lock.h.
What we did do that touched s_lock.h was attempt to ensure that
SpinLockAcquire() and SpinLockRelease() function as compiler barriers,
so that it should no longer be necessary to litter the code with
"volatile" in every function that uses those. It is possible that
this could be broken on HP-UX. If _Asm_sched_fence() doesn't
constraint the compiler appropriately, that could explain the problems
we're seeing here. But we're not the only one using that incantation,
so I'm left scratching my head.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Langote | 2015-06-29 02:36:51 | Adjust errorcode in background worker code |
Previous Message | Robert Haas | 2015-06-29 01:57:23 | Re: drop/truncate table sucks for large values of shared buffers |