From: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
---|---|
To: | Heikki <hlinnaka(at)iki(dot)fi> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: LWLock deadlock and gdb advice |
Date: | 2015-07-19 18:49:14 |
Message-ID: | CAMkU=1x265eWfhXanB1ZXphAVuTanSYwuL-iaaqz9h2u5g8A=Q@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Jul 15, 2015 at 8:44 AM, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
> On 06/30/2015 11:24 PM, Andres Freund wrote:
>
>> On 2015-06-30 22:19:02 +0300, Heikki Linnakangas wrote:
>>
>>> Hm. Right. A recheck of the value after the queuing should be sufficient
>>>> to fix? That's how we deal with the exact same scenarios for the normal
>>>> lock state, so that shouldn't be very hard to add.
>>>>
>>>
>>> Yeah. It's probably more efficient to not release the spinlock between
>>> checking the value and LWLockQueueSelf() though.
>>>
>>
>> Right now LWLockQueueSelf() takes spinlocks etc itself, and is used that
>> way in a bunch of callsites... So that'd be harder. Additionally I'm
>> planning to get rid of the spinlocks around queuing (they show up as
>> bottlenecks in contended workloads), so it seems more future proof not
>> to assume that either way. On top of that I think we should, when
>> available (or using the same type of fallback as for 32bit?), use 64 bit
>> atomics for the var anyway.
>>
>> I'll take a stab at fixing this tomorrow.
>>>
>>
>> Thanks! Do you mean both or "just" the second issue?
>>
>
> Both. Here's the patch.
>
> Previously, LWLockAcquireWithVar set the variable associated with the lock
> atomically with acquiring it. Before the lwlock-scalability changes, that
> was straightforward because you held the spinlock anyway, but it's a lot
> harder/expensive now. So I changed the way acquiring a lock with a variable
> works. There is now a separate flag, LW_FLAG_VAR_SET, which indicates that
> the current lock holder has updated the variable. The LWLockAcquireWithVar
> function is gone - you now just use LWLockAcquire(), which always clears
> the LW_FLAG_VAR_SET flag, and you can call LWLockUpdateVar() after that if
> you want to set the variable immediately. LWLockWaitForVar() always waits
> if the flag is not set, i.e. it will not return regardless of the
> variable's value, if the current lock-holder has not updated it yet.
>
> This passes make check, but I haven't done any testing beyond that. Does
> this look sane to you?
After applying this patch to commit fdf28853ae6a397497b79f, it has survived
testing long enough to convince that this fixes the problem.
Cheers,
Jeff
From | Date | Subject | |
---|---|---|---|
Next Message | Pavel Stehule | 2015-07-19 18:54:16 | Re: pg_dump quietly ignore missing tables - is it bug? |
Previous Message | Noah Misch | 2015-07-19 17:18:54 | Re: security labels on databases are bad for dump & restore |