From: | knizhnik <knizhnik(at)garret(dot)ru> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease |
Date: | 2014-02-14 16:23:32 |
Message-ID: | 52FE4304.4060309@garret.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 02/14/2014 07:51 PM, Andres Freund wrote:
> On 2014-02-14 15:03:16 +0100, Florian Pflug wrote:
>> On Feb14, 2014, at 14:07 , Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>>> On 2014-02-14 13:52:45 +0100, Florian Pflug wrote:
>>>>> I agree we should do that, but imo not in the backbranches. Anything
>>>>> more than than the minimal fix in that code should be avoided in the
>>>>> stable branches, this stuff is friggin performance sensitive, and the
>>>>> spinlock already is a *major* contention point in many workloads.
>>>>
>>>> No argument there. But unless I missed something, there actually is a bug
>>>> there, and using volatile won't fix it. A barrier would, but what about the
>>>> back branches that don't have barrier.h?
>>>
>>> Yea, but I don't see a better alternative. Realistically the likelihood
>>> of a problem without the compiler reordering stuff is miniscule on any
>>> relevant platform that's supported by the !barrier.h branches. Newer
>>> ARMs are the only realistic suspect, and the support for in older
>>> releases wasn't so good...
>>
>> Isn't POWER/PowerPC potentially affected by this?
>
> I wouldn't consider it a major architecture... And I am not sure how
> much out of order a CPU has to be to be affected by this, the read side
> uses spinlocks, which in most of the spinlock implementations implies a
> full memory barrier which in many cache coherency designs will cause
> other CPUs to flush writes. And I think the control dependency in the
> PGSemaphoreUnlock() loop will actually cause a flush on ppc...
PGSemaphoreUnlock() should really introduce memory barrier, but the problem can happen before PGSemaphoreUnlock() is called.
So presence of PGSemaphoreUnlock() just reduces probability of race condition on non-TSO architectures (PPC, ARM, IA64,...) but doesn't completely eliminate it.
>
>> Also, there is still the alternative of not resetting lwWaitLink in LWLockRelease.
>> I can understand why you oppose that - it's certainly nicer to not have stray
>> pointer lying around. But since (as least as far as we know)
>>
>> a) resetting lwWaitLink is not strictly necessary
>
> I don't want to rely on that in the backbranches, that's an entirely new
> assumption. I think anything more than minimal changes are hard to
> justify for a theoretical issue that hasn't been observed in the field.
>
> I think the biggest danger here is that writes are reordered by the
> compiler, that we definitely need to protect against. Making a variable
> volatile or introducing a memory barrier is pretty simple to analyze.
>
>> b) resetting lwWaitLink introduces a race condition (however unlikely)
>>
>> we'll trade correctness for cleanliness if we continue to reset lwWaitLink
>> without protecting against the race. That's a bit of a weird trade-off to make.
>
> It's not just cleanliness, it's being able to actually debug crashes.
Frankly speaking I do not understand why elimination of resetting of lwWaitLink was considered to be bad idea.
Resetting this pointer to NULL will not help much to analyze crash dumps, because right now it is possible that lwWaitLink==NULL but process in waiting for a lock and linked in the list
(if it is the last element of the list). So the fact that lwWaitLink==NULL actually gives not so much useful information.
>
> Greetings,
>
> Andres Freund
>
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2014-02-14 16:28:28 | Re: Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease |
Previous Message | Andres Freund | 2014-02-14 16:14:59 | Re: HBA files w/include support? |