From: | Florian Pflug <fgp(at)phlo(dot)org> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: sinval synchronization considered harmful |
Date: | 2011-07-21 22:22:09 |
Message-ID: | 79FCA27A-B912-48B4-90A4-562FEFB1EE75@phlo.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Jul21, 2011, at 21:15 , Robert Haas wrote:
> On Thu, Jul 21, 2011 at 2:50 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>>> ... On these machines, you need to issue an explicit memory barrier
>>> instruction at each sequence point, or just acquire and release a
>>> spinlock.
>>
>> Right, and the reason that a spinlock fixes it is that we have memory
>> barrier instructions built into the spinlock code sequences on machines
>> where it matters.
>>
>> To get to the point where we could do the sort of optimization Robert
>> is talking about, someone will have to build suitable primitives for
>> all the platforms we support. In the cases where we use gcc ASM in
>> s_lock.h, it shouldn't be too hard to pull out the barrier
>> instruction(s) ... but on platforms where we rely on OS-supplied
>> functions, some research is going to be needed.
>
> Yeah, although falling back to SpinLockAcquire() and SpinLockRelease()
> on a backend-private slock_t should work anywhere that PostgreSQL
> works at all[1]. That will probably be slower than a memory fence
> instruction and certainly slower than a compiler barrier, but the
> point is that - right now - we're doing it the slow way everywhere.
As I discovered while playing with various lockless algorithms to
improve our LWLocks, spin locks aren't actually a replacement for
a (full) barrier.
Lock acquisition only really needs to guarantee that loads and stores
which come after the acquisition operation in program order (i.e., in
the instruction stream) aren't globally visible before that operation
completes. This kind of barrier behaviour is often fittingly called
"acquire barrier".
Similarly, a lock release operation only needs to guarantee that loads
and stores which occur before that operation in program order are
globally visible before the release operation completes. This, again,
is fittingly called "release barrier".
Now assume the following code fragment
global1 = 1;
SpinLockAcquire();
SpinLockRelease();
global2 = 1;
If SpinLockAcquire() has "acquire barrier" semantics, and SpinLockRelease()
has "release barrier" sematics, the it's possible for the store to global1
to be delayed until after SpinLockAcquire(), and similarly for the store
to global2 to be executed before SpinLockRelease() completes. In other
words, what happens is
SpinLockAcquire();
global1 = 1;
global2 = 1;
SpinLockRelease();
But once that can happens, there's no reason that it couldn't also be
SpinLockAcquire();
global2 = 1;
global1 = 1;
SpinLockRelease();
I didn't check if any of our spin lock implementations is actually affected
by this, but it doesn't seem wise to rely on them being full barriers, even
if it may be true today.
best regards,
Florian Pflug
From | Date | Subject | |
---|---|---|---|
Next Message | Christopher Browne | 2011-07-21 22:30:48 | Re: storing TZ along timestamps |
Previous Message | Robert Haas | 2011-07-21 22:17:28 | Re: sinval synchronization considered harmful |