From: | Merlin Moncure <mmoncure(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)2ndquadrant(dot)com> |
Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com> |
Subject: | Re: Wait free LW_SHARED acquisition - v0.9 |
Date: | 2014-10-14 18:36:57 |
Message-ID: | CAHyXU0zeAkZyyoAgBePnA8L469gFV6+-XYX+GGsQ=CaegZL9qw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Oct 14, 2014 at 8:58 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> On 2014-10-14 08:40:49 -0500, Merlin Moncure wrote:
>> On Fri, Oct 10, 2014 at 11:00 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>> > Which is nearly trivial now that atomics are in. Check out the attached
>> > WIP patch which eliminates the spinlock from StrategyGetBuffer() unless
>> > there's buffers on the freelist.
>>
>> Is this safe?
>>
>> + victim = pg_atomic_fetch_add_u32(&StrategyControl->nextVictimBuffer, 1);
>>
>> - if (++StrategyControl->nextVictimBuffer >= NBuffers)
>> + buf = &BufferDescriptors[victim % NBuffers];
>> +
>> + if (victim % NBuffers == 0)
>> <snip>
>>
>> I don't think there's any guarantee you could keep nextVictimBuffer
>> from wandering off the end.
>
> Not sure what you mean? It'll cycle around at 2^32. The code doesn't try
> to avoid nextVictimBuffer from going above NBuffers. To avoid overrunning
> &BufferDescriptors I'm doing % NBuffers.
>
> Yes, that'll have the disadvantage of the first buffers being slightly
> more likely to be hit, but for that to be relevant you'd need
> unrealistically large amounts of shared_buffers.
Right -- my mistake. That's clever. I think that should work well.
> I think that's pretty much orthogonal. I *do* think it makes sense to
> increment nextVictimBuffer in bigger steps. But the above doesn't
> prohibit doing so - and it'll still be be much better without the
> spinlock. Same number of atomics, but no potential of spinning and no
> potential of being put to sleep while holding the spinlock.
>
> This callsite has a comparatively large likelihood of being put to sleep
> while holding a spinlock because it can run for a very long time doing
> nothing but spinlocking.
A while back, I submitted a minor tweak to the clock sweep so that,
instead of spinlocking every single buffer header as it swept it just
did a single TAS as a kind of a trylock and punted to the next buffer
if the test failed on the principle there's not good reason to hang
around. You only spin if you passed the first test; that should
reduce the likelihood of actual spinning to approximately zero. I
still maintain there's no reason not to do that (I couldn't show a
benefit but that was because mapping list locking was masking any
clock sweep contention at that time).
merlin
From | Date | Subject | |
---|---|---|---|
Next Message | Simon Riggs | 2014-10-14 19:09:50 | Re: pgaudit - an auditing extension for PostgreSQL |
Previous Message | Alvaro Herrera | 2014-10-14 18:21:20 | Re: pg_dump refactor patch to remove global variables |