From: | Marko Kreen <marko(at)l-t(dot)ee> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Spinlocks, yet again: analysis and proposed patches |
Date: | 2005-09-13 07:31:19 |
Message-ID: | 20050913073119.GA29529@l-t.ee |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Sep 11, 2005 at 05:59:49PM -0400, Tom Lane wrote:
> The second reason that the futex patch is helping is that when
> a spinlock delay does occur, it allows the delaying process to be
> awoken almost immediately, rather than delaying 10 msec or more
> as the existing code does. However, given that we are only expecting
> the spinlock to be held for a couple dozen instructions, using the
> kernel futex mechanism is huge overkill --- the in-kernel overhead
> to manage the futex state is almost certainly several orders of
> magnitude more than the delay we actually want.
Why do you think so? AFAIK on uncontented case there will be no
kernel access, only atomic inc/dec. On contented case you'll
want task switch anyway, so the futex managing should not
matter. Also this mechanism is specifically optimized for
inter-process locking, I don't think you can get more efficient
mechanism from side-effects from generic syscalls.
If you don't want Linux-specific locking in core code, then
it's another matter...
> 1. Use sched_yield() if available: it does just what we want,
> ie, yield the processor without forcing any useless time delay
> before we can be rescheduled. This doesn't exist everywhere
> but it exists in recent Linuxen, so I tried it. It made for a
> beautiful improvement in my test case numbers: CPU utilization
> went to 100% and the context swap rate to almost nil. Unfortunately,
> I also saw fairly frequent "stuck spinlock" panics when running
> more queries than there were processors --- this despite increasing
> NUM_DELAYS to 10000 in s_lock.c. So I don't trust sched_yield
> anymore. Whatever it's doing in Linux 2.6 isn't what you'd expect.
> (I speculate that it's set up to only yield the processor to other
> processes already affiliated to that processor. In any case, it
> is definitely capable of getting through 10000 yields without
> running the guy who's holding the spinlock.)
This is intended behaviour of sched_yield.
http://lwn.net/Articles/31462/
http://marc.theaimsgroup.com/?l=linux-kernel&m=112432727428224&w=2
--
marko
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paesold | 2005-09-13 10:37:31 | Re: Spinlocks, yet again: analysis and proposed patches |
Previous Message | Qingqing Zhou | 2005-09-13 00:16:16 | Re: counting disk access from index seek operation -- how to? |