> We're seeing an average of 30,000 context-switches a sec. This problem
> was much worse w/8.0 and got bearable with 8.1 but slowly resurfaced.
Is this from LWLock or spinlock contention? strace'ing a few backends
could tell the difference: look to see how many select(0,...) you see
compared to semop()s. Also, how many of these compared to real work
(such as read/write calls)?
Do you have any long-running transactions, and if so does shutting
them down help? There's been some discussion about thrashing of the
pg_subtrans buffers being a problem, and that's mainly a function of
the age of the oldest open transaction.
regards, tom lane