From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | josh(at)agliodbs(dot)com |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Wierd context-switching issue on Xeon |
Date: | 2003-11-25 23:40:47 |
Message-ID: | 9470.1069803647@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Josh Berkus <josh(at)agliodbs(dot)com> writes:
> We're seeing some odd issues with hyperthreading-capable Xeons, whether or not
> hyperthreading is enabled. Basically, when a small number of really-heavy
> duty queries hit the system and push all of the CPUs to more than 70% used
> (about 1/2 user & 1/2 kernel), the system goes to 100,000+ context switcthes
> per second and performance degrades.
Strictly a WAG ... but what this sounds like to me is disastrously bad
behavior of the spinlock code under heavy contention. We thought we'd
fixed the spinlock code for SMP machines awhile ago, but maybe
hyperthreading opens some new vistas for misbehavior ...
> I know that there's other Xeon users on this list ... has anyone else seen
> anything like that? The machines are Dells running Red Hat 7.3.
What Postgres version? Is it easy for you to try 7.4? If we were
really lucky, the random-backoff algorithm added late in 7.4 development
would cure this.
If you can't try 7.4, or want to gather more data first, it would be
good to try to confirm or disprove the theory that the context switches
are coming from spinlock delays. If they are, they'd be coming from the
select() calls in s_lock() in s_lock.c. Can you strace or something to
see what kernel calls the context switches occur on?
Another line of thought is that RH 7.3 is a long ways back, and it
wasn't so very long ago that Linux still had lots of SMP bugs. Maybe
what you really need is a kernel update?
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Bill Moran | 2003-11-25 23:45:08 | Re: Impossibly slow DELETEs |
Previous Message | Josh Berkus | 2003-11-25 23:37:42 | Re: Wierd context-switching issue on Xeon |