From: | Greg Smith <gsmith(at)gregsmith(dot)com> |
---|---|
To: | Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Strange behavior: pgbench and new Linux kernels |
Date: | 2009-04-04 16:07:58 |
Message-ID: | alpine.GSO.2.01.0904041147540.27286@westnet.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Tue, 31 Mar 2009, Kevin Grittner wrote:
>>>> On Thu, Apr 17, 2008 at 7:26 PM, Greg Smith wrote:
>
>> On this benchmark 2.6.25 is the worst kernel yet:
>
> I don't remember seeing a follow-up on this issue from last year.
> Are there still any particular kernels to avoid based on this?
I just discovered something really fascinating here. The problem is
strictly limited to when you're connecting via Unix-domain sockets; use
TCP/IP instead, and it goes away.
To refresh everyone's memory here, I reported a problem to the LKML here:
http://lkml.org/lkml/2008/5/21/292 Got some patches and some kernel tweaks
for the scheduler but never a clear resolution for the cause, which kept
anybody from getting too excited about merging anything. Test results
comparing various tweaks on the hardware I'm still using now are at
http://lkml.org/lkml/2008/5/26/288
For example, here's kernel 2.6.25 running pgbench with 50 clients with a
Q6000 processor, demonstrating poor performance--I'd get >20K TPS here
with a pre-CFS kernel:
$ pgbench -S -t 4000 -c 50 -n pgbench
transaction type: SELECT only
scaling factor: 10
query mode: simple
number of clients: 50
number of transactions per client: 4000
number of transactions actually processed: 200000/200000
tps = 8288.047442 (including connections establishing)
tps = 8319.702195 (excluding connections establishing)
If I now execute exactly the same test, but using localhost, performance
returns to normal:
$ pgbench -S -t 4000 -c 50 -n -h localhost pgbench
transaction type: SELECT only
scaling factor: 10
query mode: simple
number of clients: 50
number of transactions per client: 4000
number of transactions actually processed: 200000/200000
tps = 17575.277771 (including connections establishing)
tps = 17724.651090 (excluding connections establishing)
That's 100% repeatable, I ran each test several times each way.
So the new summary here of what I've found is that if:
1) You're running Linux 2.6.23 or greater (confirmed in up to 2.6.26)
2) You connect over a Unix-domain socket
3) Your client count is relatively high (>8 clients/core)
You can expect your pgbench results to tank. Switch to connecting over
TCP/IP to localhost, and everything is fine; it's not quite as fast as the
pre-CFS kernels in some cases, in others it's faster though.
I haven't gotten to testing kernels newer than 2.6.26 yet, when I saw a
17K TPS result during one of my tests on 2.6.25 I screeched to a halt to
isolate this instead.
--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD
From | Date | Subject | |
---|---|---|---|
Next Message | Josh Berkus | 2009-04-04 18:55:04 | Re: Strange behavior: pgbench and new Linux kernels |
Previous Message | henk de wit | 2009-04-04 15:54:43 | Re: Using IOZone to simulate DB access patterns |