From: | Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com> |
---|---|
To: | Bob Dusek <redusek(at)gmail(dot)com> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: performance config help |
Date: | 2010-01-11 18:19:34 |
Message-ID: | dcc563d11001111019p11aa8ca4p850213df48e6ad39@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Mon, Jan 11, 2010 at 10:49 AM, Bob Dusek <redusek(at)gmail(dot)com> wrote:
>> Depends, is that the first iteration of output? if so, ignore it and
>> show me the second and further on. Same for vmstat... In fact let
>> them run for a minute or two and attach the results... OTOH, if that
>> is the second or later set of output, then you're definitely not IO
>> bound, and I don't see why the CPUs are not being better utilized.
>>
> I was probably not clear... the output I pasted was from the third iteration
> of output from iostat. And, the vmstat output I pasted was from the sixth
> iteration of output
Yeah, you're definitely CPU/Memory bound it seems.
> We can take some measurements at 40 concurrent requests and see where we
> stand.
We'll probably not see much difference, if you're waiting on memory.
> So, we should probably try cranking our random_page_cost value down. When
> we dump our db with "pg_dump --format=t", it's about 15 MB. We should be
> able to keep the thing in memory.
Yeah, I doubt that changing it will make a huge difference given how
small your db is.
>> There are several common bottlenecks you can try to tune away from.
>> IO doesn't look like a problem for you. Neither does CPU load. So,
>> then we're left with context switching time and memory to CPU
>> bandwidth. If your CPUs are basically becoming data pumps then the
>> speed of your FSB becomes VERY critical, and some older Intel mobos
>> didn't have a lot of CPU to Memory bandwidth and adding CPUs made it
>> worse, not better. More modern Intel chipsets have much faster CPU to
>> Memory BW, since they're using the same kind of fabric switching that
>> AMD uses on highly parallel machines.
>
> Each CPU is 2.13 GHz, with 8MB Cache, and the FSB is 1066 MHz. Does that
> bus speed seem slow?
When 16 cores are all sharing the same bus (which a lot of older
designs do) then yes. I'm not that familiar with the chipset you're
running, but I don't think that series CPU has an integrated memory
controller. Does it break the memory into separate chunks that
different cpus can access without stepping on each other's toes?
Later Intel and all AMDs since the Opteron have built in memory
controllers. This meant that going from 2 to 4 cpus in an AMD server
doubled your memory bandwidth, while going from 2 to 4 cpus on older
intel designs left it the same so that each cpu got 1/2 as much
bandwidth as it had before when there were 2.
> It's hard to go to the money tree and say "we're only using about half of
> your CPUs, but you need to get better ones."
Well, if the problem is that you've got a chipset that can't utilize
all your CPUs because of memory bw starvation, it's your only fix.
You should set up some streaming read / write to memory tests you can
run singly, then on 2, 4, 8 16 cores and see how fast memory access is
as you add more threads. I'm betting you'll get throughput well
before 16 cores are working on the problem.
>> If your limit is your hardware, then the only solution is a faster
>> machine. It may well be that a machine with dual fast Nehalem
>> (2.4GHz+) quad core CPUs will be faster. Or 4 or 8 AMD CPUs with
>> their faster fabric.
>
> It sounds like we could spend less money on memory and more on faster hard
> drives and faster CPUs.
I'm pretty sure you could live with slower hard drives here, and fsync
on as well possibly. It looks like it's all cpu <-> memory bandwidth.
But I'm just guessing.
> But, man, that's a tough sell. This box is a giant, relative to anything
> else we've worked with.
Yeah, I understand. We're looking at having to upgrade our dual cpu /
quad core AMD 2.1GHz machine to 4 hex core cpus this summer, possibly
dodecacore cpus even.
So, I took a break from writing and searched for some more info on the
74xx series CPUs, and from reading lots of articles, including this
one:
http://www.anandtech.com/cpuchipsets/intel/showdoc.aspx?i=3414
It seems apparent that the 74xx series if a great CPU, as long as
you're not memory bound.
From | Date | Subject | |
---|---|---|---|
Next Message | Kevin Grittner | 2010-01-11 18:20:11 | Re: performance config help |
Previous Message | Jaime Casanova | 2010-01-11 18:15:20 | Re: [PERFORMANCE] work_mem vs temp files issue |