From: | Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com> |
---|---|
To: | Steven Crandell <steven(dot)crandell(at)gmail(dot)com> |
Cc: | Craig James <cjames(at)emolecules(dot)com>, pgsql-performance(at)postgresql(dot)org |
Subject: | Re: hardware upgrade, performance degrade? |
Date: | 2013-03-02 06:51:21 |
Message-ID: | CAOR=d=1qQN5r1LracmbJcahemq9sR5gc9QfTUuAQ9U2+7cEj7Q@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Fri, Mar 1, 2013 at 9:49 AM, Steven Crandell
<steven(dot)crandell(at)gmail(dot)com> wrote:
> We saw the same performance problems when this new hardware was running cent
> 6.3 with a 2.6.32-279.19.1.el6.x86_64 kernel and when it was matched to the
> OS/kernel of the old hardware which was cent 5.8 with a 2.6.18-308.11.1.el5
> kernel.
>
> Yes the new hardware was thoroughly tested with bonnie before being put into
> services and has been tested since. We are unable to find any interesting
> differences in our bonnie tests comparisons between the old and new
> hardware. pgbench was not used prior to our discovery of the problem but
> has been used extensively since. FWIW This server ran a zabbix database
> (much lower load requirements) for a month without any problems prior to
> taking over as our primary production DB server.
>
> After quite a bit of trial and error we were able to find a pgbench test (2x
> 300 concurrent client sessions doing selects along with 1x 50 concurrent
> user session doing the standard pgbench query rotation) that showed the new
> hardware under performing when compared to the old hardware to the tune of
> about a 1000 TPS difference (2300 to 1300) for the 50 concurrent user
> pgbench run and about a 1000 less TPS for each of the select only runs
> (~24000 to ~23000). Less demanding tests would be handled equally well by
> both old and new servers. More demanding tests would tip both old and new
> over with very similar efficacy.
>
> Hopefully that fleshes things out a bit more.
> Please let me know if I can provide additional information.
OK I'd recommend testing with various numbers of clients and seeing
what kind of shape you get from the curve when you plot it. I.e. does
it fall off really hard at some number etc? If the old server
degrades more gracefully under very heavy load it may be that you're
just admitting too many connections for the new one etc, not hitting
its sweet spot.
FWIW, the newest intel 10 core xeons and their cousins just barely
keep up with or beat the 8 or 12 core AMD Opterons from 3 years ago in
most of my testing. They look great on paper, but under heavy load
they are luck to keep up most the time.
There's also the possibility that even though you've turned off zone
reclaim that your new hardware is still running in a numa mode that
makes internode communication much more expensive and that's costing
you money. This may especially be true with 1TB of memory that it's
both running at a lower speed AND internode connection costs are much
higher. use the numactl command (I think that's it) to see what the
internode costs are, and compare it to the old hardware. IF the
internode comm costs are really high, see if you can turn off numa in
the BIOS and if it gets somewhat better.
Of course check the usual, that your battery backed cache is really
working in write back not write through etc.
Good luck. Acceptance testing can really suck when newer, supposedly
faster hardware is in fact slower.
From | Date | Subject | |
---|---|---|---|
Next Message | Jesper Krogh | 2013-03-02 07:30:22 | Re: hardware upgrade, performance degrade? |
Previous Message | Wales Wang | 2013-03-01 17:05:00 | Re: New server setup |