Quick Links

Re: Scalability in postgres

From:	Scott Carey <scott(at)richrelevance(dot)com>
To:	Fabrix <fabrixio1(at)gmail(dot)com>
Cc:	Greg Smith <gsmith(at)gregsmith(dot)com>, Flavio Henrique Araque Gurgel <flavio(at)4linux(dot)com(dot)br>, pgsql-performance <pgsql-performance(at)postgresql(dot)org>
Subject:	Re: Scalability in postgres
Date:	2009-06-01 19:19:44
Message-ID:	C64977E0.70EC%scott@richrelevance.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

On 5/31/09 9:37 AM, "Fabrix" <fabrixio1(at)gmail(dot)com> wrote:

>
>
> 2009/5/29 Scott Carey <scott(at)richrelevance(dot)com>
>>
>> On 5/28/09 6:54 PM, "Greg Smith" <gsmith(at)gregsmith(dot)com> wrote:
>>
>>> 2) You have very new hardware and a very old kernel. Once you've done the
>>> above, if you're still not happy with performance, at that point you
>>> should consider using a newer one. It's fairly simple to build a Linux
>>> kernel using the same basic kernel parameters as the stock RedHat one.
>>> 2.6.28 is six months old now, is up to 2.6.28.10, and has gotten a lot
>>> more testing than most kernels due to it being the Ubuntu 9.04 default.
>>> I'd suggest you try out that version.
>>
>>
>> Comparing RedHat's 2.6.18, heavily patched, fix backported kernel to the
>> original 2.6.18 is really hard. Yes, much of it is old, but a lot of stuff
>> has been backported.
>> I have no idea if things related to this case have been backported. Virtual
>> memory management is complex and only bug fixes would likely go in however.
>> But RedHat 5.3 for example put all the new features for Intel's latest
>> processor in the release (which may not even be in 2.6.28!).
>>
>> There are operations/IT people won't touch Ubuntu etc with a ten foot pole
>> yet for production. That may be irrational, but such paranoia exists. The
>> latest postgres release is generally a hell of a lot safer than the latest
>> linux kernel, and people get paranoid about their DB.
>>
>> If you told someone who has to wake up at 3AM by page if the system has an
>> error that "oh, we patched our own kenrel build into the RedHat OS" they
>> might not be ok with that.
>>
>> Its a good test to see if this problem is fixed in the kernel. I've seen
>> CentOS 5.2 go completely nuts with system CPU time and context switches with
>> kswapd many times before. I haven't put the system under the same stress
>> with 5.3 yet however.
>
> One of the server is: Intel Xeon X7350 2.93GHz, RH 5.3 and kernel
> 2.6.18-128.el5.
> and the perfonmace is bad too, so i don't think the probles is the kernel
>
> The two servers that I tested (HP-785 Opteron and IBM x3950 M2 Xeon) have NUMA
> architecture. and I thought the problem was caused by NUMA.
>
> http://archives.postgresql.org/pgsql-admin/2008-11/msg00157.php
>
> I'm trying another server, an HP blade bl 680 with Xeon E7450 (4 CPU x 6
> cores= 24 cores) without NUMA architecture, but the CPUs are also going up.
>
> procs -----------memory---------- ---swap-- -----io---- --system--
> -----cpu------
> r b   swpd   free   buff cache   si   so    bi    bo   in   cs us sy id wa
> st
> 1 0      0 46949972 116908 17032964    0    0    15    31    2    2 1 0
> 98 0 0
> 2 0      0 46945880 116916 17033068    0    0    72   140 2059 3140 1 1
> 97 0 0
> 329 0      0 46953260 116932 17033208    0    0    24   612 1435 194237 44 3
> 53 0 0
> 546 0      0 46952912 116940 17033208    0    0     4   136 1090 327047 96
> 4 0 0 0
> 562 0      0 46951052 116940 17033224    0    0     0     0 1095 323034 95
> 4 0 0 0
> 514 0      0 46949200 116952 17033212    0    0     0   224 1088 330178 96
> 3 1 0 0
> 234 0      0 46948456 116952 17033212    0    0     0     0 1106 315359 91
> 5 4 0 0
> 4 0      0 46958376 116968 17033272    0    0    16   396 1379 223499 47 3
> 49 0 0
> 1 1      0 46941644 116976 17033224    0    0   152 1140 2662 5540 4 2
> 93 1 0
> 1 0      0 46943196 116984 17033248    0    0   104   604 2307 3992 4 2
> 94 0 0
> 1 1      0 46931544 116996 17033568    0    0   104 4304 2318 3585 1 1
> 97 1 0
> 0 0      0 46943572 117004 17033568    0    0    32   204 2007 2986 1 1
> 98 0 0
>
>
> Now i don't think the probles is NUMA.
>
>
> The developer team will fix de aplication and then i will test again.
>
> I believe that when the application closes the connection the problem could be
> solved, and then 16 cores in a server does the work instead of a 32 or 24.

Hidden in the above data is that the context switch craziness is not
correlated with system CPU time, but user CPU time -- so this is not likely
related to the kswapd context switch stuff which is associated with high
system CPU use.

Its probably locking in Postgres.

>
>
> Regards...
>
> --Fabrix
>
>
>

In response to

Re: Scalability in postgres at 2009-05-31 16:37:33 from Fabrix

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Robert Haas	2009-06-01 20:24:10	Re: Unexpected query plan results
Previous Message	Anne Rosset	2009-06-01 18:14:47	Re: Unexpected query plan results