Re: Fwd: Help!Why CPU Usage and LoadAverage Jump up Suddenly

From: 吕晓旭 <lxxstcno1(at)gmail(dot)com>
To: John R Pierce <pierce(at)hogranch(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Fwd: Help!Why CPU Usage and LoadAverage Jump up Suddenly
Date: 2013-12-06 19:04:07
Message-ID: 525E58BD-F4DE-4A8B-90D1-2C34C91277BB@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

there are several reason drive us to build ourselves rpm, the most important is we want to install it in a directory self definition
and another reason is setting CFLAGS(add -mavx for example, high light at previous ).
because, without this compile parameter, response time is suffering on dell machine

according to the system monitor chart shown on cacti, i don't think IO is heavy, no matter "cpu io wait" and "Reads/Writes - sda".
there is no monitor items showing strange behavior of IO system, and no different between low concurrency and heavy concurrency,
and no different between two environments.

I find something with top : CPU usage of each postgres process is not so high when low concurrency, the highest one is about 50%.
but when concurrency gradually increased, some postgres process cpu usage reach 100%, and keep a moment. and at these high concurrency period runq-sz(with sar -q) is very long, sometime 50 process waiting the CPU time.
so is there something like spin-lock keep the CPU time?

some configurations on the both system:
shared_buffers = 8192MB
work_mem =256MB
maintenance_work_mem = 160MB
full_page_writes = off
wal_buffers = 10MB
wal_keep_segments = 150

hardware configurations list below(both environment with raid 10):
* perfect one:
$ sudo hwconfig
Summary: HP DL360 G7, 1 x Xeon E5645 2.40GHz, 94.4GB / 96GB 1333MHz DDR3
System: HP ProLiant DL360 G7
Processors: 1 (of 2) x Xeon E5645 2.40GHz 133MHz FSB (HT enabled, 6 cores, 24 threads)
Memory: 94.4GB / 96GB 1333MHz DDR3 == 6 x 16GB, 12 x empty
Disk: cciss/c0d0 (cciss0): 1.2TB (48%) RAID-10 == 4 x HP-EG0600FBLSH
Disk-Control: cciss0: Hewlett-Packard Company Smart Array G6 controllers, FW 5.14, Cache on 256MB/768MB (R/W)
Chipset: Intel 82801JIB (ICH10)
Network: eth0 (bnx2): Broadcom NetXtreme II BCM5709 Gigabit, e4:11:5b:ed:12:1c, 1000Mb/s <full-duplex>
Network: eth1 (bnx2): Broadcom NetXtreme II BCM5709 Gigabit, e4:11:5b:ed:12:1e, no carrier
Network: eth2 (bnx2): Broadcom NetXtreme II BCM5709 Gigabit, e4:11:5b:ed:12:58, no carrier
Network: eth3 (bnx2): Broadcom NetXtreme II BCM5709 Gigabit, e4:11:5b:ed:12:5a, no carrier
OS: CentOS 5.6 (Final), Linux 2.6.18-238.19.1.el5 x86_64, 64-bit
BIOS: HP P68 12/02/2012
Hostname: l-interdb3.f.cn1

* the one performances so bad:
$ sudo hwconfig
hwconfig: warning: could not run MegaCli
Summary: Dell R620, 1 x Xeon E5-2630 0 2.30GHz, 62.9GB / 64GB 1600MHz DDR3
System: Dell PowerEdge R620 (Dell 0D2D5F)
Processors: 1 (of 2) x Xeon E5-2630 0 2.30GHz 7200MHz FSB (HT enabled, 6 cores, 24 threads)
Memory: 62.9GB / 64GB 1600MHz DDR3 == 8 x 8GB, 16 x empty
Disk: sda (scsi6): 1.2TB (29%) JBOD == 1 x DELL-PERC-H710P
Disk-Control: ahci0: Intel Patsburg 6-Port SATA AHCI Controller
Disk-Control: megaraid_sas0: LSI Logic / Symbios Logic MegaRAID SAS 2208 [Thunderbolt]
Network: em1 (tg3): Broadcom NetXtreme BCM5720 Gigabit PCIe, e0:db:55:1f:9b:d8, 1000Mb/s <full-duplex>
Network: em2 (tg3): Broadcom NetXtreme BCM5720 Gigabit PCIe, e0:db:55:1f:9b:d9, no carrier
Network: em3 (tg3): Broadcom NetXtreme BCM5720 Gigabit PCIe, e0:db:55:1f:9b:da, no carrier
Network: em4 (tg3): Broadcom NetXtreme BCM5720 Gigabit PCIe, e0:db:55:1f:9b:db, no carrier
OS: CentOS 6.2 (Final), Linux 3.2.34-1.el6.x86_64 x86_64, 64-bit
BIOS: Dell 1.4.8 10/25/2012
Hostname: l-interdb11.f.cn1

2013/12/6 John R Pierce <pierce(at)hogranch(dot)com>
>> On 12/5/2013 12:46 AM, 吕晓旭 wrote:
>> We find so weird problem on our productive PostgreSQL system. And I don't know how could I do to resolve this problem.
>> We deployed PostgreSQL 9.2.4 on two system environments, and the performances between them are absolutely different. one of them it's perfect, and the other one lets me down, CPU Usage and LoadAverage Jumped up Suddenly when concurrency smoothly rising up, simultaneously, average response time become unacceptable.
>
> I'm curious why you built your own postgres instead of using the yum.postgresql.com repository versions?
>
> and I second the suggestion, IO performance is likely a major factor here. also, you don't give your postgresql.conf tuning settings, file systems configurations, hardware storage configurations, etc.
>
> --
> john r pierce 37N 122W
> somewhere on the middle of the left coast
>
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message John R Pierce 2013-12-06 20:04:06 Re: Fwd: Help!Why CPU Usage and LoadAverage Jump up Suddenly
Previous Message Janek Sendrowski 2013-12-06 16:21:13 Re: [PERFORM] Similarity search with the tsearch2 extension