From: | Merlin Moncure <mmoncure(at)gmail(dot)com> |
---|---|
To: | Shiva Raman <raman(dot)shivag(at)gmail(dot)com> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: High CPU load on Postgres Server during Peak times!!!! |
Date: | 2009-09-22 14:18:11 |
Message-ID: | b42b73150909220718y3dd8f0e4n3971596a0f6ef813@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Tue, Sep 22, 2009 at 9:54 AM, Shiva Raman <raman(dot)shivag(at)gmail(dot)com> wrote:
> Dear all
>
> I am having a problem of high cpu loads in my postgres server during peak
> time. Following are the
> details of my setup (details as per the postgres wiki) .
>
> * PostgreSQL version
> o Run "select pg_version();" in psql or PgAdmin III and provide the
> full, exact output.
>
>
> clusternode2:~ # rpm -qa | grep postgres
> postgresql-devel-8.1.9-1.2
> postgresql-8.1.9-1.2
> postgresql-docs-8.1.9-1.2
> postgresql-server-8.1.9-1.2
> postgresql-libs-64bit-8.1.9-1.2
> postgresql-libs-8.1.9-1.2
> postgresql-jdbc-8.1-12.2
> postgresql-contrib-8.1.9-1.2
>
>
> * A description of what you are trying to achieve and what results you
> expect.
>
> To keep the CPU Load below 10 , Now during peak times the load is nearing to
> 40
> At that time , it is not possible to access the data.
>
> * The EXACT text of the query you ran, if any
>
>
> * The EXACT output of that query if it's short enough to be reasonable to
> post
> o If you think the output is wrong, what you think should've been
> produced instead
>
> * The EXACT error message you get, if there is one
>
> As of now , i am unable to locate the exact query, the load shoots up
> abnormally during
> peak time is the main problem .
>
>
> * What program you're using to connect to PostgreSQL
>
> Jakarta Tomcat - Struts with JSP
>
>
> * What version of the ODBC/JDBC driver you're using, if any
>
> postgresql-jdbc-8.1-12.2
>
> * What you were doing when the error happened / how to cause the error.
> Describe in as much detail as possible, step by step, including command
> lines, SQL output, etc.
>
> When certain tables with more than 3 lakh items are concurrently accessed by
> more than 300
> users, the CPU load shoots up .
>
> * Is there anything remotely unusual in the PostgreSQL server logs?
> o On Windows these are in your data directory. On a default
> PostgreSQL install that'll be in C:\Program Files\PostgreSQL\8.4\data\pg_log
> (assuming you're using 8.4)
>
> The log file /var/log/postgresql has no data .
>
> o On Linux this depends a bit on distro, but you'll usually find
> them in /var/log/postgresql/.
> * Operating system and version
> o Linux users:
> + Linux distro and version
> + Kernel details (run "uname -a" on the terminal)
>
> SLES 10 SP3
> clusternode2:~ # uname -a
> Linux clusternode2 2.6.16.46-0.12-ppc64 #1 SMP Thu May 17 14:00:09 UTC 2007
> ppc64 ppc64 ppc64 GNU/Linux
>
>
>
> * What kind of hardware you have.
> o CPU manufacturer and model, eg "AMD Athlon X2" or "Intel Core 2
> Duo"
> o Amount and size of RAM installed, eg "2GB RAM"
>
> High Availability Cluster with two IBM P Series Server and one DS4700
> Storage
>
> IBM P series P52A with 2-core 2.1 Ghz POWER5+ Processor Card , 36 MB L3
> Cache ,16 GB of RAM,
> 73.4 GB 10,000 RPM Ultra320 SCSI Drive for Operating System .
>
>
>
> o Storage details (important for performance and corruption
> questions)
> + Do you use a RAID controller? If so, what type of
> controller? eg "3Ware Escalade 8500-8"
> # Does it have a battery backed cache module?
> # Is write-back caching enabled?
> + Do you use software RAID? If so, what software and what
> version? eg "Linux software RAID (md) 2.6.18-5-686 SMP mod_unload 686
> REGPARM gcc-4.1".
> # In the case of Linux software RAID you can get the
> details from the "modinfo md_mod" command
> + Is your PostgreSQL database on a SAN?
> # Who made it, what kind, etc? Provide what details you
> can.
> + How many hard disks are connected to the system and what
> types are they? You need to say more than just "6 disks". At least give
> maker, rotational speed and interface type, eg "6 15,000rpm Seagate SAS
> disks".
> + How are your disks arranged for storage? Are you using
> RAID? If so, what RAID level(s)? What PostgreSQL data is on what disks /
> disk sets? What file system(s) are in use?
> # eg: "Two disks in RAID 1, with all PostgreSQL data
> and programs stored on one ext3 file system."
> # eg: "4 disks in RAID 5 holding the pg data directory
> on an ext3 file system. 2 disks in RAID 1 holding pg_clog, pg_xlog, the
> temporary tablespace, and the sort scratch space, also on ext3.".
> # eg: "Default Windows install of PostgreSQL"
> + In case of corruption data reports:
> # Have you had any unexpected power loss lately?
> # Have you run a file system check? (chkdsk / fsck)
> # Are there any error messages in the system logs?
> (unix/linux: "dmesg", "/var/log/syslog" ; Windows: Event Viewer in Control
> Panel -> Administrative Tools )
>
>
> IBM SAN DS4700 Storage with Fibre Channel HDD (73.4 GB * 10)
> Two Partitions - 73.4 GB * 3 RAID 5 - 134 GB storage partitions (One holding
> Jakarata tomcat
> application server and other holding Postgresql Database) .
> Four Hard disk RAID 5 with ext3 file systems hold the pgdata on SAN .
> Hard disk rotational speed is 73 GB 15K IBM 2 GB Fibre channel
>
> No power loss, filesystem check also fine, No errors on /var/log/syslog
>
> Following is the output of TOP command during offpeak time.
>
>
> top - 18:36:56 up 77 days, 20:33, 1 user, load average: 12.99, 9.22, 10.37
> Tasks: 142 total, 12 running, 130 sleeping, 0 stopped, 0 zombie
> Cpu(s): 46.1%us, 1.9%sy, 0.0%ni, 6.1%id, 3.0%wa, 0.0%hi, 0.1%si,
> 42.9%st
> Mem: 16133676k total, 13657396k used, 2476280k free, 450908k buffers
> Swap: 14466492k total, 124k used, 14466368k free, 11590056k cached
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>
> 22458 postgres 19 0 2473m 477m 445m R 40 3.0 0:15.49 postmaster
>
> 22451 postgres 15 0 2442m 447m 437m S 33 2.8 0:30.44 postmaster
>
> 22464 postgres 17 0 2443m 397m 383m R 28 2.5 0:13.78 postmaster
>
> 22484 postgres 16 0 2448m 431m 412m S 20 2.7 0:02.73 postmaster
>
> 22465 postgres 17 0 2440m 461m 449m R 15 2.9 0:03.52 postmaster
>
> 22452 postgres 16 0 2450m 727m 706m R 13 4.6 0:23.46 postmaster
>
> 22476 postgres 16 0 2437m 413m 405m S 13 2.6 0:06.11 postmaster
>
> 22485 postgres 16 0 2439m 230m 222m R 7 1.5 0:05.72 postmaster
>
> 22481 postgres 15 0 2436m 175m 169m S 7 1.1 0:04.44 postmaster
>
> 22435 postgres 17 0 2438m 371m 361m R 6 2.4 1:17.92 postmaster
>
> 22440 postgres 17 0 2445m 497m 483m R 5 3.2 1:44.50 postmaster
>
> 22486 postgres 17 0 2432m 84m 81m R 4 0.5 0:00.76 postmaster
>
> 3 root 34 19 0 0 0 R 0 0.0 1:47.50 ksoftirqd/0
>
> 4726 root 15 0 29540 8776 3428 S 0 0.1 140:02.98 X
>
> 24950 root 15 0 0 0 0 S 0 0.0 0:30.96 pdflush
>
> 1 root 16 0 812 316 280 S 0 0.0 0:13.29 init
>
> 2 root RT 0 0 0 0 S 0 0.0 0:01.46 migration/0
>
> 4 root RT 0 0 0 0 S 0 0.0 0:00.78 migration/1
>
> 5 root 34 19 0 0 0 S 0 0.0 1:36.79 ksoftirqd/1
>
> 6 root RT 0 0 0 0 S 0 0.0 0:01.46 migration/2
>
> 7 root 34 19 0 0 0 R 0 0.0 1:49.83 ksoftirqd/2
>
> 8 root RT 0 0 0 0 S 0 0.0 0:00.79 migration/3
>
> 9 root 34 19 0 0 0 S 0 0.0 1:38.18 ksoftirqd/3
>
> 10 root 10 -5 0 0 0 S 0 0.0 1:02.11 events/0
>
> 11 root 10 -5 0 0 0 S 0 0.0 1:03.27 events/1
>
> 12 root 10 -5 0 0 0 S 0 0.0 1:01.76 events/2
>
> 13 root 10 -5 0 0 0 S 0 0.0 1:02.29 events/3
>
> 14 root 10 -5 0 0 0 S 0 0.0 0:00.01 khelper
>
> 1016 root 10 -5 0 0 0 S 0 0.0 0:00.00 kthread
>
> 1054 root 10 -5 0 0 0 S 0 0.0 0:03.08 kblockd/0
>
> 1055 root 10 -5 0 0 0 S 0 0.0 0:02.83 kblockd/1
>
> 1056 root 10 -5 0 0 0 S 0 0.0 0:03.19 kblockd/2
>
>
>
>
> The CPU Load shoots upto 40 during peak time.
>
> Following is my postgresql.conf (without comments)
>
> hba_file = '/var/lib/pgsql/data/pg_hba.conf'
> listen_addresses = '*'
> port = 5432
> max_connections = 1800
> shared_buffers = 300000
> max_fsm_relations = 1000
> effective_cache_size = 200000
> log_destination = 'stderr'
> redirect_stderr = on
> log_rotation_age = 0
> log_rotation_size = 10240
> silent_mode = onlog_line_prefix = '%t %d %u '
> autovacuum = on
> datestyle = 'iso, dmy'
> lc_messages = 'en_US.UTF-8'
> lc_monetary = 'en_US.UTF-8'
> lc_numeric = 'en_US.UTF-8'
> lc_time = 'en_US.UTF-8'
>
> User Access
> Total Number of Users is 500
> Maximum number of Concurrent users will be 500 during peak time
> Off Peak time the maximum number of concurrent user will be around 150 to
> 200.
>
>
> Please let me know your suggestions to improve the performance.
The very first step is to determine if you are cpu bound or i/o bound.
You need to monitor top or vmstat during high load period and report
the results here. Is the DS4700 direct attached? Sometimes using a
SAN can throw the iowait numbers off a bit. I bet you are simply
underpowered in I/O department.
merlin
From | Date | Subject | |
---|---|---|---|
Next Message | Andy Colson | 2009-09-22 14:19:46 | Re: High CPU load on Postgres Server during Peak times!!!! |
Previous Message | Stef Telford | 2009-09-22 14:05:30 | Hunting Unused Indexes .. is it this simple ? |