Re: heavy load-high cpu itilization

From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: heavy load-high cpu itilization
Date: 2011-07-26 22:46:34
Message-ID: 4E2F43CA.2070507@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 07/26/2011 01:47 PM, Filippos wrote:
> we have a dedicated server (8.4.4, redhat) with 24 cpus and 36 GB or RAM. i
> would say that the traffic in the server is huge and the cpu utilization is
> pretty high too (avg ~ 75% except during the nights when is it much lower).
> i am trying to tune the server a little bit to handle this problem. the
> incoming data in the database are about 30-40 GB /day.
>

Well, the first question is where the CPU usage is coming from. There
are two basic schools of thought here:

1) Use real-time monitoring utilities like "top -c" and see what is
gobbling time up. It's possible to miss what's happening, but if you're
at 75% a large chunk of the day that doesn't seem likely.

2) Set log_min_duration_statement and the other logging parameters;
analyze the resulting log files to see where the CPU time is going.

You seem to be focused on the background writer and its checkpoint
process right now. That cannot be the source for high CPU usage; at
most it could fully use one of your 24 cores. You should fix
wal_buffers to a reasonable value regardless, but your problem is not in
that area.

Importing 30-40 GB/day is extremely difficult to do in PostgreSQL. My
guess is that most of the server time is spent running the data import
process itself--even COPY, the most efficient way to get data in, is
very CPU intensive. The second layer of problems here that can increase
CPU usage come from autovacuum taking up a lot of resources to run,
which it will do all the time given this volume of activity. And
there's always the possibility that the queries you're running against
the data are just taking a long time to execute.

Another layer of problems in this scenario you'll hit eventually is that
you'll need to have your tables partitioned in order to prune old data
out efficiently. Presumably you can't keep up with that rate for very
long before you have to start dropping older data, and that's really
hard to do efficiently unless you've used partitions.

P.S. You should upgrade to PostgreSQL 8.4.8 as soon as possible. There
is a bug in autovacuum that's been resolved as of 8.4.6 that you are
very likely to run into:
http://www.postgresql.org/docs/8.4/static/release-8-4-6.html

P.P.S. The pgsql-performance list would be a more appropriate place to
have this discussion at. Some of the people who provide good input over
there on topics like this don't read pgsql-general, too many messages on
this list for them.

> are there any suggestions what i can do to tune better the server? i can
> provide any information you find relevant for the configuration of the
> server, the OS, the storage etc
>

There's a chapter on each of these in my PostgreSQL performance book,
and I'm not aware of any other resource that takes on all of these
topics usefully. If you're trying to keep up with this volume of data,
buying a copy of that should repay itself in time savings--where you can
look something up rather than trying to figure it out from
scratch--about once every week.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Michael Nolan 2011-07-26 22:48:09 Suggested enhancement to pg_restore
Previous Message Radosław Smogura 2011-07-26 19:58:24 Re: 100 times faster than mysql