From: | Brian Fehrle <brianf(at)consistentstate(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-general General <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Server hitting 100% CPU usage, system comes to a crawl. |
Date: | 2011-11-01 17:56:00 |
Message-ID: | 4EB032B0.1070906@consistentstate.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Update on this:
We did a switchover to another machine with the same hardware, however
this system was running on some older parameters we had set in the
postgresql.conf file.
So we went from 400 max_connections to 200 max_connections, and 160MB
work_mem to 200MB work_mem. And now on this other system, so far it
seems to be running ok.
Other than the obvious fact that each connection has a certain amount of
memory usage, is there something else to watch for when increasing
connections to numbers like 400? When we had the issue of the system
jumping to 100% cpu usage, even at that point our number of backends to
the cluster was at MAX 250, but generally in the 175 range, so well
below our 400 max_connections we allow. So could this be the culprit?
I'll be watching the cluster as we run on the new configuration (with
only 200 max_connections).
- Brian F
On 10/27/2011 03:22 PM, Brian Fehrle wrote:
> On 10/27/2011 02:50 PM, Tom Lane wrote:
>> Brian Fehrle<brianf(at)consistentstate(dot)com> writes:
>>> Hi all, need some help/clues on tracking down a performance issue.
>>> PostgreSQL version: 8.3.11
>>> I've got a system that has 32 cores and 128 gigs of ram. We have
>>> connection pooling set up, with about 100 - 200 persistent connections
>>> open to the database. Our applications then use these connections to
>>> query the database constantly, but when a connection isn't currently
>>> executing a query, it's<IDLE>. On average, at any given time, there are
>>> 3 - 6 connections that are actually executing a query, while the rest
>>> are<IDLE>.
>>> About once a day, queries that normally take just a few seconds slow
>>> way
>>> down, and start to pile up, to the point where instead of just having
>>> 3-6 queries running at any given time, we get 100 - 200. The whole
>>> system comes to a crawl, and looking at top, the CPU usage is 99%.
>> This is jumping to a conclusion based on insufficient data, but what you
>> describe sounds a bit like the sinval queue contention problems that we
>> fixed in 8.4. Some prior reports of that:
>> http://archives.postgresql.org/pgsql-performance/2008-01/msg00001.php
>> http://archives.postgresql.org/pgsql-performance/2010-06/msg00452.php
>>
>> If your symptoms match those, the best fix would be to update to 8.4.x
>> or later, but a stopgap solution would be to cut down on the number of
>> idle backends.
>>
>> regards, tom lane
> That sounds somewhat close to the same issue I am seeing. Main
> differences being that my spike lasts for much longer than a few
> minutes, and can only be resolved when the cluster is restarted. Also,
> that second link shows TOP where much of the CPU is via the 'user',
> rather than the 'sys' like mine.
>
> Is there anything I can look at more to get more info on this 'sinval
> que contention problem'?
>
> Also, having my cpu usage high in 'sys' rather than 'us', could that
> be a red flag? Or is that normal?
>
> - Brian F
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2011-11-01 18:17:57 | Re: Server hitting 100% CPU usage, system comes to a crawl. |
Previous Message | James B. Byrne | 2011-11-01 16:01:25 | Re: [pgsql-general] Need Help With a A Simple Query That's Not So Simple |