From: | Robert Treat <rob(at)xzilla(dot)net> |
---|---|
To: | Tomas Vondra <tv(at)fuzzy(dot)cz> |
Cc: | "Gaëtan Allart" <gaetan(at)nexylan(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: General performance/load issue |
Date: | 2011-11-24 15:39:58 |
Message-ID: | CABV9wwOyDFwXzuPjvyTOGUDmnmoMT-orzx0+oYiKRhEG7SZ+0Q@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Thu, Nov 24, 2011 at 9:09 AM, Tomas Vondra <tv(at)fuzzy(dot)cz> wrote:
> On 24 Listopad 2011, 14:51, Gaëtan Allart wrote:
>> Hello everyone,
>>
>> I'm having some troubles with a Postgresql server.
>> We're using PG has a database backend for a very big website (lots of data
>> and much traffic).
>>
>> The issue : server suddenly (1H after restart) becomes slow (queries not
>> responding), load rises (>20 instead of 1), iowait rises (20 to 70%)
>>
>> Version : 9.0.5
>> Server : Dual Xeon X5650 (24 cores total)
>> Memory : 48 GB
>> Disks : SSD
>>
>>
>> Top when overloaded :
>
> Top is not the most useful tool here, I guess. Use "iotop" (will show you
> which processes are doing the I/O) and tools like vmstat / iostat.
>
>> Postgresql.conf :
>>
>> max_connections = 50
>> shared_buffers = 12G
>> temp_buffers = 40MB
>> work_mem = 128MB
>> maintenance_work_mem = 256MB
>> max_files_per_process = 8192
>> checkpoint_segments = 256
>> checkpoint_timeout = 30min
>> checkpoint_completion_target = 0.9
>
> Fine. Let's see the options that look suspicious.
>
I think you missed some suspicious settings... I'd recommend setting
shared buffers to 8gb, and I'd likely reduce checkpoint segements to
30 and set the checkpoint timeout back to 5 minutes. Everything about
the way this server is configured (including those vm settings) is
pushing it towards delaying the WAL/Buffer/Checkpoint as long as
possible, which matches with the idea of good performance initial
followed by a period of poor performance and heavy i/o.
On a side note, I'd guess your work_mem is probably too high. 50
(connections) x 128 (mb work mem) x 2 (sorts per query) = 12GB RAM,
which is 25% of total ram on the box. That doesn't necessarily mean
game over, but it seem like it wouldn't be that hard to get thrashing
being set up that way. YMMV.
Robert Treat
conjecture: xzilla.net
consulting: omniti.com
From | Date | Subject | |
---|---|---|---|
Next Message | Phoenix Kiula | 2011-11-24 15:49:36 | Re: Incremental backup with RSYNC or something? |
Previous Message | Pavel Stehule | 2011-11-24 15:37:19 | Re: Reassign value of IN parameter in 9.1.1 |