From: | "Tomas Vondra" <tv(at)fuzzy(dot)cz> |
---|---|
To: | "Gaëtan Allart" <gaetan(at)nexylan(dot)com> |
Cc: | "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: General performance/load issue |
Date: | 2011-11-24 14:09:34 |
Message-ID: | ccc3da91f69e1407b12b529fb263812b.squirrel@sq.gransy.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On 24 Listopad 2011, 14:51, Gaëtan Allart wrote:
> Hello everyone,
>
> I'm having some troubles with a Postgresql server.
> We're using PG has a database backend for a very big website (lots of data
> and much traffic).
>
> The issue : server suddenly (1H after restart) becomes slow (queries not
> responding), load rises (>20 instead of 1), iowait rises (20 to 70%)
>
> Version : 9.0.5
> Server : Dual Xeon X5650 (24 cores total)
> Memory : 48 GB
> Disks : SSD
>
>
> Top when overloaded :
Top is not the most useful tool here, I guess. Use "iotop" (will show you
which processes are doing the I/O) and tools like vmstat / iostat.
> Postgresql.conf :
>
> max_connections = 50
> shared_buffers = 12G
> temp_buffers = 40MB
> work_mem = 128MB
> maintenance_work_mem = 256MB
> max_files_per_process = 8192
> checkpoint_segments = 256
> checkpoint_timeout = 30min
> checkpoint_completion_target = 0.9
Fine. Let's see the options that look suspicious.
> effective_cache_size = 12GB
Why have you set it like this? According to the "free" output you've
posted the cache has about 38G, so why just 12G here? There are possible
reasons, but I don't think this is the case.
> fsync = off
A really bad idea. I guess your data are worthless to you, right?
> seq_page_cost = 2.0
> random_page_cost = 2.0
Eh? First of all, what really matters is the relative value of those two
values, and it's good habit to leave seq_page_cost = 1.0 and change just
the other values.
Plus the random I/O is not as cheap as sequential I/O even on SSD drives,
so I't recommend something like this:
seq_page_cost = 1.0
random_page_cost = 2.0 (or maybe 1.5)
Anyway this needs to be tested properly - watch the performance and tune
if needed.
> Did I do anything wrong? Any idea?
Not sure. My guess is you're getting bitten by a checkpoint. We need to
know a few more details.
1) What is dirty_background_ratio / dirty_ratio (see /proc/sys/vm/ directory)
2) enable log_checkpoints in postgresql.conf and see how it correlates to
the bad performance
3) check which processes are responsible for the I/O (use iotop)
Tomas
From | Date | Subject | |
---|---|---|---|
Next Message | Alban Hertroys | 2011-11-24 14:12:13 | Re: Reassign value of IN parameter in 9.1.1 |
Previous Message | Gavin Casey | 2011-11-24 13:52:16 | Reassign value of IN parameter in 9.1.1 |