From: | "Tomas Vondra" <tv(at)fuzzy(dot)cz> |
---|---|
To: | "Cody Caughlan" <toolbag(at)gmail(dot)com> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Slow queries / commits, mis-configuration or hardware issues? |
Date: | 2011-11-14 19:22:52 |
Message-ID: | 87e450fd65045c328f41e1b99a4029d5.squirrel@sq.gransy.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On 14 Listopad 2011, 19:16, Cody Caughlan wrote:
> shared_buffers = 3584MB
> wal_buffers = 16MB
> checkpoint_segments = 32
> max_wal_senders = 10
> checkpoint_completion_target = 0.9
> wal_keep_segments = 1024
> maintenance_work_mem = 256MB
> work_mem = 88MB
> shared_buffers = 3584MB
> effective_cache_size = 10GB
Seems reasonable, although I'd bump up the checkpoint_timeout (the 5m is
usually too low).
> The PGDATA dir is a RAID10 on 4 local ("ephemeral" in EC2 speak)
> drives. I ran some dd tests and received the following output:
>
> --- WRITING ---
> root(at)sql03:/data# time sh -c "dd if=/dev/zero of=/data/tmp/bigfile
> bs=8k count=4000000 && sync"
> 4000000+0 records in
> 4000000+0 records out
> 32768000000 bytes (33 GB) copied, 670.663 s, 48.9 MB/s
>
> real 11m52.199s
> user 0m2.720s
> sys 0m45.330s
This measures sequential write performance (and the same holds for the
read test). We need to know the random I/O performance too - use bonnie++
or similar tool.
Based on the AWS benchmarks I've seen so far, I'd expect about 90 MB/s for
sequential read/writes, and about twice that performance for a 4-drive
RAID10. So while the reads (211 MB/s) seem perfectly OK, the writes
(50MB/s) are rather slow. Have you measured this on an idle system, or
when the db was running?
See for example this:
[1] http://victortrac.com/EC2_Ephemeral_Disks_vs_EBS_Volumes
[2]
http://www.gabrielweinberg.com/blog/2011/05/raid0-ephemeral-storage-on-aws-ec2.html
> I have enabled log_checkpoints and here is a recent sample from the log:
> 2011-11-14 17:39:15 UTC pid:3965 (-0) LOG: checkpoint starting: time
> 2011-11-14 17:43:49 UTC pid:3965 (-0) LOG: checkpoint complete: wrote
> 16462 buffers (3.6%); 0 transaction log file(s) added, 0 removed, 9
> recycled; write=269.978 s, sync=4.106 s, total=274.117 s; sync
> files=82, longest=2.943 s, average=0.050 s
Nothing special here - this just says that the checkpoints were timed and
finished on time (the default checkpoint timeout is 5 minutes, with
completion target 0.9 the expected checkpoint time is about 270s). Not a
checkpoint issue, probably.
> I've been collecting random samples from pg_stat_bgwriter:
> https://gist.github.com/4faec2ca9a79ede281e1
Although it's a bit difficult to interpret this (collect the data in
regular intervals - e.g. every hour - and post the differences, please),
but it seems reasonable.
> So given all this information (if you need more just let me know), is
> there something fundamentally wrong or mis-configured? Do I have an
> I/O issue?
Probably - the discrepancy between read/write performance is a bit
suspicious.
Try to watch the I/O performance when this happens, i.e. run "iostat -x"
and watch the output (especially %util, r_await, w_await) and post several
lines of the output.
Tomas
From | Date | Subject | |
---|---|---|---|
Next Message | Shaun Thomas | 2011-11-14 21:33:35 | Re: Using incrond for archiving |
Previous Message | Cody Caughlan | 2011-11-14 18:16:46 | Slow queries / commits, mis-configuration or hardware issues? |