Quick Links

Re: Slow queries / commits, mis-configuration or hardware issues?

From:	"Tomas Vondra" <tv(at)fuzzy(dot)cz>
To:	"Cody Caughlan" <toolbag(at)gmail(dot)com>
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: Slow queries / commits, mis-configuration or hardware issues?
Date:	2011-11-14 19:22:52
Message-ID:	87e450fd65045c328f41e1b99a4029d5.squirrel@sq.gransy.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

On 14 Listopad 2011, 19:16, Cody Caughlan wrote:
> shared_buffers = 3584MB
> wal_buffers = 16MB
> checkpoint_segments = 32
> max_wal_senders = 10
> checkpoint_completion_target = 0.9
> wal_keep_segments = 1024
> maintenance_work_mem = 256MB
> work_mem = 88MB
> shared_buffers = 3584MB
> effective_cache_size = 10GB

Seems reasonable, although I'd bump up the checkpoint_timeout (the 5m is
usually too low).

> The PGDATA dir is a RAID10 on 4 local ("ephemeral" in EC2 speak)
> drives. I ran some dd tests and received the following output:
>
> --- WRITING ---
> root(at)sql03:/data# time sh -c "dd if=/dev/zero of=/data/tmp/bigfile
> bs=8k count=4000000 && sync"
> 4000000+0 records in
> 4000000+0 records out
> 32768000000 bytes (33 GB) copied, 670.663 s, 48.9 MB/s
>
> real 11m52.199s
> user 0m2.720s
> sys 0m45.330s

This measures sequential write performance (and the same holds for the
read test). We need to know the random I/O performance too - use bonnie++
or similar tool.

Based on the AWS benchmarks I've seen so far, I'd expect about 90 MB/s for
sequential read/writes, and about twice that performance for a 4-drive
RAID10. So while the reads (211 MB/s) seem perfectly OK, the writes
(50MB/s) are rather slow. Have you measured this on an idle system, or
when the db was running?

See for example this:

[1] http://victortrac.com/EC2_Ephemeral_Disks_vs_EBS_Volumes
[2]
http://www.gabrielweinberg.com/blog/2011/05/raid0-ephemeral-storage-on-aws-ec2.html

> I have enabled log_checkpoints and here is a recent sample from the log:
> 2011-11-14 17:39:15 UTC pid:3965 (-0) LOG: checkpoint starting: time
> 2011-11-14 17:43:49 UTC pid:3965 (-0) LOG: checkpoint complete: wrote
> 16462 buffers (3.6%); 0 transaction log file(s) added, 0 removed, 9
> recycled; write=269.978 s, sync=4.106 s, total=274.117 s; sync
> files=82, longest=2.943 s, average=0.050 s

Nothing special here - this just says that the checkpoints were timed and
finished on time (the default checkpoint timeout is 5 minutes, with
completion target 0.9 the expected checkpoint time is about 270s). Not a
checkpoint issue, probably.

> I've been collecting random samples from pg_stat_bgwriter:
> https://gist.github.com/4faec2ca9a79ede281e1

Although it's a bit difficult to interpret this (collect the data in
regular intervals - e.g. every hour - and post the differences, please),
but it seems reasonable.

> So given all this information (if you need more just let me know), is
> there something fundamentally wrong or mis-configured? Do I have an
> I/O issue?

Probably - the discrepancy between read/write performance is a bit
suspicious.

Try to watch the I/O performance when this happens, i.e. run "iostat -x"
and watch the output (especially %util, r_await, w_await) and post several
lines of the output.

Tomas

In response to

Slow queries / commits, mis-configuration or hardware issues? at 2011-11-14 18:16:46 from Cody Caughlan

Responses

Re: Slow queries / commits, mis-configuration or hardware issues? at 2011-11-14 21:58:52 from Cody Caughlan

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Shaun Thomas	2011-11-14 21:33:35	Re: Using incrond for archiving
Previous Message	Cody Caughlan	2011-11-14 18:16:46	Slow queries / commits, mis-configuration or hardware issues?