From: | Scott Carey <scott(at)richrelevance(dot)com> |
---|---|
To: | Mark Kirkwood <markir(at)paradise(dot)net(dot)nz>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: Raid 10 chunksize |
Date: | 2009-03-25 01:48:36 |
Message-ID: | C5EEDB84.3B34%scott@richrelevance.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On 3/24/09 6:09 PM, "Mark Kirkwood" <markir(at)paradise(dot)net(dot)nz> wrote:
> I'm trying to pin down some performance issues with a machine where I
> work, we are seeing (read only) query response times blow out by an
> order of magnitude or more at busy times. Initially we blamed
> autovacuum, but after a tweak of the cost_delay it is *not* the
> problem. Then I looked at checkpoints... and altho there was some
> correlation with them and the query response - I'm thinking that the
> raid chunksize may well be the issue.
>
> Fortunately there is an identical DR box, so I could do a little
> testing. Details follow:
>
> Sun 4140 2x quad-core opteron 2356 16G RAM, 6x 15K 140G SAS
> Debian Lenny
> Pg 8.3.6
>
> The disk is laid out using software (md) raid:
>
> 4 drives raid 10 *4K* chunksize with database files (ext3 ordered, noatime)
> 2 drives raid 1 with database transaction logs (ext3 ordered, noatime)
>
>
> Top looks like:
>
> Cpu(s): 2.5%us, 1.9%sy, 0.0%ni, 71.9%id, 23.4%wa, 0.2%hi, 0.2%si,
> 0.0%st
> Mem: 16474084k total, 15750384k used, 723700k free, 1654320k buffers
> Swap: 2104440k total, 944k used, 2103496k free, 13552720k cached
>
> It looks to me like we are maxing out the raid 10 array, and I suspect
> the chunksize (4K) is the culprit. However as this is a pest to change
> (!) I'd like some opinions on whether I'm jumping to conclusions. I'd
> also appreciate comments about what chunksize to use (I've tended to use
> 256K in the past, but what are folks preferring these days?)
>
> regards
>
> Mark
>
>
md tends to work great at 1MB chunk sizes with RAID 1 or 10 for whatever
reason. Unlike a hardware raid card, smaller chunks aren't going to help
random i/o as it won't read the whole 1MB or bother caching much. Make sure
any partitions built on top of md are 1MB aligned if you go that route.
Random I/O on files smaller than 1MB would be affected -- but that's not a
problem on a 16GB RAM server running a database that won't fit in RAM.
Your xlogs are occasionally close to max usage too -- which is suspicious at
10MB/sec. There is no reason for them to be on ext3 since they are a
transaction log that syncs writes so file system journaling doesn't mean
anything. Ext2 there will lower the sync times and reduced i/o utilization.
I also tend to use xfs if sequential access is important at all (obviously
not so in pg_bench). ext3 is slightly safer in a power failure with unsyncd
data, but Postgres has that covered with its own journal anyway so those
differences are irrelevant.
From | Date | Subject | |
---|---|---|---|
Next Message | David Rees | 2009-03-25 02:04:42 | Re: Raid 10 chunksize |
Previous Message | Mark Kirkwood | 2009-03-25 01:09:07 | Raid 10 chunksize |