Re: How to improve db performance with $7K?

From: Alex Turner <armtuk(at)gmail(dot)com>
To: Jacques Caron <jc(at)directinfos(dot)com>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, William Yu <wyu(at)talisys(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: How to improve db performance with $7K?
Date: 2005-04-18 19:26:31
Message-ID: 33c6269f050418122650848e37@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 4/18/05, Jacques Caron <jc(at)directinfos(dot)com> wrote:
> Hi,
>
> At 20:21 18/04/2005, Alex Turner wrote:
> >So I wonder if one could take this stripe size thing further and say
> >that a larger stripe size is more likely to result in requests getting
> >served parallized across disks which would lead to increased
> >performance?
>
> Actually, it would be pretty much the opposite. The smaller the stripe
> size, the more evenly distributed data is, and the more disks can be used
> to serve requests. If your stripe size is too large, many random accesses
> within one single file (whose size is smaller than the stripe size/number
> of disks) may all end up on the same disk, rather than being split across
> multiple disks (the extreme case being stripe size = total size of all
> disks, which means concatenation). If all accesses had the same cost (i.e.
> no seek time, only transfer time), the ideal would be to have a stripe size
> equal to the number of disks.
>
[snip]

Ahh yes - but the critical distinction is this:
The smaller the stripe size, the more disks will be used to serve _a_
request - which is bad for OLTP because you want fewer disks per
request so that you can have more requests per second because the cost
is mostly seek. If more than one disk has to seek to serve a single
request, you are preventing that disk from serving a second request at
the same time.

To have more throughput in MB/sec, you want a smaller stripe size so
that you have more disks serving a single request allowing you to
multiple by effective drives to get total bandwidth.

Because OLTP is made up of small reads and writes to a small number of
different files, I would guess that you want those files split up
across your RAID, but not so much that a single small read or write
operation would traverse more than one disk. That would infer that
your optimal stripe size is somewhere on the right side of the bell
curve that represents your database read and write block count
distribution. If on average the dbwritter never flushes less than 1MB
to disk at a time, then I guess your best stripe size would be 1MB,
but that seems very large to me.

So I think therefore that I may be contending the exact opposite of
what you are postulating!

Alex Turner
netEconomist

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Bill Chandler 2005-04-18 19:27:08 Question on vacuumdb
Previous Message Bill Chandler 2005-04-18 19:21:42 Question on REINDEX