Re: RAID arrays and performance

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Matthew" <matthew(at)flymine(dot)org>
Cc: "Mark Mielke" <mark(at)mark(dot)mielke(dot)cc>, <pgsql-performance(at)postgresql(dot)org>
Subject: Re: RAID arrays and performance
Date: 2008-01-29 14:55:39
Message-ID: 87bq74vfys.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

"Matthew" <matthew(at)flymine(dot)org> writes:

> On Tue, 4 Dec 2007, Gregory Stark wrote:
>> FWIW I posted some numbers from a synthetic case to pgsql-hackers
>>
>> http://archives.postgresql.org/pgsql-hackers/2007-12/msg00088.php
>...
> This was with 8192 random requests of size 8192 bytes from an 80GB test file.
> Unsorted requests ranged from 1.8 MB/s with no prefetching to 28MB/s with lots
> of prefetching. Sorted requests went from 2.4MB/s to 38MB/s. That's almost
> exactly 16x improvement for both, and this is top of the line hardware.

Neat. The curves look very similar to mine. I also like that with your
hardware the benefit maxes out at pretty much exactly where I had
mathematically predicted they would ((stripe size)^2 / 2).

> So, this is FYI, and also an added encouragement to implement fadvise
> prefetching in some form or another. How's that going by the way?

I have a patch which implements it for the low hanging fruit of bitmap index
scans. it does it using an extra trip through the buffer manager which is the
least invasive approach but not necessarily the best.

Heikki was pushing me to look at changing the buffer manager to support doing
asynchronous buffer reads. In that scheme you would issue a ReadBufferAsync
which would give you back a pinned buffer which the scan would have to hold
onto and call a new buffer manager call to complete the i/o when it actually
needed the buffer. The ReadBufferAsync call would put the buffer in some form
of i/o in progress.

On systems with only posix_fadvise ReadBufferAsync would issue posix_fadvise
and ReadBufferFinish would issue a regular read. On systems with an effective
libaio the ReadBufferAsync could issue the aio operation and ReadBufferFinish
could then do a aio_return.

The pros of such an approach would be less locking and cpu overhead in the
buffer manager since the second trip would have the buffer handle handy and
just have to issue the read.

The con of such an approach would be the extra shared buffers occupied by
buffers which aren't really needed yet. Also the additional complexity in the
buffer manager with the new i/o in progress state. (I'm not sure but I don't
think we get to reuse the existing i/o in progress state.)

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training!

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Scott Marlowe 2008-01-29 15:00:22 Re: 8x2.5" or 6x3.5" disks
Previous Message Adrian Moisey 2008-01-29 14:28:45 analyze