Re: RAID for the DB filesystem

From: Greg Spiegelberg <gspiegelberg(at)gmail(dot)com>
To: "[ADMIN]" <pgsql-admin(at)postgresql(dot)org>
Subject: Re: RAID for the DB filesystem
Date: 2009-08-04 13:11:42
Message-ID: 22723570908040611kc21c3dev8e43da5497ac26a7@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

On Tue, Aug 4, 2009 at 2:17 AM, Brian Modra <epailty(at)googlemail(dot)com> wrote:

> 2009/8/3 Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
> >
> > On Mon, Aug 3, 2009 at 10:15 AM, Brian Modra<brian(at)zwartberg(dot)com> wrote:
> >
> > Is there a valid reason you're NOT considering RAID-1 here? I hope
> > RAID-0 is a typo.
>
> It was an error. I wanted mirroring. But... on second thoughts, is
> there really a good reason for using a second set of disks for the OS?
> Once the database is running, its surely not going to be using the OS
> disk much, so why not just make a big RAID 10 array and use that for
> both OS and DB... partition it as usual I mean - boot, root. Should I
> use another disk for swap... for that matter, do I need swap at all...
> RAM with be at least 16GB?
>

Initially, I would agree with you that placing the OS and database on the
same RAID config sounds logical but once you go through some "disasters"
you'll realize it's a putting all your eggs in the same basket kind of
thing.

With the OS, and presumably backup software, on it's own RAID config you can
recover the database, assuming you lost it due to hardware failure, without
having to recover the OS. This is a nice thing especially if you're remote
from your servers, like I am, and do not have the luxury of being able to
pop a CD in the server's drive to load the OS again. That's just one case
and not a database-admin one but I'm sure there are others.

>
> > Then I question the expertise of your experts. RAID5 is not fine.
> > It's slow, more prone to loss due to drive loss, and generally not a
> > good choice for databases.
> >
> > I would gladly have more SATA drives in a RAID-10 than fewer SAS
> > drives in a RAID-5.
> >
> > if someone is worried about "wasting" disk space tell them to worry
> > about something else, like losing data.
>
>
On the performance argument, I wholeheartedly agree that RAID-5 is not where
it's at. Sequential I/O is on-par with other RAID types but when it comes
to random I/O it's one of, if not, the worst of the bunch.

From a recoverability angle, losing a disk in a RAID-5 isn't the end of the
world but your world will spin much, much slower than it did while it's
recalculating all those parity blocks and while doing so you're at disk of
data loss if a second drive goes.

There are units out there that allow for mirrored RAID-5, RAID-5+1, to
protect from multiple disk failures however at that point RAID-10 is the
route to go.

There are units that 'format' the RAID group only where the disk has been
allocated. In other words, if you have a 4 disk RAID-6 and 25% of it has
been allocated to LUN(s) then the controller will have the parity calculated
for only that 25% in use. Makes recovery quicker in an underallocated
situations but there is still a window with a RAID-5 recovery where a second
disk failure kills the whole operation. RAID-6 however is better in this
case b/c it takes a third disk failure before data loss but you had better
have a second spare waiting in the wings.

I don't believe RAID-10's are perfect either. If your RAID-10 is really 2
RAID-0's mirrored, i.e. RAID-0+1, and you have 2 disks failure, one in each
RAID-0, then that's a go-to-tape situation. If your RAID-10 is really
multiple mirrors striped, i.e. a true RAID-10 or RAID-1+0, you're just as
susceptible to data loss except you must lose both sides of a single
mirror. Not as likely but still possible.

Recovery in either RAID-10 setups is simpler than the parity RAID's in that
a disk must be copied only and no parity calculated. This is still a window
where a second disk failure could result in data loss.

I believe that regardless of your selection you must, must, must look at
things as a 3 year solution, 5 years at the most. As those disks spin and
age the likelihood of multiple failures increase. You may not have to
replace a single disk in the first 3 years but should you lose power and
those drives spin down and cool the odds of one or more not spinning back up
are pretty good. Trust me, I've experienced that many times. Plan to
replace aging units.

In the end, people will do what people will do and most likely the largest
factor won't be performance, protection or recoverability but instead it
will be money. If you're lucky, money isn't an issue.

Greg

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Steve Crawford 2009-08-04 16:54:33 Re: Help! Upgrade to 8.4 dropped my databases
Previous Message Plugge, Joe R. 2009-08-04 11:49:17 Re: Slony-I Version with Postgres 8.4.0