Re: db recovery after raid5 failure

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Balkrishna Sharma" <b_ki(at)hotmail(dot)com>, <pgsql-admin(at)postgresql(dot)org>,<qcor(at)vp(dot)pl>
Subject: Re: db recovery after raid5 failure
Date: 2010-06-22 14:46:46
Message-ID: 4C20868602000025000327D4@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Balkrishna Sharma <b_ki(at)hotmail(dot)com> wrote:

> If the database is not extremely huge, makes you wonder what does
> a RAID actually give us.

Well, RAID5 gives you a situations where you must have a second
drive fail before recovery for the first failure is complete, versus
being instantly dead on a single-drive failure. RAID6 requires
three drives to fail in close succession (assuming a hot spare which
initiates recovery on failure). RAID10 requires that two paired
drives fail. We have about 100 database servers, and probably
average about two drive failures a month; having any down time from
them is rare because of RAID (and that's with us primarily using
RAID5).

> A robust near-realtime replication setup (say PITR + cloud)
> may be good enough against once in a few years of disk
> failure.atleast you don't add another point of failure that you
> (your database/OS) can't do anything about.

You've totally lost me there. "The cloud" still uses similar
techniques, just out of your sight and control. If you assume that
whoever is running it can do it better than you can, that's one
thing; just don't assume it's magic. The machines in my shop are
what I *can* do something about. Management here insists on near-
real-time backup using at least two completely independent
techniques to multiple machines in multiple buildings, with
continuous testing that all backups actually restore. If we were to
float data off into a cloud somewhere, I can guarantee we wouldn't
count on it without an alternative. As a place to put "one more
copy" it might make sense, as long as it had strong encryption.
(Again, you've lost all control over who has what access once you
send it into the cloud.)

-Kevin

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Tom Lane 2010-06-22 14:48:03 Re: blocking automatic vacuum
Previous Message Tom Lane 2010-06-22 14:37:02 Re: parallel option in pg_restore