Re: [PERFORM] Arguments Pro/Contra Software Raid

From: Lincoln Yeoh <lyeoh(at)pop(dot)jaring(dot)my>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Scott Ribe <scott_ribe(at)killerbytes(dot)com>
Cc: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, PostgreSQL General <pgsql-general(at)postgresql(dot)org>
Subject: Re: [PERFORM] Arguments Pro/Contra Software Raid
Date: 2006-05-14 08:31:00
Message-ID: 5.2.1.1.1.20060514162127.027ea4f8@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-performance

At 11:53 AM 5/12/2006 -0400, Tom Lane wrote:

>Scott Ribe <scott_ribe(at)killerbytes(dot)com> writes:
> >> My damn powerbook drive recently failed with very little warning
>
> > It seems to me that S.M.A.R.T. reporting is a crock of shit. I've had ATA
> > drives report everything OK while clearly in the final throes of death,
> just
> > minutes before total failure.
>
>FWIW, I replaced a powerbook's drive about two weeks ago myself, and its
>SMART reporting didn't show a darn thing wrong either. Fortunately, the
>drive started acting noticeably weird (long pauses seemingly trying to
>recalibrate itself) while still working well enough that I was able to
>get everything copied off it. I didn't wait for it to fail completely ;-)

Strange. With long pauses, usually you'd see stuff like "crc" errors in the
logs, and you'd get some info from the SMART monitoring stuff.

I guess a lot of it depends on the drive model and manufacturer.

SMART reporting is better than nothing, and it's actually not too bad. It's
just whether manufacturers implement it in useful ways or not.

I wouldn't trust the drive or manufacturer's judgement on when failure is
imminent - the drive usually gathers statistics etc and these are typically
readable with the SMART monitoring/reporting software, so you should check
those stats and decide for yourself when failure is imminent.

For example: I'd suggest regarding any non-cable related CRC errors, or
seek failures as "drive replacement time"- even if the drive or
Manufacturer thinks you need to have tons in a row for "failure imminent".

I recommend "blacklisting" drives which don't notice anything before it is
too late. e.g. even if it starts taking a long time to read a block, it
reports no differences in the SMART stats.

Link.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Tino Wildenhain 2006-05-14 08:44:48 Re: Mac Problem with Tunneling...
Previous Message Markus Schiltknecht 2006-05-14 06:01:39 Re: rules: evaluate inputs in advance

Browse pgsql-performance by date

  From Date Subject
Next Message Simon Riggs 2006-05-15 08:48:33 Re: Wrong plan for subSELECT with GROUP BY
Previous Message Tom Lane 2006-05-13 21:26:06 Re: Firebird 1.5.3 X Postgresql 8.1.3 (linux and windows)