Quick Links

Re: [PERFORM] Arguments Pro/Contra Software Raid

From:	Lincoln Yeoh <lyeoh(at)pop(dot)jaring(dot)my>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Scott Ribe <scott_ribe(at)killerbytes(dot)com>
Cc:	"Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, PostgreSQL General <pgsql-general(at)postgresql(dot)org>
Subject:	Re: [PERFORM] Arguments Pro/Contra Software Raid
Date:	2006-05-14 08:31:00
Message-ID:	5.2.1.1.1.20060514162127.027ea4f8@localhost
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general pgsql-performance

At 11:53 AM 5/12/2006 -0400, Tom Lane wrote:

>Scott Ribe <scott_ribe(at)killerbytes(dot)com> writes:
> >> My damn powerbook drive recently failed with very little warning
>
> > It seems to me that S.M.A.R.T. reporting is a crock of shit. I've had ATA
> > drives report everything OK while clearly in the final throes of death,
> just
> > minutes before total failure.
>
>FWIW, I replaced a powerbook's drive about two weeks ago myself, and its
>SMART reporting didn't show a darn thing wrong either. Fortunately, the
>drive started acting noticeably weird (long pauses seemingly trying to
>recalibrate itself) while still working well enough that I was able to
>get everything copied off it. I didn't wait for it to fail completely ;-)

Strange. With long pauses, usually you'd see stuff like "crc" errors in the
logs, and you'd get some info from the SMART monitoring stuff.

I guess a lot of it depends on the drive model and manufacturer.

SMART reporting is better than nothing, and it's actually not too bad. It's
just whether manufacturers implement it in useful ways or not.

I wouldn't trust the drive or manufacturer's judgement on when failure is
imminent - the drive usually gathers statistics etc and these are typically
readable with the SMART monitoring/reporting software, so you should check
those stats and decide for yourself when failure is imminent.

For example: I'd suggest regarding any non-cable related CRC errors, or
seek failures as "drive replacement time"- even if the drive or
Manufacturer thinks you need to have tons in a row for "failure imminent".

I recommend "blacklisting" drives which don't notice anything before it is
too late. e.g. even if it starts taking a long time to read a block, it
reports no differences in the SMART stats.

Link.

In response to

Re: [PERFORM] Arguments Pro/Contra Software Raid at 2006-05-12 15:53:56 from Tom Lane

Browse pgsql-general by date

	From	Date	Subject
Next Message	Tino Wildenhain	2006-05-14 08:44:48	Re: Mac Problem with Tunneling...
Previous Message	Markus Schiltknecht	2006-05-14 06:01:39	Re: rules: evaluate inputs in advance

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Simon Riggs	2006-05-15 08:48:33	Re: Wrong plan for subSELECT with GROUP BY
Previous Message	Tom Lane	2006-05-13 21:26:06	Re: Firebird 1.5.3 X Postgresql 8.1.3 (linux and windows)