Re: Plug-pull testing worked, diskchecker.pl failed

From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Plug-pull testing worked, diskchecker.pl failed
Date: 2012-10-27 05:26:33
Message-ID: 508B7089.6080107@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 10/24/12 4:04 PM, Chris Angelico wrote:

> Is this a useful and plausible testing methodology? It's definitely
> showed up some failures. On a hard-disk, all is well as long as the
> write-back cache is disabled; on the SSDs, I can't make them reliable.

On Linux systems, you can tell when Postgres is busy writing data out
during a checkpoint because the "Dirty:" amount will be dropping
rapidly. At most other times, that number goes up. You can try to
increase the odds of finding database level corruption during a pull the
plug test by trying to yank during that most sensitive moment. Combine
a reasonable write-heavy test like you've devised with that
"optimization", and systems that don't write reliably will usually
corrupt within a few tries.

In general, through, diskchecker.pl is the more sensitive test. If it
fails, storage is unreliable for PostgreSQL, period. It's good that
you've followed up by confirming the real database corruption implied by
that is also visible. In general, though, that's not needed.
Diskchecker says the drive is bad, you're done--don't put a database on
it. Doing the database level tests is more for finding false positives:
where diskchecker says the drive is OK, but perhaps there is a
filesystem problem that makes it unreliable, one that it doesn't test for.

What SSD are you using? The Intel 320 and 710 series models are the
only SATA-connected drives still on the market I know of that pass a
serious test. The other good models are direct PCI-E storage units,
like the FusionIO drives.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Chris Angelico 2012-10-27 06:41:02 Re: Plug-pull testing worked, diskchecker.pl failed
Previous Message Xiong He 2012-10-27 03:04:31 Re: PostgresQL intallation error