From: | Scott Whitney <scott(at)journyx(dot)com> |
---|---|
To: | jayknowsunix(at)gmail(dot)com |
Cc: | Craig James <cjames(at)emolecules(dot)com>, pgsql-admin(at)postgresql(dot)org |
Subject: | Re: Stuck LSI 9650SE-12 RAID Controller |
Date: | 2014-08-05 16:14:20 |
Message-ID: | 1611759767.760183.1407255260652.JavaMail.zimbra@mail.int.journyx.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin |
Unfortunately, yes, I have seen similar situations. On Adaptec, IBM ServeRAID and Perc cards.
I would replace that card, personally, with a new one. Likely the card itself is going flaky.
Usually when I have seen this, swapping the card for a like card and importing the RAID config from the drives resolves it, unless the card went REAL bad and actually damaged the RAID itself (which I have also seen).
----- Original Message -----
> Certainly doesn't sound like a PostgreSQL issue. Is there any sort of
> advanced diagnostics for the raid controller? I certainly would want to
> thrash it top to bottom before I trusted it enough to put it back in
> service.
> --
> Jay
> Sent from my iPad
> > On Aug 5, 2014, at 12:00 PM, Craig James <cjames(at)emolecules(dot)com> wrote:
> >
> > Has anyone seen anything like this?
> >
> > Our LSI 9650SE-12 RAID Controller dropped the main Postgres disk offline
> > ... it just disappeared as though the disk wasn't there. It was an 8-disk
> > RAID10 unit. The other unit (RAID1 for Linux & pg_xlog) was still
> > functional.
> >
> > Using tw_cli, it showed the array as "DEGRADED" and claimed to be verifying
> > it. One disk in the array was "DEGRADED". There was no /dev entry for the
> > device; Linux couldn't see it at all.
> >
> > There were two hot spares, but it didn't use them. Worse, there was nothing
> > I could do to make it do anything. Every command reported "Failed" and no
> > further explanation. Booting into the RAID BIOS gave the same problem: if
> > I selected "rebuild" or "verify", it said "You must select an array..."
> > even though I had selected the array. It was as though the array didn't
> > exist, yet it was shown.
> >
> > I shut off the computer, unplugged the BBU from the RAID card and plugged
> > it back in, unplugged and reinserted all the SATA cables, and then
> > restarted. Exact same symptoms.
> >
> > I finally gave up trying to recover the database (we had a backup server).
> > The RAID controller let me delete and recreate the degraded array, and now
> > everything seems fine. I can rebuild the Postgres database on the new
> > unit. But I've lost a HUGE amount of trust in the LSI 9650-SE RAID
> > controller card.
> >
> > Thanks,
> > Craig
> >
> --
> Sent via pgsql-admin mailing list (pgsql-admin(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-admin
From | Date | Subject | |
---|---|---|---|
Next Message | Craig James | 2014-08-05 17:11:31 | Re: Stuck LSI 9650SE-12 RAID Controller |
Previous Message | jayknowsunix | 2014-08-05 16:09:21 | Re: Stuck LSI 9650SE-12 RAID Controller |