Re: RAID 1 - drive failed - very slow queries even after drive replaced

From: Merrick <merrick(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: RAID 1 - drive failed - very slow queries even after drive replaced
Date: 2011-03-23 17:51:31
Message-ID: f04a26dd-fa2f-4f03-b897-f6778543b9e1@i35g2000prd.googlegroups.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Thank you Merlin, I had my suspicions about the hardware as well.

The backup server is blazing fast, it is definitely

"time to ramble on..."

On Mar 23, 7:11 am, mmonc(dot)(dot)(dot)(at)gmail(dot)com (Merlin Moncure) wrote:
> On Wed, Mar 23, 2011 at 3:33 AM, Merrick <merr(dot)(dot)(dot)(at)gmail(dot)com> wrote:
> > Hi,
>
> > I am looking for some advice on where to troubleshoot after 1 drive in
> > a RAID 1 failed.
>
> > Thank you.
>
> > I am running v 7.41, I am currently importing the data to another
> > physical server running 8.4 and will test with that once I can. In the
> > meantime here is relevant info:
>
> > Backups used to take 25 minutes, and now take 110 minutes, before
> > replacing the drive it became clear the backup was not going to finish
> > since in 120 minutes it had only finished 200mb of 2.8gb.
>
> > Before replacing the drive:
> > -----------------------------------
> > We noticed all of the queries were slow, many taking over 100 seconds.
> > After we replaced the drives we noticed the queries are running 40
> > seconds or more and most are 8 seconds or more where the same query
> > used to take only 1 second. We have replaced a drive in this RAID 1
> > before and nothing like this happened. The schema was not touched for
> > at least 1 week prior to this.
>
> > Since replacing the drive I have:
> > -------------------------------------------
> > Restored from a backup a few hours before the queries became very
> > slow.
> > Reindex all tables
> > Vacuum all tables
> > Analyze all tables
>
> > Here is what I get with iostat:
>
> > iostat -k /dev/sda2
> > Linux 2.6.26-2-686-bigmem (db1)
> > avg-cpu: �%user � %nice %system %iowait �%steal � %idle
> > � � � � �19.61 � �0.00 � �8.34 � �1.60 � �0.00 � 70.45
>
> probably the replacement drive is bunk, or some esoteric hw problem is
> tripping you up.  some iostat numbers while you are having the problem
> would be more telling.  the solution is obvious -- in terms of this
> server, it's time to ramble on...
>
> merlin
>
> --
> Sent via pgsql-general mailing list (pgsql-gene(dot)(dot)(dot)(at)postgresql(dot)org)
> To make changes to your subscription:http://www.postgresql.org/mailpref/pgsql-general

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Jorge Arévalo 2011-03-23 18:03:10 In what cases can SPI_finish crash postgres backend?
Previous Message Nick Raj 2011-03-23 17:45:41 Understanding Datum