Re: Suspected Postgres Datacorruption

From: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
To: Sumeet Jauhar <sumeet(dot)jauhar(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Suspected Postgres Datacorruption
Date: 2011-08-04 21:21:57
Message-ID: CAOR=d=3XeyRaf+rvzaB+9UeMQRYzj-0B9cgp_vo5BXD-SKMnuQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin pgsql-performance

On Wed, Aug 3, 2011 at 1:35 AM, Sumeet Jauhar <sumeet(dot)jauhar(at)gmail(dot)com> wrote:
>
>
> Our application is running on Postgres 7.4.X . I agree that this is a very
> old version of Postgres and we should have upgraded . The issue that we
> faced is that

Wow, that is a very old version. It has been out of maintenance for a
long time. If there are data eating bugs in it they aren't gonna get
fixed.

> 1 . There was a system crash due to a hardware failure .
>
> 2 . When the system came up , we tried to insert a few records into the
> database . However at this point in time we saw that Postgres was taking a
> lot of CPU & memory .
>
> Around 42% CPU consumption . This was a cause of concern .
>
> 3 . We re-indexed the database and it helped reduce the cpu & memory
> consumption .
>
> My question is
>
> A ) Isn’t Postgres database resilient enough to handle hardware system
> failure ? or it sometime results in a corrupt index for its tables ? I read
> on the Postgres site that hardware failure can cause corrupt indexes .
> Besides this are there any other scenario which may result in such
> corruption .

Depends on the hardware failure. If your RAID controller starts
writing garbage to the drive array, how exactly should postgresql fix
that? OTOH, if you just have a big boom and the power supply goes
out, most the time you're fine. Of course, if the drive subsystem is
lying about fsyncs, then postgresql can't guarantee your data anyway.
So, it really depends on your hardware. Standard test to make sure
your hardware is ok is to install postgresql, start a lot of
transactions at once, and walk around back and pull the power plug.
If it comes back up a half dozen times without errors you're probably
ok, but hey, there could still be a corner case out there too. Bonus
points if you initiate checkpoint that'll take a few minutes before
you pull the plug, increasing the chance you'll find problems.

With 7.4 there's a real likelihood that there are data loss bugs in
there that have never been fixed and never will be.

> B) If there has been improvement / enhancements done by Postgres regarding
> the way it handles corrupt indexes can you please pass me more information
> about the bug Id or some documentation on it ? Our application does not do
> any REINDEXING . I am in a dilemma if we should seriously incorporate it in
> our application .

Of course, there's been lots of improvements since 7.4 But being a
database when it encounters errors it tries not to guess too much
about what you want. IS a reindex the right thing to do? Maybe,
maybe not. That's the job of the DBA to figure out. Regular
reindexing is not needed and if your particular machine does need it
you need to figure out why and change it so that it's not needed. If
indexes are getting corrupted, chances are so are tables and you'll
notice too late.

> I ideally want to push to a higher version of Postgres . If I can prove that
> there will be significant performance benefits and that crashes won’t occur
> then I will be able to present a strong case .

Hehe. It would be hard to NOT get significant performance
improvements moving from 7.4 to 9.0. Heck our load on our production
servers went from 12 to 3 or so when we went from 8.1 to 8.3. Saved
us a ton on what we would have had to spend to keep 8.1 happy.
Install a test version of 9.0 on a laptop, point your test servers at
it, and watch it outrun your production database for 90% of everything
you do.

We run 8.3 and 8.4 in production and they are literally light years
ahead of 7.4 in terms of stability, performance, and capabilities.
Plus when you find a problem in one of them, it gets fixed, fast.
They're still supported. Just that would be enough to justify an
upgrade for me.

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Nicholson, Brad (Toronto, ON, CA) 2011-08-04 21:47:09 Re: Suspected Postgres Datacorruption
Previous Message Kevin Grittner 2011-08-04 15:21:32 Re: my table is empty, pg_relation_size(name) return 38 MB

Browse pgsql-performance by date

  From Date Subject
Next Message Kevin Grittner 2011-08-04 21:36:28 Re: Postgres 8.4 memory related parameters
Previous Message Kevin Grittner 2011-08-04 21:16:56 Re: Tsearch2 - bad performance with concatenated ts-vectors