Re: Suspected Postgres Datacorruption

From: Sumeet Jauhar <sumeet(dot)jauhar(at)gmail(dot)com>
To: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>, bnicholson(at)hp(dot)com
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Suspected Postgres Datacorruption
Date: 2011-08-05 02:33:31
Message-ID: CAKN1XtroO6PnP6YjB4MLrYmLMKSZS+WuxMvmS9zdSCqdE06=wA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin pgsql-performance

Yes the very fact that we are using a very very old version of Postgres is
certainly causing alot of problems .

On Fri, Aug 5, 2011 at 2:51 AM, Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>wrote:

> On Wed, Aug 3, 2011 at 1:35 AM, Sumeet Jauhar <sumeet(dot)jauhar(at)gmail(dot)com>
> wrote:
> >
> >
> > Our application is running on Postgres 7.4.X . I agree that this is a
> very
> > old version of Postgres and we should have upgraded . The issue that we
> > faced is that
>
> Wow, that is a very old version. It has been out of maintenance for a
> long time. If there are data eating bugs in it they aren't gonna get
> fixed.
>
> [ Sumeet ] i plan to propose an upgrade soon . This data corruption issue
seems to be the best push / driver for me to go ahead and implement it .

> > 1 . There was a system crash due to a hardware failure .
> >
> > 2 . When the system came up , we tried to insert a few records into the
> > database . However at this point in time we saw that Postgres was taking
> a
> > lot of CPU & memory .
> >
> > Around 42% CPU consumption . This was a cause of concern .
> >
> > 3 . We re-indexed the database and it helped reduce the cpu & memory
> > consumption .
> >
> > My question is
> >
> > A ) Isn’t Postgres database resilient enough to handle hardware system
> > failure ? or it sometime results in a corrupt index for its tables ? I
> read
> > on the Postgres site that hardware failure can cause corrupt indexes .
> > Besides this are there any other scenario which may result in such
> > corruption .
>
> Depends on the hardware failure. If your RAID controller starts
> writing garbage to the drive array, how exactly should postgresql fix
> that? OTOH, if you just have a big boom and the power supply goes
> out, most the time you're fine. Of course, if the drive subsystem is
> lying about fsyncs, then postgresql can't guarantee your data anyway.
> So, it really depends on your hardware. Standard test to make sure
> your hardware is ok is to install postgresql, start a lot of
> transactions at once, and walk around back and pull the power plug.
> If it comes back up a half dozen times without errors you're probably
> ok, but hey, there could still be a corner case out there too. Bonus
> points if you initiate checkpoint that'll take a few minutes before
> you pull the plug, increasing the chance you'll find problems.
>
> With 7.4 there's a real likelihood that there are data loss bugs in
> there that have never been fixed and never will be.
>

[ Sumeet ] The scenario that you have pointed out . ie to go back and
unplug the powersupply while there are database operations going on seems a
good test case . I will do that and see what possibly happens . A faulty
RAID on the system is bound to cause problems . I agree . It will manifest
itself in someway .

>
> > B) If there has been improvement / enhancements done by Postgres
> regarding
> > the way it handles corrupt indexes can you please pass me more
> information
> > about the bug Id or some documentation on it ? Our application does not
> do
> > any REINDEXING . I am in a dilemma if we should seriously incorporate it
> in
> > our application .
>
> Of course, there's been lots of improvements since 7.4 But being a
> database when it encounters errors it tries not to guess too much
> about what you want. IS a reindex the right thing to do? Maybe,
> maybe not. That's the job of the DBA to figure out. Regular
> reindexing is not needed and if your particular machine does need it
> you need to figure out why and change it so that it's not needed. If
> indexes are getting corrupted, chances are so are tables and you'll
> notice too late.
>
> [ Sumee ] Thanks . i was of the opinion that re-indexing could be
incorporated as a precautionary thing everytime the system crashes . However
the hard part is to do it only when the system crashes . and the harder part
is to know that the system has actually crashed and its not a simple reboot
.
DBA should help me . WIll do that .

> > I ideally want to push to a higher version of Postgres . If I can prove
> that
> > there will be significant performance benefits and that crashes won’t
> occur
> > then I will be able to present a strong case .
>
> Hehe. It would be hard to NOT get significant performance
> improvements moving from 7.4 to 9.0. Heck our load on our production
> servers went from 12 to 3 or so when we went from 8.1 to 8.3. Saved
> us a ton on what we would have had to spend to keep 8.1 happy.
> Install a test version of 9.0 on a laptop, point your test servers at
> it, and watch it outrun your production database for 90% of everything
> you do.
>
> We run 8.3 and 8.4 in production and they are literally light years
> ahead of 7.4 in terms of stability, performance, and capabilities.
> Plus when you find a problem in one of them, it gets fixed, fast.
> They're still supported. Just that would be enough to justify an
> upgrade for me.
>

[ Sumeet ] ok so i agree we need to move ahead and shift to a higher
version . But how do we decide that . Which one would you say is the
stablest version of Postgres [ still supported version ] out in the market
below beacuse Brad here says his 8.1 version did have performance impacts
. Brad - How had you decide on the version . Was it the latest version
available at that point in time or there was someother reason ? I am also
pretty sure that upgrading 2 times would not have been easy .

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Scott Marlowe 2011-08-05 02:40:05 Re: Suspected Postgres Datacorruption
Previous Message Nicholson, Brad (Toronto, ON, CA) 2011-08-04 22:00:42 Re: Suspected Postgres Datacorruption

Browse pgsql-performance by date

  From Date Subject
Next Message Scott Marlowe 2011-08-05 02:40:05 Re: Suspected Postgres Datacorruption
Previous Message Steve Crawford 2011-08-04 23:47:19 Re: table size is bigger than expected