Re: (Again) Datacorruption using 7.4.2 on XFS/raid1

From: Ian Barwick <barwick(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: (Again) Datacorruption using 7.4.2 on XFS/raid1
Date: 2004-07-12 20:38:43
Message-ID: 1d581afe040712133843b5d325@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Mon, 12 Jul 2004 20:31:15 +0200, Florian G. Pflug <fgp(at)phlo(dot)org> wrote:
> Hi
>
> We have again experienced data-corruption using 7.4.2 on an XFS Filesystem
> on top of a software-raid (md) raid-1.
>
> After a server crash last night (It was a rather strange crash - The machine
> was still pingable, but no login was possible, and postgres and apache
> didn't respond to requests any more) we hard-reset the machine. It came up
> again nicely, but a few hours later the following errors occured when trying
> to access certain tabled. (Those tables are updated heavily - each day about
> 2 million tuples are inserted, and the old versions of those tuples
> deleted).
>
> ERROR: could not access status of transaction 34048
> DETAIL: could not open file "/var/lib/postgres/data/pg_clog/0000": No such
> file or directory

You don't say what kind of disks you are using. Sounds very much like
hardware problems though.

I had a PostgreSQL installation on a pair of IDE disks with software
RAID1 / Ext3 die very nastily with similar error messages. Turned out
that one of the disks was very defective and the RAID wasn't handling
it.

On the other hand - after copying the files from the good disk,
PostgreSQL started with barely a complaint and I couldn't detect any
corruption.

Ian Barwick

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Otto Blomqvist 2004-07-12 21:08:01 Need Libpq.dll w/ SSL
Previous Message Richard A Lough 2004-07-12 20:34:45 Three versions of Pg.pm on my machine