Re: Serious Crash last Friday

From: "Henrik Steffen" <steffen(at)city-map(dot)de>
To: "Martijn van Oosterhout" <kleptog(at)svana(dot)org>
Cc: <pgsql-general(at)postgresql(dot)org>
Subject: Re: Serious Crash last Friday
Date: 2002-06-17 08:39:21
Message-ID: 039601c215da$7b06eb20$7100a8c0@topconcepts.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


Hello,

trying pgfsck on my corrupted employee table from friday it gave me about 85
lines complaining
about "Tuple incorrect length (parsed data=xxxxxx, length=xxx)"

the table had 184 rows, out of which 85 were corrupt ??

trying pgfsck on the current employee table of today (after new initdb etc.)
with 184 rows,
I get 814 (!!) rows complaining about "Tuple incorrect length ..." - how can
this be???

Mit freundlichem Gruß

Henrik Steffen
Geschäftsführer

top concepts Internetmarketing GmbH
Am Steinkamp 7 - D-21684 Stade - Germany
--------------------------------------------------------
http://www.topconcepts.com Tel. +49 4141 991230
mail: steffen(at)topconcepts(dot)com Fax. +49 4141 991233
--------------------------------------------------------
24h-Support Hotline: +49 1908 34697 (EUR 1.86/Min,topc)
--------------------------------------------------------
System-Partner gesucht: http://www.franchise.city-map.de
--------------------------------------------------------
Handelsregister: AG Stade HRB 5811 - UstId: DE 213645563
--------------------------------------------------------

----- Original Message -----
From: "Martijn van Oosterhout" <kleptog(at)svana(dot)org>
To: "Henrik Steffen" <steffen(at)city-map(dot)de>
Cc: <pgsql-general(at)postgresql(dot)org>
Sent: Monday, June 17, 2002 9:43 AM
Subject: Re: [GENERAL] Serious Crash last Friday

> On Mon, Jun 17, 2002 at 08:43:37AM +0200, Henrik Steffen wrote:
> >
> > Hello all,
> >
> > on Friday we experienced a very very worrying crash of our postgresql
> > server.
>
> Sound like the CTIDs are out of whack or something. If you're really
> desperate you can try the program here, it may be able to dump something.
> http://svana.org/kleptog/pgsql/pgfsck.html
>
> > Well, the crash was indicated as follows: One of my employees complained
> > that she couldn't
> > work anymore (via webinterface). The error-message was due to an error
in
> > the
> > employee-table. This particular table has a unique row for employee-numb
ers.
> > Suddenly
> > there were 11 entries for the same employee. Even my name was included
> > twice, and
> > another employee still working on friday afternoon was also included 3
> > times. Note:
> > This was a table with a UNIQUE KEY - this shouldn't be possible IMHO.
>
> What DB version is this. Could it be XID wraparound?
>
> > Taking a closer look, I found additional tables, with non-unique values
in
> > UNIQUE columns.
> >
> > When trying to delete unique values by using the OIDs, I found out, that
> > even the OIDs
> > were the same!!!! Taking a yet closer look, I found out by querying
> > pg_tables that
> > there were duplicates of some tables. Then there was the message:
"Backend
> > message type
> > 0x44 arrived while idle"
>
> Try the CTIDs, they will be unique.
>
> > I was running VACUUM and VACUUM FULL a hundred times - but it failed to
> > repair these
> > errors. It didn't even succeed in running VACUUM on all tables: VACUUM
> > complained something
> > about "UNIQUE" (I didn't write down the exact error message though).
>
> Please post the message exactly as printed out.
>
> > Then I tried to DUMP as much as I could, then I stopped the database,
moved
> > the db-folder to
> > a different location, did a new initdb and restored the whole system.
> > Unfortunately
> > there was one table I couldn't dump at all and I had to use the 15 hours
old
> > backup copy.
> >
> > But, please correct me if I am wrong, this should never actually happen,
> > shouldn't it?
>
> Never, that's why it would be helpful to know what went wrong.
>
> > Anyone had any of these problems before? I will see if this happens
again -
> > and if it
> > does I will have to think about using a different backend-server. I'll
don't
> > have to
> > explain to you, that a database server that corrupts data, is completely
> > useless.
>
> HTH,
> --
> Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> > There are 10 kinds of people in the world, those that can do binary
> > arithmetic and those that can't.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Martijn van Oosterhout 2002-06-17 09:27:59 Re: Serious Crash last Friday
Previous Message Henrik Steffen 2002-06-17 08:18:18 Re: Serious Crash last Friday