Quick Links

Re: Lost rows/data corruption?

From:	"Andrew Hall" <temp02(at)bluereef(dot)com(dot)au>
To:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	<pgsql-general(at)postgresql(dot)org>
Subject:	Re: Lost rows/data corruption?
Date:	2005-02-15 04:05:50
Message-ID:	008001c51313$a2d2d680$5001010a@bluereef.local
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Tom,

Here is the data you requested. It took little while to gather it as this
kind of corruption doesn't happen all the time.

The first sign that we know something is wrong is our application freezing
when communicating with the DB.

If we then issue a vacuum, here's the output:

sonar=# VACUUM FULL;
WARNING: index "user_session_pkey" contains 85613 row versions, but table
contains 85513 row versions
HINT: Rebuild the index with REINDEX.
WARNING: index "user_session_pkey" contains 85613 row versions, but table
contains 85513 row versions
HINT: Rebuild the index with REINDEX.
VACUUM

If we the try a reindex, here's the output:

sonar=# REINDEX TABLE user_session;
ERROR: could not create unique index
DETAIL: Table contains duplicated values.

There are duplicates and massive amounts of erroneous data in most columns
in the table. From here, no amount of row deletion seems to clear the issue,
the only remedy is a truncate on the table and repopulate. In fact if we try
even a select we get this:

WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.

Here is the schema for the table:

sonar=# \d user_session;
Table "public.user_session"
Column | Type |
Modifiers

I'm most concerned that a primary constraint is being actively violated, and
I don't understand how this is possible.

Any help appreciated.
----- Original Message -----
From: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Andrew Hall" <temp02(at)bluereef(dot)com(dot)au>
Cc: <pgsql-general(at)postgresql(dot)org>
Sent: Friday, February 04, 2005 10:12 AM
Subject: Re: [GENERAL] Lost rows/data corruption?

> "Andrew Hall" <temp02(at)bluereef(dot)com(dot)au> writes:
>> We have a long running DB application using PG7.4.6. We do a VACUUM FULL
>> every night and a normal 'maintenance' VACUUM every hour. We do nothing
>> with
>> any indexes. Every now and then we get errors from the database whereby
>> an
>> update will fail on a table saying that there is duplicate violation on a
>> primary keyed row. Theoretically this is impossible as the constraint
>> should
>> prevent any duplication, ever. When the next vacuum is run, we see an
>> error
>> saying that there are more rows than the database can account for (or
>> words
>> to that effect) and we should either REINDEX or TRUNCATE the table.
>
> Could we see the exact log trace, rather than handwaving? Also the
> schemas of the table(s)/index(es) involved might be useful.
>
> regards, tom lane

In response to

Re: Lost rows/data corruption? at 2005-02-03 23:12:12 from Tom Lane

Responses

Re: Lost rows/data corruption? at 2005-02-15 04:25:41 from Tom Lane

Browse pgsql-general by date

	From	Date	Subject
Next Message	Tom Lane	2005-02-15 04:25:41	Re: Lost rows/data corruption?
Previous Message	Bruce Momjian	2005-02-15 03:30:35	Re: Bug in COPY from CSV?