Re: unable to repair table: missing chunk number

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alex Krohn <alex(at)gossamer-threads(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: unable to repair table: missing chunk number
Date: 2002-04-19 20:32:22
Message-ID: 3766.1019248342@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Alex Krohn <alex(at)gossamer-threads(dot)com> writes:
>> If not, you'll have to identify exactly which tuple references the
>> trashed toast value and get rid of it.

> How do you correlate a tuple to a toast value? What is a toast value?
> (Sorry new, to postgres). =)

A TOAST value is a value of a particular field in a particular row
that's been pushed out-of-line because it's too big. (Bigger than
a kilobyte or so, probably, though I'm guessing since I don't know
how many columns in your table.) What you have to do is identify
which row and which field that is. Even if the toasted value is,
um, toast, you should be able to extract everything else.

> When I dump the table to disk I get records in sequence by primary key
> from 1 to 115847 before it crashes. So I then tried to SELECT * INTO
> newtable WHERE primary_key > 115847.

There could be some buffering going on (records read from database but
not yet emitted to file), but you seem to have the right idea.

> I tried increasing the value to see
> where it broke, but even with primary_key > 130000 (out of 135,000),
> it's always bad.

Do you always get the same toast value number mentioned? There could be
more than one trashed value.

A brute-force way to narrow things down would be to write a little
program that tries to retrieve each row individually by primary key,
starting at 115848 since you know the rows before that are okay.

> I'm not sure. I wasn't running VACUUM regularly which may have been the
> cause. The program was running smoothly for about two weeks, and then
> the application started failing for one or two users with that error in
> the SQL log.

That's disturbing; short of a serious failure (disk crash, for instance)
I don't know of anything that would cause this.

One thing that would be interesting to try is to investigate the TOAST
table directly. To do this, get your problem table's OID from pg_class:
select oid from pg_class where relname = 'your-table-name';
The TOAST table's name is then "pg_toast_OID", for instance
pg_toast_139777 if you find the table OID is 139777. If you inspect
this table you'll find it's got a structure like

regression=# \d pg_toast_139777
TOAST table "pg_toast_139777"
Column | Type
------------+---------
chunk_id | oid
chunk_seq | integer
chunk_data | bytea

The chunk_id is that magic number 12851102 that your error message is
complaining about. What would be interesting to know is what is
actually in the toast table for that ID. You might try something like
select chunk_seq, length(chunk_data) from pg_toast_139777
where chunk_id = 12851102 order by chunk_seq;

(You probably need to be database superuser to look at a toast table
directly like this.)

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Dave 2002-04-19 20:41:34 postgres startup script, long delay when auth = password
Previous Message Alex Krohn 2002-04-19 20:08:42 Re: unable to repair table: missing chunk number