From: | Michael Fuhr <mike(at)fuhr(dot)org> |
---|---|
To: | Noel Faux <noel(dot)faux(at)med(dot)monash(dot)edu(dot)au> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Data corruption zero a file - help!! |
Date: | 2006-03-09 06:15:46 |
Message-ID: | 20060309061546.GA60778@winnie.fuhr.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Thu, Mar 09, 2006 at 03:57:46PM +1100, Noel Faux wrote:
> Given that this seems problem has occurred a number of times for a
> number I've written a small step by step procedure to address this
> issue. Is there any other comments you which to add. I was thinking
> that this should be added to the FAQ / troubleshooting in the docs.
>
> How to repair corrupted data due to "ERROR: invalid page header in block
> X of relation "Y":
The word "repair" might be misleading. The operation repairs the
table in a sense, but as the following caution points out it does
so by completing the destruction that something else began.
> CAUTION this will permanently remove the data defined in the bad block
>
> 1. To identify which file(s) the relation is in:
> 1. <THE SELECT STATEMENTS TO COLLECT THIS DATA> I can't
> remember how I did it, but will keep hunting for my notes :)
Hint: pg_class.relfilenode
http://www.postgresql.org/docs/8.1/interactive/catalog-pg-class.html
> 2. so the file(s) are $PGDATA/<databaseDIR>/<relFile>.* The *
> is any number which is defined below
You might want to show how to determine <databaseDIR>. The actual
location might not be under $PGDATA -- 8.0 and later have tablespaces
and earlier versions support alternative locations, so instructions
should account for that. Also, relations smaller than 1G won't have
any .N files.
http://www.postgresql.org/docs/8.1/interactive/manage-ag-tablespaces.html
http://www.postgresql.org/docs/8.1/interactive/storage.html
http://www.postgresql.org/docs/7.4/interactive/manage-ag-alternate-locs.html
> 2. To calculate the * value:
> 1. SELECT <block> / 131072 AS filenum, <block> % 131072 AS blocknum;
> filenum | blocknum
> -----------+----------------
> <filenum> | <blocknum>
> 1. 131072 comes from "each database file is 1G, or 131072
> * 8k blocks"
The 1G figure obviously applies only to tables that require that
much space. If <filenum> comes back zero then you'd use the file
without any .N suffix. If the bad block is less than 131072 (or
however many other-than-8k blocks fit in 1G) then you needn't bother
with the calculation.
> 3. Now you need to re-zero this block using the following command:
> 1. dd bs=8k seek=<blocknum> conv=notrunc count=1 if=/dev/zero
> of=$PGDATA/base/<databaseDIR>/<relFile>.<filenum>
I'd recommend testing the command on a throwaway file before working
with real data -- "measure twice, cut once" as it were. To gain
confidence in what you're doing you could create a test table,
populate it with data, corrupt its data file, then zero its bad
blocks until you can select all of the remaining data. Playing
around in a production database is probably a bad idea; a safer way
would be to initdb a test cluster and run a separate postmaster
(listening on a different port if you're on the same machine as the
real database).
It's probably best to shut down the postmaster while you're mucking
around with the data files.
> 1. Before you do this it is best to backup the block:
> "dd bs=8k skip=<blocknum> count=1 if=/path/file | hd"
This command doesn't back up the block, it pipes the block into a
command that on some systems will display a hex and ASCII dump of
the data (some systems will require a command other than hd). You
could back up the block by redirecting the dd output to a file
instead of piping it into another command.
Incidentally, I was looking at your web site and your project
might make an interesting case study for the PostgreSQL web site
(Community -> In The Real World -> Case studies).
http://www.postgresql.org/about/casestudies/
Some users and potential users might be interested in reading about
how you're using PostgreSQL with a 100G+ database. Post a message
to pgsql-www if you'd be interested in providing a write-up.
--
Michael Fuhr
From | Date | Subject | |
---|---|---|---|
Next Message | levi godinez | 2006-03-09 07:39:24 | double-quoted field names in pgadmin |
Previous Message | Paul Newman | 2006-03-09 05:57:04 | Re: Triggers and Multiple Schemas. |