Re: Trying to recover a corrupted database

From: Scott Whitney <scott(at)journyx(dot)com>
To: John Scalia <jayknowsunix(at)gmail(dot)com>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: Trying to recover a corrupted database
Date: 2014-07-17 19:16:35
Message-ID: 984033755.512124.1405624595840.JavaMail.zimbra@mail.int.journyx.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

a) How many databases do you have?
b) Have you tried to pg_dump the database(s) to see if that succeeds?

If you _can_ pg_dump (which you might or might not be able to do), you could re-init the cluster and restore.

----- Original Message -----

> Hi all,

> You may have seen my post from yesterday about our production database
> getting corrupted. Well, this morning we brought the system down to single
> user and ran an fsck which did
> report some drive errors. We repeated until no additional errors were
> reported. Then, we brought the system back to multi-user status and ran a
> successful pg_basebackup on the
> broken database. Since then we restarted the database and a ps -ef result
> looks like:

> /usr/pgsql-9.2/bin/postmaster -D /opt/datacenter -o -c
> zero_damaged_pages=true -i -N 384 -p 5431

> After the Db started up, we ran a VACUUM FULL ANALYZE which ran for about 3
> hours, but the database is still showing the same type of errors in its log:
> invalid page header in
> block 29718... etc. What disturbed me a little, is that I don't think the
> zero_damaged_pages got applied. Checking the pg_settings table, we got:

> select name, setting, boot_val, reset_val from pg_settings where name =
> 'zero_damaged_pages';
> name | setting | boot_val | reset_val
> ---------------------------------------------------------------------------------
> zero_damaged_pages | on | off | on

> Now, my colleague ran this after he tried running some operations again after
> I told him how to set zero_damaged_pages again. He swears that that it was
> on when the first VACUUM
> FULL ANALYZE was run, but I'm not as sure. Plus, I don't understand why the
> boot_val shows as off. In any event, as we're still getting log errors like
> before, I don't really know
> what to try next other than rerunning the VACUUM FULL again. Help?
> --
> Jay

> --
> Sent via pgsql-admin mailing list (pgsql-admin(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-admin

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message John Scalia 2014-07-17 19:24:47 Re: Trying to recover a corrupted database
Previous Message John Scalia 2014-07-17 19:10:16 Trying to recover a corrupted database