Re: Recovery of corrupted database

From: Bear Giles <bgiles(at)coyotesong(dot)com>
To: Rikardo Tinauer <rikardo(dot)tinauer(at)eba(dot)si>
Cc: pgsql-admin <pgsql-admin(at)postgresql(dot)org>
Subject: Re: Recovery of corrupted database
Date: 2017-07-19 16:01:41
Message-ID: CALBNtw5t-ox78fv3uU4hT++_Mt9udxzBkzDe6wR6fM7EBam_Tg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Take this with a huge grain of salt but you can't back up the database
files from a running server - too much is cached in memory. You must shut
it down first. That's one reason the usual advice is to explicitly exclude
the database directory from backups and perform backups using pg_dump and
pg_restore.

There's a big difference between restoring from backups and recovering from
a crash. The latter is abrupt - everything is caught in a moment in time.
Backups take time so files will be backed up at a slightly different times.
That means transactions may be caught midway, e.g., between writing to the
data file and updating the log. That makes it much harder to recover.

The encryption probably had the same effect as backing up a running
database. It didn't capture anything cached in memory and it wasn't a clean
snapshot at a single instance. The results aren't a valid image.

This seems especially likely since you're seeing a problem in block 0. I
don't know about PostgreSQL but it's common to store the index to other
information in the first block and it's always kept in memory with only
periodic flushes to disk.

I'm sure others will be able to give you better advice on what, if
anything, you can do. I'm not very hopeful though - I hope they did regular
and frequent backups.

(Aside: would this be a use-case for a replicated read-only server? Or
would the primary database crash in a way that would also corrupt the
secondary server?)

On Wed, Jul 19, 2017 at 7:27 AM, Rikardo Tinauer <rikardo(dot)tinauer(at)eba(dot)si>
wrote:

> Our client got virus on database server encrypting the database. They paid
> and got files decrypted but database wasn’t able to start.
>
> We got the following error:
> 2017-07-19 10:31:11 CEST LOG: database system was shut down in recovery
> at 2017-07-19 10:29:00 CEST
> 2017-07-19 10:31:11 CEST LOG: invalid magic number B549 in log segment
> 000000010000002E00000015, offset 0
> 2017-07-19 10:31:11 CEST LOG: invalid primary checkpoint record
> 2017-07-19 10:31:11 CEST LOG: invalid magic number B549 in log segment
> 000000010000002E00000015, offset 0
> 2017-07-19 10:31:11 CEST LOG: invalid secondary checkpoint record
> 2017-07-19 10:31:11 CEST PANIC: could not locate a valid checkpoint record
> 2017-07-19 10:31:11 CEST LOG: shutdown at recovery target
> 2017-07-19 10:31:11 CEST LOG: shutting down
> 2017-07-19 10:31:11 CEST LOG: database system is shut down
>
>
> We got past that with executing the “pg_resetxlog -f DATADIR” command.
>
> Now database starts but we cannot connect to it, we get the following
> error:
> 2017-07-19 14:11:05 CEST ERROR: invalid page in block 0 of relation
> global/12369
>
> Can anyone help?
>
> Lp,
> Rikardo Tinauer
> ----------------------------------------------------------
> EBA, d.o.o., Ljubljana
> Litostrojska 40, SI-1000 Ljubljana
> Tel: +386 (0)590 98 890
> Email: rikardo(dot)tinauer(at)eba(dot)si
> Web: http://www.ebadms.com
> -----------------------------------------------------------
>
>

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Янченко Владимир 2017-07-20 15:59:12 Strange size of pg_largeobject
Previous Message Samed YILDIRIM 2017-07-19 16:01:34 Re: Recovery of corrupted database