Re: Issue on restore / recover

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: flumbador(at)virgilio(dot)it
Cc: pgsql-admin(at)lists(dot)postgresql(dot)org
Subject: Re: Issue on restore / recover
Date: 2018-01-02 15:16:25
Message-ID: 20180102151625.GX2416@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Greetings,

* flumbador(at)virgilio(dot)it (flumbador(at)virgilio(dot)it) wrote:
> I have restored and recovered a Postgresql 9.4.9 from an hot backup. The backup is a filesystem copy taken while the Postgrsql is in backup mode (I mean between start and stop backup).
>
> During the restore 3 files were missing; these three files belong to a table with high transaction workload, and for sure during the backup many transactions had modified this table and those missing files. What is surprising to me is that even if the files were missing the recover phase ended successfully. I expect an error (for example file not found) raised when postgresql try to apply the wal entries related to this table and those files. After the recover I find that these three file has been created during recover but when I try to query the table I get the error:
>
> db4=# select count(*) from pgbench_accounts ;
>
> ERROR: could not read block 1999996 in file "pg_tblspc/16471/PG_9.4_201409291/16474/16593.15": read only 0 of 8192 bytes
>
> This error confirm that my database is corrupted! The question is: why during the recover phase Postgresql doesn't throw any errors? I think that is better to know immediately that we have restored our database from a corrupted backup rather then discover the issue after maybe a long time when a query is executed on the corrupted table.

How are you handling WAL archiving? Based on the symptoms, my first
guess is that you aren't putting the backup_label file in place (or
you're removing it) before PG starts doing WAL replay, which makes PG
think it's just doing crash recovery and so it does *some* WAL replay
but not *all* that it needs to do (it needs to start farther back, where
the backup started), which can certainly result in a corrupt backup.

I'd strongly suggest you look at the various backup tools which exist
for PostgreSQL and know how to properly do backups and restores instead
of trying to roll your own. Personally, I'd suggest pgBackRest but
there are other tools out there such as barman and WAL-E/G that you
might consider.

Thanks!

Stephen

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Stephen Frost 2018-01-02 15:19:25 Re: Issue on restore / recover
Previous Message Ben Primrose 2018-01-02 15:06:49 Re: Issue on restore / recover