Re: Postgres point-in-time recovery failure

From: Cheryl Grant <cheryl(dot)grant(at)aapt(dot)com(dot)au>
To: Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc: "pgsql-admin(at)postgresql(dot)org" <pgsql-admin(at)postgresql(dot)org>
Subject: Re: Postgres point-in-time recovery failure
Date: 2013-02-26 23:18:30
Message-ID: CAHQT=kkMda6_UTx1wCX3FjCsM3FVOYqouyqtyh8xO3qTan=+nw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

I'm doing a pg_basebackup to create the instance with -x specified so some
of the logs are in the pg_xlog directory after the backup. It always seems
to fall over with the same error on the first log. I've tried this numerous
times with different backups and it always fails on the first log.

I've used the same method to create a hot standby which works, but only
because streaming replication is getting the data across. But this won't
work in a disaster recovery situation.

My backup command for the primary WAL logs is a script. Here is the
contents of the script:

ls -1 $PGDATA/pg_xlog | while read f; do
{
if [ -f $PGDATA/pg_xlog/$f ] ; then
if [ ! -f $LOGPATH/$f ] ; then
echo "$PGDATA/pg_xlog/$f" >> $LOGFILE
cp $PGDATA/pg_xlog/$f $LOGPATH
status=$?
echo status=$status >> $LOGFILE
scp $LOGPATH/$f $SCPHOST:$LOGPATH &
fi
fi
} done;

On 26 February 2013 19:38, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at> wrote:

> Cheryl Grant wrote:
> > Hi, I'm trying to test restoration of a database using point-in-time
> > recovery. I'm taking a backup of the database using pg_basebackup:
> > pg_basebackup -D /postgres/data -Fp -l RestorePostgres -U reco -w -h
> > radmast01 -p 5432
> > Then attempting to recover the backup on a second server using the
> following
> > recovery.conf settings:
> > restore_command = 'cp /apps/postgres/backup/WAL/%f %p'
> > recovery_target_time = '2013-02-26 12:53:00'
> > recovery_target_inclusive=true
> > Every time I start the recovery I get the following error in the log file
> > and the instance crashes:
> > 2844LOG: database system was interrupted; last known up at 2013-02-26
> > 12:46:56 EST
> > 2844LOG: creating missing WAL directory "pg_xlog/archive_status"
> > 2844LOG: starting point-in-time recovery to 2013-02-26 12:53:00+11
> > 2844LOG: restored log file "000000010000017D00000056" from archive
> > 2844LOG: unexpected pageaddr 17D/2E000000 in log file 381, segment 86,
> > offset 0
> > 2844LOG: invalid checkpoint record
> > 2844FATAL: could not locate required checkpoint record
> > 2844HINT: If you are not restoring from a backup, try removing the
> file
> > "/apps/postgres/data/backup_label".
> > 2825LOG: startup process (PID 2844) exited with exit code 1
> > 2825LOG: aborting startup due to startup process failure
>
> That indicates that the WAL file 000000010000017D00000056 is
> broken. Are you sure that it is from the PostgreSQL server
> you backed up? How did you archive the WAL files?
>
> Yours,
> Laurenz Albe
>

--

*Cheryl Grant*

Senior Development DBA

T: 02 9009 3050 (extn 65050)

M: 0404 083 591

E: cheryl(dot)grant(at)aapt(dot)com(dot)au

W: aapt.com.au <http://www.aapt.com.au/>

Level 20, 680 George St
Sydney NSW 2000

This communication, including any attachments, is confidential. If you are not the intended
recipient, you should not read it - please contact me immediately, destroy it, and do not
copy or use any part of this communication or disclose anything about it.

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Albe Laurenz 2013-02-27 08:58:45 Re: Postgres point-in-time recovery failure
Previous Message Albe Laurenz 2013-02-26 08:38:12 Re: Postgres point-in-time recovery failure