From: | "Mark Steben" <msteben(at)autorevenue(dot)com> |
---|---|
To: | "'Lee Azzarello'" <lee(at)dropio(dot)com>, <pgsql-admin(at)postgresql(dot)org> |
Subject: | Re: recovery question |
Date: | 2009-02-25 19:31:07 |
Message-ID: | 928EDD41859E4D02B913C86A3E6FD79E@dei26g028534 |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin |
Hi Lee, just got your reply.
Every segment comes over compressed (gzip). So every segment would be a
Different size in the compressed folder. But we decompress it into another
folder (gzip) and they always decompress into the standard 16 meg size when
We copy them back into xlog.
So 0000000100000C28000000B1 came into xlog as 16777216, just like the
others.
Thanks for the response.
Mark Steben│Database Administrator│
@utoRevenue-R- "Join the Revenue-tion"
95 Ashley Ave. West Springfield, MA., 01089
413-243-4800 x1512 (Phone) │ 413-732-1824 (Fax)
@utoRevenue is a registered trademark and a division of Dominion Enterprises
-----Original Message-----
From: Lee Azzarello [mailto:lee(at)dropio(dot)com]
Sent: Wednesday, February 25, 2009 10:40 AM
To: pgsql-admin(at)postgresql(dot)org
Subject: Re: recovery question
Is 0000000100000C28000000B1 the same size as the other segments?
-lee
2009/2/25 Mark Steben <msteben(at)autorevenue(dot)com>:
> Hi listers,
>
>
>
> Here is my problem. I am running PITR restore on a machine remote from my
> production machine.
>
> I'm shipping logs over there, compressed, then uncompressing them and
> copying them to pg_xlog.
>
> Everything works fine until a network outage creates a gap in my logs.
>
> The recovery terminates at log "0000000100000C28000000B1" and brings the
> database up
>
> Because it can't find "0000000100000C28000000B2".
>
> Log "0000000100000C28000000B3" is copied over but I wish to restart
recovery
> at B2.
>
> So I scp B2 over from my primary machine from a folder that I created for
> just such an occasion.
>
>
>
> Now I rename recovery.done to recovery.conf (Copied here for your
> convenience)
>
>
>
> 'sh /usr/local/postgresql-8.2.5/bin/copy.sh %f %p 2>>/tmp/recovery.log'
>
>
>
> (and copy.sh:)
>
>
>
> REQ_FILE=$1
>
> DEST=$2
>
> LF="${REQ_FILE}.lock"
>
> SUFFIX=${REQ_FILE##*.}
>
> ###############################################################
>
> ## check if file is transaction log or informational file
>
> ## if transaction log, cat from archlog and uncompress into unzipped
folder
>
> ## if informational simply copy into unzipped folder (it came over
> uncompressed)
>
>
############################################################################
#########
>
> if [ "${SUFFIX}" != 'history' ] && [ "${SUFFIX}" != 'backup' ]; then
>
> cat "/logs/var/backups/archlog/${REQ_FILE}" | gzip -dc >
> "/logs/var/backups/unzipped/${REQ_FILE}"
>
> if [ "$?" = "0" ] ;
>
> then
>
> echo 'successful uncompress of '
> "/logs/var/backups/unzipped/${REQ_FILE}" >> /tmp/restore.mavmail.log
>
> else
>
> echo 'unsuccessful uncompress of '
> "/logs/var/backups/unzipped/${REQ_FILE}" >> /tmp/restore.mavmail.log
>
> echo 'the return code is ' "$?" >> /tmp/restore.mavmail.log
>
> fi
>
> else
>
> cp "/logs/var/backups/archlog/${REQ_FILE}"
> "/logs/var/backups/unzipped/${REQ_FILE}"
>
> fi
>
>
############################################################################
###########
>
> ## check for size. If not a full size (16777216) trans log, the copy
from
>
> ## cobra is still in progress. Don't copy this file. Stop recovery here.
>
>
############################################################################
###########
>
> SIZE=$(ls -gG1 "/logs/var/backups/unzipped/${REQ_FILE}" | awk '{ print
$3}'
> )
>
> echo "The size of the log to be restored is " "${SIZE}" >>
> /tmp/restore.mavmail.log
>
> if [ "${SUFFIX}" != 'history' ] && [ "${SUFFIX}" != 'backup' ]; then
>
> if [ "${SIZE}" != '16777216' ]; then
>
> echo 'partially written log - not restored - finishing recovery' >>
> /tmp/restore.mavmail.log
>
> exit 0
>
> fi
>
> fi
>
>
>
> /usr/bin/lockfile "${LF}"
>
> ################################################################
>
> ## copy either full sized trans log or informational file
>
> ## into pg_xlog data cluster.
>
> ################################################################
>
> cp "/logs/var/backups/unzipped/${REQ_FILE}" "${DEST}"
>
> rm -f "${LF}"
>
> rm "/logs/var/backups/unzipped/${REQ_FILE}"
>
>
>
> (END)
>
>
>
> Now when I try to restart, hoping to begin recovery with the C2 log I get
an
> invalid checkpoint error:
>
>
>
> : LOG: starting archive recovery
>
> Feb 25 10:08:10 ar-db3 postgres[32538]: [3-1] @: LOG: restore_command =
"sh
> /usr/local/postgresql-8.2.5/bin/copy.sh %f %p 2>>/tmp/recovery.log"
>
> Feb 25 10:08:11 ar-db3 postgres[32538]: [4-1] @: LOG: restored log file
> "0000000100000C28000000B1" from archive
>
> Feb 25 10:08:11 ar-db3 postgres[32538]: [5-1] @: LOG: invalid record
length
> at C28/B1FFECA4
>
> Feb 25 10:08:11 ar-db3 postgres[32538]: [6-1] @: LOG: invalid primary
> checkpoint record
>
> Feb 25 10:08:12 ar-db3 postgres[32538]: [7-1] @: LOG: restored log file
> "0000000100000C28000000B1" from archive
>
> Feb 25 10:08:12 ar-db3 postgres[32538]: [8-1] @: LOG: invalid record
length
> at C28/B1FFEC5C
>
> Feb 25 10:08:12 ar-db3 postgres[32538]: [9-1] @: LOG: invalid secondary
> checkpoint record
>
> Feb 25 10:08:12 ar-db3 postgres[32538]: [10-1] @: PANIC: could not locate
a
> valid checkpoint record
>
> Feb 25 10:08:12 ar-db3 postgres[32537]: [1-1] @: LOG: startup process
(PID
> 32538) was terminated by signal 6
>
> Feb 25 10:08:12 ar-db3 postgres[32537]: [2-1] @: LOG: aborting startup
due
> to startup process failure
>
>
>
> I remove the recovery.conf file, successfully start the database and issue
a
> checkpoint. I try the restore again and get the same error.
>
>
>
> So, is there a way that I can force the recovery to begin at B2 or am I
dead
> in the water and need to bring in another full file copy and
>
> Start from scratch:
>
>
>
> Thanks for your time.
>
>
>
> Mark Steben│Database Administrator│
>
> @utoRevenue-(R)- "Join the Revenue-tion"
> 95 Ashley Ave. West Springfield, MA., 01089
> 413-243-4800 x1512 (Phone) │ 413-732-1824 (Fax)
>
> @utoRevenue is a registered trademark and a division of Dominion
Enterprises
>
>
>
>
From | Date | Subject | |
---|---|---|---|
Next Message | Harald Fuchs | 2009-02-25 20:15:21 | Re: "like" and index |
Previous Message | Pavan Deolasee | 2009-02-25 17:42:31 | Re: Database corruption help |