From: | domehead100 <domehead100(at)gmail(dot)com> |
---|---|
To: | pgsql-admin(at)postgresql(dot)org |
Subject: | base backup/restore + streaming replication => weirdness |
Date: | 2013-02-22 22:11:03 |
Message-ID: | 1361571063448-5746342.post@n5.nabble.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin |
I have a smallish Postgres 9.0 database with Primary and Standby instances.
These instances are set up with streaming replication from the Primary to
the Standby. The primary archives WAL files to a shared directory that is
accessible from the Standby. This is a hot standby, so transactions are
received over TCP.
We had an issue this week where the shared directory where WAL files were
being archived (/pgsql_wal) ran out of space.
To restart replication, I performed a base backup on Primary (tar $PGDATA to
/pgsql_wal) and then performed a base restore (untar) on Standby.
After this, the Standby is staying in recovery mode (recovery.conf never
gets changed to recovery.done), and my check_replication.sh script shows
strange results. The sequence number for the Primary (first item below) is
totally different from either the received or applied sequence numbers on
the Standby.
Primary:
pg_current_xlog_location
--------------------------
1E/D5C40A40 <= this looks strange
(1 row)
Standby, last received:
pg_last_xlog_receive_location
-------------------------------
E/BF68BD08
(1 row)
Standby, last applied:
pg_last_xlog_replay_location
------------------------------
E/BF68BD08
(1 row)
I can connect to the Standby, and a select query seems to indicate that the
databases are in sync (they return the same value for max(<primary_key>) on
a table that is constantly receiving inserts).
One concern is that my tar command apparently did not exclude the files in
$PGDATA/pg_xlog, so those got untarred on the Standby. Could that be a
problem?
Here's my basebackup.sh:
#! /bin/sh
# Base Backup script for streaming replication
BACKUP_FILE=/pgsql_wal/backup/pg_base_backup.tgz
psql -c "SELECT pg_start_backup('$BACKUP_FILE', true)" postgres
rm -rf $BACKUP_FILE
nice -n 10 tar czvpf $BACKUP_FILE --exclude={"$PGDATA/pg_xlog/*"} $PGDATA
psql -c "SELECT pg_stop_backup()" postgres
And here's my baserestore.h:
#! /bin/sh
# Base Recovery script for streaming replication (run on Standby)
# Run as postgres user
# Postgres should be stopped
DATE=`date +%Y_%M_%d`
CONF_BACKUP_DIR=/tmp/pgsql_conf_backup_$DATE
BASE_BACKUP_FILE=/pgsql_wal/backup/pg_base_backup.tgz
#backup config files
mkdir $CONF_BACKUP_DIR
cp $PGDATA/*.conf $CONF_BACKUP_DIR
cp $PGDATA/recovery.done $CONF_BACKUP_DIR
#blow away existing data directory
rm -rf $PGDATA
#untar base backup file
cd /
tar xzvf $BASE_BACKUP_FILE
#copy configs back
cp $CONF_BACKUP_DIR/*.conf $PGDATA
cp $CONF_BACKUP_DIR/recovery.done $PGDATA/recovery.conf
--
View this message in context: http://postgresql.1045698.n5.nabble.com/base-backup-restore-streaming-replication-weirdness-tp5746342.html
Sent from the PostgreSQL - admin mailing list archive at Nabble.com.
From | Date | Subject | |
---|---|---|---|
Next Message | Charles Sprickman | 2013-02-23 05:55:07 | logging full queries separately |
Previous Message | Ned Wolpert | 2013-02-22 16:59:27 | Re: Database corruption event, unlockable rows, possibly bogus virtual xids? (-1/4444444444) |