From: | Michael Blake <postgresql(at)akunno(dot)net> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Problem with hot standby |
Date: | 2010-12-14 22:12:42 |
Message-ID: | AANLkTi=Svt92zSdhKTC1H2dFXiqzTfj4E-1NMe0bDD06@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
I'm trying to set up a master/slave server, which initially worked
fine, but recently started failing with the following error:
==============
LOG: database system was interrupted; last known up at [time]
LOG: could not open file "pg_xlog/00000001000000000000002B" (log file
0, segment 43): No such file or directory
LOG: invalid checkpoint record
PANIC: could not locate required checkpoint record
HINT: If you are not restoring from a backup, try removing the file
"/var/lib/postgresql/9.0/main/backup_label".
LOG: startup process (PID 31489) was terminated by signal 6: Aborted
LOG: aborting startup due to startup process failure
==============
This is an Ubuntu 10.04 machine, with all debian default
configurations barring the following changes:
[Primary: postgresql.conf]
wal_level = hot_standby
max_wal_senders = 1
archive_mode = on
archive_command = 'cp -i %p /var/lib/postgresql/export/9.0/main/%f
</dev/null' # Unix
log_statement = 'all'
[Secondary: postgresql.conf]
hot_standby = on
[Secondary: recovery.conf]
standby_mode = 'on'
primary_conninfo = 'host=10.168.60.41 port=5432 user=replication_sys
password=XXXXXXXXX'
restore_command = 'cp /var/lib/postgresql/archive/9.0/main/%f "%p"'
#restore_command = '/usr/lib/postgresql/9.0/bin/pg_standby -c -d -s 2
-t /var/log/pgpool/trigger/trigger_file1
/var/lib/postgresql/archive/9.0/main %p >>
/var/log/postgresql/postgresql-9.0-standby.log.1 1>&2'
#restore_command = '/usr/lib/postgresql/9.0/bin/pg_standby
/var/lib/postgresql/archive/9.0/main %f %p %r'
#archive_cleanup_command = 'pg_archivecleanup
/var/lib/postgresql/archive/9.0/main %r'
#archive_command = 'cp %p /var/lib/postgresql/archive/9.0/main/%f'
The 'archive directory' mentioned above is an NFS mount of the primary
server's /var/lib/postgresql/export/9.0/main directory.
This is working fine, and I can see (in the archive directory on the
recovery server) the pg_xlog file mentioned in the error above.
The script I use to bring a server up to date after failure is as
follows, run as the postgresql user:
================
#!/bin/sh
SERVER=10.168.60.41
VERSION="9.0"
CLUSTER="main"
DEST_CLUSTER="/var/lib/postgresql/$VERSION/$CLUSTER"
ARCHIVE_CLUSTER="/var/lib/postgresql/archive/$VERSION/$CLUSTER"
/etc/init.d/postgresql stop
echo "SELECT pg_start_backup('backup');" | psql --host $SERVER --user
replication_sys template1
rm -rf $DEST_CLUSTER/pg_xlog
# Don't need to ignore postgresql.conf etc as they are in
/etc/postgresql as per debian standard install
rsync -C -a -c --delete -e ssh --exclude pg_log --exclude pg_xlog
--exclude postmaster.pid --exclude postmaster.opts
$SERVER:$DEST_CLUSTER/* $DEST_CLUSTER/
mkdir -p $DEST_CLUSTER/pg_xlog/archive_status
chmod -R 700 $DEST_CLUSTER/pg_xlog
# stop the backup on the master
echo "SELECT pg_stop_backup();" | psql --host $SERVER --user
replication_sys template1
/etc/init.d/postgresql start
================
So I believe I'm doing it right, just can't seem to crack why the
pg_xlog error is happening.
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Blake | 2010-12-14 22:24:28 | Hot Standby pg_xlog problem |
Previous Message | Brent Wood | 2010-12-14 22:12:12 | Re: Simple, free PG GUI/query tool wanted |