Quick Links

Some problem with warm standby server

From:	Nico Sabbi <nsabbi(at)officinedigitali(dot)it>
To:	pgsql-general(at)postgresql(dot)org
Subject:	Some problem with warm standby server
Date:	2007-04-27 10:31:26
Message-ID:	4631D0FE.4010106@officinedigitali.it
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Hi,
I have some doubts regarding the settings and the access procedure of
warm standby servers:
- can autovacuum be safely enabled on the replicator?
- I'm using pg_standby (from cvs) that is generally working well as
expected (logs are copied with
scp); today I wanted to temporarily stop the replication to verify
some data to restart it later on, so
I touched the trigger file, waited for the log to report "database
ready", verified that the
databases were actually up-to-date. All was fine, then I ran

rm -f pg_xlog/* pg_xlog/archive_status/*
mv recovery.done recovery.conf (the permissions were right)
/etc/init.d/postgresql stop ; /etc/init.d/postgresql start

the replication seemed to start:
----
---------------------------------------------------
LOG: database system was shut down at 2007-04-27 12:16:13 CEST
LOG: starting archive recovery
LOG: restore_command = "/usr/local/bin/pg_standby -s 5 -w 0 -t
/usr/local/postgres_replica/trigger /usr/local/postgres_replica/log/ %f %p"
cp: cannot stat `/usr/local/postgres_replica/log//00000001.history': No
such file or directory
cp: cannot stat `/usr/local/postgres_replica/log//00000001.history': No
such file or directory
cp: cannot stat `/usr/local/postgres_replica/log//00000001.history': No
such file or directory

then I updated the master with a batch of inserts, but after a while the
slave stopped with
these messages:

LOG: restored log file "000000010000000000000021" from archive
LOG: record with zero length at 0/21000048
LOG: invalid primary checkpoint record
LOG: restored log file "000000010000000000000020" from archive
LOG: restored log file "000000010000000000000021" from archive
LOG: invalid resource manager ID in secondary checkpoint record
PANIC: could not locate a valid checkpoint record
LOG: startup process (PID 19619) was terminated by signal 6
LOG: aborting startup due to startup process failure

What did I do wrong? Is there any other procedure to follow to restart a
stopped replication?
Thanks,
Nico

Responses

Re: Some problem with warm standby server at 2007-04-28 10:20:45 from Simon Riggs

Browse pgsql-general by date

	From	Date	Subject
Next Message	rupesh bajaj	2007-04-27 11:01:48	When the locially dropped column is also physically dropped
Previous Message	A. Kretschmer	2007-04-27 08:00:39	Re: Business days