From: | Rural Hunter <ruralhunter(at)gmail(dot)com> |
---|---|
To: | Khusro Jaleel <mailing-lists(at)kerneljack(dot)com> |
Cc: | pgsql-admin(at)postgresql(dot)org |
Subject: | Re: Issues with log-shipping replication |
Date: | 2011-12-16 00:55:33 |
Message-ID: | 4EEA9705.6090806@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin |
well, is pgsql capable to parse shell variables in postgresql.conf?
anway you should check your master log. If it can not archive the wal,
there will be errors reported in it.
于2011年12月16日 2:02:03,Khusro Jaleel写到:
> Hello, I'm trying out a simple example from the Postgresql 9
> Administration Cookbook about File-based log shipping replication, but
> I can't get it to work between 2 CentOS 5.7 VMs running Postgresql
> 9.1. Here are my settings:
>
> .bash_profile on master:
> export PGARCHIVE=/var/lib/pgsql/pgarchive
> export STANDBYNODE=172.16.1.56
>
> .bash_profile on slave:
> export PGARCHIVE=/var/lib/pgsql/pgarchive
>
> postgresql.conf on master:
> ===================================================================
> # - Settings -
>
> wal_level = archive # minimal, archive, or hot_standby
> # (change requires restart)
> #fsync = on # turns forced synchronization on or off
> #synchronous_commit = on # synchronization level; on, off, or local
> #wal_sync_method = fsync # the default is the first option
> # supported by the operating system:
> # open_datasync
> # fdatasync (default on Linux)
> # fsync
> # fsync_writethrough
> # open_sync
> #full_page_writes = on # recover from partial page writes
> #wal_buffers = -1 # min 32kB, -1 sets based on shared_buffers
> # (change requires restart)
> #wal_writer_delay = 200ms # 1-10000 milliseconds
>
> #commit_delay = 0 # range 0-100000, in microseconds
> #commit_siblings = 5 # range 1-1000
>
> # - Checkpoints -
>
> #checkpoint_segments = 3 # in logfile segments, min 1, 16MB each
> #checkpoint_timeout = 5min # range 30s-1h
> #checkpoint_completion_target = 0.5 # checkpoint target duration, 0.0
> - 1.0
> #checkpoint_warning = 30s # 0 disables
>
> # - Archiving -
>
> archive_mode = on # allows archiving to be done
> # (change requires restart)
> archive_command = 'scp %p $STANDBYNODE:$PGARCHIVE/%f'
> archive_timeout = 30 # force a logfile segment switch after this #
> number of seconds; 0 disables
> ===================================================================
>
> recovery.conf on slave:
> ===================================================================
> standby_mode = 'on'
> restore_command = 'cp $PGARCHIVE/%f %p'
> archive_cleanup_command = '/usr/pgsql-9.1/bin/pg_archivecleanup
> $PGARCHIVE %r'
> trigger_file = '/tmp/postgresql.trigger.5432
> ===================================================================
>
> do_backup.sh script to make an initial copy of master onto slave:
> ===================================================================
> #!/bin/bash
>
> psql -c "select pg_start_backup('base backup for log shipping')"
> rsync -cva --inplace --exclude=*pg_xlog* ${PGDATA}/ $STANDBYNODE:$PGDATA
> psql -c "select pg_stop_backup(), current_timestamp"
> ===================================================================
>
> The procedure I am using to test this is:
>
> 1. Clear out the data dir on the slave, put only the recovery.conf
> file there
> 2. Run the 'do_backup.sh' script to copy the master data dir to the
> slave, excluding pg_xlog
> 3. Create the pg_xlog dir on the slave
> 4. Make sure the $PGARCHIVE dir exists on the slave. Note that this is
> completely outside the normal 'data' dir
> 5. Start up the master. As soon as I do this, I see some archive files
> appear in the $PGARCHIVE dir on the slave. This means the 'scp'
> 'archive_command' I am using IS working.
> 6. Now start up the slave, which prints the following to the log in
> 9.1/data/pg_log:
>
> ========================================================================
> LOG: startup process (PID 4771) exited with exit code 1
> LOG: aborting startup due to startup process failure
> LOG: database system was interrupted; last known up at 2011-12-08
> 19:45:10 UTC
> LOG: creating missing WAL directory "pg_xlog/archive_status"
> LOG: entering standby mode
> LOG: restored log file "000000010000000000000027" from archive
> LOG: redo starts at 0/27000078
> LOG: consistent recovery state reached at 0/28000000
> cp: cannot stat `/var/lib/pgsql/pgarchive/000000010000000000000028':
> No such file or directory
> cp: cannot stat `/var/lib/pgsql/pgarchive/000000010000000000000028':
> No such file or directory
> ========================================================================
>
> Looking at the master, the file '000000010000000000000028' does exist
> in 9.1/data/pg_xlog, but for some reason the master is NOT copying it
> over to the slave. The slave has all the files prior to this, but not
> anything after.
>
> Anybody got any ideas why?
>
>
>
>
From | Date | Subject | |
---|---|---|---|
Next Message | K P Manoj | 2011-12-16 08:56:07 | Index error on recovery |
Previous Message | Liu, Jianli (Jianli) | 2011-12-15 22:53:17 | User password encryption using a stronger hashing function? |