PostgreSQL 9.0.3 streaming replication failure

From: Ray Stell <stellr(at)cns(dot)vt(dot)edu>
To: pgsql-admin(at)postgresql(dot)org
Subject: PostgreSQL 9.0.3 streaming replication failure
Date: 2011-02-09 13:55:48
Message-ID: 20110209135548.GA27471@cns.vt.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

I had set up this streaming replication pair of systems a few days ago and
everything seemed pretty happy as changes were being replicated. I set
it up without wal archiving turned up. The backup log this morning caught
my eye. I find the standby reports to have activated itself.

I don't see anything in the postmaster logs on this event.

The backup log caught my eye:
-------------------------------
pg_start_backup
-----------------
1/E0000020
(1 row)

NOTICE: WAL archiving is not enabled; you must ensure that all required WAL segments are copied through other means to complete the backup
pg_stop_backup
----------------
1/E00000C4
(1 row)

on original prod:
-------------------------
$ pg_controldata /var/database/pgsql/tws
WARNING: Calculated CRC checksum does not match value stored in file.
Either the file is corrupt, or it has a different layout than this program
is expecting. The results below are untrustworthy.

pg_control version number: 903
Catalog version number: 201008051
Database system identifier: 5569584977953909783
Database cluster state: unrecognized status code
pg_control last modified: Wed 09 Feb 2011 08:47:29 AM EST
Latest checkpoint location: 0/1
Prior checkpoint location: E201229C/1
Latest checkpoint's REDO location: E201221C/1
Latest checkpoint's TimeLineID: 3791725164
Latest checkpoint's NextXID: 1/0
Latest checkpoint's NextOID: 670
Latest checkpoint's NextMultiXactId: 24579
Latest checkpoint's NextMultiOffset: 1
Time of latest checkpoint: Wed 31 Dec 1969 07:00:00 PM EST
Minimum recovery ending location: 28E/1
Maximum data alignment: 1297259249
Database block size: 0
Blocks per segment of large relation: 0
WAL block size: 0
Bytes per WAL segment: 0
Maximum length of identifiers: 2
Maximum columns in an index: 100
Maximum size of a TOAST chunk: 0
Date/time type storage: 64-bit integers
Maximum length of locale name: 4
LC_COLLATE:
LC_CTYPE:

I find the standby says it activated itself:
-----------------------------------------------
pg_controldata /var/database/pgsql/tws
WARNING: Calculated CRC checksum does not match value stored in file.
Either the file is corrupt, or it has a different layout than this program
is expecting. The results below are untrustworthy.

pg_control version number: 903
Catalog version number: 201008051
Database system identifier: 5569584977953909783
Database cluster state: in production
pg_control last modified: Wed 09 Feb 2011 08:47:18 AM EST
Latest checkpoint location: 0/1
Prior checkpoint location: E201221C/1
Latest checkpoint's REDO location: E1000150/1
Latest checkpoint's TimeLineID: 3791725036
Latest checkpoint's NextXID: 1/0
Latest checkpoint's NextOID: 670
Latest checkpoint's NextMultiXactId: 24579
Latest checkpoint's NextMultiOffset: 1
Time of latest checkpoint: Wed 31 Dec 1969 07:00:00 PM EST
Minimum recovery ending location: 28E/1
Maximum data alignment: 1297258949
Database block size: 1
Blocks per segment of large relation: 3791725036
WAL block size: 0
Bytes per WAL segment: 0
Maximum length of identifiers: 2
Maximum columns in an index: 100
Maximum size of a TOAST chunk: 0
Date/time type storage: 64-bit integers
Maximum length of locale name: 4
LC_COLLATE:
LC_CTYPE:

the process table on old prod:
LC_CTYPE:
-------------------------------
$ ps -ef | grep 30083
500 8950 6052 0 08:50 pts/2 00:00:00 grep 30083
500 30083 1 0 Feb07 ? 00:00:01 /usr/local/pgsql903/bin/postgres -D /var/database/pgsql/tws
500 30084 30083 0 Feb07 ? 00:00:00 postgres: logger process
500 30086 30083 0 Feb07 ? 00:00:02 postgres: writer process
500 30087 30083 0 Feb07 ? 00:00:00 postgres: wal writer process
500 30088 30083 0 Feb07 ? 00:00:00 postgres: autovacuum launcher process
500 30089 30083 0 Feb07 ? 00:00:06 postgres: stats collector process
500 30092 30083 0 Feb07 ? 00:00:07 postgres: wal sender process repuser 198.82.169.39(7908) streaming 1/E20122EC

the processes on the old standby look normal:
----------------------------------------------
$ ps -ef | grep 16990
500 3548 16990 0 Feb07 ? 00:00:13 postgres: wal receiver process streaming 1/E20122EC
500 15685 13427 0 08:51 pts/2 00:00:00 grep 16990
500 16990 1 0 Feb04 ? 00:00:00 /usr/local/pgsql903/bin/postgres -D /var/database/pgsql/tws
500 16991 16990 0 Feb04 ? 00:00:00 postgres: logger process
500 16992 16990 0 Feb04 ? 00:00:00 postgres: startup process recovering 0000000100000001000000E2
500 16993 16990 0 Feb04 ? 00:00:00 postgres: writer process
500 16994 16990 0 Feb04 ? 00:00:00 postgres: stats collector process

can't do ddl on the "old" standby, the one that says it is "in production"
----------------------------------------------------------------------------
tws=# create table t (x int);
ERROR: cannot execute CREATE TABLE in a read-only transaction

ddl works on the old prod, the one that says it is "unrecognized status code"

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Ray Stell 2011-02-09 14:28:18 Re: PostgreSQL 9.0.3 streaming replication failure
Previous Message Gabriele Bartolini 2011-02-09 07:11:30 Re: Postgres Replication Options