db crash, streaming rep slave will not start

From: CS DBA <cs_dba(at)consistentstate(dot)com>
To: "pgsql-admin(at)postgresql(dot)org" <pgsql-admin(at)postgresql(dot)org>
Cc: cs_dba(at)consistentstate(dot)com
Subject: db crash, streaming rep slave will not start
Date: 2013-07-30 23:19:48
Message-ID: 51F84A14.4000801@consistentstate.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Hi All;

A client's master database crashed, they tried to startup the streaming
replication slave and it refuses to start.

See the log details below... thanks in advance for any help

Master log:

2013-07-30 16:23:01 MDT PANIC: corrupted page pointers: lower = 0, upper
= 0, special = 0

2013-07-30 16:23:02 MDT LOG: server process (PID 17539) was terminated
by signal 6: Abort trap

2013-07-30 16:23:02 MDT LOG: terminating any other active server processes

2013-07-30 16:23:02 MDT [local]WARNING: terminating connection because
of crash of another server process

2013-07-30 16:23:02 MDT [local]DETAIL: The postmaster has commanded this
server process to roll back the current transaction and exit, because
another server process exited abnormally and possibly corrupted shared
memory.

2013-07-30 16:23:02 MDT [local]HINT: In a moment you should be able to
reconnect to the database and repeat your command.

2013-07-30 16:23:02 MDT [local]WARNING: terminating connection because
of crash of another server process

2013-07-30 16:23:02 MDT [local]DETAIL: The postmaster has commanded this
server process to roll back the current transaction and exit, because
another server process exited abnormally and possibly corrupted shared
memory.

2013-07-30 16:23:02 MDT [local]HINT: In a moment you should be able to
reconnect to the database and repeat your command.

2013-07-30 16:23:02 MDT [local]WARNING: terminating connection because
of crash of another server process

2013-07-30 16:23:02 MDT [local]DETAIL: The postmaster has commanded this
server process to roll back the current transaction and exit, because
another server process exited abnormally and possibly corrupted shared
memory.

2013-07-30 16:23:04 MDT [local]FATAL: the database system is in recovery
mode

2013-07-30 16:23:04 MDT LOG: archiver process (PID 1826) exited with
exit code 1

2013-07-30 16:23:04 MDT 192.168.131.2FATAL: the database system is in
recovery mode

2013-07-30 16:23:04 MDT LOG: all server processes terminated; reinitializing

2013-07-30 16:23:04 MDT LOG: database system was interrupted; last known
up at 2013-07-30 16:21:33 MDT

2013-07-30 16:23:04 MDT LOG: database system was not properly shut down;
automatic recovery in progress

2013-07-30 16:23:04 MDT LOG: consistent recovery state reached at
1179D/8B7E7EF8

2013-07-30 16:23:04 MDT LOG: redo starts at 1179A/C1001EA8

2013-07-30 16:26:48 MDT LOG: record with zero length at 1179D/AC2591A8

2013-07-30 16:26:48 MDT LOG: redo done at 1179D/AC259168

2013-07-30 16:26:48 MDT LOG: last completed transaction was at log time
2013-07-30 16:23:02.11493-06

2013-07-30 16:26:48 MDT WARNING: page 476 of relation
base/603188/199093492 did not exist

2013-07-30 16:26:48 MDT WARNING: page 493 of relation
base/603188/199093492 did not exist

2013-07-30 16:26:48 MDT WARNING: page 1023 of relation
base/603188/199093492 did not exist

2013-07-30 16:26:48 MDT WARNING: page 708 of relation
base/603188/199093492 did not exist

2013-07-30 16:26:48 MDT WARNING: page 1075 of relation
base/603188/199093492 did not exist

2013-07-30 16:26:48 MDT WARNING: page 590 of relation
base/603188/199093492 did not exist

2013-07-30 16:26:48 MDT WARNING: page 832 of relation
base/603188/199093492 did not exist

2013-07-30 16:26:48 MDT WARNING: page 1742 of relation
base/603188/199093492 did not exist

2013-07-30 16:26:48 MDT WARNING: page 238 of relation
base/603188/199093492 did not exist

2013-07-30 16:26:48 MDT WARNING: page 334 of relation
base/603188/199093492 did not exist

2013-07-30 16:26:48 MDT WARNING: page 1131 of relation
base/603188/199093492 did not exist

2013-07-30 16:26:48 MDT WARNING: page 434 of relation
base/603188/199093492 did not exist

2013-07-30 16:26:48 MDT WARNING: page 772 of relation
base/603188/199093492 did not exist

2013-07-30 16:26:48 MDT WARNING: page 259 of relation
base/603188/199093492 did not exist

2013-07-30 16:26:48 MDT WARNING: page 498 of relation
base/603188/199093492 did not exist

2013-07-30 16:26:48 MDT WARNING: page 948 of relation
base/603188/199093492 did not exist

2013-07-30 16:26:48 MDT WARNING: page 1743 of relation
base/603188/199093492 did not exist

2013-07-30 16:26:48 MDT WARNING: page 96 of relation
base/603188/199093492 did not exist

2013-07-30 16:26:48 MDT WARNING: page 559 of relation
base/603188/199093492 did not exist

2013-07-30 16:26:48 MDT PANIC: WAL contains references to invalid pages

2013-07-30 16:26:48 MDT LOG: startup process (PID 17546) was terminated
by signal 6: Abort trap

2013-07-30 16:26:48 MDT LOG: aborting startup due to startup process failure

Slave log:

2013-07-30 16:41:48 MDT FATAL: could not connect to the primary server:
could not connect to server: Operation timed out

Is the server running on host "192.168.131.1" and accepting

TCP/IP connections on port 5432?

2013-07-30 16:42:59 MDT LOG: trigger file found: /pgdata/data/failover

2013-07-30 16:42:59 MDT FATAL: terminating walreceiver process due to
administrator command

2013-07-30 16:42:59 MDT LOG: redo done at 1179D/AC11DF00

2013-07-30 16:42:59 MDT LOG: last completed transaction was at log time
2013-07-30 16:23:01.986951-06

2013-07-30 16:42:59 MDT LOG: selected new timeline ID: 2

2013-07-30 16:42:59 MDT LOG: archive recovery complete

2013-07-30 16:42:59 MDT WARNING: page 98 of relation
base/603188/4268050827 did not exist

2013-07-30 16:42:59 MDT WARNING: page 476 of relation
base/603188/199093492 did not exist

2013-07-30 16:42:59 MDT WARNING: page 571 of relation
base/603188/2775093183 did not exist

2013-07-30 16:42:59 MDT WARNING: page 202 of relation
base/603188/4268050827 did not exist

2013-07-30 16:42:59 MDT WARNING: page 202 of relation
base/603188/3435025974 did not exist

2013-07-30 16:42:59 MDT WARNING: page 493 of relation
base/603188/199093492 did not exist

2013-07-30 16:42:59 MDT WARNING: page 1023 of relation
base/603188/199093492 did not exist

2013-07-30 16:42:59 MDT WARNING: page 163 of relation
base/603188/3476677873 did not exist

2013-07-30 16:42:59 MDT WARNING: page 15 of relation
base/603188/3435025974 did not exist

2013-07-30 16:42:59 MDT PANIC: WAL contains references to invalid pages

2013-07-30 16:43:00 MDT LOG: startup process (PID 41056) was terminated
by signal 6: Abort trap

2013-07-30 16:43:00 MDT LOG: terminating any other active server processes

2013-07-30 16:43:00 MDT 10.254.254.23WARNING: terminating connection
because of crash of another server process

2013-07-30 16:43:00 MDT 10.254.254.23DETAIL: The postmaster has
commanded this server process to roll back the current transaction and
exit, because another server process exited abnormally and possibly
corrupted shared memory.

Browse pgsql-admin by date

  From Date Subject
Next Message Carlos Henrique Reimer 2013-07-31 01:11:38 Exit code -1073741819
Previous Message bricklen 2013-07-30 16:02:26 Re: Disk latency goes up during certaing pediods