Detect streaming replication failure

From: Lists <lists(at)benjamindsmith(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Detect streaming replication failure
Date: 2014-07-17 22:50:12
Message-ID: 53C85324.5070206@benjamindsmith.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

For reference:
https://wiki.postgresql.org/wiki/Streaming_Replication

Assume a master -> slave streaming replication configuration, Postgresql
9.2.
Assume that the master has been chugging away, but the slave PG service
has been offline
for a while and the wal archive has updated enough that the slave cannot
catch up.

When I start the slave PG instance, pg launches and "runs" but doesn't
update. It also doesn't seem to throw any errors. The only outward sign
that I can see that anything is wrong is that
pg_last_xlog_replay_location() doesn't update. I can look in
/var/lib/pgsql/9.2/data/pg_log/postgresql-Thu.csv and see errors there EG:

2014-07-17 22:38:23.851 UTC,,,21310,,53c8505f.533e,2,,2014-07-17
22:38:23 UTC,,0,FATAL,XX000,"could not receive data from WAL stream:
FATAL: requested WAL segment 000000070000000500000071 has already been
removed

Is that the only way to detect this condition? I guess I'm looking for
something like

select * from pg_is_replicating_ok();
1

on the slave. At the moment, it appears that I can either parse the log
file, or look for pg_last_xact_replay_timestamp() > acceptable threshold
minutes in the past.

http://www.postgresql.org/docs/9.2/static/functions-admin.html

Thanks,

Ben

Responses

Browse pgsql-general by date

  From Date Subject
Next Message AI Rumman 2014-07-17 23:08:48 Re: Whats is lock type transactionid?
Previous Message Douglas J Hunley 2014-07-17 21:08:18 Re: Whats is lock type transactionid?