From: | Kris Bushover <kris(at)spiceworks(dot)com> |
---|---|
To: | "pgsql-admin(at)postgresql(dot)org" <pgsql-admin(at)postgresql(dot)org> |
Subject: | Re: Fwd: Monitoring Replication on Master/Slave Postgres(9.1) |
Date: | 2012-11-28 19:29:48 |
Message-ID: | 128EC501EB29514B87952686265BB9A50AB58B@ORD2MBX02G.mex05.mlsrvr.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin |
The Bucardo check_postgres module contains a hot_standby_delay check function which will calculate the delta between the xlog position of the master with the slave(s).
http://bucardo.org/check_postgres/check_postgres.pl.html#hot_standby_delay
-----Original Message-----
From: pgsql-admin-owner(at)postgresql(dot)org [mailto:pgsql-admin-owner(at)postgresql(dot)org] On Behalf Of Steve Crawford
Sent: Wednesday, November 28, 2012 1:02 PM
To: pgsql-admin(at)postgresql(dot)org
Subject: Re: [ADMIN] Fwd: Monitoring Replication on Master/Slave Postgres(9.1)
On 11/28/2012 10:21 AM, Shams Khan wrote:
> ...how do we ensure my replication is working fine?...
>
Below is the core of one of my bash-script tools. It could use some tweaking (comments welcome) but works well. The script is run every minute by cron on master and standby servers. It auto-determines whether the server is currently a master or standby so the same script can be deployed to all servers.
If a master-server, it updates a one-record test table with a current timestamp to ensure there is activity on the master.
If a standby-server, it determines the lag based both on the age of
pg_last_xact_replay_timestamp() and on the age of the record in the test table then returns the worst of the two.
The delay value is set in $standby_delay which is a value in seconds.
It's up to you to decide what constitutes an issue that requires attention (but remember that 60-seconds does not necessarily indicate a problem on an idle server). My first-level alert triggers at 130-seconds and I have never hit that much of a delay.
#!/bin/bash
#
# Check PostgreSQL sync-status
#
# Requires table "sync_status" with column "sync_time" of type timestamp with time zone
# We need a temp file
tempquery="$(mktemp /tmp/monitor_db_synchronizationXXXXXXXXXX)"
# If master, update sync_status timestamp and return 0. If standby,
check both age
# of log-replay location and of timestamp in sync_status table and set
$standby_delay
# to the greater of the two (in seconds)
#
standby_delay=$(
psql -q --tuples-only --no-align 2>/dev/null <<EOS
\o ${tempquery}
select
case when setting='on' then
'
with logdelay as
(
select
case when
pg_last_xlog_receive_location()=pg_last_xlog_replay_location() then 0::int
else
(extract(epoch from now())-extract(epoch from
pg_last_xact_replay_timestamp()))::int
end as replicadelay
union
select
(extract(epoch from now())-extract(epoch from sync_time))::int
as replicadelay
from
sync_status
)
select
max(replicadelay)
from
logdelay
;
'
else
'
begin;
delete from sync_status;
insert into sync_status (sync_time) values (now()) returning 0::int as
replicadelay;
commit;
'
end
from pg_settings where name='transaction_read_only';
\o
\i ${tempquery}
EOS
)
# Cleanup temp file
test -f "${tempquery}" && rm "${tempquery}"
# Do some alert based on the number of seconds of lag between master and
standby here
Cheers,
Steve
--
Sent via pgsql-admin mailing list (pgsql-admin(at)postgresql(dot)org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
From | Date | Subject | |
---|---|---|---|
Next Message | Scott Marlowe | 2012-11-28 21:16:51 | Re: DBA user in Postgres |
Previous Message | Shams Khan | 2012-11-28 19:24:57 | Re: Fwd: Monitoring Replication on Master/Slave Postgres(9.1) |