I'm using this script to check my replication lag on my streaming
replication pairs with Nagios:
https://gist.github.com/jacobian/743942
It generally works fine, but will occasionally return a negative lag value
(-37kb for example) which of course causes it to throw an alarm, but is
total nonsense. I've been working on the assumption that it is some sort of
bug in the script, but in taking a quick look at it nothing jumps out at me.
Is there something in Postgres itself that could cause this to happen once
in awhile? Is it something to be concerned about? Is there a better way to
monitor this state?
Thanks!
QH