From: | Josh Kupershmidt <schmiddy(at)gmail(dot)com> |
---|---|
To: | Greg Sabino Mullane <greg(at)turnstep(dot)com> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: monitoring warm standby lag in 8.4? |
Date: | 2010-12-10 19:13:06 |
Message-ID: | AANLkTimHVJrrUYsJBXBtvgdo9P75BhQEd9iYJpMb55-B@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Fri, Dec 10, 2010 at 11:27 AM, Greg Sabino Mullane <greg(at)turnstep(dot)com> wrote:
> Correct. But since we cannot connect to a database in recovery mode,
> there are very few options to determine how far 'behind' it is. The
> pg_controldata is what the check_postgres program uses. This offers a
> rough check which is usually sufficient unless you have a very
> inactive database or need very fine grained checking.
>
> A better system would perhaps connect to both ends and examine which
> specific WALs were being shipped and which one was last played, but
> there are no tools I know of that do that. I suspect the reason for
> this is that the pg_controldata check is "good enough". Certainly,
> that's what we are using for many clients via check_postgres, and
> it's been very good at detecting when the replica has problems. Good
> enough that I've never worried about writing a different method,
> anyway. :)
Thanks for the reply.
One simple piece I added in to my monitoring script which wasn't here:
http://www.kennygorman.com/wordpress/?p=249
(or in check_postgres.pl, from a quick look at check_checkpoint() in
check_postgres.pl) is a verification that the standby slave is
actually 'in archive recovery' mode, from looking at the 'Database
cluster state:' output of pg_controldata.
I was mulling over some ways to add in a reasonable check that the
standby was keeping up with the WAL stream. Comparing WAL file names
on master vs. standby would probably work, but I was also thinking
that a simple directory-size check on the standby's WAL archive
directory would show whether we were receiving WAL files faster than
we could process them.
Josh
From | Date | Subject | |
---|---|---|---|
Next Message | Vick Khera | 2010-12-10 20:05:26 | Re: Invalid byte sequence |
Previous Message | Dmitriy Igrishin | 2010-12-10 18:26:11 | Re: Extended query protocol and exact types matches. |