From: | "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> |
---|---|
To: | "Jeff Davis" <pgsql(at)j-davis(dot)com> |
Cc: | "Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Brian Wipf" <brian(at)clickspace(dot)com>, <pgsql-general(at)postgresql(dot)org>, <pgsql-hackers(at)postgresql(dot)org>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Subject: | Re: [HACKERS] WAL archiving idle database |
Date: | 2007-10-29 14:56:37 |
Message-ID: | 4725AE55.EE98.0025.0@wicourts.gov |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general pgsql-hackers |
>>> On Fri, Oct 26, 2007 at 6:39 PM, in message
<1193441976(dot)7624(dot)128(dot)camel(at)dogma(dot)ljc(dot)laika(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>
wrote:
> On Fri, 2007-10-26 at 18:06 -0500, Kevin Grittner wrote:
>> Hmmm... We would actually prefer to get the WAL file at the
>> specified interval. We have software to ensure that the warm
>> standby instances are not getting stale, and that's pretty simple
>> with the current behavior.
>
> Another thought: when you say it's "pretty simple", what do you do now?
> My monitoring scripts for this particular situation employ some pretty
> ugly code.
Here's our script:
#! /bin/bash
if [ "$1" == "" ] ; then
savepwd=$PWD
cd /var/pgsql/data/county/
find * -maxdepth 0 -type d | xargs -idirname $0 dirname
cd $savepwd
exit 0
fi
for countyName ; do
echo County: $countyName
/usr/local/pgsql/bin/pg_controldata /var/pgsql/data/county/$countyName/data | grep -E '(Database cluster state|pg_control last modified)'
/etc/init.d/postgresql-${countyName}-cc status
grep basebackup /var/pgsql/data/county/$countyName/data/basebackup-of-this-instance
echo ''
done
Here's an example of running it (although the opcenter usually runs
it without a parameter, to get all counties):
opcenter(at)PGBACKUP:~> sudo pgstatus.sh iowa
County: iowa
Database cluster state: in archive recovery
pg_control last modified: Mon 29 Oct 2007 09:03:16 AM CDT
pg_ctl: server is running (PID: 15902)
/usr/local/pgsql-8.2.4/bin/postgres -D /var/pgsql/data/county/iowa/data
basebackup cc-2007-10-26_190001
This gets parsed by a script in our monitor (python, I think) and
winds up feeding a status display. It's probably a bit crude, but
it has worked well for us, with very little time required to get it
going. This thread has made me aware that it is dependent on the
checkpoint frequency as well as the archive frequency. Our
checkpoint_timeout setting is 30min and our archive_timeout is the
default (one hour). The monitor shows red if the cluster state
isn't "in archvie recovery" or pg_ctl doesn't report "server is
running" or the last modified is older than 75 minutes.
We are OK with a one hour archive interval because we have a
separate application-oriented transaction stream (independent of
the database product) coming back real-time, which we can replay
to "top off" a database backup.
-Kevin
From | Date | Subject | |
---|---|---|---|
Next Message | Kevin Grittner | 2007-10-29 15:54:53 | Re: [HACKERS] WAL archiving idle database |
Previous Message | Pit M. | 2007-10-29 14:44:05 | Problems with PostgreSQL DBI-Link / DBD-ODBC |
From | Date | Subject | |
---|---|---|---|
Next Message | Kevin Grittner | 2007-10-29 15:54:53 | Re: [HACKERS] WAL archiving idle database |
Previous Message | Andrew Dunstan | 2007-10-29 14:37:29 | Re: [HACKERS] grep command |