Dennis Thrysøe<dth(at)geysirit(dot)dk> wrote:
> 1) The master keeps writing WAL files even though I'm quite sure
> nothing is happening. This seems like a large waste of diskspace?
What is your setting for archive_timeout? This limits how long
before a WAL file is sent. You could extend the time, although that
means that in case of a failure, you might not be as up-to-date.
Another option is to use pg_clearxlogtail with gzip or use
pglesslog. I haven't used pglesslog, but piping an empty WAL file
through pg_clearxlogtail and gzip reduces it to about 16 kB (rather
than 16 MB).
> 2) Sometimes my slave does not read and delete WAL files when in
> recovery mode. This will eventually fill up the disk.
Sorry I can't help with that one -- we use our own scripts rather
than pg_standby. Anyone else recognize this issue?
> pg_controldata gives me:
>
> Minimum recovery ending location: 0/0
>
> What does that mean?
I think that only has meaning when the cluster is in archive
recovery status. What does pg_controldata say the "Database cluster
state" is when you see this?
> Is there any good ways of troubleshooting the behaviour, and
> finding out precisely what state the slave is in, etc.?
We use pg_controldata and check "Database cluster state" to ensure
that our warm standbys are still "in archive recovery" and we check
the "Time of latest checkpoint" to ensure that its age is not much
beyond our archive_timeout setting -- to ensure that the data is
indeed still flowing.
I believe that 9.0 will be adding some nicer ways to check such
things.
-Kevin