Re: recovering from "too many failures" wal error

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: CS DBA <cs_dba(at)consistentstate(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: recovering from "too many failures" wal error
Date: 2014-12-01 18:08:09
Message-ID: 20141201180809.GB2456@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 2014-11-29 14:37:56 -0700, CS DBA wrote:
> All;
>
> We have a postgresql 9.2 cluster setup to do continuous wal archiving. We
> were archiving to a mount point that went offline. As a result the db could
> not archive the wal files, we ended up with many many errors in the logs
> indicating the file could not be archived:
>
> WARNING: transaction log file "0000000100000FB100000050" could not be
> archived: too many failures
>
> So we caught the issue before the file system filled up, fixed the mount
> point and I see wal files being added to the target wal archive directory.
> However the pg_xlog directory does not seem to be shrinking, there are
> currently 27,546 files in the pg_xlog directory and that number is not
> changed in some time (since we fixed the mount point.
>
> I assume the db will at some point remove the backed up files in the pg_xlog
> dir, is this true? or do I need to intervene?

The archiver will restart trying to archive if either a timeout has
passed (60s?) or if a new file is ready to be archived. So there should
be no need to intervene after fixing archiving. Are files being archived
again? Specifically ones that previously failed?
WAL files will only be removed around checkpoints - you could force one
by manually issuing a 'CHECKPOINT;' statement.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Merlin Moncure 2014-12-01 20:14:06 Re: Programmatic access to interval units
Previous Message Andres Freund 2014-12-01 18:02:47 Re: PG94RC1- plv8 functions - problem with input parameter length