Pg_xlog increase due to postgres crash (disk full)

From: Cliff de Carteret <cliffdecarteret(at)googlemail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Pg_xlog increase due to postgres crash (disk full)
Date: 2013-01-22 11:12:54
Message-ID: CAC+bnxkjsPHs6np68mVOfsVW8MApEa731w3ZjV6pWXHbSJqTCg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

My database crashed a couple of days ago during an upgrade several seconds
after committing a large transaction to the database. Eventually we found
out that this was due to the disk being full as the transaction had created
several gigs of data. A day or so later the disk is full again and
PostgreSQL crashes due to the pg_xlog file taking up all of the disk space.
I have cleaned up the drive to have so extra space which allows PostgreSQL
to start again but the xlogs are still increasing. I have two errors in my
pg_log:

"WARNING: transaction log file "00000001000000A800000078" could not be
archived: too many failures" and

"LOG: archive command failed with exit code 1
DETAIL: The failed archive command was: test ! -f
/opt/postgres/remote_pgsql/wal_archive/00000001000000A800000078 && cp
pg_xlog/00000001000000A800000078
/opt/postgres/remote_pgsql/wal_archive/00000001000000A800000078"

Postgres version 9.0.3 conf:

-
wal_level = hot_standby
archive_mode = true
archive_command = 'test ! -f /opt/postgres/remote_pgsql/wal_archive/%f
&& cp %p /opt/postgres/remote_pgsql/wal_archive/%f' # command to use to
archive a logfile segment
archive_timeout = 1800
max_wal_senders = 1
max_standby_archive_delay = 900s
max_standby_streaming_delay = 900s
default_statistics_target = 50 # pgtune wizard 2010-11-18
maintenance_work_mem = 480MB # pgtune wizard 2010-11-18
constraint_exclusion = on # pgtune wizard 2010-11-18
checkpoint_completion_target = 0.9 # pgtune wizard 2010-11-18
effective_cache_size = 5632MB # pgtune wizard 2010-11-18
work_mem = 48MB # pgtune wizard 2010-11-18
wal_buffers = 8MB # pgtune wizard 2010-11-18
checkpoint_segments = 16 # pgtune wizard 2010-11-18
shared_buffers = 1920MB # pgtune wizard 2010-11-18
max_connections = 80 # pgtune wizard 2010-11-18

I've tried stopping postgres and then deleting the
00000001000000A800000078.ready file and starting postgres but this appears
to be recreated instantly and the error is still in the log file.

I've read about the pg_reset_xlog() command but with having to pg_dump our
db with a large amount of blobs and restoring it again is highly
problematic as the pg_restore has struggled to restore.

Will setting zero_damaged_pages (true) work in 9.0.1 and would this resolve
the issue?

Would creating an empty file and replacing the offending xlog work, would
this need to be a specific size?

Any ideas?

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Leif Jensen 2013-01-22 13:34:43 Re: Update rule on a view - what am I doing wrong
Previous Message Roela B Montecillo 2013-01-22 10:08:47 PostgreS Security Concern