From: | Klaus Ita <klaus(at)worstofall(dot)com> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Fwd: corrupted files |
Date: | 2013-07-29 22:37:27 |
Message-ID: | CAK9oVJwKKBPb7ejm9wF8NXFU3hbA=xVkYBqrAVasYKmqLbcGZQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs pgsql-general |
Sorry for cross-posting, i read that pg-bug was not the right place for
this email
Hi list!
depressed me gets error messages like these:
2013-07-29 20:57:09 UTC <xaxos_mailer%xaxos_de> ERROR: could not access
status of transaction 8393477
2013-07-29 20:57:09 UTC <xaxos_mailer%xaxos_de> DETAIL: Could not open
file "pg_clog/0008": No such file or directory.
combined with the error output of queries that do not work.
I looked in pg_clog and correct, 0008 is missing.
On this linux machine on (3.2.0-4-amd64 #1 SMP Debian 3.2.46-1 x86_64
GNU/Linux) I am using xfs on raid1 on a megacli raid controller with 16
disks, no battery, this is why write through is enabled, no cacheing.
I quite extensively created indices in transactions and removed those
within these transactions to do fast deletes (foreign key constraints)
before i got the error???
Now it might be that the memory on the server is corrupt? dunno, but i
think it's the only 'cheap' part in the whole game.
* tried to get one of the warm standby's up but one complains about not
being the same pg cluster as the 'wal files'. the other hot standby won't
start for some locale reason.
(it's not that I did not have backups ;) ).
the cluster is 'working', i get the error around 1/sec but the other
clients seem fine, so it's really only a few tables that are corrupted. I
cannot really take down the machine as it's quite a busy few million
queries a day cluster.
before the current error, i got some error that XXXXX.1 was missing which
was (luckily) an index file that i could recreate via 'reindex', but i fear
we're now at a table / transaction corruption which i cannot just 'rewrite'.
I would not at all mind just discarding all those transactions that have
accumulated in pg_clog
postgres(at)pgmaster:~/9.1/main/pg_clog$ ls -alrt | wc -l
180
Is there any way, even with data loss to get rid of those transactions and
just let the cluster behave again? It's serving some web-apps for users so
some minor data loss will not be the issue.
quite desperate...
postgres(at)[local]:5432 [postgres] # select version();
version
----------------------------------------------------------------------------------------------
PostgreSQL 9.1.9 on x86_64-unknown-linux-gnu, compiled by gcc (Debian
4.7.2-5) 4.7.2, 64-bit
(1 row)
Customized options:
#------------------------------------------------------------------------------
# CUSTOMIZED OPTIONS
#------------------------------------------------------------------------------
#custom_variable_classes = '' # list of custom variable class
names
listen_addresses = '*' # what IP address(es) to listen on;
max_connections = 320 # (change requires restart)
timezone = 'Etc/UTC'
shared_buffers = 2GB # min 128kB
maintenance_work_mem = 250MB
checkpoint_completion_target = 0.9
effective_cache_size = 20GB
effective_io_concurrency = 6 # 1-1000. 0 disables prefetching
archive_mode = on
wal_level = 'hot_standby' #
http://www.postgresql.org/docs/9.1/static/runtime-config-wal.html#GUC-WAL-LEVEL
archive_command = '/opt/postgres_archive_command.pl --file_path=%p
--file_name=%f --work_dir=/var/tmp/ --destination_hosts=
va-pg-backups(at)dx(dot)ipv6(dot)ex(dot)net
--destination_sftp_hosts=u671(at)ipv6(dot)u71(dot)y--destination_hosts=
va-pg-backups(at)y7(dot)ipv6(dot)ex(dot)net'
max_wal_senders = 3 # max number of walsender processes
wal_keep_segments = 50 # in logfile segments, 16MB each; 0 disables
thx in advance,
klaus
From | Date | Subject | |
---|---|---|---|
Next Message | raghu ram | 2013-07-30 06:23:25 | Re: Fwd: corrupted files |
Previous Message | Klaus Ita | 2013-07-29 21:49:31 | Re: corrupted files |
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Langote | 2013-07-30 00:29:22 | Re: [HACKERS] maintenance_work_mem and CREATE INDEX time |
Previous Message | Klaus Ita | 2013-07-29 21:49:31 | Re: corrupted files |