From: | Nigel Metheringham <nigel(dot)metheringham(at)dev(dot)intechnology(dot)co(dot)uk> |
---|---|
To: | pgsql-admin(at)postgresql(dot)org |
Subject: | Loss of table structure on 7.3.19 |
Date: | 2009-09-17 12:42:19 |
Message-ID: | BD3938DF-A126-45DA-BCE0-E5D226F092D6@dev.intechnology.co.uk |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin |
Hi,
I have a database thats been running in production use since 2006 on a
Centos 4.7 (originally an earlier 4 release, updated incrementally).
The pg version is somewhat ancient as we have stuck with the system
postgres - currently postgresql-7.4.19-1.el4_6.1.
Yesterday it all fell apart with all queries/updates into it having
issues. A check showed that many of the tables had lost their
definitions - for example the task_log table now consisted on a single
timestamp field rather than the selection of fields that would
normally be there.
Fortunately we have a recent backup and could restore the db with the
loss of only a couple of hours of data.
So now the post-mortum.
Suspiciously, in the pg_clog directory there were 2049 files (0000 to
0800) - the time the failure happened coincided with the modification
time on file 07FF. I see in the documentation (http://www.postgresql.org/docs/7.3/static/routine-vacuuming.html
) warnings regarding XID rollover at 1 billion - since it appears clog
datastructures are 2 bits per XID then 2048 files of 256KB would be
1,048,576 which is close enough to 1 billion for me, and the effects
are close to those described....
However we do have a regular vacuuming process - every day each table
is VACUUM ANALYZE-ed (as well as an index rebuild).
I've experimented this morning, and nothing I can do with the VACUUM
command (FULL, against a table or the whole db etc) appears to make
the number of CLOG files reduce. If I understand their function
correctly there is no reason for them to hang around and multiply
unless there are long-running open transactions (which there are not)
- and surely if these were just open transaction related then a
database shutdown/restart ought to scrub them.
So the questions are:-
- Is this plethora of (active) CLOG files likely to be reason for
the problem?
- Or am I barking up entirely the wrong tree?
- How can I avoid this in the future?
- Is there any other information I should be digging out (I don't
have much
in the way of postgres logs)?
I've also got a Centos 5.x box with a similar installation but
postgres version 8.1.11 on it. This also appears to have an
increasing number of pg_clog files, and a full vacuum is not stopping
this...
Cheers
Nigel.
--
[ Nigel Metheringham Nigel(dot)Metheringham(at)InTechnology(dot)com ]
[ - Comments in this message are my own and not ITO opinion/policy - ]
From | Date | Subject | |
---|---|---|---|
Next Message | Andre Rothe | 2009-09-17 13:36:46 | Wrong column default values |
Previous Message | brahma tiwari | 2009-09-17 10:04:43 | Re: PostgreSQL 8.3 |