Quick Links

Loss of table structure on 7.3.19

From:	Nigel Metheringham <nigel(dot)metheringham(at)dev(dot)intechnology(dot)co(dot)uk>
To:	pgsql-admin(at)postgresql(dot)org
Subject:	Loss of table structure on 7.3.19
Date:	2009-09-17 12:42:19
Message-ID:	BD3938DF-A126-45DA-BCE0-E5D226F092D6@dev.intechnology.co.uk
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-admin

Hi,

I have a database thats been running in production use since 2006 on a
Centos 4.7 (originally an earlier 4 release, updated incrementally).
The pg version is somewhat ancient as we have stuck with the system
postgres - currently postgresql-7.4.19-1.el4_6.1.

Yesterday it all fell apart with all queries/updates into it having
issues. A check showed that many of the tables had lost their
definitions - for example the task_log table now consisted on a single
timestamp field rather than the selection of fields that would
normally be there.

Fortunately we have a recent backup and could restore the db with the
loss of only a couple of hours of data.

So now the post-mortum.

Suspiciously, in the pg_clog directory there were 2049 files (0000 to
0800) - the time the failure happened coincided with the modification
time on file 07FF. I see in the documentation (http://www.postgresql.org/docs/7.3/static/routine-vacuuming.html
) warnings regarding XID rollover at 1 billion - since it appears clog
datastructures are 2 bits per XID then 2048 files of 256KB would be
1,048,576 which is close enough to 1 billion for me, and the effects
are close to those described....

However we do have a regular vacuuming process - every day each table
is VACUUM ANALYZE-ed (as well as an index rebuild).

I've experimented this morning, and nothing I can do with the VACUUM
command (FULL, against a table or the whole db etc) appears to make
the number of CLOG files reduce. If I understand their function
correctly there is no reason for them to hang around and multiply
unless there are long-running open transactions (which there are not)
- and surely if these were just open transaction related then a
database shutdown/restart ought to scrub them.

So the questions are:-
- Is this plethora of (active) CLOG files likely to be reason for
the problem?
- Or am I barking up entirely the wrong tree?
- How can I avoid this in the future?
- Is there any other information I should be digging out (I don't
have much
in the way of postgres logs)?

I've also got a Centos 5.x box with a similar installation but
postgres version 8.1.11 on it. This also appears to have an
increasing number of pg_clog files, and a full vacuum is not stopping
this...

Cheers
Nigel.

--
[ Nigel Metheringham Nigel(dot)Metheringham(at)InTechnology(dot)com ]
[ - Comments in this message are my own and not ITO opinion/policy - ]

Responses

Re: Loss of table structure on 7.3.19 at 2009-09-17 13:57:08 from Tom Lane

Browse pgsql-admin by date

	From	Date	Subject
Next Message	Andre Rothe	2009-09-17 13:36:46	Wrong column default values
Previous Message	brahma tiwari	2009-09-17 10:04:43	Re: PostgreSQL 8.3