From: | "Pete St(dot) Onge" <pete(at)seul(dot)org> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Hosed PostGreSQL Installation |
Date: | 2002-09-21 05:54:55 |
Message-ID: | 20020921015454.U31893@moria.seul.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
As a result of some disk errors on another drive, an admin in our group
brought down the server hosting our pgsql databases with a kill -KILL
after having gone to runlevel 1 and finding the postmaster process still
running. No surprise, our installation was hosed in the process.
After talking on #postgresql with klamath for about an hour or so to
work through the issue (many thanks!), it was suggested that I send
the info to this list.
Currently, PostGreSQL will no longer start, and gives this error.
bash-2.05$ /usr/bin/pg_ctl -D $PGDATA -p /usr/bin/postmaster start
postmaster successfully started
bash-2.05$ DEBUG: database system shutdown was interrupted at
2002-09-19 22:59:54 EDT
DEBUG: open(logfile 0 seg 0) failed: No such file or directory
DEBUG: Invalid primary checkPoint record
DEBUG: open(logfile 0 seg 0) failed: No such file or directory
DEBUG: Invalid secondary checkPoint record
FATAL 2: Unable to locate a valid CheckPoint record
/usr/bin/postmaster: Startup proc 11735 exited with status 512 - abort
Our setup is vanilla Red Hat 7.2, having pretty much all of the
postgresql-*-7.1.3-2 packages installed. Klamath asked if I had disabled
fsync in postgresql.conf, and the only non-default (read: non-commented)
setting in the file is: `tcpip_socket = true`
Klamath suggested that I run pg_controldata:
bash-2.05$ ./pg_controldata
pg_control version number: 71
Catalog version number: 200101061
Database state: SHUTDOWNING
pg_control last modified: Thu Sep 19 22:59:54 2002
Current log file id: 0
Next log file segment: 1
Latest checkpoint location: 0/1739A0
Prior checkpoint location: 0/1718F0
Latest checkpoint's REDO location: 0/1739A0
Latest checkpoint's UNDO location: 0/0
Latest checkpoint's StartUpID: 21
Latest checkpoint's NextXID: 615
Latest checkpoint's NextOID: 18720
Time of latest checkpoint: Thu Sep 19 22:49:42 2002
Database block size: 8192
Blocks per segment of large relation: 131072
LC_COLLATE: en_US
LC_CTYPE: en_US
If I look into the pg_xlog directory, I see this:
sh-2.05$ cd pg_xlog/
bash-2.05$ ls -l
total 32808
-rw------- 1 postgres postgres 16777216 Sep 20 23:13 0000000000000002
-rw------- 1 postgres postgres 16777216 Sep 19 22:09 000000020000007E
There is one caveat. The installation resides on a partition of its own:
/dev/hda3 17259308 6531140 9851424 40% /var/lib/pgsql/data
fdisk did not report errors for this partition at boot time after the
forced shutdown, however.
This installation serves a university research project, and although
most of the code / schemas are in development (and should be in cvs by
rights), I can't confirm that all projects have indeed done that. So any
advice, ideas or suggestions on how the data and / or schemas can be
recovered would be greatly appreciated.
Many thanks!
-- pete
P.S.: I've been using pgsql for about four years now, and it played a
big role during my grad work. In fact, the availability of pgsql was one
of the reasons why I was able to complete and graduate. Many thanks for
such a great database!
--
Pete St. Onge
Research Associate, Computational Biologist, UNIX Admin
Banting and Best Institute of Medical Research
Program in Bioinformatics and Proteomics
University of Toronto
http://www.utoronto.ca/emililab/ pete(at)seul(dot)org
From | Date | Subject | |
---|---|---|---|
Next Message | Shridhar Daithankar | 2002-09-21 08:44:26 | Re: Improving speed of copy |
Previous Message | Curt Sampson | 2002-09-21 04:16:42 | Re: PGXLOG variable worthwhile? |