From: | Mitchell Laks <mlaks(at)verizon(dot)net> |
---|---|
To: | "Pgsql-Admin (E-mail)" <pgsql-admin(at)postgresql(dot)org> |
Subject: | Severe Badness On My Server: psql: FATAL: the database system is starting up |
Date: | 2005-03-13 16:12:00 |
Message-ID: | 200503131112.01120.mlaks@verizon.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin |
Dear Gurus:
My Server and me have had a very bad weekend, starting Friday afternoon.
I am running Debian Sarge, Postgresql 7.4.6 with linux kernel 2.6.8.
I am running a Postgresql backed application on a remote server. The system
has a system drive, on which the Postgresql database runs and there is a raid
1 drive on which the application stores data.
Well, the raid1 failed (or is failing - or is trying its hardest to fail, not
clear yet...). This should not have affected the Postgresql database as it is
safely on a separate drive.
However, when i logged onto the system, I found that I could not turn off
postgresql. I logged in as postgres, did pg_ctl stop and it did ....... and
then could not stop (presumably because hanging client applications were not
loged off the database).
So then I killed all the application clients (kill -9 of them), and still I
tried to pg_ctl stop and it did not want to stop.
So I looked in ps aux and the client applications looked like they were in D
status in ps aux.
wustl 18232 0.0 0.2 4872 1920 ? D Mar11
0:00 /usr/local/ctn/bi
I then tried to reboot system remotely via login as root and shutdown -r now
and even shutdown -h now. Interestingly enough (I have never ever seen this -
system refused to shutdown!!!!!!!).
I was floored! Well what to do? I decided to sleep on it.
Well I logged in then on saturday night and system was still hanging in this
bizarre state. I now saw qued shutdown requests in the ps aux. And nothing
was happening fast.
I thought. I read a little. I tried pg_ctl stop -m fast. It did nothing. I
prayed. I tried to do pg_dump LTA_IDB >lta_idb.dump to dump the database in
question. It didnt do anything.
I was desparate. I decided to try desparate measures I then pulled the gun
pg_ctl stop -m i.
OK so it stopped. Then I said let me try to dump the database and so I did
pg_ctl start. It started
postgres(at)A1:~$ pg_ctl status
pg_ctl: postmaster is running (PID: 21195)
Command line was:
/usr/lib/postgresql/bin/postmaster
Then I tried to dump the database and i got some message about the fact that
Fatal the database was starting. I waited a while and then I tried again.
same message. I then tried as user of the database psql LTA_IDB and message
Fatal the database is starting.
Then I tried psql LTA_IDB and got Fatal database is starting.
I waited. Then I did pg_ctl stop (I dont know why i did it. Perversity I
think.)
It then said to me
................ something about unable to stop.
Then I did
postgres(at)A1:~$ pg_dump LTA_IDB>lta_idb.dump
2005-03-13 10:56:33 [21481] LOG: connection received: host=[local] port=
2005-03-13 10:56:33 [21481] FATAL: the database system is shutting down
pg_dump: [archiver (db)] connection to database "LTA_IDB" failed: FATAL: the
dn
Now I did
pg_ctl status
postgres(at)A1:~$ pg_ctl status
pg_ctl: postmaster is running (PID: 21195)
Command line was:
/usr/lib/postgresql/bin/postmaster
OK I feel like I am in the twilight zone.
Next I did as root
cd /var/log
ls postg*
A1:/var/log# ls post*
postgres.log postgres.log.2.gz postgres.log.5.gz postgres.log.8.gz
postgres.log.1 postgres.log.3.gz postgres.log.6.gz postgres.log.9.gz
postgres.log.10.gz postgres.log.4.gz postgres.log.7.gz
A1:/var/log# less postgres.log
postgres.log: No such file or directory
WHAT????????
df -h
/dev/sda2 9.2G 2.8G 6.0G 32% /
tmpfs 443M 0 443M 0% /dev/shm
/dev/sda1 89M 11M 74M 13% /boot
/dev/sda3 7.4G 273M 6.7G 4% /home
/dev/sda8 11G 33M 9.9G 1% /mirror
/dev/sda7 449M 8.1M 417M 2% /tmp
/dev/sda6 7.4G 4.7G 2.4G 67% /var
/dev/md0 230G 139G 80G 64% /home/big0
I am in the twilight zone. My sanity is suspect. Any ideas on what to do next?
Pull the plug????
Mitchell
From | Date | Subject | |
---|---|---|---|
Next Message | Geoffrey | 2005-03-13 17:24:00 | Re: Too frequent warnings for wraparound failure |
Previous Message | Milen A. Radev | 2005-03-12 16:34:19 | Re: Too frequent warnings for wraparound failure |