From: | andy rost <Andy(dot)Rost(at)noaa(dot)gov> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Unable to restart postgres - database system was |
Date: | 2006-12-05 20:14:03 |
Message-ID: | 4575D30B.4050400@noaa.gov |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
We perform a daily PTR backups of the database. Part of this process is
to delete old archived WALs between backups (no need to keep archived
transaction logs that are older than the most recent full backup, or is
there?). Since we had no indication of a problem, and since the server
continued to run, the backup process ran on schedule. The problem, in
our case, is that when we restarted the server it asked for transaction
logs that preceded our most recent backup. And since they've been
deleted from our WAL archive directory ....
I'm curious about a couple of things. Why didn't the logs reflect the
problem that it noticed when it tried to restart on 2006-12-04(what I
mean by that, is Postgres thought the server had been interrupted on
2006-12-02 16:45 yet the logs for that date and time don't show that
anything unusual happened). I suppose that its possible that the nature
of the problem prevented Postgres from logging the problem.
Secondly, how did Postgres know at the restart that a) a problem had
occurred sometime in the past and b) a specific set of transaction logs
is required to get back up again. I'd like to incorporate that logic in
a) our server monitoring scripts to look for problems that might not
make it to the Postgres logs and b) constrain our PTR backup process
(ie, stop it from running until the problem can be resolved - as near as
I can tell our current backup is compromised).
But, back to your questions. Nothing was done to the database other than
a shutdown and restart. The age of the pg_control file coincided with
the the time we stopped Postgres.
Tom Lane wrote:
> andy rost <Andy(dot)Rost(at)noaa(dot)gov> writes:
>> We stopped postgres using kill -TERM. When we tried to restart the
>> engine, it would not recover.
>
> Since you're apparently using archiving, you could pull the missing xlog
> files back from the archive no? Either manually, or automatically by
> installing a recovery.conf file. I am kinda wondering what happened
> here though. Did you do anything to the database between the shutdown
> and the attempted restart? It looks like you have a pg_control file
> that is quite a bit older than it should be.
>
> regards, tom lane
--
--------------------------------------------------------------------------------
Andrew Rost
National Operational Hydrologic Remote Sensing Center (NOHRSC)
National Weather Service, NOAA
1735 Lake Dr. West, Chanhassen, MN 55317-8582
Voice: (952)361-6610 x 234
Fax: (952)361-6634
andy(dot)rost(at)noaa(dot)gov
http://www.nohrsc.noaa.gov
--------------------------------------------------------------------------------
From | Date | Subject | |
---|---|---|---|
Next Message | Anton Melser | 2006-12-05 20:21:29 | Re: n00b RAID + wal hot standby question |
Previous Message | Alejandro Michelin Salomon ( Adinet ) | 2006-12-05 19:47:54 | RES: Problem working with dates and times. |