From: | "Dhaval Shah" <dhaval(dot)shah(dot)m(at)gmail(dot)com> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Errors during recovery of a postgres. Need some help understanding them... |
Date: | 2007-04-10 01:23:07 |
Message-ID: | 565237760704091823v1f5527d3w74f93c1c7fd3040e@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Here is the situation:
I have a standby postgres which is fed a WAL File every 2 minutes.
Whenever it is fed a WAL file it logs the following:
---
LOG: restored log file "000000010000000000000070" from archive
pg_restore::copyWALFile: Moving
/opt/data/mirror/000000010000000000000071 to pg_xlog/RECOVERYXLOG
LOG: restored log file "000000010000000000000071" from archive
pg_restore::copyWALFile: Moving
/opt/data/mirror/000000010000000000000072 to pg_xlog/RECOVERYXLOG
LOG: restored log file "000000010000000000000072" from archive
...
...
pg_restore::copyWALFile: Moving
/opt/data/mirror/000000010000000000000082 to pg_xlog/RECOVERYXLOG
LOG: restored log file "000000010000000000000082" from archive
---
I assume that the above situation is a happy postgres in a recovery
mode. The "copyWALFile" is my message in the serverlog.
After a while, the primary gives up. That is it goes down and I am not
able to pull any WAL file from the primary. So I tell the standby that
I do not have any WAL File to give.
----
LOG: could not open file "pg_xlog/000000010000000000000083" (log file
0, segment 131): No such file or directory
LOG: redo done at 0/8200D280
Main: Triggering recovery
PANIC: could not open file "pg_xlog/000000010000000000000082" (log
file 0, segment 130): No such file or directory
---
The issue above is that I do not have the "001...0083" file and I
return a "file not found". Further when the postgres asks me about
"001...0082", I do not have that either, since in the intervening
minutes, I have moved that file out of my /opt/data/mirror to
/opt/data/tape directory for long term tape storage. So how do I make
my standby postgres happy?
Having run into that situation, the standby also spits out the following:
---
LOG: could not open file "pg_xlog/000000010000000000000082" (log file
0, segment 130): No such file or directory
LOG: invalid primary checkpoint record
LOG: could not open file "pg_xlog/000000010000000000000080" (log file
0, segment 128): No such file or directory
LOG: invalid secondary checkpoint record
---
What is happening is that the postgres is looking behind in time for
the "0001...0082" and "0001...0080" files.
The question I have is, how far does it look behind in time? Then I
have to be careful of when I move the WAL file out to tape. Further if
I know how far back in time I have to keep my WAL file, then I can
device an effective strategy of removing older files. That is if I
come and say that I generate WAL file every 2 minutes, then do I keep
10 files or 120 files?
Any insight on this will help.
Regards
Dhaval
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Treat | 2007-04-10 01:23:17 | Re: Is there a shortage of postgresql skilled ops people |
Previous Message | Geoffrey | 2007-04-10 00:31:42 | Re: backend reset of database |