Postgres 9.1 - getting a continous archiving database to accept connections taking too long

From: James Sebastian <james(dot)sebastian(at)gmail(dot)com>
To: pgsql-admin(at)postgresql(dot)org
Subject: Postgres 9.1 - getting a continous archiving database to accept connections taking too long
Date: 2015-08-06 10:44:14
Message-ID: CA+ehAmEhvBN7T==F_wqA8Q0QGrXE-Mn7BsEa2EQMs8Z42wnOQg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

I am very new to postgresql and have a continous archiving server to
recover the database in case of a hardware failure or pg crash. Following
http://www.postgresql.org/docs/9.1/static/continuous-archiving.html, I copy
the WAL archives to another machine and apply them regularly. Recently due
to pg server crash at my main server I tried to restore the database by
pushing a trigger (an empty file, expecting smart failover) and the logs
are given below.

prodrestore_error.log
======================

WAL file not present yet. Checking for trigger file...
trigger file found: smart failover
Trigger file: /app/recovery-prod/trigger/pgsql.trigger.5432
Waiting for WAL file: 0000000100000001000000CC
WAL file path:
/backup/prod/db_backup/archive/0000000100000001000000CC
Restoring to: pg_xlog/RECOVERYXLOG
Sleep interval: 60 seconds
Max wait interval: 0 forever
Command for restore: cp
"/backup/prod/db_backup/archive/0000000100000001000000CC"
"pg_xlog/RECOVERYXLOG"
Keep archive history: 000000000000000000000000 and later
trigger file found: smart failover
running restore: OK

Trigger file: /app/recovery-prod/trigger/pgsql.trigger.5432
Waiting for WAL file: 00000002.history
WAL file path: /backup/prod/db_backup/archive/00000002.history
Restoring to: pg_xlog/RECOVERYHISTORY
Sleep interval: 60 seconds
Max wait interval: 0 forever
Command for restore: cp "/backup/prod/db_backup/archive/00000002.history"
"pg_xlog/RECOVERYHISTORY"
Keep archive history: 000000000000000000000000 and later
running restore: cp: cannot stat
`/backup/prod/db_backup/archive/00000002.history': No such file or directory
cp: cannot stat `/backup/prod/db_backup/archive/00000002.history': No such
file or directory
cp: cannot stat `/backup/prod/db_backup/archive/00000002.history': No such
file or directory
cp: cannot stat `/backup/prod/db_backup/archive/00000002.history': No such
file or directory
not restored
history file not found
Trigger file: /app/recovery-prod/trigger/pgsql.trigger.5432
Waiting for WAL file: 00000001.history
WAL file path: /backup/prod/db_backup/archive/00000001.history
Restoring to: pg_xlog/RECOVERYHISTORY
Sleep interval: 60 seconds
Max wait interval: 0 forever
Command for restore: cp "/backup/prod/db_backup/archive/00000001.history"
"pg_xlog/RECOVERYHISTORY"
Keep archive history: 000000000000000000000000 and later
running restore: cp: cannot stat
`/backup/prod/db_backup/archive/00000001.history': No such file or directory
cp: cannot stat `/backup/prod/db_backup/archive/00000001.history': No such
file or directory
cp: cannot stat `/backup/prod/db_backup/archive/00000001.history': No such
file or directory
cp: cannot stat `/backup/prod/db_backup/archive/00000001.history': No such
file or directory
not restored
history file not found

postgresql-9.1-main.log
========================

2015-08-03 08:08:44 IST::@:[5542]:LOG: restored log file
"0000000100000001000000CC" from archive
2015-08-03 08:11:44 IST::@:[5542]:LOG: could not open file
"pg_xlog/0000000100000001000000CD" (log file 1, segment 205): No such file
or directory
2015-08-03 08:11:44 IST::@:[5542]:LOG: redo done at 1/CC001E10
2015-08-03 08:11:44 IST::@:[5542]:LOG: last completed transaction was at
log time 2015-08-03 07:59:58.121908+05:30
2015-08-03 08:11:44 IST::@:[5542]:LOG: restored log file
"0000000100000001000000CC" from archive
2015-08-03 08:17:44 IST::@:[5542]:LOG: selected new timeline ID: 2
2015-08-03 08:23:45 IST::@:[5542]:LOG: archive recovery complete
2015-08-03 08:23:51 IST::@:[6088]:LOG: autovacuum launcher started
2015-08-03 08:23:51 IST::@:[5541]:LOG: database system is ready to accept
connections

The whole process of looking for two history files 00000002.history and
00000001.history, 4 times and finally start the database took approximately
12 minutes.

1. How I can make this as fast to look for history files and if not present
proceed quickly to start database in read/write mode ? or not to look for
history files at all.

2. After some reading, I tested by keeping keyword 'fast' in the trigger
file.

I could see below in logs.

"WAL file not present yet. Checking for trigger file...
trigger file found: *fast failover*
Trigger file: /app/recovery-prod/trigger/pgsql.trigger.5432
Waiting for WAL file: 0000000100000001000000CF
WAL file path:
/backup/prod/db_backup/archive/0000000100000001000000CF
Restoring to: pg_xlog/RECOVERYXLOG
Sleep interval: 60 seconds
Max wait interval: 0 forever
Command for restore: cp
"/backup/prod/db_backup/archive/0000000100000001000000CF"
"pg_xlog/RECOVERYXLOG"
Keep archive history: 000000000000000000000000 and later
trigger file found:* smart failover*
running restore: OK

and the restore took same time to check the history files.
In this case, I needed almost 14-15 minutes as it looked for
00000003.history aprt from 00000002.history and 00000001.history, even
though I restarted the whole process by replacing the base backup afresh.

Am I doing something wrong here or is this was a fast failover expected to
work?

3. Each iteration, I did to see the behaviour by stopping the current
postgres server, deleting the previous base backup and restoring afresh
added search for one more non existing history file like 00000004.history
and 00000005.history, further slowing the process.

As I restore from the base backup each time, then push the trigger file
after applying all WAL archives after the base backup, I would like to
either ignore looking for history files altogether or control how many
history files are tried. Also like to speed up starting the database in
read write mode, so that I can point my application to this database and
continue the business. What can I do for that ?

Thanks for all the help.
Regards,
James

flatulencehttp://
www.postgresql.org/docs/9.1/static/continuous-archiving.html

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Glyn Astill 2015-08-06 10:52:59 Re: How to take backup of a table on a different schema in PostgreSQL
Previous Message Geo P.C. 2015-08-06 10:20:14 How to take backup of a table on a different schema in PostgreSQL