PostgreSQL 14.8 - server fails to start even though all transaction logs with base backup are restored

From: Meera Nair <mnair(at)commvault(dot)com>
To: "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Cc: Punit Pranesh Koujalgi <pkoujalgi(at)commvault(dot)com>
Subject: PostgreSQL 14.8 - server fails to start even though all transaction logs with base backup are restored
Date: 2023-08-08 07:39:48
Message-ID: BY3PR19MB5219A12E18AC5DDFA5F4F063BA0DA@BY3PR19MB5219.namprd19.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi Team,

We are following https://www.postgresql.org/docs/14/continuous-archiving.html#BACKUP-LOWLEVEL-BASE-BACKUP
Making An Exclusive Low-Level Backup.

After restoring, we see the server fails to start with below error:

PostgreSQL server log:
2023-08-04 16:47:47.227 IST [40582] LOG: starting PostgreSQL 14.8 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit
2023-08-04 16:47:47.228 IST [40582] LOG: listening on IPv4 address "0.0.0.0", port 5414
2023-08-04 16:47:47.228 IST [40582] LOG: listening on IPv6 address "::", port 5414
2023-08-04 16:47:47.231 IST [40582] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5414"
2023-08-04 16:47:47.236 IST [40582] LOG: listening on Unix socket "/tmp/.s.PGSQL.5414"
2023-08-04 16:47:47.243 IST [40584] LOG: database system was interrupted; last known up at 2023-08-04 14:48:29 IST
2023-08-04 16:47:49.254 IST [40584] LOG: starting archive recovery
cp: cannot stat '/var/lib/pgsql/14/wal/00000004.history': No such file or directory
2023-08-04 16:47:49.275 IST [40584] LOG: restored log file "000000040000000200000041" from archive
2023-08-04 16:47:49.290 IST [40584] LOG: redo starts at 2/41000028
2023-08-04 16:47:49.310 IST [40584] LOG: restored log file "000000040000000200000042" from archive
2023-08-04 16:47:49.344 IST [40584] LOG: restored log file "000000040000000200000043" from archive
2023-08-04 16:47:49.368 IST [40584] LOG: restored log file "000000040000000200000044" from archive
2023-08-04 16:47:49.395 IST [40584] LOG: restored log file "000000040000000200000045" from archive
2023-08-04 16:47:49.423 IST [40584] LOG: restored log file "000000040000000200000046" from archive
2023-08-04 16:47:49.460 IST [40584] LOG: restored log file "000000040000000200000047" from archive
2023-08-04 16:47:50.649 IST [40584] LOG: restored log file "000000040000000200000048" from archive
2023-08-04 16:47:51.219 IST [40584] LOG: restored log file "000000040000000200000049" from archive
cp: cannot stat '/var/lib/pgsql/14/wal/00000004000000020000004A': No such file or directory
2023-08-04 16:47:51.603 IST [40584] LOG: redo done at 2/49013500 system usage: CPU: user: 0.00 s, system: 0.19 s, elapsed: 2.31 s
2023-08-04 16:47:51.603 IST [40584] LOG: last completed transaction was at log time 2023-08-04 14:58:51.021414+05:30
2023-08-04 16:47:51.628 IST [40584] LOG: restored log file "000000040000000200000049" from archive
2023-08-04 16:47:51.973 IST [40584] FATAL: WAL ends before end of online backup
2023-08-04 16:47:51.973 IST [40584] HINT: Online backup started with pg_start_backup() must be ended with pg_stop_backup(), and all WAL up to that point must be available at recovery.
2023-08-04 16:47:51.975 IST [40582] LOG: startup process (PID 40584) exited with exit code 1
2023-08-04 16:47:51.975 IST [40582] LOG: terminating any other active server processes
2023-08-04 16:47:51.975 IST [40582] LOG: shutting down due to startup process failure
2023-08-04 16:47:51.977 IST [40582] LOG: database system is shut down

Backup_label from data directory:

[root(at)gkstandby2 data]# cat backup_label
START WAL LOCATION: 2/41000028 (file 000000040000000200000041)
CHECKPOINT LOCATION: 2/41000060
BACKUP METHOD: pg_start_backup
BACKUP FROM: primary
START TIME: 2023-08-04 14:48:29 IST
LABEL: pgida_backup_5414_108625_1691140709
START TIMELINE: 4
[root(at)gkstandby2 data]#

Archive log directory:

[root(at)gkstandby2 wal]# ls
00000002000000020000003B 00000002000000020000003E.00000028.backup 000000040000000200000040.00000028.backup 000000040000000200000044 000000040000000200000048
00000002000000020000003C 00000002.history 000000040000000200000041 000000040000000200000045 000000040000000200000049
00000002000000020000003C.00000028.backup 0000000300000005000000E4 000000040000000200000042 000000040000000200000045.00005AE0.backup
00000002000000020000003D 00000004000000020000003F 000000040000000200000043 000000040000000200000046
00000002000000020000003E 000000040000000200000040 000000040000000200000043.00000028.backup 000000040000000200000047

[root(at)gkstandby2 wal]# /usr/pgsql-14/bin/pg_waldump 000000040000000200000046 | grep BACKUP
rmgr: XLOG len (rec/tot): 34/ 34, tx: 0, lsn: 2/46000110, prev 2/46000098, desc: BACKUP_END 2/45005AE0
[root(at)gkstandby2 wal]#

We are restoring all the transaction logs required for base backup to be consistent.
BACKUP_END is present in the 000000040000000200000046 segment. Why did recovery fail to find this?

Regards,
Meera

Responses

Browse pgsql-general by date

  From Date Subject
Next Message KK CHN 2023-08-08 09:33:48 Re: My 1st TABLESPACE
Previous Message H 2023-08-08 00:59:29 Re: Dropping all tables in a database