"invalid record length" after restoring pg_basebackup

From: Dennis Jacobfeuerborn <dennisml(at)conversis(dot)de>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: "invalid record length" after restoring pg_basebackup
Date: 2020-11-13 12:52:10
Message-ID: c729cdd1-73c3-73f8-9f40-dd0fc84efb29@conversis.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi,
I've run into a strange issue after restoring a backup that I created
using pg_basebackup on a standby instance. The command I use to create
the backup is this:

pg_basebackup -v --write-recovery-conf -h$BACKUP_HOST -p5432
-U$BACKUP_USER --format tar --wal-method stream --compress=2 -D
"$BACKUP_DIR"

This backup runs fine and produces a "base.tar.gz" and "pg_wal.tar.gz"
file. The server version is 11.7.

To restore the backup on another system I unpack the "base.tar.gz" file
into the data directory and "pg_wal.tar.gz" into the "pg_wal"
sub-directory. I then comment out the "primary_conninfo" directive in
the "recovery.conf" file since I don't want to replicate from the
primary on this system I just want to restore the state from the backup.

All of this works fine and the logs report that the db reaches a
consistent recovery state but as last entry it reports an "invalid
record length":

2020-11-13 12:25:34.266 UTC [1] LOG: listening on IPv4 address
"0.0.0.0", port 5432
2020-11-13 12:25:34.266 UTC [1] LOG: listening on IPv6 address "::",
port 5432
2020-11-13 12:25:34.269 UTC [1] LOG: listening on Unix socket
"/var/run/postgresql/.s.PGSQL.5432"
2020-11-13 12:25:34.437 UTC [20] LOG: database system was interrupted
while in recovery at log time 2020-11-03 14:36:00 UTC
2020-11-13 12:25:34.437 UTC [20] HINT: If this has occurred more than
once some data might be corrupted and you might need to choose an
earlier recovery target.
2020-11-13 12:25:35.359 UTC [20] WARNING: recovery command file
"recovery.conf" specified neither primary_conninfo nor restore_command
2020-11-13 12:25:35.359 UTC [20] HINT: The database server will
regularly poll the pg_wal subdirectory to check for files placed there.
2020-11-13 12:25:35.359 UTC [20] LOG: entering standby mode
2020-11-13 12:25:35.372 UTC [20] LOG: redo starts at 932/9D123E70
2020-11-13 12:26:45.435 UTC [20] LOG: consistent recovery state reached
at 933/AA63F600
2020-11-13 12:26:45.435 UTC [1] LOG: database system is ready to accept
read only connections
2020-11-13 12:26:45.444 UTC [20] LOG: invalid record length at
933/AA6DA660: wanted 24, got 0

Looking at the WAL data the last file is "0000000100000933000000AA" and
there is a file "archive_status/0000000100000933000000A9.done" but no
corresponding ".done" file for the "...AA" WAL file.

Running pg_waldump on the "...AA" file shows the following at the end:

$ pg_waldump 0000000100000933000000AA
...
rmgr: Heap len (rec/tot): 54/ 54, tx: 1743220, lsn:
933/AA6DA510, prev 933/AA6D8F08, desc: LOCK off 12: xid 1743220: flags 0
LOCK_ONLY KEYSHR_LOCK , blkref #0: rel 1663/13117/874105 blk 0
rmgr: Heap len (rec/tot): 81/ 81, tx: 1743220, lsn:
933/AA6DA548, prev 933/AA6DA510, desc: INSERT off 19, blkref #0: rel
1663/13117/868296 blk 0
rmgr: Btree len (rec/tot): 80/ 80, tx: 1743220, lsn:
933/AA6DA5A0, prev 933/AA6DA548, desc: INSERT_LEAF off 24, blkref #0:
rel 1663/13117/868299 blk 1
rmgr: Btree len (rec/tot): 72/ 72, tx: 1743220, lsn:
933/AA6DA5F0, prev 933/AA6DA5A0, desc: INSERT_LEAF off 23, blkref #0:
rel 1663/13117/915254 blk 2
rmgr: Transaction len (rec/tot): 34/ 34, tx: 1743220, lsn:
933/AA6DA638, prev 933/AA6DA5F0, desc: COMMIT 2020-11-03 14:49:29.205562 UTC
pg_waldump: FATAL: error in WAL record at 933/AA6DA638: invalid record
length at 933/AA6DA660: wanted 24, got 0

It looks like the WAL in the backup is inconsistent at the end but I'd
expect pg_basebackup to create a clean backup. Is this something I
should expect or is this a problem with pg_basebackup?

Regards,
Dennis

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Radoslav Nedyalkov 2020-11-13 13:24:14 conflict with recovery when delay is gone
Previous Message Magnus Hagander 2020-11-13 10:13:13 Re: Failed Login Attempts in PostgreSQL