Streaming replica failure

From: Aleš Zelený <zeleny(dot)ales(at)gmail(dot)com>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Streaming replica failure
Date: 2021-04-27 10:01:22
Message-ID: CAODqTUYjAdr-qHkWEgDWhtYf0Bkz__so8zxVfqU9bT44HzaOBw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello,

we are using PostgreSQL 12.4 on CentOS 7. The hot standby failed:

2021-04-24 09:19:27 CEST [20956]: [747-1] user=,db=,host=,app= LOG:
recovery restart point at 3D8C/352B4CE8
2021-04-24 09:19:27 CEST [20956]: [748-1] user=,db=,host=,app= DETAIL:
Last completed transaction was at log time 2021-04-24 09:19:27.221313+02.
2021-04-24 09:20:57 CEST [20956]: [749-1] user=,db=,host=,app= LOG:
restartpoint starting: time
2021-04-24 09:24:27 CEST [20956]: [750-1] user=,db=,host=,app= LOG:
restartpoint complete: wrote 171233 buffers (13.4%); 0 WAL file(s) added,
68 removed, 0 recycled; write
=209.663 s, sync=0.012 s, total=209.963 s; sync files=283, longest=0.001 s,
average=0.000 s; distance=684762 kB, estimate=684762 kB
2021-04-24 09:24:27 CEST [20956]: [751-1] user=,db=,host=,app= LOG:
recovery restart point at 3D8C/5EF6B858
2021-04-24 09:24:27 CEST [20956]: [752-1] user=,db=,host=,app= DETAIL:
Last completed transaction was at log time 2021-04-24 09:24:27.288115+02.
2021-04-24 09:25:15 CEST [20955]: [11-1] user=,db=,host=,app= WARNING:
page 366603 of relation base/20955/10143636 is uninitialized
2021-04-24 09:25:15 CEST [20955]: [12-1] user=,db=,host=,app= CONTEXT: WAL
redo at 3D8C/D79F6500 for Heap2/VISIBLE: cutoff xid 3806260577 flags 0x01
2021-04-24 09:25:15 CEST [20955]: [13-1] user=,db=,host=,app= PANIC: WAL
contains references to invalid pages
2021-04-24 09:25:15 CEST [20955]: [14-1] user=,db=,host=,app= CONTEXT: WAL
redo at 3D8C/D79F6500 for Heap2/VISIBLE: cutoff xid 3806260577 flags 0x01
2021-04-24 09:25:16 CEST [20953]: [11-1] user=,db=,host=,app= LOG: startup
process (PID 20955) was terminated by signal 6: Neúspěšně ukončen (SIGABRT)
2021-04-24 09:25:16 CEST [20953]: [12-1] user=,db=,host=,app= LOG:
terminating any other active server processes

The relation base/20955/10143636 is a standard table.

Version details:
postgres=# select version();
version

---------------------------------------------------------------------------------------------------------
PostgreSQL 12.4 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5
20150623 (Red Hat 4.8.5-39), 64-bit
(1 row)

Settings:
postgres=# select name, setting, category from pg_settings where category =
'Write-Ahead Log / Settings';
name | setting | category
------------------------+-----------+----------------------------
commit_delay | 0 | Write-Ahead Log / Settings
commit_siblings | 5 | Write-Ahead Log / Settings
fsync | on | Write-Ahead Log / Settings
full_page_writes | on | Write-Ahead Log / Settings
synchronous_commit | on | Write-Ahead Log / Settings
wal_buffers | 2048 | Write-Ahead Log / Settings
wal_compression | off | Write-Ahead Log / Settings
wal_init_zero | on | Write-Ahead Log / Settings
wal_level | logical | Write-Ahead Log / Settings
wal_log_hints | off | Write-Ahead Log / Settings
wal_recycle | on | Write-Ahead Log / Settings
wal_sync_method | fdatasync | Write-Ahead Log / Settings
wal_writer_delay | 200 | Write-Ahead Log / Settings
wal_writer_flush_after | 128 | Write-Ahead Log / Settings
(14 rows)

pg_waldump output:
-bash-4.2$ /usr/pgsql-12/bin/pg_waldump 0000000100003D8C000000D7 2>&1 | tail
rmgr: Btree len (rec/tot): 64/ 64, tx: 3812802559, lsn:
3D8C/D7CF5A98, prev 3D8C/D7CF5A58, desc: INSERT_LEAF off 360, blkref #0:
rel 1663/20955/11280092 blk 19
509
rmgr: Transaction len (rec/tot): 46/ 46, tx: 3812802557, lsn:
3D8C/D7CF5AD8, prev 3D8C/D7CF5A98, desc: COMMIT 2021-04-24 09:25:16.160687
CEST
rmgr: Heap len (rec/tot): 159/ 159, tx: 3812802559, lsn:
3D8C/D7CF5B08, prev 3D8C/D7CF5AD8, desc: INSERT off 8 flags 0x08, blkref
#0: rel 1663/20955/11280066 bl
k 165603
rmgr: Btree len (rec/tot): 64/ 64, tx: 3812802559, lsn:
3D8C/D7CF5BA8, prev 3D8C/D7CF5B08, desc: INSERT_LEAF off 317, blkref #0:
rel 1663/20955/11280073 blk 15
340
rmgr: Btree len (rec/tot): 64/ 64, tx: 3812802559, lsn:
3D8C/D7CF5BE8, prev 3D8C/D7CF5BA8, desc: INSERT_LEAF off 130, blkref #0:
rel 1663/20955/11280091 blk 22
003
rmgr: Heap len (rec/tot): 80/ 80, tx: 3812802559, lsn:
3D8C/D7CF5C28, prev 3D8C/D7CF5BE8, desc: INSERT off 61 flags 0x08, blkref
#0: rel 1663/20955/11280082 b
lk 38917
rmgr: Btree len (rec/tot): 64/ 64, tx: 3812802559, lsn:
3D8C/D7CF5C78, prev 3D8C/D7CF5C28, desc: INSERT_LEAF off 73, blkref #0: rel
1663/20955/11280088 blk 421
25
rmgr: Btree len (rec/tot): 64/ 64, tx: 3812802559, lsn:
3D8C/D7CF5CB8, prev 3D8C/D7CF5C78, desc: INSERT_LEAF off 361, blkref #0:
rel 1663/20955/11280092 blk 19
509
rmgr: Transaction len (rec/tot): 46/ 46, tx: 3812802559, lsn:
3D8C/D7CF5CF8, prev 3D8C/D7CF5CB8, desc: COMMIT 2021-04-24 09:25:16.161036
CEST
pg_waldump: fatal: error in WAL record at 3D8C/D7CF5CF8: invalid record
length at 3D8C/D7CF5D28: wanted 24, got 0
-bash-4.2$

Emails like "BUG #13822: Slave terminated - WAL contains references to
invalid page" in the archive are quite old (2015).

While we can create a new replica, ios there something we can do to
diagnose the root cause of this failure?

Thanks for any hints or advice.

Kind regards Ales Zeleny

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Sanjay Minni 2021-04-27 10:32:49 Approach to creating users in Database
Previous Message Matteo Bonardi 2021-04-27 09:57:01 Re: Temporary files usage in explain