Replication is stuck

From: Murthy Nunna <mnunna(at)fnal(dot)gov>
To: "pgsql-admin(at)postgresql(dot)org" <pgsql-admin(at)postgresql(dot)org>
Subject: Replication is stuck
Date: 2024-06-23 12:01:08
Message-ID: DM8PR09MB66777F78ADC2157B11C90368B8CB2@DM8PR09MB6677.namprd09.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

I am running pg14.4. I use WAL replication in a stand-by server which is 7-days behind primary (recovery_min_apply_delay = 7d)

My replication is stuck. It looks like it is repeatedly applying same WAL file. The next WAL file(s) are very much there.

I restarted cluster but it didn't fix the issue.

I appreciate any help you can provide before I rebuild the stand-by. I am trying to find the root cause. If 0000000100013D94000000FF is corrupted how can we tell?

2024-06-23 06:54:57 CDT []LOG: restored log file "0000000100013D94000000FF" from archive
2024-06-23 06:55:02 CDT []LOG: restored log file "0000000100013D94000000FF" from archive
2024-06-23 06:55:07 CDT []LOG: restored log file "0000000100013D94000000FF" from archive
2024-06-23 06:55:12 CDT []LOG: restored log file "0000000100013D94000000FF" from archive
2024-06-23 06:55:17 CDT []LOG: restored log file "0000000100013D94000000FF" from archive
2024-06-23 06:55:22 CDT []LOG: restored log file "0000000100013D94000000FF" from archive
2024-06-23 06:55:27 CDT []LOG: restored log file "0000000100013D94000000FF" from archive
2024-06-23 06:55:32 CDT []LOG: restored log file "0000000100013D94000000FF" from archive
2024-06-23 06:55:37 CDT []LOG: restored log file "0000000100013D94000000FF" from archive
2024-06-23 06:55:42 CDT []LOG: restored log file "0000000100013D94000000FF" from archive

There are no missing WALs:

ls -ltr 0000000100013D95000000* |more
-rw------- 1 postgres postgres 16777216 Jun 14 19:39 0000000100013D9500000000
-rw------- 1 postgres postgres 16777216 Jun 14 19:39 0000000100013D9500000001
-rw------- 1 postgres postgres 16777216 Jun 14 19:39 0000000100013D9500000002
-rw------- 1 postgres postgres 16777216 Jun 14 19:39 0000000100013D9500000003
-rw------- 1 postgres postgres 16777216 Jun 14 19:40 0000000100013D9500000004
-rw------- 1 postgres postgres 16777216 Jun 14 19:40 0000000100013D9500000005
-rw------- 1 postgres postgres 16777216 Jun 14 19:40 0000000100013D9500000006
-rw------- 1 postgres postgres 16777216 Jun 14 19:40 0000000100013D9500000007
-rw------- 1 postgres postgres 16777216 Jun 14 19:40 0000000100013D9500000008
-rw------- 1 postgres postgres 16777216 Jun 14 19:40 0000000100013D9500000009
-rw------- 1 postgres postgres 16777216 Jun 14 19:41 0000000100013D950000000A
-rw------- 1 postgres postgres 16777216 Jun 14 19:41 0000000100013D950000000B

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Ninad Shah 2024-06-23 12:15:44 Re: Replication is stuck
Previous Message Wasim Devale 2024-06-23 10:54:58 Re: Restoration