From: | Harinath Kanchu <hkanchu(at)apple(dot)com> |
---|---|
To: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | LOG: invalid record length at <LSN> : wanted 24, got 0 |
Date: | 2023-03-01 05:21:12 |
Message-ID: | 47509690-AC33-4C8D-8566-D1B9BF662B34@apple.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello,
We are seeing an interesting STANDBY behavior, that’s happening once in 3-4 days.
The standby suddenly disconnects from the primary, and it throws the error “LOG: invalid record length at <LSN>: wanted 24, got0”.
And then it tries to restore the WAL file from the archive. Due to low write activity on primary, the WAL file will be switched and archived only after 1 hr.
So, it stuck in a loop of switching the WAL sources from STREAM and ARCHIVE without replicating the primary.
Due to this there will be write outage as the standby is synchronous standby.
We are using “wal_sync_method” as “fsync” assuming WAL file not getting flushed correctly.
But this is happening even after making it as “fsync” instead of “fdatasync”.
Restarting the STANDBY sometimes fixes this problem, but detecting this automatically is a big problem as the postgres standby process will be still running fine, but WAL RECEIVER process is up and down continuously due to switching of WAL sources.
How can we fix this ? Any suggestions regarding this will be appreciated.
Postgres Version: 13.6
OS: RHEL Linux
Thank you,
Best,
Harinath.
From | Date | Subject | |
---|---|---|---|
Next Message | Masahiko Sawada | 2023-03-01 05:26:48 | Re: Time delayed LR (WAS Re: logical replication restrictions) |
Previous Message | Zheng Li | 2023-03-01 05:19:50 | Re: Support logical replication of global object commands |