invalid record length at XX: wanted 24, got

From: Mariel Cherkassky <mariel(dot)cherkassky(at)gmail(dot)com>
To: pgsql-admin(at)lists(dot)postgresql(dot)org
Subject: invalid record length at XX: wanted 24, got
Date: 2019-08-20 06:43:52
Message-ID: CA+t6e1nJMKsA2_sUeUqRSNhds4utanCpKSqcH6C+U=M44CMF-A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Hey,
I have 2 db nodes(9.6) configured with streaming replication (+repmgr).
Suddenly ysterday my secondary stopped syncing and I saw the following
error in the log :
invalid record length at X/YYYYY: wanted 24, got

In addition, since then, the secondary db keeps restoring the same wal file
(kinda stuck on restorying it).
I guess that the wal was missing some data / corrupted so I tried to copy
it from the primary but it didnt help. In addition, I decided to start the
secondary in read write but it failed with the following error :
LOG: invalid primary checkpoint record
LOG: invalid secondary checkpoint record
PANIC: could not locate a valid checkpoint record
LOG: startup process (PID 17096) was terminated by signal 6: Aborted
LOG: aborting startup due to startup process failure
LOG: database system is shut down

My next idea is using pg_resetxlog in order to start the secondary
successfully and then use pg_rewind to sync it again with the master. The
master is working perfectly and there arent any issues on it. Right now,
I'm not interested in taking a basebackup and creating the secondary from
scratch..

I will be happy to hear if u guys have any other ideas why it might
happened and how I can handle it.

Thanks

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Jeff Janes 2019-08-20 14:14:09 Re: invalid record length at XX: wanted 24, got
Previous Message Shital A 2019-08-19 18:06:56 Re: Pgsql resource agent of pacemaker