Re: FATAL: could not send data to WAL stream: lost synchronization with server: got message type "0", length 892351284

From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: Дмитрий <dsolik(at)mail(dot)ru>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: FATAL: could not send data to WAL stream: lost synchronization with server: got message type "0", length 892351284
Date: 2025-01-25 17:32:27
Message-ID: 108b4789-190e-4b1d-a49b-d15215074351@aklaver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 1/25/25 09:03, Дмитрий wrote:
> 1) What sort of replication?
> - Streaming replication
>
> 2) Where are the two servers located relative to each other?
> - The servers are located in different data centers.
>
> 3) Has there been any software upgrades/network changes recently?
> - I don't know any information about the  software upgrades/network

It would be a good thing to ask of those folks that do know.

From the log attached to your initial post:

2025-01-25 17:28:01.930 MSK [1196013] LOG: starting PostgreSQL 15.10 on
x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.5.0 20240719 (Red Hat
11.5.0-2), 64-bit
2025-01-25 17:28:01.930 MSK [1196013] LOG: listening on IPv4 address
"0.0.0.0", port 5432
2025-01-25 17:28:01.931 MSK [1196013] LOG: listening on Unix socket
"/run/postgresql/.s.PGSQL.5432"
2025-01-25 17:28:01.932 MSK [1196013] LOG: listening on Unix socket
"/tmp/.s.PGSQL.5432"
2025-01-25 17:28:01.962 MSK [1196017] LOG: database system was shut
down in recovery at 2025-01-25 17:28:01 MSK
2025-01-25 17:28:01.962 MSK [1196017] LOG: entering standby mode

How was it shut down, on purpose or a hardware/software issue?

Also do you have corresponding logs from primary?

2025-01-25 17:28:12.192 MSK [1196017] LOG: consistent recovery state
reached at 1063C/D002DC68
2025-01-25 17:28:12.192 MSK [1196017] LOG: incorrect resource manager
data checksum in record at 1063C/D002DC68
2025-01-25 17:28:12.192 MSK [1196013] LOG: database system is ready to
accept read-only connections
2025-01-25 17:28:12.205 MSK [1196019] LOG: started streaming WAL from
primary at 1063C/D0000000 on timeline 61

The recovery ended and the streaming started.

Not sure if 'incorrect resource manager data checksum' is significant or
not.

2025-01-25 17:29:08.452 MSK [1196015] LOG: recovery restart point at
1063C/DBC7E1D8
2025-01-25 17:29:08.452 MSK [1196015] DETAIL: Last completed
transaction was at log time 2025-01-25 16:23:08.828548+03.
2025-01-25 17:29:24.553 MSK [1196015] LOG: restartpoint starting: wal
2025-01-25 17:29:24.553 MSK [1196015] DEBUG: performing replication
slot checkpoint
2025-01-25 17:29:27.651 MSK [1196019] FATAL: could not send data to WAL
stream: lost synchronization with server: got message type "0", length
892351284
2025-01-25 17:29:27.653 MSK [1196017] LOG: invalid magic number 3600 in
log segment 0000003D0001063D000000F4, offset 212992
2025-01-25 17:29:27.653 MSK [1196017] LOG: invalid magic number 3600 in
log segment 0000003D0001063D000000F4, offset 212992
2025-01-25 17:29:27.653 MSK [1196017] LOG: invalid magic number 3600 in
log segment 0000003D0001063D000000F4, offset 212992

This is where things fall apart. What confuses me is:

"could not send data to WAL stream: lost synchronization with server:
got message type "0", length 892351284"

If this is from the standby why is it sending data to the stream?

Unless, is there cascading replication going on?

2025-01-25 17:30:01.887 MSK [1196013] LOG: received fast shutdown request
2025-01-25 17:30:01.888 MSK [1196013] LOG: aborting any active transactions

Was that a manual intervention?

2025-01-25 17:30:02.157 MSK [1196015] LOG: shutting down
2025-01-25 17:30:02.181 MSK [1196013] LOG: database system is shut down
2025-01-25 17:30:02.182 MSK [1196014] DEBUG: logger shutting down

So the server went from start up to shut down in ~2 minutes.

From your original post:

'Restarting PostgreSQL helps.'

Is that what is shown above or have you restarted since the above and
the server is running?

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Andrey Borodin 2025-01-26 10:07:22 Re: Using Expanded Objects other than Arrays from plpgsql
Previous Message Дмитрий 2025-01-25 17:03:16 Re[2]: FATAL: could not send data to WAL stream: lost synchronization with server: got message type "0", length 892351284