From: | Alexander Lakhin <exclusion(at)gmail(dot)com> |
---|---|
To: | pgsql-bugs(at)lists(dot)postgresql(dot)org, Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
Subject: | Re: BUG #17928: Standby fails to decode WAL on termination of primary |
Date: | 2023-05-11 18:00:00 |
Message-ID: | 72bd036d-4f2a-8d50-b56e-6b1e3b9ba0a9@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
11.05.2023 11:00, PG Bug reporting form wrote:
> The following bug has been logged on the website:
>
> Bug reference: 17928
> ...
> `git bisect` for this behavior blames 3f1ce9734 (where
> XLogDecodeNextRecord() -> XLogReadRecordAlloc() call was introduced).
>
> A reproducer for the anomaly to follow.
The TAP test that demonstrates the issue is attached. To catch the failure
faster, I place it in multiple directories src/test/recoveryX/t, add
minimal Makefiles, and run (on tmpfs):
for ((i=1;i<=10;i++)); do echo "iteration $i"; NO_TEMP_INSTALL=1 parallel --halt now,fail=1 -j7 --linebuffer --tag make
-s check -C src/test/{} ::: recovery1 recovery2 recovery3 recovery4 recovery5 recovery6 recovery7 || break; done
iteration 1
recovery1 +++ tap check in src/test/recovery1 +++
recovery2 +++ tap check in src/test/recovery2 +++
recovery3 +++ tap check in src/test/recovery3 +++
recovery4 +++ tap check in src/test/recovery4 +++
recovery5 +++ tap check in src/test/recovery5 +++
recovery6 +++ tap check in src/test/recovery6 +++
recovery7 +++ tap check in src/test/recovery7 +++
...
recovery5 # Restarting primary instance (49)
recovery3 # Restarting primary instance (49)
recovery7 # Restarting primary instance (49)
recovery2 Bailout called. Further testing stopped: pg_ctl stop failed
recovery2 FAILED--Further testing stopped: pg_ctl stop failed
recovery2 make: *** [Makefile:6: check] Error 255
parallel: This job failed:
make -s check -C src/test/recovery2
tail src/test/recovery2/tmp_check/log/099_restart_with_stanby_standby.log
2023-05-11 20:19:22.247 MSK [2046385] DETAIL: End of WAL reached on timeline 1 at 3/64BDFF8.
2023-05-11 20:19:22.247 MSK [2046385] FATAL: could not send end-of-streaming message to primary: server closed the
connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
no COPY in progress
2023-05-11 20:19:22.248 MSK [2037134] FATAL: invalid memory alloc request size 2021163525
2023-05-11 20:19:22.248 MSK [2037114] LOG: startup process (PID 2037134) exited with exit code 1
2023-05-11 20:19:22.248 MSK [2037114] LOG: terminating any other active server processes
2023-05-11 20:19:22.248 MSK [2037114] LOG: shutting down due to startup process failure
2023-05-11 20:19:22.249 MSK [2037114] LOG: database system is shut down
Best regards,
Alexander
Attachment | Content-Type | Size |
---|---|---|
099_restart_with_stanby.pl | application/x-perl | 1.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2023-05-11 18:21:04 | Re: Clause accidentally pushed down ( Possible bug in Making Vars outer-join aware) |
Previous Message | Robert Haas | 2023-05-11 16:11:14 | Re: Clause accidentally pushed down ( Possible bug in Making Vars outer-join aware) |