From: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> |
---|---|
To: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Network failure may prevent promotion |
Date: | 2023-12-31 11:07:41 |
Message-ID: | 20231231.200741.1078989336605759878.horikyota.ntt@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
(Apology for resubmitting due to poor subject of the previous mail)
---
Hello.
We've noticed that when walreceiver is waiting for a connection to
complete, standby does not immediately respond to promotion
requests. In PG14, upon receiving a promotion request, walreceiver
terminates instantly, but in PG16, it waits for connection
timeout. This behavior is attributed to commit 728f86fec65, where a
part of libpqrcv_connect was simply replaced with a call to
libpqsrc_connect_params. This behavior can be verified by simply
dropping packets from the standby to the primary.
By a simple thought, in walreceiver, libpqsrv_connect_internal could
just call ProcessWalRcvInterrupts() instead of CHECK_FOR_INTERRUPTS(),
but this approach is quite ugly. Since ProcessWalRcvInterrupts()
originally calls CHECK_FOR_INTERRUPTS() and there are no standalone
calls to CHECK_FOR_INTERRUPTS() within walreceiver, I think it might
be better to use ProcDiePending instead of ShutdownRequestPending. I
added a subset function of die() as the SIGTERM handler in walsender
in a crude patch attached.
What do you think about the issue, and the approach?
If there are no issues or objections with this method, I will continue
to refine this patch. For now, I plan to register it for the upcoming
commitfest.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
From | Date | Subject | |
---|---|---|---|
Next Message | Ivan Kush | 2023-12-31 14:15:33 | Re: Autonomous transactions 2023, WIP |
Previous Message | Kyotaro Horiguchi | 2023-12-31 11:02:18 | libpqsrv_connect_params should call ProcessWalRcvInterrupts |