From: | Francesco Canovai <francesco(dot)canovai(at)enterprisedb(dot)com> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Supporting TCP_SYNCNT in libpq |
Date: | 2025-03-13 08:37:37 |
Message-ID: | CAJNLuXDoMhyTvQT9d3WkxQmOuc-=PQfB0khGcmHDzibp1ti7zQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
This patch introduces support for a `tcp_syn_count` parameter in
libpq, allowing control over the number of SYN retransmissions when
initiating a connection.
The primary goal is to prevent the walreceiver from getting stuck
resending SYNs for an extended period, up to
`net.ipv4.tcp_syn_retries` (127 seconds by default), in the event of
network disruptions.
A specific scenario where this can occur is during a failover in Kubernetes:
* The primary node fails, a standby is promoted, and other standbys
attempt to reconnect to the service representing the new primary.
* The `primary_conninfo` of the standby points to a service, usually
managed via iptables rules.
* If the walreceiver's initial SYN is dropped due to outdated rules,
the connection may remain stranded until the system timeout is
reached.
* As a result, a second standby may reattach after a couple of
minutes. In the case of synchronous replication, this can block the
writes from the application.
In this scenario, `tcp_user_timeout` could close a connection retrying
the SYNs (even though it doesn't seem to do it from the documentation,
it works) the parameter will affect the entire connection.
`connect_timeout`, doesn't work with `PQconnectPoll`, so it won't
prevent the walreceiver from timing out.
Thank you,
Francesco
Attachment | Content-Type | Size |
---|---|---|
0001-Support-TCP_SYNCNT-in-libpq.patch | application/x-patch | 4.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Rushabh Lathia | 2025-03-13 08:37:47 | Re: Support NOT VALID / VALIDATE constraint options for named NOT NULL constraints |
Previous Message | Steven Niu | 2025-03-13 08:36:58 | Re: [Patch] remove duplicated smgrclose |