libpq connect keepalive* parms: no effect!?

From: Bill Clay <william(dot)b(dot)clay(at)acm(dot)org>
To: pgsql-interfaces(at)postgresql(dot)org
Subject: libpq connect keepalive* parms: no effect!?
Date: 2015-03-02 22:31:39
Message-ID: 54F4E4CB.3040601@acm.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-interfaces

I have searched fairly thoroughly and been unable to find a way to force
prompt client application session breaks when PostgreSQL
client-to-server transport fails.

I run a 7x24 PostgreSQL 9.1 "write-only" libpq client application
(solely INSERTs/COPYs running on Debian 7 "wheezy" OS) that communicates
with its PostgreSQL 9.0 DB server (Debian 6 "squeeze") via
less-than-perfect intercontinental TCP Internet/VPN transport. The
application has been running very reliably for over 4 years except for
communication breaks.

Unfortunately, in this environment, connectivity lapses of a minute or
two to an hour or two are common. To minimize the risk of data loss
when session recovery is attempted only AFTER the client queues data, I
want to promptly detect and attempt recovery of lost sessions even when
no transactions are pending. To this end, I have tried:

#define PGSQL_KEEPALIVE_QSECS "60"
char pstring[6];
snprintf(pstring, sizeof(pstring), "%i", cnt0->conf.pgsql_port);
PQconnectdbParams(
(const char *[]) {"dbname", "host",
"user", "password", "port", "sslmode",
"application_name", "connect_timeout", "keepalives",
"keepalives_idle", "keepalives_interval", "keepalives_count", NULL},
(const char *[]) {cnt0->conf.pgsql_db, cnt0->conf.pgsql_host,
cnt0->conf.pgsql_user, cnt0->conf.pgsql_password, pstring, "disable",
"motion", PGSQL_KEEPALIVE_QSECS, "1",
PGSQL_KEEPALIVE_QSECS, PGSQL_KEEPALIVE_QSECS, "3", NULL}, 0))

As a baseline comparison, I establish a psql session with an all-default
environment, break the VPN link, and then attempt a simple query (select
count(*) from ...). The query and psql session fail after about 17
minutes' wait. When testing the application -- even specifiying the
above connection parameters -- I get approximately the same 17 minute
timeout before a broken session is signalled at the application
(PQconsumeInput(conn); if (PQstatus(conn)!=PGSQL_CONNECTION_OK) ...)
when testing over the intentionally broken link. This is a far cry from
the maximum of 5 minutes I expected.

Based on postings elsewhere, I have also tried changing the relevant
Linux kernel defaults of:

/proc/sys/net/ipv4/tcp_keepalive_time=7200
/proc/sys/net/ipv4/tcp_keepalive_probes=9
/proc/sys/net/ipv4/tcp_keepalive_intvl=75

to:

/proc/sys/net/ipv4/tcp_keepalive_time=60
/proc/sys/net/ipv4/tcp_keepalive_probes=3
/proc/sys/net/ipv4/tcp_keepalive_intvl=15

... with no detectable effect; still a ca. 17 minute timeout. (Failure
of initial connection establishment IS indicated rapidly; ca. 20 sec.,
with or without any of the above measures, even connection_timeout=60.)

Any ideas how to achieve the keepalives as specified in
PQconnectdbParams when running on these platforms?

Thanks,
Bill Clay

Responses

Browse pgsql-interfaces by date

  From Date Subject
Next Message Bill Clay 2015-03-03 19:47:05 Re: libpq connect keepalive* parms: no effect!?
Previous Message Martin Kleppmann 2014-12-29 20:36:46 Access to receive/output functions from libpq client