Re: Connection terminated but client didn't realise

From: David Wheeler <dwheeler(at)dgitsystems(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>, Orapan Sanghirunyaplute <orapans(at)dgitsystems(dot)com>, Ross Dougherty <rdougherty(at)dgitsystems(dot)com>
Subject: Re: Connection terminated but client didn't realise
Date: 2019-12-03 01:29:38
Message-ID: 05C27DA3-68CF-4F40-989E-D46ED0751239@dgitsystems.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> Is the application remote from the database server? My gut reaction to this type of report is "something timed out the network connection", but there would have to be a router or firewall or the like in between to make that a tenable explanation.
> If that is the issue, you should be able to fix it by making the server's TCP keepalive settings more aggressive.

Yes the application server is separate from the database server, and the application is running within docker which I suspect adds some complexity too. I had suspicions about something in the middle closing the connection too, but your email has clarified my thinking a bit.

TCP Keepalive appears to be enabled on the application server and within docker, and the client holds the allegedly dead connection for much longer (24h) than the keepalive should take to kill it (<3h), so I think the next step is to try to identify the connection at the OS level with netstat to see what state it's in.

Thanks for your help.

Regards,

David

On 2/12/19, 11:17 pm, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

David Wheeler <dwheeler(at)dgitsystems(dot)com> writes:
> We have a query that our system runs nightly to refresh materialised views. This takes some time to execute (~25 minutes) and then it will usually return the results to the application and everything is golden. However occasionally we see something like the below, where the query finishes, but the connection gets unexpectedly closed from Postgres’ perspective. From the application’s perspective the connection is still alive, and it sits there forever waiting for the result.

Is the application remote from the database server? My gut reaction
to this type of report is "something timed out the network connection",
but there would have to be a router or firewall or the like in between
to make that a tenable explanation.

If that is the issue, you should be able to fix it by making the
server's TCP keepalive settings more aggressive.

regards, tom lane

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Alexander Farber 2019-12-03 10:12:16 Syntax error for UPDATE ... RETURNING INTO STRICT
Previous Message Adrian Klaver 2019-12-02 20:28:22 Re: Upgrading from V11 to V12 on Debian install