Re: BUG #5465: dblink TCP connection hangs blocking translation from being terminated

From: Robert Voinea <rvoinea(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #5465: dblink TCP connection hangs blocking translation from being terminated
Date: 2014-03-17 11:09:20
Message-ID: 6245413.JIiidGyzFj@shu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi

On Friday 14 March 2014 09:45:22 Tom Lane wrote:
> Robert Voinea <rvoinea(at)gmail(dot)com> writes:
> > In our setup, we make use extensively of dblink.
> > Due to the fact that some queries take some time to complete and that the
> > link is over the internet, sometime the server process (the transaction
> > that runs the dblink queries) hangs when the link goes down, keeping
> > locks on several records plus some advisory locks and thus freezing the
> > whole (most of) the database.
> >
> > What I have found is this bug, that is remarkably similar (if not
> > identical) with what we are experiencing.
> > http://postgresql.1045698.n5.nabble.com/BUG-5465-dblink-TCP-connection-han
> > gs-blocking-translation-from-being-terminated-td2132419.html#a2132420
> That does not sound like a Postgres bug to me. What you are unhappy about
> is that the kernel isn't timing out a lost TCP connection more quickly.
> The default timeout is long (>1 hour probably), but that's required by
> Internet standards. The appropriate fix for this is to use aggressive
> keepalive parameters on the connection. You can set libpq's keepalive
> parameters in the connection string given to dblink.
>
> regards, tom lane

I seem to have missed those parameters... and the fact that you actually need
keep-alive on both client AND server, not only on the server.

Thank you!

--
Robert Voinea
Software Engineer
+4 0740 467 262

Don't take life too seriously. You'll never get out of it alive.
(Elbert Hubbard)

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Michael.buccetti 2014-03-17 17:38:00 BUG #9604: Unable to access table remotely
Previous Message Heikki Linnakangas 2014-03-17 09:05:14 Re: November 2013 Replication Data Loss Issue