Re: Connection terminated by the server causes deadlock in jdbc client side connection

From: Leonardo Frittelli <leonardo(dot)frittelli(at)gmail(dot)com>
To: Dave Cramer <pg(at)fastcrypt(dot)com>
Cc: List <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: Connection terminated by the server causes deadlock in jdbc client side connection
Date: 2015-10-07 20:06:37
Message-ID: CACh06N3B4hDpLe_N=_5YVK82mYfcG=y0Db=zdVOVnjq5Or=CwA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc

Dave,

I've now replaced all protoConnection.close() by protoConnection.abort() in
all IO Exception handlers for QueryExecutorImpl (v2 and v3 just in case)
and that appears to have done the trick.

With exactly the same load test, I would before lock out all 45 connections
in my pool, but now they get released nicely. Once the test finishes, they
are all released, when before they would have remained deadlocked.

I really think this is a bug then. Would you kindly indicate further steps?

Thanks,

Leonardo

On 7 October 2015 at 16:27, Leonardo Frittelli <leonardo(dot)frittelli(at)gmail(dot)com
> wrote:

> I'm actually giving it a go now with abort(). If this works I'll let you
> know.
>
> On 7 October 2015 at 16:15, Dave Cramer <pg(at)fastcrypt(dot)com> wrote:
>
>> OK, let me see if I can change the code and give you a driver that will
>> print out and not call close.
>>
>> Dave
>>
>> Dave Cramer
>>
>> dave.cramer(at)credativ(dot)ca
>> http://www.credativ.ca
>>
>> On 7 October 2015 at 15:08, Leonardo Frittelli <
>> leonardo(dot)frittelli(at)gmail(dot)com> wrote:
>>
>>> Dave,
>>>
>>> Sadly ExecutorQueryImpl will only print out the exception _after_
>>> closing the connection :)
>>>
>>> catch (IOException e)
>>> {
>>> protoConnection.close();
>>> handler.handleError(new PSQLException(GT.tr("An I/O error
>>> occurred while sending to the backend."), PSQLState.CONNECTION_FAILURE, e));
>>> }
>>>
>>>
>>> Here is a longer StackTrace in case there is anything else of value. We
>>> are using BoneCP and Hibernate. The next line in the stack trace would be
>>> our application code.
>>>
>>> java.lang.Thread.State: RUNNABLE
>>> at java.net.SocketOutputStream.socketWrite0(Native Method)
>>> at java.net.SocketOutputStream.socketWrite(Unknown Source)
>>> at java.net.SocketOutputStream.write(Unknown Source)
>>> at java.io.BufferedOutputStream.flushBuffer(Unknown Source)
>>> at java.io.BufferedOutputStream.flush(Unknown Source)
>>> - locked <0x0000000700742e30> (a java.io.BufferedOutputStream)
>>> at org.postgresql.core.PGStream.flush(PGStream.java:518)
>>> at
>>> org.postgresql.core.v3.ProtocolConnectionImpl.close(ProtocolConnectionImpl.java:136)
>>> at
>>> org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:280)
>>> - locked <0x0000000700742fd0> (a
>>> org.postgresql.core.v3.QueryExecutorImpl)
>>> at
>>> org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:547)
>>> at
>>> org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:417)
>>> at
>>> org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:302)
>>> at
>>> com.jolbox.bonecp.PreparedStatementHandle.executeQuery(PreparedStatementHandle.java:174)
>>> at
>>> org.hibernate.engine.jdbc.internal.ResultSetReturnImpl.extract(ResultSetReturnImpl.java:56)
>>> at org.hibernate.loader.Loader.getResultSet(Loader.java:2040)
>>> at org.hibernate.loader.Loader.executeQueryStatement(Loader.java:1837)
>>> at org.hibernate.loader.Loader.executeQueryStatement(Loader.java:1816)
>>> at org.hibernate.loader.Loader.doQuery(Loader.java:900)
>>> at
>>> org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:342)
>>> at org.hibernate.loader.Loader.doList(Loader.java:2526)
>>> at org.hibernate.loader.Loader.doList(Loader.java:2512)
>>> at org.hibernate.loader.Loader.listIgnoreQueryCache(Loader.java:2342)
>>> at org.hibernate.loader.Loader.list(Loader.java:2337)
>>> at
>>> org.hibernate.loader.criteria.CriteriaLoader.list(CriteriaLoader.java:124)
>>> at org.hibernate.internal.SessionImpl.list(SessionImpl.java:1662)
>>> at org.hibernate.internal.CriteriaImpl.list(CriteriaImpl.java:374)
>>>
>>> Regards,
>>>
>>> Leonardo
>>>
>>> On 7 October 2015 at 16:01, Dave Cramer <pg(at)fastcrypt(dot)com> wrote:
>>>
>>>> Leonardo,
>>>>
>>>> At this point I'm wondering if we get an I/O error if it is feasible to
>>>> attempt to close it. I am leaning towards no.
>>>>
>>>> What is above this in the stack trace. Do you see any information about
>>>> the IOException error that caused it
>>>>
>>>> Dave Cramer
>>>>
>>>> dave.cramer(at)credativ(dot)ca
>>>> http://www.credativ.ca
>>>>
>>>> On 7 October 2015 at 14:43, Leonardo Frittelli <
>>>> leonardo(dot)frittelli(at)gmail(dot)com> wrote:
>>>>
>>>>> Dave,
>>>>>
>>>>> Thank you very much for your prompt response.
>>>>> I am using version postgresql-9.3-1101, but I have checked the latest
>>>>> available 9.4 sources and it appears to have the same code (the code
>>>>> example I pasted was from v2, but I see similar code in v3).
>>>>>
>>>>> Could it be that in this particular scenario we should use
>>>>> ProtocolConnectionImpl.abort() instead of close()?
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Leonardo
>>>>>
>>>>> On 7 October 2015 at 15:23, Dave Cramer <pg(at)fastcrypt(dot)com> wrote:
>>>>>
>>>>>> It could be a problem. What version are you using ?
>>>>>>
>>>>>> Dave Cramer
>>>>>>
>>>>>> dave.cramer(at)credativ(dot)ca
>>>>>> http://www.credativ.ca
>>>>>>
>>>>>> On 7 October 2015 at 12:33, Leonardo Frittelli <
>>>>>> leonardo(dot)frittelli(at)gmail(dot)com> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> We are experiencing very frequent deadlocks in pgsql jdbc
>>>>>>> connections. The scenario is a replicated database with hot_standby = on
>>>>>>>
>>>>>>> At times of high volumes of queries in both the primary and the
>>>>>>> replicated server, we sometimes get the following log indicating that the
>>>>>>> server has terminated the connection in the replicated database.
>>>>>>> FATAL: terminating connection due to conflict with recovery
>>>>>>> DETAIL: User query might have needed to see row versions that must
>>>>>>> be removed.
>>>>>>> HINT: In a moment you should be able to reconnect to the database
>>>>>>> and repeat your command.
>>>>>>>
>>>>>>> This is expected and we see no issue with that. What I did not
>>>>>>> expect, however, is that in the JDBC client side, the connection is
>>>>>>> deadlocked.
>>>>>>>
>>>>>>> java.lang.Thread.State: RUNNABLE
>>>>>>> at java.net.SocketOutputStream.socketWrite0(Native Method)
>>>>>>> at java.net.SocketOutputStream.socketWrite(Unknown Source)
>>>>>>> at java.net.SocketOutputStream.write(Unknown Source)
>>>>>>> at java.io.BufferedOutputStream.flushBuffer(Unknown Source)
>>>>>>> at java.io.BufferedOutputStream.flush(Unknown Source)
>>>>>>> - locked <0x0000000700742e30> (a java.io.BufferedOutputStream)
>>>>>>> at org.postgresql.core.PGStream.flush(PGStream.java:518)
>>>>>>> at
>>>>>>> org.postgresql.core.v3.ProtocolConnectionImpl.close(ProtocolConnectionImpl.java:136)
>>>>>>> at
>>>>>>> org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:280)
>>>>>>> - locked <0x0000000700742fd0> (a
>>>>>>> org.postgresql.core.v3.QueryExecutorImpl)
>>>>>>> at
>>>>>>> org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:547)
>>>>>>> at
>>>>>>> org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:417)
>>>>>>> at
>>>>>>> org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:302)
>>>>>>> ...
>>>>>>>
>>>>>>> Looking at the Postgres JDBC code, I notice that
>>>>>>> ProtocolConnectionImpl.close() (invoked by the exception handler in
>>>>>>> QueryExecutorImpl.execute) is trying to 'gracefully close' by sending an
>>>>>>> 'X' to the server before actually closing the socket.
>>>>>>>
>>>>>>> ...
>>>>>>> if (logger.logDebug()) logger.debug(" FE=> Terminate");
>>>>>>> pgStream.SendChar('X'); pgStream.flush(); pgStream.close(); ...
>>>>>>>
>>>>>>> Does this make sense in a scenario of a connection which has already
>>>>>>> been terminated by the server side?
>>>>>>>
>>>>>>> At times of high load, all connections in the pool get eventually
>>>>>>> locked with exactly the same stack trace.
>>>>>>>
>>>>>>> I'd appreciate any advice on how to handle this. Could this be a bug
>>>>>>> in the JDBC driver?
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Leonardo
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

In response to

Responses

Browse pgsql-jdbc by date

  From Date Subject
Next Message Dave Cramer 2015-10-07 21:17:52 Re: Connection terminated by the server causes deadlock in jdbc client side connection
Previous Message Leonardo Frittelli 2015-10-07 19:27:47 Re: Connection terminated by the server causes deadlock in jdbc client side connection