[Possible Bug] EnableSSL method has no socket timeout, thread hangs indefinitely

From: Ray Hu <rayhu92(at)gmail(dot)com>
To: pgsql-jdbc(at)postgresql(dot)org
Subject: [Possible Bug] EnableSSL method has no socket timeout, thread hangs indefinitely
Date: 2018-12-07 23:48:10
Message-ID: CAGEe+P9E8bypjDumdUwo-EUGJPFDeJQpXbzVynCHdkqFi_7MMA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc

Hi all

During my usage of the JDBC library, I was performing some error handling
tests by introducing network packet loss between my application and the
database and I found that my application’s thread count began rising, I did
a thread dump of my application and found many of the following stack trace:

"PostgreSQL JDBC driver connection thread" #4645 daemon prio=5 os_prio=0
tid=0x967d7c00 nid=0x6c8e runnable [0x901ac000]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
at sun.security.ssl.InputRecord.read(InputRecord.java:503)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:983)
- locked <0xd1aca150> (a java.lang.Object)
at
sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1385)
- locked <0xd1aca120> (a java.lang.Object)
at
sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1413)
at
sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1397)
at
org.postgresql.ssl.jdbc4.AbstractJdbc4MakeSSL.convert(AbstractJdbc4MakeSSL.java:119)
at
org.postgresql.core.v3.ConnectionFactoryImpl.enableSSL(ConnectionFactoryImpl.java:331)
at
org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:125)
at
org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:66)
at
org.postgresql.jdbc2.AbstractJdbc2Connection.<init>(AbstractJdbc2Connection.java:127)
at
org.postgresql.jdbc3.AbstractJdbc3Connection.<init>(AbstractJdbc3Connection.java:29)
at
org.postgresql.jdbc3g.AbstractJdbc3gConnection.<init>(AbstractJdbc3gConnection.java:21)
at
org.postgresql.jdbc4.AbstractJdbc4Connection.<init>(AbstractJdbc4Connection.java:41)
at
org.postgresql.jdbc4.Jdbc4Connection.<init>(Jdbc4Connection.java:24)
at org.postgresql.Driver.makeConnection(Driver.java:414)
at org.postgresql.Driver.access$100(Driver.java:47)
at org.postgresql.Driver$ConnectThread.run(Driver.java:325)
at java.lang.Thread.run(Thread.java:748)

These threads were permanently stuck in the above state and they did not
recover even after the network recovered

I checked the source code of the latest PostgreSQL JDBC library (42.2.5)
and found the following code:
In class ConnectionFactoryImpl.java
Line 85:
private PGStream tryConnect(String user, String database,
Properties info, SocketFactory socketFactory, HostSpec hostSpec,
SslMode sslMode)
throws SQLException, IOException {
int connectTimeout = PGProperty.CONNECT_TIMEOUT.getInt(info) * 1000;

PGStream newStream = new PGStream(socketFactory, hostSpec,
connectTimeout);

// Construct and send an ssl startup packet if requested.
newStream = enableSSL(newStream, sslMode, info, connectTimeout);

// Set the socket timeout if the "socketTimeout" property has been set.
int socketTimeout = PGProperty.SOCKET_TIMEOUT.getInt(info);
if (socketTimeout > 0) {
newStream.getSocket().setSoTimeout(socketTimeout * 1000);
}

From what I see, the enableSSL method is called *before* the socket timeout
is set, so when java.net.SocketInputStream.socketRead0 is called, it has no
timeout and will wait infinitely for its peer to reply back. Which matches
up with the behavior that I see in my stack trace.

Is this a bug?

Note: I’m aware that the stack trace that I supplied uses an older version
of the JDBC library, I’ve repeated the test with the latest library
version and the issue persisted. I can run the tests again and supply the
new stack trace if needed.

Thank you

Responses

Browse pgsql-jdbc by date

  From Date Subject
Next Message Vladimir Sitnikov 2018-12-08 10:43:44 [pgjdbc/pgjdbc] cdfd49: chore: use openjdk7 to boostrap Travis CI images f...
Previous Message Sehrope Sarkuni 2018-12-05 11:02:07 [pgjdbc/pgjdbc] 0999bb: Fix TestUtil.dropXyz(...) object not exists errors...