Re: Logical replication fails due to SocketException

From: Alex Maltinsky <alex(at)bgprotect(dot)com>
To: Dave Cramer <pg(at)fastcrypt(dot)com>
Cc: List <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: Logical replication fails due to SocketException
Date: 2019-05-21 21:20:03
Message-ID: CAH+ZVca_hLe5kYSbu1qDvKPszjfhNjX1XP5=_yE3diLJ72uUeA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc

> Seems to be the common theme. Is there something about windows that drops
connections ?

Yeah, I noticed the Windows part too. I'm not aware of anything
Windows-specific that resets active connections. Also, I tried tunneling
the postgres connection over SSH and got the same exact result while the
SSH connection itself was fine.

On Tue, 21 May 2019 at 22:55, Dave Cramer <pg(at)fastcrypt(dot)com> wrote:

>
>
> On Tue, 21 May 2019 at 15:49, Alex Maltinsky <alex(at)bgprotect(dot)com> wrote:
>
>> The server runs a dockerized Postgres on Ubuntu.
>>
>> The client is on Windows 10.
>>
>> On Tue, 21 May 2019 at 21:22 Dave Cramer <pg(at)fastcrypt(dot)com> wrote:
>>
>>> On Tue, 21 May 2019 at 09:58, Alex Maltinsky <alex(at)bgprotect(dot)com> wrote:
>>>
>>>> Hi All
>>>>
>>>> I ran into the a problem with Postgres 11 and JDBC driver 42.2.5 which
>>>> resembles a problem that was posted here before (
>>>> https://www.postgresql-archive.org/postgresql-Logical-Replication-Stream-fails-with-Database-connection-failed-when-reading-from-copy-td6036639.html)
>>>> but unfortunately the solution was never posted here.
>>>>
>>>> I have a simple endless loop that follows the official replication
>>>> example (
>>>> https://jdbc.postgresql.org/documentation/head/replication.html) and
>>>> I keep getting socket exceptions like these after fetching approximately
>>>> 66K rows with remarkable consistency:
>>>>
>>>> Exception in thread "main" org.postgresql.util.PSQLException:
>>>> Database connection failed when reading from copy
>>>> at
>>>> org.postgresql.core.v3.QueryExecutorImpl.readFromCopy(QueryExecutorImpl.java:1037)
>>>> at
>>>> org.postgresql.core.v3.CopyDualImpl.readFromCopy(CopyDualImpl.java:41)
>>>> at
>>>> org.postgresql.core.v3.replication.V3PGReplicationStream.receiveNextData(V3PGReplicationStream.java:155)
>>>> at
>>>> org.postgresql.core.v3.replication.V3PGReplicationStream.readInternal(V3PGReplicationStream.java:124)
>>>> at
>>>> org.postgresql.core.v3.replication.V3PGReplicationStream.readPending(V3PGReplicationStream.java:78)
>>>> at com.example.main(ReplicationTest.java:48)
>>>> Caused by: java.net.SocketException: socket closed
>>>> at java.net.SocketInputStream.socketRead0(Native Method)
>>>> at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
>>>> at java.net.SocketInputStream.read(SocketInputStream.java:171)
>>>> at java.net.SocketInputStream.read(SocketInputStream.java:141)
>>>> at
>>>> org.postgresql.core.VisibleBufferedInputStream.readMore(VisibleBufferedInputStream.java:140)
>>>> at
>>>> org.postgresql.core.VisibleBufferedInputStream.ensureBytes(VisibleBufferedInputStream.java:109)
>>>> at
>>>> org.postgresql.core.VisibleBufferedInputStream.read(VisibleBufferedInputStream.java:191)
>>>> at org.postgresql.core.PGStream.receive(PGStream.java:462)
>>>> at org.postgresql.core.PGStream.receive(PGStream.java:446)
>>>> at
>>>> org.postgresql.core.v3.QueryExecutorImpl.processCopyResults(QueryExecutorImpl.java:1170)
>>>> at
>>>> org.postgresql.core.v3.QueryExecutorImpl.readFromCopy(QueryExecutorImpl.java:1035)
>>>> ... 5 more
>>>>
>>>> The database log shows "LOG: could not send data to client: Connection
>>>> reset by peer"
>>>>
>>>> Wireshark shows that it was the client who suddenly sent a TCP RST to
>>>> the server and closed the connection.
>>>>
>>>> Parameters: `wal_sender_timeout` is set to 60 seconds, and I'm using a
>>>> status interval of 10 seconds and TCP_KEEP_ALIVE is enabled.
>>>>
>>>> The body of the Java loop looks like this:
>>>>
>>>> while (true) {
>>>> ByteBuffer msg = stream.readPending();
>>>> if (msg == null) {
>>>> TimeUnit.MILLISECONDS.sleep(10L);
>>>> continue;
>>>> }
>>>>
>>>> LogSequenceNumber lastReceiveLSN =
>>>> stream.getLastReceiveLSN();
>>>> System.out.println((i++) + " " + lastReceiveLSN);
>>>>
>>>> stream.setAppliedLSN(lastReceiveLSN);
>>>> stream.setFlushedLSN(lastReceiveLSN);
>>>> }
>>>>
>>>>
>>>> Curiously enough, if I change the loop to the code below, the problem
>>>> disappears:
>>>>
>>>> while (true) {
>>>> ByteBuffer msg = stream.readPending();
>>>> if (msg == null) {
>>>> TimeUnit.MILLISECONDS.sleep(10L);
>>>> continue;
>>>> }
>>>>
>>>> int offset = msg.arrayOffset();
>>>> byte[] source = msg.array();
>>>> int length = source.length - offset;
>>>>
>>>> LogSequenceNumber lastReceiveLSN =
>>>> stream.getLastReceiveLSN();
>>>> System.out.println((i++) + " " + lastReceiveLSN + " " + new
>>>> String(source, offset, length));
>>>>
>>>> stream.setAppliedLSN(lastReceiveLSN);
>>>> stream.setFlushedLSN(lastReceiveLSN);
>>>> }
>>>>
>>>>
>>>> Any help would be appreciated
>>>>
>>>> - Alex
>>>>
>>>>
>>> What OS are you using ?
>>>
>>>
>>> Dave Cramer
>>>
>>> davec(at)postgresintl(dot)com
>>> www.postgresintl.com
>>>
>>>
>>>
>
> Seems to be the common theme. Is there something about windows that drops
> connections ?
>
>
> Dave Cramer
>
> davec(at)postgresintl(dot)com
> www.postgresintl.com
>
>
>

In response to

Browse pgsql-jdbc by date

  From Date Subject
Next Message Dave Cramer 2019-05-28 15:41:09 [pgjdbc/pgjdbc] 8bd906: some fixbugs cleanup (#1486)
Previous Message Dave Cramer 2019-05-21 19:55:44 Re: Logical replication fails due to SocketException