| From: | Alex Maltinsky <alex(at)bgprotect(dot)com> | 
|---|---|
| To: | Dave Cramer <pg(at)fastcrypt(dot)com> | 
| Cc: | List <pgsql-jdbc(at)postgresql(dot)org> | 
| Subject: | Re: Logical replication fails due to SocketException | 
| Date: | 2019-05-21 21:20:03 | 
| Message-ID: | CAH+ZVca_hLe5kYSbu1qDvKPszjfhNjX1XP5=_yE3diLJ72uUeA@mail.gmail.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-jdbc | 
> Seems to be the common theme. Is there something about windows that drops
connections ?
Yeah, I noticed the Windows part too. I'm not aware of anything
Windows-specific that resets active connections. Also, I tried tunneling
the postgres connection over SSH and got the same exact result while the
SSH connection itself was fine.
On Tue, 21 May 2019 at 22:55, Dave Cramer <pg(at)fastcrypt(dot)com> wrote:
>
>
> On Tue, 21 May 2019 at 15:49, Alex Maltinsky <alex(at)bgprotect(dot)com> wrote:
>
>> The server runs a dockerized Postgres on Ubuntu.
>>
>> The client is on Windows 10.
>>
>> On Tue, 21 May 2019 at 21:22 Dave Cramer <pg(at)fastcrypt(dot)com> wrote:
>>
>>> On Tue, 21 May 2019 at 09:58, Alex Maltinsky <alex(at)bgprotect(dot)com> wrote:
>>>
>>>> Hi All
>>>>
>>>> I ran into the a problem with Postgres 11 and JDBC driver 42.2.5 which
>>>> resembles a problem that was posted here before (
>>>> https://www.postgresql-archive.org/postgresql-Logical-Replication-Stream-fails-with-Database-connection-failed-when-reading-from-copy-td6036639.html)
>>>> but unfortunately the solution was never posted here.
>>>>
>>>> I have a simple endless loop that follows the official replication
>>>> example (
>>>> https://jdbc.postgresql.org/documentation/head/replication.html) and
>>>> I keep getting socket exceptions like these after fetching approximately
>>>> 66K rows with remarkable consistency:
>>>>
>>>>     Exception in thread "main" org.postgresql.util.PSQLException:
>>>> Database connection failed when reading from copy
>>>>     at
>>>> org.postgresql.core.v3.QueryExecutorImpl.readFromCopy(QueryExecutorImpl.java:1037)
>>>>     at
>>>> org.postgresql.core.v3.CopyDualImpl.readFromCopy(CopyDualImpl.java:41)
>>>>     at
>>>> org.postgresql.core.v3.replication.V3PGReplicationStream.receiveNextData(V3PGReplicationStream.java:155)
>>>>     at
>>>> org.postgresql.core.v3.replication.V3PGReplicationStream.readInternal(V3PGReplicationStream.java:124)
>>>>     at
>>>> org.postgresql.core.v3.replication.V3PGReplicationStream.readPending(V3PGReplicationStream.java:78)
>>>>     at com.example.main(ReplicationTest.java:48)
>>>>     Caused by: java.net.SocketException: socket closed
>>>>     at java.net.SocketInputStream.socketRead0(Native Method)
>>>>     at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
>>>>     at java.net.SocketInputStream.read(SocketInputStream.java:171)
>>>>     at java.net.SocketInputStream.read(SocketInputStream.java:141)
>>>>     at
>>>> org.postgresql.core.VisibleBufferedInputStream.readMore(VisibleBufferedInputStream.java:140)
>>>>     at
>>>> org.postgresql.core.VisibleBufferedInputStream.ensureBytes(VisibleBufferedInputStream.java:109)
>>>>     at
>>>> org.postgresql.core.VisibleBufferedInputStream.read(VisibleBufferedInputStream.java:191)
>>>>     at org.postgresql.core.PGStream.receive(PGStream.java:462)
>>>>     at org.postgresql.core.PGStream.receive(PGStream.java:446)
>>>>     at
>>>> org.postgresql.core.v3.QueryExecutorImpl.processCopyResults(QueryExecutorImpl.java:1170)
>>>>     at
>>>> org.postgresql.core.v3.QueryExecutorImpl.readFromCopy(QueryExecutorImpl.java:1035)
>>>>     ... 5 more
>>>>
>>>> The database log shows "LOG:  could not send data to client: Connection
>>>> reset by peer"
>>>>
>>>> Wireshark shows that it was the client who suddenly sent a TCP RST to
>>>> the server and closed the connection.
>>>>
>>>> Parameters: `wal_sender_timeout` is set to 60 seconds, and I'm using a
>>>> status interval of 10 seconds and  TCP_KEEP_ALIVE is enabled.
>>>>
>>>> The body of the Java loop looks like this:
>>>>
>>>>         while (true) {
>>>>             ByteBuffer msg = stream.readPending();
>>>>             if (msg == null) {
>>>>                 TimeUnit.MILLISECONDS.sleep(10L);
>>>>                 continue;
>>>>             }
>>>>
>>>>             LogSequenceNumber lastReceiveLSN =
>>>> stream.getLastReceiveLSN();
>>>>             System.out.println((i++) + " " + lastReceiveLSN);
>>>>
>>>>             stream.setAppliedLSN(lastReceiveLSN);
>>>>             stream.setFlushedLSN(lastReceiveLSN);
>>>>         }
>>>>
>>>>
>>>> Curiously enough, if I change the loop to the code below, the problem
>>>> disappears:
>>>>
>>>>         while (true) {
>>>>             ByteBuffer msg = stream.readPending();
>>>>             if (msg == null) {
>>>>                 TimeUnit.MILLISECONDS.sleep(10L);
>>>>                 continue;
>>>>             }
>>>>
>>>>             int offset = msg.arrayOffset();
>>>>             byte[] source = msg.array();
>>>>             int length = source.length - offset;
>>>>
>>>>             LogSequenceNumber lastReceiveLSN =
>>>> stream.getLastReceiveLSN();
>>>>             System.out.println((i++) + " " + lastReceiveLSN + " " + new
>>>> String(source, offset, length));
>>>>
>>>>             stream.setAppliedLSN(lastReceiveLSN);
>>>>             stream.setFlushedLSN(lastReceiveLSN);
>>>>         }
>>>>
>>>>
>>>> Any help would be appreciated
>>>>
>>>> - Alex
>>>>
>>>>
>>> What OS are you using ?
>>>
>>>
>>> Dave Cramer
>>>
>>> davec(at)postgresintl(dot)com
>>> www.postgresintl.com
>>>
>>>
>>>
>
> Seems to be the common theme. Is there something about windows that drops
> connections ?
>
>
> Dave Cramer
>
> davec(at)postgresintl(dot)com
> www.postgresintl.com
>
>
>
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Dave Cramer | 2019-05-28 15:41:09 | [pgjdbc/pgjdbc] 8bd906: some fixbugs cleanup (#1486) | 
| Previous Message | Dave Cramer | 2019-05-21 19:55:44 | Re: Logical replication fails due to SocketException |