Fwd: postgresql Logical Replication Stream fails with “Database connection failed when reading from copy”

From: Jammie <shailesh(dot)jamloki(at)gmail(dot)com>
To: pgsql-jdbc(at)postgresql(dot)org
Subject: Fwd: postgresql Logical Replication Stream fails with “Database connection failed when reading from copy”
Date: 2018-09-05 05:35:01
Message-ID: CAFt1pcrKvqRN43BOLHiC-FpJYX-VSua55irVeugJEUTHw3ktTQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin pgsql-jdbc

Dave,
I am using Windows 10 and I do not have tcp_keep* settings set in my
regedit
location :
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters

So all these settings should be windows default values.

Regards
Shailesh
On Wed, 5 Sep 2018, 00:14 Dave Cramer, <davec(at)postgresintl(dot)com> wrote:

> What O/S are you using ? Can you see what the kernel tcp_keep* settings
> are ?
>
> On Tue, 4 Sep 2018 at 13:15, Jammie <shailesh(dot)jamloki(at)gmail(dot)com> wrote:
>
>> PGProperty.TCP_KEEP_ALIVE.set(connectionProperties, "true");
>>
>> this is how I am setting.
>>
>> On Tue, Sep 4, 2018 at 10:44 PM Dave Cramer <davec(at)postgresintl(dot)com>
>> wrote:
>>
>>> Are you specifying *tcpKeepAlive=true *in the connection string ?
>>>
>>> On Tue, 4 Sep 2018 at 12:27, Jammie <shailesh(dot)jamloki(at)gmail(dot)com> wrote:
>>>
>>>> The issue is still coming for me.
>>>>
>>>> One more thing to note here is my postgres database and Java program
>>>> that stream data is on the same box.
>>>>
>>>> Regards
>>>> Shailesh
>>>>
>>>> On Tue, Sep 4, 2018 at 9:26 PM Jammie <shailesh(dot)jamloki(at)gmail(dot)com>
>>>> wrote:
>>>>
>>>>> Ok, I'll check it out.
>>>>>
>>>>> On Tue, 4 Sep 2018, 20:59 Dave Cramer, <davec(at)postgresintl(dot)com> wrote:
>>>>>
>>>>>> Have a look at https://issues.jboss.org/browse/DBZ-218
>>>>>>
>>>>>> I think just setting "tcpKeepAlive=true" should work. Can you try
>>>>>> that and get back to me?
>>>>>>
>>>>>> On Tue, 4 Sep 2018 at 10:53, Jammie <shailesh(dot)jamloki(at)gmail(dot)com>
>>>>>> wrote:
>>>>>>
>>>>>>> One query is ,
>>>>>>> In my program I do not set
>>>>>>>
>>>>>>> stream.setAppliedPosition() or stream.setFlushLSN().
>>>>>>>
>>>>>>> So does that mean we will not be able to send the updates to server.
>>>>>>> and It is necessary to send this message periodically ?
>>>>>>>
>>>>>>> Regards
>>>>>>> Shailesh
>>>>>>>
>>>>>>> On Tue, Sep 4, 2018 at 8:05 PM Jammie <shailesh(dot)jamloki(at)gmail(dot)com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Ok. Thank you very much. !! Appreciate your time here .
>>>>>>>>
>>>>>>>> On Tue, Sep 4, 2018 at 8:03 PM Dave Cramer <davec(at)postgresintl(dot)com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, 4 Sep 2018 at 10:32, Jammie <shailesh(dot)jamloki(at)gmail(dot)com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Most of the time I have set
>>>>>>>>>>
>>>>>>>>>> .withStatusInterval(20, TimeUnit.SECONDS)
>>>>>>>>>>
>>>>>>>>>> Just that to see If 0 can solve this issue I tried it once.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> No, that is worse.
>>>>>>>>>
>>>>>>>>> I'll build a jar for you to try.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, Sep 4, 2018 at 7:59 PM Jammie <shailesh(dot)jamloki(at)gmail(dot)com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Sorry I have not done building of driver myself. Please help If
>>>>>>>>>>> you can do that for me or If you can share steps I can do that.
>>>>>>>>>>>
>>>>>>>>>>> Also would like to understand If we got the root cause fixed :
>>>>>>>>>>>
>>>>>>>>>>> Here is my further observation for this issue :
>>>>>>>>>>> 1) The issue comes at different point in time while stream is
>>>>>>>>>>> doing its job e.g. writing copy/reading copy/ending copy. I think
>>>>>>>>>>> underlying issue is Socket issue only.
>>>>>>>>>>> 2) Once this issue comes the stream object cant even be closed.
>>>>>>>>>>> even stream.close() throws the same error e.g. 'error reading copy`. Even
>>>>>>>>>>> SQL Connection object is unusable e.g. querying to database using that
>>>>>>>>>>> object hangs. However I see both stream.isClose() and connection.isClose()
>>>>>>>>>>> is false.
>>>>>>>>>>>
>>>>>>>>>>> Regards
>>>>>>>>>>> Shailesh
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Sep 4, 2018 at 7:52 PM Dave Cramer <
>>>>>>>>>>> davec(at)postgresintl(dot)com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hey,
>>>>>>>>>>>>
>>>>>>>>>>>> So here's what I think is happening. The keep alive timeout
>>>>>>>>>>>> exactly the same as the option in the slot.
>>>>>>>>>>>>
>>>>>>>>>>>> So in
>>>>>>>>>>>>
>>>>>>>>>>>> private boolean isTimeUpdate() {
>>>>>>>>>>>> /* a value of 0 disables automatic updates */
>>>>>>>>>>>> if ( updateInterval == 0 ) {
>>>>>>>>>>>> return false;
>>>>>>>>>>>> }
>>>>>>>>>>>> long diff = System.currentTimeMillis() - lastStatusUpdate;
>>>>>>>>>>>> return diff >= updateInterval;
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>> change this to be
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> return diff >= updateInterval-100;
>>>>>>>>>>>>
>>>>>>>>>>>> or something similar to get it to update before it times out on
>>>>>>>>>>>> the server
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Can you build the driver ? If not I'll send you a driver with
>>>>>>>>>>>> some mods to change
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, 4 Sep 2018 at 08:40, Dave Cramer <
>>>>>>>>>>>> davec(at)postgresintl(dot)com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, 3 Sep 2018 at 21:03, Jammie <
>>>>>>>>>>>>> shailesh(dot)jamloki(at)gmail(dot)com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Dave,
>>>>>>>>>>>>>> thanks for the response !!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 1) I have tried latest driver 42.2.5 as well the same
>>>>>>>>>>>>>> exception can be reproduced easily.
>>>>>>>>>>>>>> 2) I have tried postgres V10 version along with 9.6.4 same
>>>>>>>>>>>>>> issue could be reproduced.
>>>>>>>>>>>>>> The server side log says :
>>>>>>>>>>>>>> *2018-09-02 23:40:52 EDT LOG: could not receive data from
>>>>>>>>>>>>>> client: An existing connection was forcibly closed by the remote host.*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In one of the instance I had enabled JDBC trace and this is
>>>>>>>>>>>>>> what the last few lines are :
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *Sep 03, 2018 3:37:52 AM org.postgresql.core.v3.*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *QueryExecutorImpl receiveRFQFINEST: <=BE
>>>>>>>>>>>>>> ReadyForQuery(I)Sep 03, 2018 3:37:52 AM org.postgresql.jdbc.*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *PgResultSet getIntFINEST: getInt columnIndex: 1Sep 03,
>>>>>>>>>>>>>> 2018 3:37:52 AM org.postgresql.jdbc.*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *PgConnection setAutoCommitFINE: setAutoCommit = falseSep
>>>>>>>>>>>>>> 03, 2018 3:37:53 AM org.postgresql.core.**QueryExecutorBase
>>>>>>>>>>>>>> close*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I also see Socket reset error some times :
>>>>>>>>>>>>>> ---------------------
>>>>>>>>>>>>>> *Sep 03, 2018 3:29:42 AM org.postgresql.core.v3.*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *QueryExecutorImpl processCopyResultsFINEST: <=BE
>>>>>>>>>>>>>> CopyDataSep 03, 2018 3:29:42 AM org.postgresql.core.v3.*
>>>>>>>>>>>>>> *replication.*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *V3PGReplicationStream processKeepAliveMessageFINEST: <=BE
>>>>>>>>>>>>>> Keepalive(lastServerWal: 0/2DD0218, clock: 9/3/18 3:29 AM needReply:
>>>>>>>>>>>>>> false)Sep 03, 2018 3:30:40 AM org.postgresql.core.*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *QueryExecutorBase closeFINEST: FE=> TerminateSep 03, 2018
>>>>>>>>>>>>>> 3:30:40 AM org.postgresql.core.*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *QueryExecutorBase closeFINEST: Discarding IOException on
>>>>>>>>>>>>>> close:java.net.SocketException: Connection reset by peer: socket write
>>>>>>>>>>>>>> error at java.net.SocketOutputStream.*
>>>>>>>>>>>>>> *socketWrite0(Native Method) at
>>>>>>>>>>>>>> java.net.SocketOutputStream.**socketWrite(*
>>>>>>>>>>>>>> *SocketOutputStream.java:122) at
>>>>>>>>>>>>>> java.net.SocketOutputStream.**write(SocketOutputStream.java:*
>>>>>>>>>>>>>> *166) at java.io.BufferedOutputStream.**flushBuffer(*
>>>>>>>>>>>>>> *BufferedOutputStream.java:93) at
>>>>>>>>>>>>>> java.io.BufferedOutputStream.**flush(BufferedOutputStream.*
>>>>>>>>>>>>>> *java:151) at org.postgresql.core.PGStream.*
>>>>>>>>>>>>>> *flush(PGStream.java:514) at org.postgresql.core.*
>>>>>>>>>>>>>> *QueryExecutorBase.close(*
>>>>>>>>>>>>>> *QueryExecutorBase.java:136) at org.postgresql.jdbc.*
>>>>>>>>>>>>>> *PgConnection.close(*
>>>>>>>>>>>>>> *PgConnection.java:659) at com.datamirror.ts.scrapers.*
>>>>>>>>>>>>>> *postgresqlscraper.**PostgreSQLLogicalLogReaderImpl*
>>>>>>>>>>>>>> *.readLog(**PostgreSQLLogicalLogReaderImpl*
>>>>>>>>>>>>>> *.java:208) at com.datamirror.ts.scrapers.*
>>>>>>>>>>>>>> *postgresqlscraper.**PostgreSQLLogReader.readLog(*
>>>>>>>>>>>>>> *PostgreSQLLogReader.java:174) at
>>>>>>>>>>>>>> com.datamirror.ts.scrapers.**cdc.LogReader.execute(*
>>>>>>>>>>>>>> *LogReader.java:341) at com.datamirror.ts.scrapers.*
>>>>>>>>>>>>>> *cdc.LogReader$**LogReaderPipelineJob.execute(*
>>>>>>>>>>>>>> *LogReader.java:114) at com.datamirror.ts.engine.*
>>>>>>>>>>>>>> *component.PipelineThread.**runThread(PipelineThread.java:*
>>>>>>>>>>>>>> *217) at com.datamirror.ts.util.*
>>>>>>>>>>>>>> *TsThread.run(TsThread.java:*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *130)Sep 03, 2018 3:30:40 AM org.postgresql.Driver
>>>>>>>>>>>>>> connectFINEST: FE=> Terminate*
>>>>>>>>>>>>>> -------------------------
>>>>>>>>>>>>>> 3) Other than this I have tried to set PGProperty
>>>>>>>>>>>>>> TCP_KEEP_ALIVE to true and SOCKET_TIMEOUT as 0.
>>>>>>>>>>>>>> Also SetStatusInterval of the stream also to 0 but these did
>>>>>>>>>>>>>> not help the purpose.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The issue is very consistent and can be reproduced easily.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Regards
>>>>>>>>>>>>>> Shailesh
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Pretty sure I know how to fix this,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Give me a few hours..
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Tue, Sep 4, 2018 at 12:51 AM Dave Cramer <pg(at)fastcrypt(dot)com>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Off the top of my head I don't have any advice. Can you try
>>>>>>>>>>>>>>> the latest driver though and see if it still exists.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Are there any messages on the server saying that it closed
>>>>>>>>>>>>>>> the connection ?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Dave Cramer
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> davec(at)postgresintl(dot)com
>>>>>>>>>>>>>>> www.postgresintl.com
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Mon, 3 Sep 2018 at 06:34, Jammie <
>>>>>>>>>>>>>>> shailesh(dot)jamloki(at)gmail(dot)com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi I am reading from logical replication stream
>>>>>>>>>>>>>>>> continuously in infinite while loop. and I have another program that is
>>>>>>>>>>>>>>>> continuously populating a table in the same database. I notice that after
>>>>>>>>>>>>>>>> some time (around 5-10 minutes). I always get the exception
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> org.postgresql.util.PSQLException: Database connection failed when reading from copy
>>>>>>>>>>>>>>>> at org.postgresql.core.v3.QueryExecutorImpl.readFromCopy(QueryExecutorImpl.java:1035)
>>>>>>>>>>>>>>>> at org.postgresql.core.v3.CopyDualImpl.readFromCopy(CopyDualImpl.java:41)
>>>>>>>>>>>>>>>> at org.postgresql.core.v3.replication.V3PGReplicationStream.receiveNextData(V3PGReplicationStream.java:155)
>>>>>>>>>>>>>>>> at org.postgresql.core.v3.replication.V3PGReplicationStream.readInternal(V3PGReplicationStream.java:124)
>>>>>>>>>>>>>>>> at org.postgresql.core.v3.replication.V3PGReplicationStream.read(V3PGReplicationStream.java:70)
>>>>>>>>>>>>>>>> at com.datamirror.ts.scrapers.postgresqlscraper.PGStreamReceiver.main(PGStreamReceiver.java:60)
>>>>>>>>>>>>>>>> Caused by: java.net.SocketException: Connection reset
>>>>>>>>>>>>>>>> at java.net.SocketInputStream.read(SocketInputStream.java:220)
>>>>>>>>>>>>>>>> at java.net.SocketInputStream.read(SocketInputStream.java:152)
>>>>>>>>>>>>>>>> at org.postgresql.core.VisibleBufferedInputStream.readMore(VisibleBufferedInputStream.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message pavan95 2018-09-05 09:30:48 Re: Heavy Logging in Subscriber side when configured Logical Replication in 10.4
Previous Message David G. Johnston 2018-09-04 14:30:40 Re: Heavy Logging in Subscriber side when configured Logical Replication in 10.4

Browse pgsql-jdbc by date

  From Date Subject
Next Message Dave Cramer 2018-09-05 10:50:42 Re: postgresql Logical Replication Stream fails with “Database connection failed when reading from copy”
Previous Message Dave Cramer 2018-09-04 14:22:14 Re: postgresql Logical Replication Stream fails with “Database connection failed when reading from copy”