Re: postgresql Logical Replication Stream fails with “Database connection failed when reading from copy”

From: Dave Cramer <pg(at)fastcrypt(dot)com>
To: Jammie Jamloki <shailesh(dot)jamloki(at)gmail(dot)com>
Cc: List <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: postgresql Logical Replication Stream fails with “Database connection failed when reading from copy”
Date: 2018-09-05 10:50:42
Message-ID: CADK3HH+yjHKobw4WoDOMmv9icM=F=28JTjT2e73kswNy7y0Lqw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin pgsql-jdbc

Shailesh,

Can you join https://gitter.im/debezium/dev to discuss this ?

Dave Cramer

davec(at)postgresintl(dot)com
www.postgresintl.com

On Wed, 5 Sep 2018 at 01:35, Jammie <shailesh(dot)jamloki(at)gmail(dot)com> wrote:

> Dave,
> I am using Windows 10 and I do not have tcp_keep* settings set in my
> regedit
> location :
> HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters
>
> So all these settings should be windows default values.
>
> Regards
> Shailesh
> On Wed, 5 Sep 2018, 00:14 Dave Cramer, <davec(at)postgresintl(dot)com> wrote:
>
>> What O/S are you using ? Can you see what the kernel tcp_keep* settings
>> are ?
>>
>> On Tue, 4 Sep 2018 at 13:15, Jammie <shailesh(dot)jamloki(at)gmail(dot)com> wrote:
>>
>>> PGProperty.TCP_KEEP_ALIVE.set(connectionProperties, "true");
>>>
>>> this is how I am setting.
>>>
>>> On Tue, Sep 4, 2018 at 10:44 PM Dave Cramer <davec(at)postgresintl(dot)com>
>>> wrote:
>>>
>>>> Are you specifying *tcpKeepAlive=true *in the connection string ?
>>>>
>>>> On Tue, 4 Sep 2018 at 12:27, Jammie <shailesh(dot)jamloki(at)gmail(dot)com> wrote:
>>>>
>>>>> The issue is still coming for me.
>>>>>
>>>>> One more thing to note here is my postgres database and Java program
>>>>> that stream data is on the same box.
>>>>>
>>>>> Regards
>>>>> Shailesh
>>>>>
>>>>> On Tue, Sep 4, 2018 at 9:26 PM Jammie <shailesh(dot)jamloki(at)gmail(dot)com>
>>>>> wrote:
>>>>>
>>>>>> Ok, I'll check it out.
>>>>>>
>>>>>> On Tue, 4 Sep 2018, 20:59 Dave Cramer, <davec(at)postgresintl(dot)com>
>>>>>> wrote:
>>>>>>
>>>>>>> Have a look at https://issues.jboss.org/browse/DBZ-218
>>>>>>>
>>>>>>> I think just setting "tcpKeepAlive=true" should work. Can you try
>>>>>>> that and get back to me?
>>>>>>>
>>>>>>> On Tue, 4 Sep 2018 at 10:53, Jammie <shailesh(dot)jamloki(at)gmail(dot)com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> One query is ,
>>>>>>>> In my program I do not set
>>>>>>>>
>>>>>>>> stream.setAppliedPosition() or stream.setFlushLSN().
>>>>>>>>
>>>>>>>> So does that mean we will not be able to send the updates to
>>>>>>>> server. and It is necessary to send this message periodically ?
>>>>>>>>
>>>>>>>> Regards
>>>>>>>> Shailesh
>>>>>>>>
>>>>>>>> On Tue, Sep 4, 2018 at 8:05 PM Jammie <shailesh(dot)jamloki(at)gmail(dot)com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Ok. Thank you very much. !! Appreciate your time here .
>>>>>>>>>
>>>>>>>>> On Tue, Sep 4, 2018 at 8:03 PM Dave Cramer <davec(at)postgresintl(dot)com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, 4 Sep 2018 at 10:32, Jammie <shailesh(dot)jamloki(at)gmail(dot)com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Most of the time I have set
>>>>>>>>>>>
>>>>>>>>>>> .withStatusInterval(20, TimeUnit.SECONDS)
>>>>>>>>>>>
>>>>>>>>>>> Just that to see If 0 can solve this issue I tried it once.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> No, that is worse.
>>>>>>>>>>
>>>>>>>>>> I'll build a jar for you to try.
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Sep 4, 2018 at 7:59 PM Jammie <
>>>>>>>>>>> shailesh(dot)jamloki(at)gmail(dot)com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Sorry I have not done building of driver myself. Please help If
>>>>>>>>>>>> you can do that for me or If you can share steps I can do that.
>>>>>>>>>>>>
>>>>>>>>>>>> Also would like to understand If we got the root cause fixed :
>>>>>>>>>>>>
>>>>>>>>>>>> Here is my further observation for this issue :
>>>>>>>>>>>> 1) The issue comes at different point in time while stream is
>>>>>>>>>>>> doing its job e.g. writing copy/reading copy/ending copy. I think
>>>>>>>>>>>> underlying issue is Socket issue only.
>>>>>>>>>>>> 2) Once this issue comes the stream object cant even be closed.
>>>>>>>>>>>> even stream.close() throws the same error e.g. 'error reading copy`. Even
>>>>>>>>>>>> SQL Connection object is unusable e.g. querying to database using that
>>>>>>>>>>>> object hangs. However I see both stream.isClose() and connection.isClose()
>>>>>>>>>>>> is false.
>>>>>>>>>>>>
>>>>>>>>>>>> Regards
>>>>>>>>>>>> Shailesh
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Sep 4, 2018 at 7:52 PM Dave Cramer <
>>>>>>>>>>>> davec(at)postgresintl(dot)com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hey,
>>>>>>>>>>>>>
>>>>>>>>>>>>> So here's what I think is happening. The keep alive timeout
>>>>>>>>>>>>> exactly the same as the option in the slot.
>>>>>>>>>>>>>
>>>>>>>>>>>>> So in
>>>>>>>>>>>>>
>>>>>>>>>>>>> private boolean isTimeUpdate() {
>>>>>>>>>>>>> /* a value of 0 disables automatic updates */
>>>>>>>>>>>>> if ( updateInterval == 0 ) {
>>>>>>>>>>>>> return false;
>>>>>>>>>>>>> }
>>>>>>>>>>>>> long diff = System.currentTimeMillis() - lastStatusUpdate;
>>>>>>>>>>>>> return diff >= updateInterval;
>>>>>>>>>>>>> }
>>>>>>>>>>>>>
>>>>>>>>>>>>> change this to be
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> return diff >= updateInterval-100;
>>>>>>>>>>>>>
>>>>>>>>>>>>> or something similar to get it to update before it times out
>>>>>>>>>>>>> on the server
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Can you build the driver ? If not I'll send you a driver with
>>>>>>>>>>>>> some mods to change
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, 4 Sep 2018 at 08:40, Dave Cramer <
>>>>>>>>>>>>> davec(at)postgresintl(dot)com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, 3 Sep 2018 at 21:03, Jammie <
>>>>>>>>>>>>>> shailesh(dot)jamloki(at)gmail(dot)com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Dave,
>>>>>>>>>>>>>>> thanks for the response !!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 1) I have tried latest driver 42.2.5 as well the same
>>>>>>>>>>>>>>> exception can be reproduced easily.
>>>>>>>>>>>>>>> 2) I have tried postgres V10 version along with 9.6.4 same
>>>>>>>>>>>>>>> issue could be reproduced.
>>>>>>>>>>>>>>> The server side log says :
>>>>>>>>>>>>>>> *2018-09-02 23:40:52 EDT LOG: could not receive data from
>>>>>>>>>>>>>>> client: An existing connection was forcibly closed by the remote host.*
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> In one of the instance I had enabled JDBC trace and this is
>>>>>>>>>>>>>>> what the last few lines are :
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> *Sep 03, 2018 3:37:52 AM org.postgresql.core.v3.*
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> *QueryExecutorImpl receiveRFQFINEST: <=BE
>>>>>>>>>>>>>>> ReadyForQuery(I)Sep 03, 2018 3:37:52 AM org.postgresql.jdbc.*
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> *PgResultSet getIntFINEST: getInt columnIndex: 1Sep 03,
>>>>>>>>>>>>>>> 2018 3:37:52 AM org.postgresql.jdbc.*
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> *PgConnection setAutoCommitFINE: setAutoCommit = falseSep
>>>>>>>>>>>>>>> 03, 2018 3:37:53 AM org.postgresql.core.**QueryExecutorBase
>>>>>>>>>>>>>>> close*
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I also see Socket reset error some times :
>>>>>>>>>>>>>>> ---------------------
>>>>>>>>>>>>>>> *Sep 03, 2018 3:29:42 AM org.postgresql.core.v3.*
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> *QueryExecutorImpl processCopyResultsFINEST: <=BE
>>>>>>>>>>>>>>> CopyDataSep 03, 2018 3:29:42 AM org.postgresql.core.v3.*
>>>>>>>>>>>>>>> *replication.*
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> *V3PGReplicationStream processKeepAliveMessageFINEST: <=BE
>>>>>>>>>>>>>>> Keepalive(lastServerWal: 0/2DD0218, clock: 9/3/18 3:29 AM needReply:
>>>>>>>>>>>>>>> false)Sep 03, 2018 3:30:40 AM org.postgresql.core.*
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> *QueryExecutorBase closeFINEST: FE=> TerminateSep 03, 2018
>>>>>>>>>>>>>>> 3:30:40 AM org.postgresql.core.*
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> *QueryExecutorBase closeFINEST: Discarding IOException on
>>>>>>>>>>>>>>> close:java.net.SocketException: Connection reset by peer: socket write
>>>>>>>>>>>>>>> error at java.net.SocketOutputStream.*
>>>>>>>>>>>>>>> *socketWrite0(Native Method) at
>>>>>>>>>>>>>>> java.net.SocketOutputStream.**socketWrite(*
>>>>>>>>>>>>>>> *SocketOutputStream.java:122) at
>>>>>>>>>>>>>>> java.net.SocketOutputStream.*
>>>>>>>>>>>>>>> *write(SocketOutputStream.java:*
>>>>>>>>>>>>>>> *166) at java.io.BufferedOutputStream.**flushBuffer(*
>>>>>>>>>>>>>>> *BufferedOutputStream.java:93) at
>>>>>>>>>>>>>>> java.io.BufferedOutputStream.**flush(BufferedOutputStream.*
>>>>>>>>>>>>>>> *java:151) at org.postgresql.core.PGStream.*
>>>>>>>>>>>>>>> *flush(PGStream.java:514) at org.postgresql.core.*
>>>>>>>>>>>>>>> *QueryExecutorBase.close(*
>>>>>>>>>>>>>>> *QueryExecutorBase.java:136) at org.postgresql.jdbc.*
>>>>>>>>>>>>>>> *PgConnection.close(*
>>>>>>>>>>>>>>> *PgConnection.java:659) at com.datamirror.ts.scrapers.*
>>>>>>>>>>>>>>> *postgresqlscraper.**PostgreSQLLogicalLogReaderImpl*
>>>>>>>>>>>>>>> *.readLog(**PostgreSQLLogicalLogReaderImpl*
>>>>>>>>>>>>>>> *.java:208) at com.datamirror.ts.scrapers.*
>>>>>>>>>>>>>>> *postgresqlscraper.**PostgreSQLLogReader.readLog(*
>>>>>>>>>>>>>>> *PostgreSQLLogReader.java:174) at
>>>>>>>>>>>>>>> com.datamirror.ts.scrapers.**cdc.LogReader.execute(*
>>>>>>>>>>>>>>> *LogReader.java:341) at com.datamirror.ts.scrapers.*
>>>>>>>>>>>>>>> *cdc.LogReader$**LogReaderPipelineJob.execute(*
>>>>>>>>>>>>>>> *LogReader.java:114) at com.datamirror.ts.engine.*
>>>>>>>>>>>>>>> *component.PipelineThread.**runThread(PipelineThread.java:*
>>>>>>>>>>>>>>> *217) at com.datamirror.ts.util.*
>>>>>>>>>>>>>>> *TsThread.run(TsThread.java:*
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> *130)Sep 03, 2018 3:30:40 AM org.postgresql.Driver
>>>>>>>>>>>>>>> connectFINEST: FE=> Terminate*
>>>>>>>>>>>>>>> -------------------------
>>>>>>>>>>>>>>> 3) Other than this I have tried to set PGProperty
>>>>>>>>>>>>>>> TCP_KEEP_ALIVE to true and SOCKET_TIMEOUT as 0.
>>>>>>>>>>>>>>> Also SetStatusInterval of the stream also to 0 but these did
>>>>>>>>>>>>>>> not help the purpose.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The issue is very consistent and can be reproduced easily.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Regards
>>>>>>>>>>>>>>> Shailesh
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Pretty sure I know how to fix this,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Give me a few hours..
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Tue, Sep 4, 2018 at 12:51 AM Dave Cramer <
>>>>>>>>>>>>>>> pg(at)fastcrypt(dot)com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Off the top of my head I don't have any advice. Can you try
>>>>>>>>>>>>>>>> the latest driver though and see if it still exists.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Are there any messages on the server saying that it closed
>>>>>>>>>>>>>>>> the connection ?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Dave Cramer
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> davec(at)postgresintl(dot)com
>>>>>>>>>>>>>>>> www.postgresintl.com
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Mon, 3 Sep 2018 at 06:34, Jammie <
>>>>>>>>>>>>>>>> shailesh(dot)jamloki(at)gmail(dot)com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi I am reading from logical replication stream
>>>>>>>>>>>>>>>>> continuously in infinite while loop. and I have another program that is
>>>>>>>>>>>>>>>>> continuously populating a table in the same database. I notice that after
>>>>>>>>>>>>>>>>> some time (around 5-10 minutes). I always get the exception
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> org.postgresql.util.PSQLException: Database connection failed when reading from copy
>>>>>>>>>>>>>>>>> at org.postgresql.core.v3.QueryExecutorImpl.readFromCopy(QueryExecutorImpl.java:1035)
>>>>>>>>>>>>>>>>> at org.postgresql.core.v3.CopyDualImpl.readFromCopy(CopyDualImpl.java:41)
>>>>>>>>>>>>>>>>> at org.postgresql.core.v3.replication.V3PGReplicationStream.receiveNextData(V3PGReplicationStream.java:155)
>>>>>>>>>>>>>>>>> at org.postgresql.core.v3.replication.V3PGReplicationStream.readInternal(V3PGReplicationStream.java:124)
>>>>>>>>>>>>>>>>> at org.postgresql.core.v3.replication.V3PGReplicationStream.read(V3PGReplicationStream.java:70)
>>>>>>>>>>>>>>>>> at com.datamirror.ts.scrapers.postgresqlscraper.PGStreamReceiver.main(PGStreamReceiver.java:60)
>>>>>>>>>>>>>>>>> Caused by: java.net.SocketException: Connection reset
>>>>>>>>>>>>>>>>> at java.net.SocketInputStream.read(SocketInputStream.java:220)
>>>>>>>>>>>>>>>>> at java.net.SocketInputStream.read(SocketInputStream.java:152)
>>>>>>>>>>>>>>>>> at org.postgresql.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Jammie 2018-09-05 11:35:36 Re: postgresql Logical Replication Stream fails with “Database connection failed when reading from copy”
Previous Message pavan95 2018-09-05 09:42:11 Re: How to get alerted automatically whenever a table structure is changed between Publisher and Subscriber in Logical Replication?

Browse pgsql-jdbc by date

  From Date Subject
Next Message Dave Cramer 2018-09-05 11:24:42 re-skinning the jdbc.postgresql.org
Previous Message Jammie 2018-09-05 05:35:01 Fwd: postgresql Logical Replication Stream fails with “Database connection failed when reading from copy”