Re: Postgres restart during CopyManager.copyIn does not free connection, thread stuck on QueryExecutorImpl.waitOnLock

From: Alexis Meneses <alexis(dot)meneses(at)gmail(dot)com>
To: Brendan Reekie <breekie(at)sandvine(dot)com>
Cc: "pgsql-jdbc(at)postgresql(dot)org" <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: Postgres restart during CopyManager.copyIn does not free connection, thread stuck on QueryExecutorImpl.waitOnLock
Date: 2015-02-11 23:45:14
Message-ID: CANPkoZS7jNmPYyrPguw-RHJu1KzXAFtKh7teGNeWQZ_TQGro-A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc

Hi

I think that a similar issue has been seen already (see thread
http://www.postgresql.org/message-id/flat/CADGbXSQ--8pJcSPkC7+tR6rsGrk7p=141Bp16VJiOR5mg_SQpQ(at)mail(dot)gmail(dot)com)
but it has not yet been fixed.

Would you have time to work on a patch and submit a pull request on the
github project?

Thanks.

Alexis

2015-02-09 19:38 GMT+01:00 Brendan Reekie <breekie(at)sandvine(dot)com>:

> Hi,
>
>
>
> I’m currently using driver: 9.3.1100-jdbc3.jar with a 9.3.5 server.
>
>
>
> The behaviour I’m seeing is if the connection to the database is lost due
> a restart of Postgres and the block of code being executed is a
> CopyManager.copyIn() method the connection to the database is never freed
> and the stack trace shows that the thread is still awaiting unlock:
>
>
>
> java.lang.Object.$$YJP$$wait(Native Method)
>
> java.lang.Object.wait(Object.java)
>
> java.lang.Object.wait(Object.java:503)
>
>
> org.postgresql.core.v3.QueryExecutorImpl.waitOnLock(QueryExecutorImpl.java:91)
>
>
> org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:228)
>
>
> org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:560)
>
>
> org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:403)
>
>
> org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:395)
>
>
>
> Debugging through the code it looks like the issue might be in the
> QueryExecutorImpl.cancelCopy() operation. When the operation is attempting
> to flush the pgStream this throws an IOException and the block of code to
> remove the lock (processCopyResults) is never called and the connection
> remains open and the lock never freed.
>
>
>
>
>
> /**
>
> * Finishes a copy operation and unlocks connection discarding any
> exchanged data.
>
> * *(at)param* op the copy operation presumably currently holding lock
> on this connection
>
> * *(at)throws* SQLException on any additional failure
>
> */
>
> *public* *void* *cancelCopy*(CopyOperationImpl op) *throws*
> SQLException {
>
> *if*(!hasLock(op))
>
> *throw* *new* PSQLException(GT.*tr*("Tried to cancel an
> inactive copy operation"), PSQLState.*OBJECT_NOT_IN_STATE*);
>
>
>
> SQLException *error* = *null*;
>
> *int* *errors* = 0;
>
>
>
> *try* {
>
> *if*(op *instanceof* CopyInImpl) {
>
> *synchronized* (*this*) {
>
> *if* (logger.logDebug()) {
>
> logger.debug("FE => CopyFail");
>
> }
>
> *final* *byte*[] *msg* = Utils.*encodeUTF8*("Copy
> cancel requested");
>
> pgStream.SendChar('f'); // CopyFail
>
> pgStream.SendInteger4(5 + msg.length);
>
> pgStream.Send(msg);
>
> pgStream.SendChar(0);
>
> pgStream.flush();
>
> *do* {
>
> *try* {
>
> processCopyResults(op, *true*); // discard
> rest of input
>
> } *catch*(SQLException *se*) { // expected error
> response to failing copy
>
> errors++;
>
> *if*( error != *null* ) {
>
> SQLException *e* = se, *next*;
>
> *while*( (next = e.getNextException()) !=
> *null* )
>
> e = next;
>
> e.setNextException(error);
>
> }
>
> error = se;
>
> }
>
> } *while*(hasLock(op));
>
> }
>
> } *else* *if* (op *instanceof* CopyOutImpl) {
>
> protoConnection.sendQueryCancel();
>
> }
>
>
>
> } *catch*(IOException *ioe*) {
>
> *throw* *new* PSQLException(GT.*tr*("Database connection
> failed when canceling copy operation"), PSQLState.*CONNECTION_FAILURE*,
> ioe);
>
> }
>
>
>
> *if* (op *instanceof* CopyInImpl) {
>
> *if*(errors < 1) {
>
> *throw* *new* PSQLException(GT.*tr*("Missing expected
> error response to copy cancel request"), PSQLState.*COMMUNICATION_ERROR*);
>
> } *else* *if*(errors > 1) {
>
> *throw* *new* PSQLException(GT.*tr*("Got {0} error
> responses to single copy cancel request", String.*valueOf*(errors)),
> PSQLState.*COMMUNICATION_ERROR*, error);
>
> }
>
> }
>
> }
>
>
>
> I’ve tried the latest driver 9.4-1200 and observed the same behaviour. To
> reproduce this test I’m using a tester that writes to copyIn using a stream
> of data and set a break point and restart Postgres server while performing
> the copyIn.
>
>
>
> Has anyone seen this issue previously? Is there a work around to this
> scenario?
>
>
>
> Thanks in advance,
>
> Brendan
>

In response to

Browse pgsql-jdbc by date

  From Date Subject
Next Message Heikki Linnakangas 2015-02-12 18:55:50 Re: SSL renegotiation is broken
Previous Message Albe Laurenz 2015-02-11 15:02:56 SSL renegotiation is broken