Re: Infinite waitOnLock

From: Dave Cramer <pg(at)fastcrypt(dot)com>
To: Leonard Meyer <lmeyer(at)excilys(dot)com>
Cc: List <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: Infinite waitOnLock
Date: 2014-12-10 11:34:48
Message-ID: CADK3HHLKUFMKtmE44EWV=3wp_KL6G_O+VVQC5MDb_xjz3EAVcg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc

So sure enough we do not timeout. The question is how long to wait. I
haven't looked at this in detail, but copy could take a very long time
depending on what people are copying in/out

Before we go chasing this can you build a driver from master and try it?

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

On 10 December 2014 at 03:47, Leonard Meyer <lmeyer(at)excilys(dot)com> wrote:

> Hi,
>
> I was advised to come here for this instead of the pgsql-bugs ML. As I
> already explained it, I'll just copy paste the problem :
>
> We're doing an ETL Akka application using PostgreSQL 9.3.5 using
> driver 9.3-1102-jdbc41. We're currently testing our application resilience
> by shutting down the database in the middle of our processing. Basically we
> have 8 threads working and when shutting down, some threads just get stuck
> forever waiting on a lock.
> We think it's coming from a bug inside the driver because we would expect
> the lock acquisition to timeout at some point. Here's the stacktrace of a
> frozen thread :
>
> "xxxxxx-akka.actor.pinned-dispatcher-26" prio=10 tid=0x00007f3a1c007800
>> nid=0x3125 in Object.wait() [0x00007f3acaae8000]
>> java.lang.Thread.State: WAITING (on object monitor)
>> at java.lang.Object.wait(Native Method)
>> - waiting on <0x0000000744a14aa0> (a org.postgresql.core.v3.
>> QueryExecutorImpl)
>> at java.lang.Object.wait(Object.java:503)
>> at org.postgresql.core.v3.QueryExecutorImpl.waitOnLock(
>> QueryExecutorImpl.java:91)
>> at org.postgresql.core.v3.QueryExecutorImpl.execute(
>> QueryExecutorImpl.java:228)
>> - locked <0x0000000744a14aa0> (a org.postgresql.core.v3.
>> QueryExecutorImpl)
>> at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(
>> AbstractJdbc2Statement.java:561)
>> at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(
>> AbstractJdbc2Statement.java:405)
>> at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(
>> AbstractJdbc2Statement.java:285)
>> at org.postgresql.jdbc4.AbstractJdbc4Connection.isValid(
>> AbstractJdbc4Connection.java:130)
>> at org.postgresql.jdbc4.Jdbc4Connection.isValid(Jdbc4Connection.java:21)
>> at com.zaxxer.hikari.proxy.ConnectionProxy.isValid(
>> ConnectionProxy.java:357)
>> at xx.xxxx.actors.flux.mclu.McluProcessor$$anonfun$receive$1.applyOrElse(
>> McluProcessor.scala:65)
>> at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
>> at xx.xxxx.actors.flux.mclu.McluProcessor.aroundReceive(
>> McluProcessor.scala:30)
>> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
>> at akka.actor.ActorCell.invoke(ActorCell.scala:487)
>> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
>> at akka.dispatch.Mailbox.run(Mailbox.scala:221)
>> at java.util.concurrent.ThreadPoolExecutor.runWorker(
>> ThreadPoolExecutor.java:1145)
>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
>> ThreadPoolExecutor.java:615)
>> at java.lang.Thread.run(Thread.java:744)
>> Locked ownable synchronizers:
>> - <0x000000074482e6d0> (a java.util.concurrent.ThreadPoolExecutor$Worker)
>
>
> We've first dropped by the Akka group, thinking it was a problem with
> our code, but nothing came up. HikariCP's main developper showed up with an
> explanation though :
>
> It appears that you are using the PostgreSQL CopyManager, correct?
>> Looking at QueryExecutorImpl it appears that rollback() is trying to
>> obtain a lock that was not released by the CopyManager. I recommend using
>> the CopyManager.copyIn() method that returns a CopyIn object, rather
>> than using the convenience method that takes a reader. Use the
>> writeToCopy() to pump the data in, and be sure to catch SQLException.
>> If you get an SQLException, call cancelCopy() and retry or whatever your
>> recovery scenario is, otherwise call endCopy(). I would have expected
>> PostgreSQL to handle the severing of a Connection in the middle of a bulk
>> copy better, but that is probably a question for the PostgreSQL group.
>
>
> Tried it, no luck. Here the link for reference :
> https://groups.google.com/forum/#!topic/akka-user/Ehzioy3jVoU
>
> Here's what we can show you of our (scala) code :
>
>
>> val cpManager = connection.unwrap(classOf[PGConnection]).getCopyAPI
>> val stringBytes: Array[Byte] = batchStrings.map(_.toByte).toArray
>> val copy = cpManager.copyIn(s"COPY temp FROM STDIN WITH CSV")
>> try {
>> copy.writeToCopy(stringBytes, 0, stringBytes.size)
>> copy.endCopy()
>> } finally {
>> if (copy.isActive) {
>> copy.cancelCopy()
>> }
>> }
>
>
> Connection used is closed further down in a finally block.
>
> We're rather sure the lock is coming from here since we have another
> exactly similar processing but with regular inserts instead of this, and no
> issues so far. Thanks for any help.
>

In response to

Responses

Browse pgsql-jdbc by date

  From Date Subject
Next Message Dave Cramer 2014-12-10 11:44:03 Re: Incorrect return code for error on call of commit during 2PC
Previous Message Leonard Meyer 2014-12-10 08:47:09 Infinite waitOnLock