Infinite waitOnLock

From: Leonard Meyer <lmeyer(at)excilys(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: Infinite waitOnLock
Date: 2014-12-04 14:44:28
Message-ID: CADGbXST3s1G29+5KJXaJpU-3x3wJ=KpUhdySFTe0mjdD=ZeYPQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi,

We're doing an ETL Akka application using PostgreSQL 9.3.5 using
driver 9.3-1102-jdbc41. We're currently testing our application resilience
by shutting down the database in the middle of our processing. Basically we
have 8 threads working and when shutting down, some threads just get stuck
forever waiting on a lock.
We think it's coming from a bug inside the driver because we would expect
the lock acquisition to timeout at some point. Here's the stacktrace of a
frozen thread :

"xxxxxx-akka.actor.pinned-dispatcher-26" prio=10 tid=0x00007f3a1c007800
> nid=0x3125 in Object.wait() [0x00007f3acaae8000]
> java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> - waiting on <0x0000000744a14aa0> (a
> org.postgresql.core.v3.QueryExecutorImpl)
> at java.lang.Object.wait(Object.java:503)
> at
> org.postgresql.core.v3.QueryExecutorImpl.waitOnLock(QueryExecutorImpl.java:91)
> at
> org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:228)
> - locked <0x0000000744a14aa0> (a org.postgresql.core.v3.QueryExecutorImpl)
> at
> org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:561)
> at
> org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:405)
> at
> org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:285)
> at
> org.postgresql.jdbc4.AbstractJdbc4Connection.isValid(AbstractJdbc4Connection.java:130)
> at org.postgresql.jdbc4.Jdbc4Connection.isValid(Jdbc4Connection.java:21)
> at
> com.zaxxer.hikari.proxy.ConnectionProxy.isValid(ConnectionProxy.java:357)
> at
> xx.xxxx.actors.flux.mclu.McluProcessor$$anonfun$receive$1.applyOrElse(McluProcessor.scala:65)
> at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
>
> at xx.xxxx.actors.flux.mclu.McluProcessor.aroundReceive(McluProcessor.scala:30)
> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
> at akka.actor.ActorCell.invoke(ActorCell.scala:487)
> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
> at akka.dispatch.Mailbox.run(Mailbox.scala:221)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Locked ownable synchronizers:
> - <0x000000074482e6d0> (a java.util.concurrent.ThreadPoolExecutor$Worker)

We've first dropped by the Akka group, thinking it was a problem with our
code, but nothing came up. HikariCP's main developper showed up with an
explanation though :

It appears that you are using the PostgreSQL CopyManager, correct? Looking
> at QueryExecutorImpl it appears that rollback() is trying to obtain a
> lock that was not released by the CopyManager. I recommend using the
> CopyManager.copyIn() method that returns a CopyIn object, rather than
> using the convenience method that takes a reader. Use the writeToCopy() to
> pump the data in, and be sure to catch SQLException. If you get an
> SQLException, call cancelCopy() and retry or whatever your recovery
> scenario is, otherwise call endCopy(). I would have expected PostgreSQL
> to handle the severing of a Connection in the middle of a bulk copy better,
> but that is probably a question for the PostgreSQL group.

Tried it, no luck. Here the link for reference :
https://groups.google.com/forum/#!topic/akka-user/Ehzioy3jVoU

Here's what we can show you of our (scala) code :

val cpManager = connection.unwrap(classOf[PGConnection]).getCopyAPI
> val stringBytes: Array[Byte] = batchStrings.map(_.toByte).toArray
> val copy = cpManager.copyIn(s"COPY temp FROM STDIN WITH CSV")
> try {
> copy.writeToCopy(stringBytes, 0, stringBytes.size)
> copy.endCopy()
> } finally {
> if (copy.isActive) {
> copy.cancelCopy()
> }
> }

Connection used is closed further down in a finally block.

We're rather sure the lock is coming from here since we have another
exactly similar processing but with regular inserts instead of this, and no
issues so far. Thanks for any help.

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Kevin Grittner 2014-12-04 15:30:54 Re: Infinite waitOnLock
Previous Message Matheus de Oliveira 2014-12-04 14:14:57 Re: BUG #12129: questions on exporting data from a database