Re: How do I track down a possible locking problem?

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Herouth Maoz <herouth(at)unicell(dot)co(dot)il>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: How do I track down a possible locking problem?
Date: 2014-02-18 17:02:54
Message-ID: CAMkU=1yNP6EzSL2HtFejAtytLURfheW50WS-E8ZVkFXjc+0EvQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Mon, Feb 17, 2014 at 8:45 AM, Herouth Maoz <herouth(at)unicell(dot)co(dot)il> wrote:

> I have a production system using Postgresql 9.1.2.
>
> The system basically receives messages, puts them in a queue, and then
> several parallel modules, each in its own thread, read from that queue, and
> perform two inserts, then release the message to the next queue for
> non-database-related processing.
>
> Today, after we received complaints from a customer about delays, I
> noticed odd behavior in the system. About 2 percent of the messages were
> inserted into the tables more than an hour after they got into the system.
>

How do you know that?

>
> The queue never has more than 27,000 messages at the same time, and all
> together, the parallel modules process about 5000 or 6000 messages per
> minute. So basically, the delay for a single message should never be more
> than a few minutes. Even if one module gets stuck, another will grab the
> next message from the queue. I believe the only way for a message to be
> stuck for so long would be for it to be grabbed by a module, and then for
> the database write to be somehow delayed for a hour, although it's merely a
> simple insert performed with a prepared statement.
>
> The database in production is very busy with millions of writes per hour.
> Could there be a situation in which a particular connection gets "starved"
> while other connections are able to run queries without noticeable delay?
>

If there is a delay like that, it would almost certainly be due to database
locks that show up in pg_locks.

http://www.postgresql.org/docs/current/static/view-pg-locks.html
http://wiki.postgresql.org/wiki/Lock_Monitoring
http://wiki.postgresql.org/wiki/Lock_dependency_information

But, I doubt that that is your problem.

>
> How can I truck such locks down? Does anybody have any ideas other than
> starvation? The system lets me view statistics of how many messages were
> processed in each modules and the average latency. None of the four modules
> running has long average latency or low number of messages processes, so I
> don't think the issue is related to any particular thread in my (Java)
> system being slow or locked away by the others.
>

If the insert into PostgreSQL was freezing, wouldn't that time get
reflected in your latency monitoring?

It sounds to me like your application has a bug in its queue, where it
forgets about items on the queue for a while.

Cheers,

Jeff

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Purdon 2014-02-18 17:07:58 Is pgFoundry Down? (2/18/2014)
Previous Message Merlin Moncure 2014-02-18 16:47:58 Re: Is PostgreSQL 9.3 using indexes for pipelined top-N window function queries?