Quick Links

Re: How to debug a locked backend ?

From:	Csaba Nagy <nagy(at)ecircle-ag(dot)com>
To:	Richard Huxton <dev(at)archonet(dot)com>
Cc:	Postgres general mailing list <pgsql-general(at)postgresql(dot)org>
Subject:	Re: How to debug a locked backend ?
Date:	2005-11-18 16:00:06
Message-ID:	1132329606.10890.457.camel@coppola.muc.ecircle.de
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

> Well, I've had time to read your previous message too.
>
> The first time you seem to imply the machine slowed down across all
> processes - ssh etc. Was that the case this time?

OK, the slowdown/crash was a different problem, which might have been
caused by a "too many files open" problem combined with a huge load.
There was no evidence that the server actually crashed, but we needed to
reboot it because it became completely non-responsive. My current
problem is this:

http://archives.postgresql.org/pgsql-general/2005-11/msg00828.php

This current situation is different, the machine was almost idle, and
there was 1 UPDATE on a table running, a lot of inserts on the same
table waiting for a lock held by the transaction of the UPDATE, and a
VACUUM (started by autovacuum) running on the same table. There was no
dead-lock detected by the system, and I guess it should have detected it
if it was one, cause this whole thing was running for more than 3 hours
according to pg_stat_activity, and I set the deadlock timeout to 2
seconds.

So this happened 2 days ago, and today it happened again with exactly
the same picture: update, inserts, (auto)vacuum, on exactly the same
table as 2 days ago. That can't be a coincidence, there is something bad
either with our code or postgres. I do suspect that our code is doing
something not completely kosher, but it still should not lock in this
way.

Now according to pg_locks, the transaction of the update was holding an
exclusive lock on the table, which I can't explain, as we don't acquire
any table lock in our whole application, and the SQL of the update is
coming from our application, I could identify the process it's part of.
The only explicit locking we do in that process is via a SELECT ... FOR
UPDATE on that table, but that should place no exclusive lock on the
table, right ? So where is that lock coming from ?

And then, why was I not able to cancel the backend via kill or
pg_cancel_backend ?

> When you say "locked" do you mean it was waiting on locks, was using all
> the CPU, unresponsive or just taking the query a long time?
>
> To prepare for next time I'd:
> 1. Leave ssh logged-in, run screen to get three sessions
> 2. Leave "top" running in the first - that'll show you process
> activity/general load
> 3. Run "vmstat 10" in the second - that'll show you overall
> memory/swap/disk/cpu usage.
> 4. The third session is then free to work in, if neither of the first
> two show anything useful.

In that particular case I had a few terminals open, one of them actually
running top and others free. They were not responding either, so the
machine was completely bogged down. I do routinely have a few terminals
open, a psql one included. But that kind of lock-down did not happen
anymore, and we changed server in the meantime.

Cheers,
Csaba.

In response to

Re: How to debug a locked backend ? at 2005-11-18 15:29:59 from Richard Huxton

Responses

Re: How to debug a locked backend ? at 2005-11-18 18:10:31 from Andrew Sullivan
Re: How to debug a locked backend ? at 2005-11-18 19:37:05 from Martijn van Oosterhout

Browse pgsql-general by date

	From	Date	Subject
Next Message	A.j. Langereis	2005-11-18 16:15:20	PREPARE in bash scripts
Previous Message	Vivek Khera	2005-11-18 15:53:05	Re: Moving from MySQL to PostgreSQL with Ruby on Rails.