Lock pileup stuck processes

From: Josh berkus <josh(at)agliodbs(dot)com>
To: pgsql-bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Lock pileup stuck processes
Date: 2016-04-13 22:59:44
Message-ID: 570ECF60.5040200@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Folks,

This is a "hard to reproduce" bug, so is being submitted to this list in
order to accumulate evidence for eventual debugging when there are
enough reports to figure something out. Since I've seen this on two
different user applications now, I think it relates to some kind of
persistent issue either in Postgres or in the OS.

Summary: in some cases, "lock pileups" fail to resolve completely, and
one or more orphan backends are left in permanent lock-waiting state.

Versions observed: 9.2.14, 9.2.15, 9.3.5

Platforms: RHEL6, Fedora

Observations:

1. A long-running transaction grabs one or more row locks.

2. Various queries, especially SELECT FOR UPDATE queries, pile up behind
this lock request.

3. At peak, 30 or more backends are waiting for locks in a dependency
chain. System load is high.

4. Original transaction ends.

5. Over 10 minutes most of the waiting backends complete their work and
release.

6. 1 to 3 backends never come out of active/waiting state, remaining
that way indefinitely.

My attempts to reproduce this issue under synthetic circumstances have
not been successful. strace of the stuck backends shows no activity.

--
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2016-04-13 23:06:29 Re: Lock pileup stuck processes
Previous Message Christoph Berg 2016-04-13 21:11:52 Re: Bus error in pg_logical_slot_get_changes (9.4.7, sparc)