[PATCH] LWLock self-deadlock detection

From: Craig Ringer <craig(dot)ringer(at)enterprisedb(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: [PATCH] LWLock self-deadlock detection
Date: 2020-11-19 10:31:36
Message-ID: CAGRY4nyyYarrwfc72gp5uDyA-wR+Tf6f9YnPhqDv_rw7R5oEYA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi all

Here's a patch I wrote a while ago to detect and report when a
LWLockAcquire() results in a simple self-deadlock due to the caller already
holding the LWLock.

To avoid affecting hot-path performance, it only fires the check on the
first iteration through the retry loops in LWLockAcquire() and
LWLockWaitForVar(), and just before we sleep, once the fast-path has been
missed.

I wrote an earlier version of this when I was chasing down some hairy
issues with background workers deadlocking on some exit paths because
ereport(ERROR) or elog(ERROR) calls fired when a LWLock was held would
cause a before_shmem_exit or on_shmem_exit cleanup function to deadlock
when it tried to acquire the same lock.

But it's an easy enough mistake to make and a seriously annoying one to
track down, so I figured I'd post it for consideration. Maybe someone else
will get some use out of it even if nobody likes the idea of merging it.

As written the check runs only for --enable-cassert builds or when
LOCK_DEBUG is defined.

Attachment Content-Type Size
0001-LWLock-self-deadlock-detection.patch text/x-patch 5.1 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2020-11-19 10:33:49 Re: Add LWLock blocker(s) information
Previous Message osumi.takamichi@fujitsu.com 2020-11-19 10:26:12 RE: Disable WAL logging to speed up data loading