From: | Craig Ringer <craig(dot)ringer(at)enterprisedb(dot)com> |
---|---|
To: | Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [PATCH] LWLock self-deadlock detection |
Date: | 2020-11-25 06:16:53 |
Message-ID: | CAGRY4nyb6iOhPMphZq=9sTUFep35f2JqF=G35mUoJ_3zdoPx1g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Nov 24, 2020 at 10:11 PM Ashutosh Bapat <
ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
> This looks useful. LWLockCheckSelfDeadlock() could use LWLockHeldByMe
> variant instead of copying that code with possibly a change in that
> function to return the required information.
>
Yes, possibly so. I was reluctant to mess with the rest of the code too
much.
I am also seeing a pattern
> Assert(!LWLockHeldByMe());
> LWLockAcquire()
>
> at some places. Should we change LWLockAcquire to do
> Assert(!LWLockHeldByMe()) always to detect such occurrences?
I'm inclined not to, at least not without benchmarking it, because that'd
do the check before we attempt the fast-path. cassert builds are still
supposed to perform decently and be suitable for day to day development and
I'd rather not risk a slowdown.
I'd prefer to make the lock self deadlock check run for production builds,
not just cassert builds. It'd print a normal LOG (with backtrace if
supported) then Assert(). So on an assert build we'd get a crash and core,
and on a non-assert build we'd carry on and self-deadlock anyway.
That's probably the safest thing to do. We can't expect to safely ERROR out
of the middle of an LWLockAcquire(), not without introducing a new and
really hard to test code path that'll also be surprising to callers. We
probably don't want to PANIC the whole server for non-assert builds since
it might enter a panic-loop. So it's probably better to self-deadlock. We
could HINT that a -m immediate shutdown will be necessary to recover though.
I don't think it makes sense to introduce much complexity for a feature
that's mainly there to help developers and to catch corner-case bugs.
(FWIW a variant of this patch has been in 2ndQPostgres for some time. It
helped catch an extension bug just the other day.)
From | Date | Subject | |
---|---|---|---|
Next Message | Pavel Stehule | 2020-11-25 06:19:56 | Re: [Proposal] Global temporary tables |
Previous Message | Amit Kapila | 2020-11-25 05:45:38 | Re: Remove cache_plan argument comments to ri_PlanCheck |