From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Postgres stucks in deadlock detection |
Date: | 2018-04-13 18:09:48 |
Message-ID: | 20180413180948.rj5e3bsxhilvdccr@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2018-04-13 19:13:07 +0300, Konstantin Knizhnik wrote:
> On 13.04.2018 18:41, Andres Freund wrote:
> > On 2018-04-13 16:43:09 +0300, Konstantin Knizhnik wrote:
> > > Updated patch is attached.
> > > + /*
> > > + * Ensure that only one backend is checking for deadlock.
> > > + * Otherwise under high load cascade of deadlock timeout expirations can cause stuck of Postgres.
> > > + */
> > > + if (!pg_atomic_test_set_flag(&ProcGlobal->activeDeadlockCheck))
> > > + {
> > > + enable_timeout_after(DEADLOCK_TIMEOUT, DeadlockTimeout);
> > > + return;
> > > + }
> > > + inside_deadlock_check = true;
> > I can't see that ever being accepted. This means there's absolutely no
> > bound for deadlock checks happening even under light concurrency, even
> > if there's no contention for a large fraction of the time.
>
> It may cause problems only if
> 1. There is large number of active sessions
> 2. They perform deadlock-prone queries (so no attempts to avoid deadlocks at
> application level)
> 3. Deadlock timeout is set to be very small (10 msec?)
That's just not true.
> Otherwise either probability that all backends once and once again are
> trying to check deadlocks concurrently is very small (and can be even more
> reduced by using random timeout for subsequent deadlock checks), either
> system can not normally function in any case because large number of clients
> fall into deadlock.
Operating systems batch wakeups.
> I completely agree that there are plenty of different approaches, but IMHO
> the currently used strategy is the worst one, because it can stall system
> even if there are not deadlocks at all.
> I always think that deadlock is a programmer's error rather than normal
> situation. May be it is wrong assumption
It is.
> So before implementing some complicated solution of the problem9too slow
> deadlock detection), I think that first it is necessary to understand
> whether there is such problem at al and under which workload it can happen.
Sure. I'm not saying that you shouldn't experiment with a patch like the
one you sent. What I am saying is that that can't be the actual solution
that will be integrated.
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2018-04-13 18:21:33 | Re: crash with sql language partition support function |
Previous Message | Alvaro Herrera | 2018-04-13 18:08:30 | Re: crash with sql language partition support function |