Re: RFC: replace pg_stat_activity.waiting with something more descriptive

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Ildus Kurbangaliev <i(dot)kurbangaliev(at)postgrespro(dot)ru>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Subject: Re: RFC: replace pg_stat_activity.waiting with something more descriptive
Date: 2015-09-14 12:03:55
Message-ID: CA+TgmoZ-8ZpoUM9BGtBUP1u4dUQhC-9EpEDLzyK0dG4pKMDUwQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Sep 14, 2015 at 5:32 AM, Alexander Korotkov
<aekorotkov(at)gmail(dot)com> wrote:
> In order to build the consensus we need the roadmap for waits monitoring.
> Would single byte in PgBackendStatus be the only way for tracking wait
> events? Could we have pluggable infrastructure in waits monitoring: for
> instance, hooks for wait event begin and end?

No, it's not the only way of doing it. I proposed doing that way
because it's simple and cheap, but I'm not hell-bent on it. My basic
concern here is about the cost of this. I think that the most data we
can report without some kind of synchronization protocol is one 4-byte
integer. If we want to report anything more than that, we're going to
need something like the st_changecount protocol, or a lock, and that's
going to add very significantly - and in my view unacceptably - to the
cost. I care very much about having this facility be something that
we can use in lots of places, even extremely frequent operations like
buffer reads and contended lwlock acquisition.

I think that there may be some *kinds of waits* for which it's
practical to report additional detail. For example, suppose that when
a heavyweight lock wait first happens, we just report the lock type
(relation, tuple, etc.) but then when the deadlock detector expires,
if we're still waiting, we report the entire lock tag. Well, that's
going to happen infrequently enough, and is expensive enough anyway,
that the cost doesn't matter. But if, every time we read a disk
block, we take a lock (or bump a changecount and do a write barrier),
dump the whole block tag in there, release the lock (or do another
write barrier and bump the changecount again) that sounds kind of
expensive to me. Maybe we can prove that it doesn't matter on any
workload, but I doubt it. We're fighting for every cycle in some of
these code paths, and there's good evidence that we're burning too
many of them compared to competing products already.

I am not a big fan of hooks as a way of resolving disagreements about
the design. We may find that there are places where it's useful to
have hooks so that different extensions can do different things, and
that is fine. But we shouldn't use that as a way of punting the
difficult questions. There isn't enough common understanding here of
what we're all trying to get done and why we're trying to do it in
particular ways rather than in other ways to jump to the conclusion
that a hook is the right answer. I'd prefer to have a nice, built-in
system that everyone agrees represents a good set of trade-offs than
an extensible system.

I think it's reasonable to consider reporting this data in the PGPROC
using a 4-byte integer rather than reporting it through a singe byte
in the backend status structure. I believe that addresses the
concerns about reporting from auxiliary processes, and it also allows
a little more data to be reported. For anything in excess of that, I
think we should think rather harder. Most likely, such addition
detail should be reported only for certain types of wait events, or on
a delay, or something like that, so that the core mechanism remains
really, really fast.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2015-09-14 12:11:10 Re: On-demand running query plans using auto_explain and signals
Previous Message Alexander Korotkov 2015-09-14 11:50:05 Re: RFC: replace pg_stat_activity.waiting with something more descriptive