Re: [PATCH] pg_stat_activity: make slow/hanging authentication more visible

From: Noah Misch <noah(at)leadboat(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Jacob Champion <jacob(dot)champion(at)enterprisedb(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Euler Taveira <euler(dot)taveira(at)enterprisedb(dot)com>
Subject: Re: [PATCH] pg_stat_activity: make slow/hanging authentication more visible
Date: 2024-09-10 20:58:50
Message-ID: 20240910205850.5d.nmisch@google.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 10, 2024 at 02:51:23PM -0400, Robert Haas wrote:
> On Tue, Sep 10, 2024 at 1:27 PM Noah Misch <noah(at)leadboat(dot)com> wrote:
> > On Tue, Sep 10, 2024 at 02:29:57PM +0900, Michael Paquier wrote:
> > > You are adding twelve event points with only 5
> > > new wait names. Couldn't it be better to have a one-one mapping
> > > instead, adding twelve entries in wait_event_names.txt?
> >
> > No, I think the patch's level of detail is better. One shouldn't expect the
> > two ldap_simple_bind_s() calls to have different-enough performance
> > characteristics to justify exposing that level of detail to the DBA.
> > ldap_search_s() and InitializeLDAPConnection() differ more, but the DBA mostly
> > just needs to know the scale of their LDAP responsiveness problem.
> >
> > (Someday, it might be good to expose the file:line and/or backtrace associated
> > with a wait, like we do for ereport(). As a way to satisfy rare needs for
> > more detail, I'd prefer that over giving every pgstat_report_wait_start() a
> > different name.)
>
> I think unique names are a good idea. If a user doesn't care about the
> difference between sdgjsA and sdjgsB, they can easily ignore the
> trailing suffix, and IME, people typically do that without really
> stopping to think about it. If on the other hand the two are lumped
> together as sdjgs and a user needs to distinguish them, they can't. So
> I see unique names as having much more upside than downside.

I agree a person can ignore the distinction, but that requires the person to
be consuming the raw event list. It's reasonable to tell your monitoring tool
to give you the top N wait events. Individual AuthnLdap* events may all miss
the cut even though their aggregate would have made the cut. Before you know
to teach that monitoring tool to group AuthnLdap* together, it won't show you
any of those names.

I felt commit c789f0f also chose sub-optimally in this respect, particularly
with the DblinkGetConnect/DblinkConnect pair. I didn't feel strongly enough
to complain at the time, but a rule of "each wait event appears in one
pgstat_report_wait_start()" would be a rule I don't want. One needs
familiarity with the dblink implementation internals to grasp the
DblinkGetConnect/DblinkConnect distinction, and a plausible refactor of dblink
would make those names cease to fit. I see this level of fine-grained naming
as making the event name a sort of stable proxy for FILE:LINE. I'd value
exposing such a proxy, all else being equal, but I don't think wait event
names like AuthLdapBindLdapbinddn/AuthLdapBindUser are the right way. Wait
event names should be more independent of today's code-level details.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John H 2024-09-10 21:10:32 Re: Allow logical failover slots to wait on synchronous replication
Previous Message Andrew Dunstan 2024-09-10 20:51:28 Re: Jargon and acronyms on this mailing list