Re: [PATCH] pg_stat_activity: make slow/hanging authentication more visible

From: Jacob Champion <jacob(dot)champion(at)enterprisedb(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Euler Taveira <euler(dot)taveira(at)enterprisedb(dot)com>
Subject: Re: [PATCH] pg_stat_activity: make slow/hanging authentication more visible
Date: 2024-08-29 20:44:01
Message-ID: CAOYmi+=1Y4HKATADR=WYOMhxCkRQpoxH2rY0t1An6A5su1Cw-w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jun 30, 2024 at 10:48 AM Noah Misch <noah(at)leadboat(dot)com> wrote:v
> > Would anyone like me to be more aggressive, and create a pgstat entry
> > as soon as we have the opening transaction? Or... as soon as a
> > connection is made?
>
> All else being equal, I'd like backends to have one before taking any lmgr
> lock or snapshot.

v3-0003 pushes the pgstat creation as far back as I felt comfortable,
right after the PGPROC registration by InitProcessPhase2(). That
function does lock the ProcArray, but if it gets held forever due to
some bug, you won't be able to use pg_stat_activity to debug it
anyway. And with this ordering, pg_stat_get_activity() will be able to
retrieve the proc entry by PID without a race.

This approach ends up registering an early entry for more cases than
the original patchset. For example, autovacuum and other background
workers will now briefly get their own "authenticating" state, which
seems like it could potentially confuse people. Should I rename the
state, or am I overthinking it?

> You could release the xmin before calling PAM or LDAP. If you've copied all
> relevant catalog content to local memory, that's fine to do.

I played with the xmin problem a little bit, but I've shelved it for
now. There's probably a way to do that safely; I just don't understand
enough about the invariants to do it. For example, there's a comment
later on that says

* We established a catalog snapshot while reading pg_authid and/or
* pg_database;

and I'm a little nervous about invalidating the snapshot halfway
through that process. Even if PAM and LDAP don't rely on pg_authid or
other shared catalogs today, shouldn't they be allowed to in the
future, without being coupled to InitPostgres implementation order?
And I don't think we can move the pg_database checks before
authentication.

As for the other patches, I'll ping Andrew about 0001, and 0004
remains in its original WIP state. Anyone excited about that wait
event idea?

Thanks!
--Jacob

Attachment Content-Type Size
v3-0002-Test-Cluster-let-background_psql-work-asynchronou.patch application/octet-stream 3.1 KB
v3-0001-BackgroundPsql-handle-empty-query-results.patch application/octet-stream 2.6 KB
v3-0003-pgstat-report-in-earlier-with-STATE_AUTHENTICATIN.patch application/octet-stream 9.2 KB
v3-0004-WIP-report-external-auth-calls-as-wait-events.patch application/octet-stream 7.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2024-08-29 20:48:53 Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation
Previous Message Guillaume Lelarge 2024-08-29 20:08:23 Add parallel columns for pg_stat_statements