Re: [PATCH] pg_stat_activity: make slow/hanging authentication more visible

From: Jacob Champion <jacob(dot)champion(at)enterprisedb(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Robert Haas <robertmhaas(at)gmail(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Euler Taveira <euler(dot)taveira(at)enterprisedb(dot)com>
Subject: Re: [PATCH] pg_stat_activity: make slow/hanging authentication more visible
Date: 2024-11-07 20:11:46
Message-ID: CAOYmi+neRXD5LKUthKDaGLfAzApoyvL4OJF-VffifdwyqTdxsQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Nov 7, 2024 at 11:41 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> I think the patch should not be merged as-is. It's just too ugly and fragile.

Understood; I'm trying to find a way forward, and I'm pointing out
that the alternatives I've tried seem to me to be _more_ fragile.

Are there any items in this list that you disagree with/are less
concerned about?

- the pre-auth step must always initialize the entire pgstat struct
- two-step initialization requires two PGSTAT_BEGIN_WRITE_ACTIVITY()
calls for _every_ backend
- two-step initialization requires us to couple against the order that
authentication information is being filled in (pre-auth, post-auth, or
both)

> I think it might make more sense to use pgstat_report_activity() or such here,
> rather than using wait events to describe something that's not a wait.

I'm not sure why you're saying these aren't waits. If pam_authenticate
is capable of hanging for hours and bringing down a production system,
is that not a "wait"?

> > I agree that would be amazing! I'm not about to tackle reliable
> > interrupts for all of the current blocking auth code for v18, though.
> > I'm just trying to make it observable when we do something that
> > blocks.
>
> Well, with that justification we could end up adding wait events for large
> swaths of code that might not actually ever wait.

If it were hypothetically useful to do so, would that be a problem?
I'm trying not to propose things that aren't actually useful.

--Jacob

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2024-11-07 20:32:09 Re: Popcount optimization using AVX512
Previous Message Jan Wieck 2024-11-07 20:05:31 Re: Commit Timestamp and LSN Inversion issue