Re: [PATCH] pg_stat_activity: make slow/hanging authentication more visible

From: Jacob Champion <jacob(dot)champion(at)enterprisedb(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Euler Taveira <euler(dot)taveira(at)enterprisedb(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>
Subject: Re: [PATCH] pg_stat_activity: make slow/hanging authentication more visible
Date: 2025-03-06 22:25:07
Message-ID: CAOYmi+nbgAifBRHy6_dSFGcwtRk78YTbVTO=tb+1HwqM0gwbCw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 5, 2025 at 8:08 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> Honestly, I don't see a reason not to introduce that, like in the
> attached.

This new code races against the session timeout. I see this on timer expiration:

[14:19:55.224](0.000s) # issuing query 34 via background psql:
SELECT state FROM pg_stat_activity WHERE pid = ;
[14:19:55.228](0.004s) # pump_until: process terminated
unexpectedly when searching for "(?^:(^|\n)background_psql:
QUERY_SEPARATOR 34:\r?\n)" with stream: ""
process ended prematurely at
/home/jacob/src/postgres/src/test/perl/PostgreSQL/Test/Utils.pm line
439.

Which makes it seem like some sort of crash, IMO. I don't find that as
easily debuggable as the previous log message, which was

[14:21:33.104](0.001s) # issuing query 32 via background psql:
SELECT pid FROM pg_stat_activity
# WHERE state = 'starting' and wait_event = 'init-pre-auth';
IPC::Run: timeout on timer #1 at
/home/jacob/perl5/lib/perl5/IPC/Run.pm line 3007.

> + WHERE state = 'starting' and wait_event = 'init-pre-auth';});

Did you have thoughts on expanding the check to backend_type [1]?

> + # Give up. The output of the last attempt is logged by query(),
> + # so no need to do anything here.
> + return 0;

One of my primary complaints about the poll_query_until()
implementation is that "giving up" in this case means continuing to
run pieces of the test that have no business running after a timeout,
and increasing the log noise after a failure. I'm not sure how loudly
to complain in this particular case, since I know we use it
elsewhere...

Thanks!
--Jacob

[1] https://postgr.es/m/CAOYmi%2BnxNCQcTQE-tQ7Lwpe4cYc1u-yxwEe5kt2AVN%2BDXXVVbQ%40mail.gmail.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nikhil Kumar Veldanda 2025-03-06 22:36:28 Re: ZStandard (with dictionaries) compression support for TOAST compression
Previous Message Julien Rouhaud 2025-03-06 22:12:52 Re: what's going on with lapwing?