Re: FW: query pg_stat_ssl hang 100%cpu

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: "James Pang (chaolpan)" <chaolpan(at)cisco(dot)com>
Cc: PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: FW: query pg_stat_ssl hang 100%cpu
Date: 2023-09-07 04:04:32
Message-ID: ZPlL0MVFmzKPPgSQ@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-performance

On Thu, Sep 07, 2023 at 01:35:00AM +0000, James Pang (chaolpan) wrote:
> PGv14.8, OS RHEL8, no SSL enabled in this database, we have a
> lot of client sessions who check it's ssl state by query, all
> other sessions got done very quickly, but only 1 session hang
> there in 100% cpu tens of hours, even pg_terminate_backend does
> not make it stopped either. It looks like abnormal.
>
> select ssl from pg_stat_ssl where pid=pg_backend_pid();

This is hard to act on without more details or even a reproducible and
self-contained test case. Even a java script based on the JDBC driver
would be OK for me, for example, if it helps digging into what you are
seeing.

> #0 ensure_record_cache_typmod_slot_exists (typmod=0) at typcache.c:1714
> #1 0x000000000091185b in assign_record_type_typmod (tupDesc=<optimized out>, tupDesc(at)entry=0x27bc738) at typcache.c:2001
> #2 0x000000000091df03 in internal_get_result_type (funcid=<optimized out>, call_expr=<optimized out>, rsinfo=<optimized out>,
> resultTypeId=<optimized out>, resultTupleDesc=0x7ffc9dff8cd0) at funcapi.c:393
> #3 0x000000000091e263 in get_expr_result_type (expr=expr(at)entry=0x2792798, resultTypeId=resultTypeId(at)entry=0x7ffc9dff8ccc,
> resultTupleDesc=resultTupleDesc(at)entry=0x7ffc9dff8cd0) at funcapi.c:230
> #4 0x00000000006a2fa5 in ExecInitFunctionScan (node=node(at)entry=0x273afa8, estate=estate(at)entry=0x269e948, eflags=eflags(at)entry=16) at nodeFunctionscan.c:370
> #5 0x000000000069084e in ExecInitNode (node=node(at)entry=0x273afa8, estate=estate(at)entry=0x269e948, eflags=eflags(at)entry=16) at execProcnode.c:255
> #6 0x000000000068a96d in InitPlan (eflags=16, queryDesc=0x273b2d8) at execMain.c:936
> #7 standard_ExecutorStart (queryDesc=0x273b2d8, eflags=16) at execMain.c:263
> #8 0x00007f67c2821d5d in pgss_ExecutorStart (queryDesc=0x273b2d8, eflags=<optimized out>) at pg_stat_statements.c:965
> #9 0x00000000007fc226 in PortalStart (portal=portal(at)entry=0x26848b8, params=params(at)entry=0x0, eflags=eflags(at)entry=0, snapshot=snapshot(at)entry=0x0)
> at pquery.c:514
> #10 0x00000000007fa27f in exec_bind_message (input_message=0x7ffc9dff90d0) at postgres.c:1995
> #11 PostgresMain (argc=argc(at)entry=1, argv=argv(at)entry=0x7ffc9dff9370, dbname=<optimized out>, username=<optimized out>) at postgres.c:4552
> #12 0x000000000077a4ea in BackendRun (port=<optimized out>, port=<optimized out>) at postmaster.c:4537
> #13 BackendStartup (port=<optimized out>) at postmaster.c:4259
> #14 ServerLoop () at postmaster.c:1745
> #15 0x000000000077b363 in PostmasterMain (argc=argc(at)entry=5, argv=argv(at)entry=0x256abc0) at postmaster.c:1417
> #16 0x00000000004fec63 in main (argc=5, argv=0x256abc0) at main.c:209

This stack is referring to a code path where we are checking that some
of the type-related data associated to a record is around, but this
does not say exactly where the loop happens, so... Are we looking on
a loop in the function execution itself from which the information of
pg_stat_ssl is retrieved (aka pg_stat_get_activity())? Or is the
type cache somewhat broken because of the extended query protocol?
That's not really possible to see any evidence based on the
information provided, though it provides a few hits that can help.
FWIW, I've not heard about an issue like that in the field.

The first thing I would do is update to 14.9, which is the latest
version of Postgres available for this major version.
--
Michael

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Kyotaro Horiguchi 2023-09-07 05:12:32 Re: BUG #18046: stats collection behaviour change is affecting the usability of information.
Previous Message James Pang (chaolpan) 2023-09-07 01:35:00 FW: query pg_stat_ssl hang 100%cpu

Browse pgsql-performance by date

  From Date Subject
Next Message James Pang (chaolpan) 2023-09-07 08:46:29 RE: FW: query pg_stat_ssl hang 100%cpu
Previous Message James Pang (chaolpan) 2023-09-07 01:35:00 FW: query pg_stat_ssl hang 100%cpu