Re: How to debug a connection that's "active" but hanging?

From: Vijaykumar Jain <vijaykumarjain(dot)github(at)gmail(dot)com>
To: Jurrie Overgoor <postgresql-mailinglist(at)jurr(dot)org>
Cc: pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: How to debug a connection that's "active" but hanging?
Date: 2021-07-10 08:26:43
Message-ID: CAM+6J97mNj4gFnC=wYpsy5L+igEE8gwifVhaoXm-BE8BYX2PPQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Sat, 10 Jul 2021 at 00:29, Jurrie Overgoor <
postgresql-mailinglist(at)jurr(dot)org> wrote:

> Hi everyone,
>
> We are in the process of upgrading from PostgreSQL 9.6 to 13. When our
> database gets created in our regression tests, we run some unit tests
> first. We see one of those tests hang.
>
> It seems the client is waiting on more data to arrive from the
> PostgreSQL server. A thread dump shows it waiting on the socket.
>
> On the server, I see the connection from the client. Looking at
> pg_stat_activity, I see it is in state 'active'. I have seen query_start
> be as old as eight hours ago. The documentation states that 'active'
> means a query is being executed. The query in question is:
>
> Looking at pg_locks, I only see locks with granted = true for the PID.
>

Is this reproducible, I mean this happens multiple times?
can you please run the below query in a separate session and share the
result, feel free to anonymize sensitive stuff.

SELECT db.datname, locktype, relation::regclass, mode, transactionid
AS tid, virtualtransaction AS vtid, pid, granted
FROM pg_catalog.pg_locks l
LEFT JOIN pg_catalog.pg_database db ON db.oid = l.database
WHERE NOT pid = pg_backend_pid();

> There is nothing in the logs as far as I can see. Configuration
> variables log_min_messages and log_min_error_statement are on 'debug1'.
> This is a snippet of the logs:
>
> 2021-07-09 20:35:16.374 CEST [30399] STATEMENT: START TRANSACTION
> 2021-07-09 20:35:18.703 CEST [30399] WARNING: there is already a
> transaction in progress
>
Are you having multiple BEGIN tx not matching commit/rollbacks ? not sure
related, but fixing this will help reduce noise.

>
> I am a bit out of ideas - does anyone have any tips where I should look
> to see what is causing the query to hang?
>

Although I am not an expert at this, I am exploring myself.
Can you collect backtrace for this pid?

Getting a stack trace of a running PostgreSQL backend on Linux/BSD -
PostgreSQL wiki
<https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD#Linux_and_BSD>

You can try enabling core dumps, then run a core dump using `gcore` on the
hanging pid (if you do not want to attach to a live process)
and generate a backtrace on that pid using `gdb` and share the trace.
I think that might help.

--
Thanks,
Vijay
Mumbai, India

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Phil Endecott 2021-07-10 15:43:01 Re: Why can't I drop a tablespace?
Previous Message rob stone 2021-07-10 08:07:43 Re: optimization issue