From: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
---|---|
To: | Michael Harris <harmic(at)gmail(dot)com> |
Cc: | pgsql-general(at)lists(dot)postgresql(dot)org |
Subject: | Re: Undetected Deadlock |
Date: | 2022-01-25 23:11:39 |
Message-ID: | 202201252311.4jc4gpdzwwu7@alvherre.pgsql |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On 2022-Jan-25, Michael Harris wrote:
> We've recently updated our application to PG 14.1, and in the test instance we
> have started to see some alarming undetected deadlocks.
This is indeed suspicious / worrisome / curious.
What version were you using previously?
I reformatted the result sets:
> An example of what we have seen is:
>
> locktype | database | relation | page | tuple | virtualxid | transactionid | classid | objid | objsubid | virtualtransaction | pid | mode | granted | fastpath | waitstart | relation
> ----------+----------+------------+------+-------+------------+---------------+---------+-------+----------+--------------------+---------+-----------------+---------+----------+-------------------------------+--------------------------------
> relation | 529986 | 1842228045 | | | | | | | | 165/1941408 | 2130531 | AccessShareLock | f | f | 2022-01-19 00:32:32.626152+01 | st.ctr_table_efr_oa
> (1 row)
>
> locktype | database | relation | page | tuple | virtualxid | transactionid | classid | objid | objsubid | virtualtransaction | pid | mode | granted | fastpath | waitstart | relation
> ----------+----------+------------+------+-------+------------+---------------+---------+-------+----------+--------------------+---------+---------------------+---------+----------+-----------+--------------------------------
> relation | 529986 | 1842228045 | | | | | | | | 75/2193719 | 2128603 | AccessExclusiveLock | t | f | | st.ctr_table_efr_oa
> (1 row)
>
> locktype | database | relation | page | tuple | virtualxid | transactionid | classid | objid | objsubid | virtualtransaction | pid | mode | granted | fastpath | waitstart | relation
> ----------+----------+------------+------+-------+------------+---------------+---------+-------+----------+--------------------+---------+---------------------+---------+----------+-------------------------------+-----------
> relation | 529986 | 1842231489 | | | | | | | | 75/2193719 | 2128603 | AccessExclusiveLock | f | f | 2022-01-19 00:32:32.924694+01 | st.tpd_oa
> (1 row)
>
> locktype | database | relation | page | tuple | virtualxid | transactionid | classid | objid | objsubid | virtualtransaction | pid | mode | granted | fastpath | waitstart | relation
> ---------------+----------+------------+------+-------+--------------+---------------+---------+-----------+----------+--------------------+---------+-----------------------+---------+----------+-------------------------------+-----------
> relation | 529986 | 1842231489 | | | | | | | | 165/1941408 | 2130531 | AccessShareLock | t | f | | st.tpd_oa
>
> So:
> pid 2130531 waits for an AccessShareLock on relation 1842228045, blocked by pid 2128603 which holds an AccessExclusiveLock
> pid 2128603 waits for an AccessExclusiveLock on relation 1842231489, blocked by pid 2130531 which holds an AccessShareLock
>
> The queries being executed by these backends are:
>
> pid | query_start | state_change | wait_event_type | wait_event | state | query
> ---------+-------------------------------+-------------------------------+-----------------+------------+--------+-------------------------------------------------------------------------
> 2128603 | 2022-01-19 00:32:32.924413+01 | 2022-01-19 00:32:32.924413+01 | Lock | relation | active | DROP TABLE st.tpd_oa_18929
> 2130531 | 2022-01-19 00:32:32.625706+01 | 2022-01-19 00:32:32.625708+01 | Lock | relation | active | DELETE FROM st.ctr_table_efr_oa_19010 WHERE ropid = 44788868
> (2 rows)
I know of no cases in which we fail to detect a deadlock. Perhaps you
have indeed hit a bug.
> Note that there were a lot of other processes also waiting on relation
> 1842231489 - could that be confusing the deadlock detection routine?
It shouldn't.
> I am also confused about the locks which are being taken out by the
> DELETE query.
Maybe the lock is already taken before the DELETE is run; do you have
any triggers, rules, constraints, or anything? If you have seen this
several times already, maybe a way to investigate deeper is an
exhaustive log capture of everything that these transactions do, from
the point they begin until they become blocked (log_statement=all).
Perhaps you need to involve other concurrent transactions in order to
cause the problem.
--
Álvaro Herrera 39°49'30"S 73°17'W — https://www.EnterpriseDB.com/
"Tiene valor aquel que admite que es un cobarde" (Fernandel)
From | Date | Subject | |
---|---|---|---|
Next Message | Jian He | 2022-01-26 06:05:41 | Re: Counting the number of repeated phrases in a column |
Previous Message | Ivan Panchenko | 2022-01-25 21:33:03 | Re: Counting the number of repeated phrases in a column |