PG9.1.2 -- 3 day old orphaned/non-responsive query -- pg_(cancel)(terminate)_backend has no effect

From: Reid Thompson <Reid(dot)Thompson(at)ateb(dot)com>
To: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Cc: Reid Thompson <Reid(dot)Thompson(at)ateb(dot)com>
Subject: PG9.1.2 -- 3 day old orphaned/non-responsive query -- pg_(cancel)(terminate)_backend has no effect
Date: 2012-01-13 17:02:12
Message-ID: 1326474130.17464.92.camel@raker2.ateb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Looking for some quidance or suggestions.

PostgreSQL 9.1.2 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.5 20110214 (Red Hat 4.4.5-6), 64-bit
$ uname -a
Linux db1.hw.ateb.com 2.6.32-131.21.1.el6.x86_64 #1 SMP Tue Nov 22 19:48:09 GMT 2011 x86_64 x86_64 x86_64 GNU/Linux

datid | datname | procpid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | query_start | waiting |
current_query
------------+-----------+---------+------------+----------+------------------+--------------+-----------------+-------------+-------------------------------+-------------------------------+-------------------------------+---------+--
----------------------------------------------------------------------------
2034098332 | reporting | 30479 | 2033174751 | jbaucom | | 172.16.48.79 | | 1860 | 2012-01-10 10:55:03.784225-05 | 2012-01-10 10:56:28.726281-05 | 2012-01-10 10:56:28.738283-05 | f | s
elect * from xxxxdata_vw where clientid = 164 and filldate > '01-01-2012'
2034098332 | reporting | 32697 | 10 | postgres | | | | | 2012-01-12 19:02:06.911721-05 | 2012-01-12 19:02:06.941263-05 | 2012-01-12 19:02:06.941263-05 | f | a
utovacuum: VACUUM public.xxxxdata_201201
(2 rows)

There are 3 db's in this cluster. These above two activities are
essentially hitting the same underlying child table. xxxxdata_vw is on
xxxxdata_parent which has xxxxdata_YYYYMM children.

The select query has been apparently orphaned by it's originating
terminal or script or application. pg_/cancel/terminate/_backend both
return t, but the query does not cancel/terminate. sigterm via the OS
has no effect either.

I'm not sure whether the vacuum is active, and just slow, or if it's
spinning, or...? I've taken no action toward the vacuum process.

reporting=# select count(*) from xxxxdata_201201;
count
---------
9170468
(1 row)

There are approx 500K records inserted daily, so the table had approx
7.5M records when the query started.

$ cat /proc/30479/wchan
sk_stream_wait_memory

Thanks,
reid

Responses

Browse pgsql-general by date

  From Date Subject
Next Message debian nick 2012-01-13 17:10:58 Re: Postgresql allow connections without password [SOLVED]
Previous Message Pavel Stehule 2012-01-13 16:52:58 Re: ORAFCE -> UTL_FILE -> pul_line