From: | Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz> |
---|---|
To: | Greg Stark <gsstark(at)mit(dot)edu> |
Cc: | Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-bugs(at)postgresql(dot)org |
Subject: | Re: Hung Vacuum in 8.3 |
Date: | 2011-03-08 00:03:06 |
Message-ID: | 4D75723A.5060207@catalyst.net.nz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On 08/03/11 12:55, Mark Kirkwood wrote:
> On 23/02/11 10:18, Mark Kirkwood wrote:
>> On 23/02/11 00:26, Greg Stark wrote:
>>>
>>> It's also possible there's a bug of course. If someone was using that
>>> buffer and somehow failed to notify the vacuum that they were done it
>>> would wait for a very long time (forever?). However if vacuum
>>> eventually continued when the query was canceled then it seems likely
>>> it was working as intended.
>>>
>>
>> Greg, thanks for clarifying this.
>>
>> Unfortunately this time around I canceled the vacuum and then the
>> query. However *next* time I'll get rid of the query 1st and see what
>> happens.
>>
>
> I have another one here:
>
> postgres=# select datname,procpid, query_start, current_query
> from pg_stat_activity
> where now() - query_start > '2 hours'::interval
> order by query_start;
> datname | procpid | query_start | current_query
> ---------+---------+-------------------------------+-----------------
> stuff | 23957 | 2011-03-08 06:02:02.659159+13 | VACUUM ANALYZE;
>
> postgres=# select datname,procpid, query_start, current_query
> from pg_stat_activity
> where now() - backend_start > '2 hours'::interval
> order by query_start;
> datname | procpid | query_start | current_query
> ---------+---------+-------------------------------+-----------------
> stuff | 23957 | 2011-03-08 06:02:02.659159+13 | VACUUM ANALYZE;
> :
> stuff | 30018 | 2011-03-08 11:57:03.214177+13 | <IDLE>
> stuff | 11390 | 2011-03-08 12:27:20.390122+13 | <IDLE>
> stuff | 10178 | 2011-03-08 12:28:59.328265+13 | <IDLE>
> stuff | 5100 | 2011-03-08 12:30:10.318204+13 | <IDLE>
> stuff | 10255 | 2011-03-08 12:32:21.520814+13 | <IDLE>
> stuff | 29993 | 2011-03-08 12:32:26.562835+13 | <IDLE>
> stuff | 10204 | 2011-03-08 12:40:57.059958+13 | <IDLE>
> stuff | 10206 | 2011-03-08 12:41:11.708924+13 | <IDLE>
> (9 rows)
>
>
> It looks to me like whatever query buffer pins the vacuum *was*
> waiting on should be well and truly over by now. The strace for the
> vacuum looks the same as before:
>
> $ strace -p 23957
> Process 23957 attached - interrupt to quit
> semop(39747613, 0x7fff53d3dec0, 1
>
>
> It certainly looks like the vacuum is not able to resume after setting
> itself to sleep for some reason.
>
>
>
Immediately after I sent this, I saw the vacuum unlock itself and
continue... so there *are* some queries still blocking it. I'm seeing
some highish (1 -2 minute) lock waits for a certain table:
postgres=# select datname,procpid, query_start, current_query
from pg_stat_activity where now() - query_start > '1 minute'
and procpid in (select distinct pid from pg_locks where not
granted) order by query_start;
datname | procpid | query_start | current_query
---------+---------+-------------------------------+------------------------------
stuff | 390 | 2011-03-08 12:57:01.752234+13 | UPDATE
content.twitter_queue
: SET error = $1
: WHERE
twitter_queue_id = $2
(30 or so)
which are probably not helping.
So false alarm, no bug here it would seem, sorry.
Mark
From | Date | Subject | |
---|---|---|---|
Next Message | YAMAMOTO Takashi | 2011-03-08 00:37:40 | BUG #5918: SummarizeOldestCommittedSxact assertion failure |
Previous Message | Mark Kirkwood | 2011-03-07 23:55:36 | Re: Hung Vacuum in 8.3 |