From: | Lonni J Friedman <netllama(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-general <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: autovacuum stuck on a table for 18+ hours, consuming lots of CPU time |
Date: | 2011-11-23 03:43:58 |
Message-ID: | CAP=oouF7P_zThUrcQPxVNYrhLByZ0T-TVdQa5iUKXN3BGyegoQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Tue, Nov 22, 2011 at 7:19 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Lonni J Friedman <netllama(at)gmail(dot)com> writes:
>> Thanks for your prompt reply. I was pretty sure that I was using the
>> default, but just to confirm, I just ran:
>> 'SHOW vacuum_cost_delay;'
>
> What about autovacuum_vacuum_cost_delay? The selects seem to be
> delaying for 32msec, which is not the default for anything.
20ms is what it returns. I grepped for '32' in postgresql.conf and
other than the '5432' port, nothing was returned.
I'm using the official postgres RPMs that shipped with
Fedora15-x86_64, so unless something weird got compiled in, I have no
idea where that 32ms is coming from.
>
>> Is there no way to safely kill off this autovacuum process that
>> doesn't involve shutting down & restarting the database?
>
> Sending it a SIGINT ought to be safe enough, though I don't think that
> is necessarily advisable, because the next autovacuum will probably take
> just as long. Killing this one will mainly mean you've wasted (much of)
> the work it did so far.
ok, then I guess I'll wait longer.
>
> Before getting hasty I'd suggest identifying what table (or index) it's
> working on --- lsof on the process to see what FD 95 is connected to
> would be the first step. I'm thinking it's an index since the seek
> addresses don't seem to be consecutive. And it might be worth watching
> the seek addresses for awhile to see if you can prove that it's looping
> --- if it is, that might be an indication of a corrupt index. If it
> isn't, but is just steadily working through the index, you'd be best
> advised to have patience.
I suspect you're right. I just ran strace against that PID again, and
now all the lseek & read FD's are referrring to a different number
(115), so that means its moved onto something new since I looked a few
hours ago?
Anyway, I think this is what you were referring to:
/proc/30188/fd/115 -> /var/lib/pgsql/data/base/64793/72633.10
How do I correlate that file to an actual database object?
thanks
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2011-11-23 03:49:56 | Re: autovacuum stuck on a table for 18+ hours, consuming lots of CPU time |
Previous Message | Ondrej Ivanič | 2011-11-23 03:22:41 | Re: autovacuum stuck on a table for 18+ hours, consuming lots of CPU time |