From: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
---|---|
To: | Mike Schanne <mschanne(at)kns(dot)com> |
Cc: | "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: autovacuum locking question |
Date: | 2019-12-06 17:47:56 |
Message-ID: | CAMkU=1zGKub0OjwFYkbcck7-7YmNEbj1HRpzdOMHnAvFY_tppw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Fri, Dec 6, 2019 at 10:55 AM Mike Schanne <mschanne(at)kns(dot)com> wrote:
> The error is not actually showing up very often (I have 8 occurrences from
> 11/29 and none since then). So maybe I should not be concerned about it.
> I suspect we have an I/O bottleneck from other logs (i.e. long checkpoint
> sync times), so this error may be a symptom rather than the cause.
>
I think that at the point it is getting cancelled, it has done all the work
except the truncation of the empty pages, and reporting the results (for
example, updating n_live_tup and n_dead_tup). If this happens
every single time (neither last_autovacuum nor last_vacuum ever advances)
it will eventually cause problems. So this is mostly a symptom, but not
entirely. Simply running a manual vacuum should fix the reporting
problem. It is not subject to cancelling, so it will detect it is blocking
someone and gracefully bow. Meaning it will suspend the truncation, but
will still report its results as normal.
Reading the table backwards in order to truncate it might be contributing
to the IO problems as well as being a victim of those problems. Upgrading
to v10 might help with this, as it implemented a prefetch where it reads
the table forward in 128kB chunks, and then jumps backwards one chunk at a
time. Rather than just reading backwards 8kB at a time.
Cheers,
Jeff
>
From | Date | Subject | |
---|---|---|---|
Next Message | Justin Pryzby | 2019-12-06 17:49:34 | Re: autovacuum locking question |
Previous Message | Tom Lane | 2019-12-06 17:19:25 | Re: autovacuum locking question |