Autovacuum degrades all other operations by keeping all buffers dirty?

From: David Pacheco <dap(at)joyent(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Autovacuum degrades all other operations by keeping all buffers dirty?
Date: 2018-08-31 21:53:12
Message-ID: CACukRjPNweo72NjLRjX1aCsBSm0TXH9Wmr311DDgZvCNm_xFcw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello,

We've been struggling with some major performance issues related to
autovacuum.
(I know this is a common problem.) For a while, we believed this was
primarily
related to I/O contention, but recent observations from our monitoring make
me
wonder if there's a deeper issue here, so I'm looking for some better
understanding.

From reading the 9.6.3 source, it looks like the autovacuum process itself
is
single-threaded, and it reads pages essentially linearly from the relation
(possibly skipping some). When the autovacuum process needs to modify a
page,
it doesn't write it directly, but rather marks the buffer dirty. The page
will
be written later, either by the checkpointer (according to its
configuration, in
terms of time and WAL), the bgwriter (according to its configuration, in
terms
of write count and sleep time), or else some other backend process that
requires
a free buffer (if the count of non-dirty buffers reaches zero). Is this
accurate?

In our case, we found that when autovacuum runs, the number of dirty buffers
written by regular backends shoots from a handful to as much as 300 buffers
per
second. (We have 200 backends on most databases.) More specifically:
prior to
recent autovacuums starting, databases did under 1000 buffer allocations per
second. Most of those were done by the checkpointer. Individual backends
did
just a handful. After autovacuum started, buffer allocations rose to
between
2,000 and 8,000 per second. The checkpointer handled many of these, but so
did
individual backends (see above). The bgwriter peaked around 25 buffer
writes
per second. So it seems like the spike in buffers written by normal
backends
could explain the significant degradation that we see in average latency and
overall throughput (which can be as much as 50%).

It looks to me like the autovacuum process is effectively generating work
(in
the form of async writes) that's being distributed implicitly to the various
backend processes, creating latency for any other query that happens to
require
a buffer (including read-only queries). Maybe we can improve this by
tuning the
bgwriter. But if that's single-threaded, presumably there's an upper bound
to
how many buffer writes it can handle? Is there anything to guarantee that
the
bgwriter will keep up with the work that's being generated? That is, for
some
workloads, wouldn't it be possible that autovacuum could always generate
work
faster than the bgwriter can do it, and you'd always have some pretty major
degradation to all other queries?

I've drawn a lot of inferences here, and I'm sure there are some mistakes.
I'd
appreciate any clarification, correction, or confirmation!

Thanks in advance,
Dave

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Alvaro Herrera 2018-08-31 22:31:47 Re: Autovacuum degrades all other operations by keeping all buffers dirty?
Previous Message Dave Peticolas 2018-08-31 15:51:59 Re: locate DB corruption