Re: vacuum scheduling (was: index-only scans)

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Kääriäinen Anssi <anssi(dot)kaariainen(at)thl(dot)fi>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, "postgres(at)cybertec(dot)at" <postgres(at)cybertec(dot)at>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: vacuum scheduling (was: index-only scans)
Date: 2011-08-14 04:12:19
Message-ID: CA+TgmoZ4m_ycyC880TvouTdkT9NdjGJhc3DonTMbBp=aOV3Kog@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Aug 13, 2011 at 5:31 PM, Heikki Linnakangas
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> Yeah, I'm not excited about making the planner and statistics more dynamic.
> Feedback loops and plan instability are not fun. I think we should rather be
> more aggressive in setting the visibility map bits.

As far as I can see, the only way we're really going to make any
headway setting visibility map bits more aggressively is to get
autovacuum to do it for us. I mean, we could consider doing it when
we perform a HOT cleanup; and we could also consider having the
background writer do it when it evicts buffers. But frankly, those
sorts of things are a drop in the bucket. They're just not going to
occur frequently enough to really make a dent. Maybe you could make
it work for the case where the working set fits in shared_buffers,
because then the background writer isn't doing anything anyway and so,
hey, why not set visibility map bits? But you're not going to get
much of a performance win out of index-only scans on that workload
anyway. If the working set does NOT fit in shared_buffers, you're
going to be evicting pages like crazy. I can't really see us deciding
to dirty all of those pages and write and flush XLOG for them instead
of just booting them out. Even if you have the background writer try
to do some of that, other backends that are just throwing things over
the fence to get foreground work done are going to run circles around
it, so it's going to take a long, long time to make much headway by
this route.

Assuming you believe the above analysis (which you may not, so feel
free to argue if you see something I'm missing), that leaves
autovacuum. There are many things to like about running autovacuum
more aggressively, and even on insert-only tables. Setting visibility
map bits isn't only useful for index-only scans: it also significantly
speeds up sequential scans that are not I/O-bound, because it enables
the visibility checks on the offending pages to be skipped. On the
flip side, setting visibility map bits (and hint bits) has to be
considered a lower priority than vacuuming tables that contain dead
tuples, because bloat is a far, far bigger problem than a few unset
visibility map bits. There are already situations where vacuum can't
keep up, especially with exciting values of vacuum_cost_delay, and I
don't think we can afford to stir any more work into that pot without
prioritizing it.

We already have this parameter called vacuum_cost_delay that is
intensively problematic. If you set it too low, vacuum can consume
too much I/O bandwidth and you suffer a performance drop. On the
other hand, if you set it too high (which is easy to do), autovacuum
can't keep up and you get bloat. In the worst case, this sends the
system into a tailspin: as all the tables get bigger, it gets harder
and harder for vacuum to keep up. What would be really nice is to
make this parameter self-tuning, at least to some degree. We used to
run checkpoints at full speed, but we now have this nifty parameter
called checkpoint_completion_target that makes them run slowly, but
not so slowly that they don't complete fast enough. It would be nice
to do the same thing for autovacuum. If we can estimate that a table
will next need vacuuming in 37 minutes and that at the current
vacuum_cost_delay setting the currently-in-progress vacuum will finish
in 83 minutes (or 83 hours), we should realize that we have a problem
and vacuum faster. Similarly, if we notice that tables are coming due
for vacuuming faster than autovacuum workers are processing them, we
should try to speed up.

You can imagine trying to solve this problem by teaching autovacuum to
estimate, for every table in the cluster, the time remaining until the
next vacuum needs to start; and the time at which we will be able to
start the next few vacuums based on when we expect to have enough
autovacuum workers available. We could then detect whether or not
we're keeping up, and adjust accordingly. This would allow us, in
turn, to run vacuum with a higher cost delay by default, and let it
decide for itself when it needs to speed up. And if we want vacuum to
do lower-priority tasks, such as set visibility map bits, we could
teach it to do so only when it's keeping up.

This all seems like a lot of work, but at the moment I'm not seeing
another way to deliver on the suggested goal of "setting visibility
map bits more aggressively". Does anyone else?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-08-14 04:16:39 Re: OperationalError: FATAL: lock AccessShareLock on object 0/1260/0 is already
Previous Message Bruce Momjian 2011-08-14 02:23:36 Re: Ignore lost+found when checking if a directory is empty