Re: autovacuum not prioritising for-wraparound tables

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Christopher Browne <cbbrowne(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: autovacuum not prioritising for-wraparound tables
Date: 2013-01-25 17:35:25
Message-ID: 20130125173525.GB14926@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2013-01-25 12:19:25 -0500, Robert Haas wrote:
> On Fri, Jan 25, 2013 at 11:51 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > The floor(log(size)) part seems like it will have rather arbitrary
> > behavioral shifts when a table grows just past a log boundary. Also,
> > I'm not exactly sure whether you're proposing smaller tables first or
> > bigger tables first, nor that either of those orderings is a good thing.
> >
> > I think sorting by just age(relfrozenxid) for for-wraparound tables, and
> > just the n_dead_tuples measurement for others, is probably reasonable
> > for now. If we find out that has bad behaviors then we can look at how
> > to fix them, but I don't think we have enough understanding yet of what
> > the bad behaviors might be.

> I think that to do this right, we need to consider not only the status
> quo but the trajectory. For example, suppose we have two tables to
> process, one of which needs a wraparound vacuum and the other one of
> which needs dead tuples removed. If the table needing the wraparound
> vacuum is small and just barely over the threshold, it isn't urgent;
> but if it's large and way over the threshold, it's quite urgent.
> Similarly, if the table which needs dead tuples removed is rarely
> updated, postponing vacuum is not a big deal, but if it's being
> updated like crazy, postponing vacuum is a big problem. Categorically
> putting autovacuum wraparound tables ahead of everything else seems
> simplistic, and thinking that more dead tuples is more urgent than
> fewer dead tuples seems *extremely* simplistic.

I don't think the first part is problematic. Which scenario do you have
in mind where that would really cause adverse behaviour? autovacuum
seldomly does full table vacuums on tables otherwise these days so
tables get "old" in that sense pretty regularly and mostly uniform.

I agree that the second criterion isn't worth very much and that we need
something better there.

> I ran across a real-world case where a user had a small table that had
> to be vacuumed every 15 seconds to prevent bloat. If we change the
> algorithm in a way that gives other things priority over that table,
> then that user could easily get hosed when they install a maintenance
> release containing this change.

I think if we backpatch this we should only prefer wraparound tables and
leave the rest unchanged.

> Which is exactly why back-patching this is not a good idea, IMHO. We
> could easily run across a system where pg_class order happens to be
> better than anything else we come up with. Such changes are expected
> in new major versions, but not in maintenance releases.

I think a minimal version might be acceptable. Its a bug if the database
regularly shuts down and you need to write manual vacuuming scripts to
prevent it from happening.

I don't think the argument that the pg_class order might work better
than anything holds that much truth - its not like thats something
really stable.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2013-01-25 17:42:49 Re: COPY FREEZE has no warning
Previous Message David Fetter 2013-01-25 17:25:17 Re: LATERAL, UNNEST and spec compliance