Re: Berserk Autovacuum (let's save next Mandrill)

From: Andres Freund <andres(at)anarazel(dot)de>
To: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Darafei Komяpa Praliaskouski <me(at)komzpa(dot)net>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Michael Banck <mbanck(at)gmx(dot)net>
Subject: Re: Berserk Autovacuum (let's save next Mandrill)
Date: 2020-03-20 06:44:33
Message-ID: 20200320064433.uiq4bb7fvhffpr6k@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2020-03-20 06:59:57 +0100, Laurenz Albe wrote:
> On Thu, 2020-03-19 at 15:17 -0700, Andres Freund wrote:
> > I am *VERY* doubtful that the attempt of using a large threshold, and a
> > tiny scale factor, is going to work out well. I'm not confident enough
> > in my gut feeling to full throatedly object, but confident enough that
> > I'd immediately change it on any important database I operated.
> >
> > Independent of how large a constant you set the threshold to, for
> > databases with substantially bigger tables this will lead to [near]
> > constant vacuuming. As soon as you hit 1 billion rows - which isn't
> > actually that much - this is equivalent to setting
> > autovacuum_{vacuum,analyze}_scale_factor to 0.01. There's cases where
> > that can be a sensible setting, but I don't think anybody would suggest
> > it as a default.
>
> In that, you are assuming that the bigger a table is, the more data
> modifications it will get, so that making the scale factor the dominant
> element will work out better.

> My experience is that it is more likely for the change rate (inserts,
> I am less certain about updates and deletes) to be independent of the
> table size. (Too) many large databases are so large not because the
> data influx grows linearly over time, but because people don't want to
> get rid of old data (or would very much like to do so, but never planned
> for it).

I don't think growing ingest rate into insert only tables is exactly
rare. Maybe I've been too long in the Bay Area though.

> This second scenario would be much better served by a high threshold and
> a low scale factor.

I don't think that's really true. As soon as there's any gin/gist
indexes, a single non-HOT dead tuple, or a btree index grew by more
than vacuum_cleanup_index_scale_factor, indexes are scanned as a
whole. See the email I just concurrently happened to write:
https://postgr.es/m/20200320062031.uwagypenawujwajx%40alap3.anarazel.de

Which means that often each additional vacuum causes IO that's
proportional to the *total* index size, *not* the table size
delta. Which means that the difference in total IO basically is
O(increased_frequency * peak_table_size) in the worst case.

> > After thinking about it for a while, I think it's fundamentally flawed
> > to use large constant thresholds to avoid unnecessary vacuums. It's easy
> > to see cases where it's bad for common databases of today, but it'll be
> > much worse a few years down the line where common table sizes have grown
> > by a magnitude or two. Nor do they address the difference between tables
> > of a certain size with e.g. 2kb wide rows, and a same sized table with
> > 28 byte wide rows. The point of constant thresholds imo can only be to
> > avoid unnecessary work at the *small* (even tiny) end, not the opposite.
> >
> >
> > I think there's too much "reinventing" autovacuum scheduling in a
> > "local" insert-only manner happening in this thread. And as far as I can
> > tell additionally only looking at a somewhat narrow slice of insert only
> > workloads.
>
> Perhaps. The traditional "high scale factor, low threshold" system
> is (in my perception) mostly based on the objective of cleaning up
> dead tuples. When autovacuum was introduced, index only scans were
> only a dream.
>
> With the objective of getting rid of dead tuples, having the scale factor
> be the dominant part makes sense: it is OK for bloat to be a certain
> percentage of the table size.
>

As far as I can tell this argument doesn't make sense in light of the ob
fact that many vacuums trigger whole index scans, even if there are no
deleted tuples, as described above?

Even disregarding the index issue, I still don't think your argument is
very convicing. For one, as I mentioned in another recent email, 10
million rows in a narrow table is something entirely different than 10
million rows in a very wide table. scale_factor doesn't have that
problem to the same degree. Also, it's fairly obvious that this
argument doesn't hold in the general sense, otherwise we could just set
a threshold of, say, 10000.

There's also the issue that frequent vacuums will often not be able to
mark most of the the new data all-visible, due to concurrent
sessions. E.g. concurrent bulk loading sessions, analytics queries
actually looking at the data, replicas all can easily prevent data that
was just inserted from being marked 'all-visible' (not to speak of
frozen). That's not likely to be a problem in a purely oltp system that
inserts only single rows per xact, and has no longlived readers (nor
replicas with hs_feedback = on), but outside of that...

> Also, as you say, tables were much smaller then, and they will only
> become bigger in the future. But I find that to be an argument *for*
> making the threshold the dominant element: otherwise, you vacuum less
> and less often, and the individual runs become larger and larger.

Which mostly is ok, because there are significant costs that scale with
the table size. And in a lot (but far from all!) of cases the benefits
of vacuuming scale more with the overall table size than with the delta
of the size.

> Now that vacuum skips pages where it knows it has nothing to do,
> doesn't take away much of the pain of vacuuming large tables where
> nothing much has changed?

Unfortunately not really.

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2020-03-20 06:50:32 Re: [PATCH] Add schema and table names to partition error
Previous Message Atsushi Torikoshi 2020-03-20 06:22:53 Re: replay pause vs. standby promotion