Re: New GUC autovacuum_max_threshold ?

From: Frédéric Yhuel <frederic(dot)yhuel(at)dalibo(dot)com>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>, wenhui qiu <qiuwenhuifx(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, "Imseih (AWS), Sami" <simseih(at)amazon(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Joe Conway <mail(at)joeconway(dot)com>, Michael Banck <mbanck(at)gmx(dot)net>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: New GUC autovacuum_max_threshold ?
Date: 2024-11-13 10:03:25
Message-ID: 2ca7f9c7-5da9-46cb-9072-39b71bfab948@dalibo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11/9/24 16:59, Nathan Bossart wrote:
> AFAICT the main advantage of these formulas is that you don't need another
> GUC, but they also makes the existing ones more difficult to configure.

I wouldn't say that's the main advantage. It doesn't seem very clean to
me to cap to a fixed value. Because you could take Robert's
demonstration with a bigger table, and come to the same conclusion:

Let's compare the current situation to the situation post-Nathan's-patch
with a cap of 100M. Consider a table 100 times larger than the one of
Robert's previous example, so pgbench scale factor 2_560_000, size on
disk 32TB.
Currently, that table will be vacuumed for bloat when the number of
dead tuples exceeds 20% of the table size, because that's the default
value of autovacuum_vacuum_scale_factor. The table has 256 billion
tuples, so that means that we're going to vacuum it when there are
more than 51 billion dead tuples. Post-patch, we will vacuum when we
have 100 million dead tuples. Suppose a uniform workload that slowly
updates rows in the table. If we were previously autovacuuming the
table once per day (1440 minutes) we're now going to try to vacuum it
almost every minute (1440 minutes / 512 = 168 seconds).

(compare with every 55 min with my formula)

Of course, this a theoretical example that is probably unrealistic. I
don't know, really. I don't know if Robert's example was realistic in
the first place.

In any case, we should do the tests that Robert suggested and/or come up
with a good mathematical model, because we are in the dark at the moment.

> Plus, there's no way to go back to the existing behavior.

I think we should indeed provide a retro-compatible behaviour (so maybe
another GUC after all).

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2024-11-13 10:11:37 Re: SQL:2011 application time
Previous Message Amit Kapila 2024-11-13 09:38:27 Re: logical replication: restart_lsn can go backwards (and more), seems broken since 9.4