Re: New GUC autovacuum_max_threshold ?

From: Joe Conway <mail(at)joeconway(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Michael Banck <mbanck(at)gmx(dot)net>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, Frédéric Yhuel <frederic(dot)yhuel(at)dalibo(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, David Rowley <dgrowleyml(at)gmail(dot)com>
Subject: Re: New GUC autovacuum_max_threshold ?
Date: 2024-04-26 13:40:05
Message-ID: 7c91df80-ef1b-426a-a2b3-e11ffb31bcc7@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 4/26/24 09:31, Robert Haas wrote:
> On Fri, Apr 26, 2024 at 9:22 AM Joe Conway <mail(at)joeconway(dot)com> wrote:
>> Although I don't think 500000 is necessarily too small. In my view,
>> having autovac run very quickly, even if more frequently, provides an
>> overall better user experience.
>
> Can you elaborate on why you think that? I mean, to me, that's almost
> equivalent to removing autovacuum_vacuum_scale_factor entirely,
> because only for very small tables will that calculation produce a
> value lower than 500k.

If I understood Nathan's proposed calc, for small tables you would still
get (thresh + sf * numtuples). Once that number exceeds the new limit
parameter, then the latter would kick in. So small tables would retain
the current behavior and large enough tables would be clamped.

> We might need to try to figure out some test cases here. My intuition
> is that this is going to vacuum large tables insanely aggressively.

It depends on workload to be sure. Just because a table is large, it
doesn't mean that dead rows are generated that fast.

Admittedly it has been quite a while since I looked at all this that
closely, but if A/V runs on some large busy table for a few milliseconds
once every few minutes, that is far less disruptive than A/V running for
tens of seconds once every few hours or for minutes ones every few days
-- or whatever. The key thing to me is the "few milliseconds" runtime.
The short duration means that no one notices an impact, and the longer
duration almost guarantees that an impact will be felt.

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Aleksander Alekseev 2024-04-26 13:41:18 Re: New committers: Melanie Plageman, Richard Guo
Previous Message Robert Haas 2024-04-26 13:37:38 Re: New GUC autovacuum_max_threshold ?