Re: allow changing autovacuum_max_workers without restarting

From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "Imseih (AWS), Sami" <simseih(at)amazon(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: allow changing autovacuum_max_workers without restarting
Date: 2024-04-19 20:29:31
Message-ID: 20240419202931.GA57652@nathanxps13
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Apr 19, 2024 at 02:42:13PM -0400, Robert Haas wrote:
> I think this could help a bunch of users, but I'd still like to
> complain, not so much with the desire to kill this patch as with the
> desire to broaden the conversation.

I think I subconsciously hoped this would spark a bigger discussion...

> Now, before this patch, there is a fairly good reason for that, which
> is that we need to reserve shared memory resources for each autovacuum
> worker that might potentially run, and the system can't know how much
> shared memory you'd like to reserve for that purpose. But if that were
> the only problem, then this patch would probably just be proposing to
> crank up the default value of that parameter rather than introducing a
> second one. I bet Nathan isn't proposing that because his intuition is
> that it will work out badly, and I think he's right. I bet that
> cranking up the number of allowed workers will often result in running
> more workers than we really should. One possible negative consequence
> is that we'll end up with multiple processes fighting over the disk in
> a situation where they should just take turns. I suspect there are
> also ways that we can be harmed - in broadly similar fashion - by cost
> balancing.

Even if we were content to bump up the default value of
autovacuum_max_workers and tell folks to just mess with the cost settings,
there are still probably many cases where bumping up the number of workers
further would be necessary. If you have a zillion tables, turning
cost-based vacuuming off completely may be insufficient to keep up, at
which point your options become limited. It can be difficult to tell
whether you might end up in this situation over time as your workload
evolves. In any case, it's not clear to me that bumping up the default
value of autovacuum_max_workers would do more good than harm. I get the
idea that the default of 3 is sufficient for a lot of clusters, so there'd
really be little upside to changing it AFAICT. (I guess this proves your
point about my intuition.)

> So I feel like what this proposal reveals is that we know that our
> algorithm for ramping up the number of running workers doesn't really
> work. And maybe that's just a consequence of the general problem that
> we have no global information about how much vacuuming work there is
> to be done at any given time, and therefore we cannot take any kind of
> sensible guess about whether 1 more worker will help or hurt. Or,
> maybe there's some way to do better than what we do today without a
> big rewrite. I'm not sure. I don't think this patch should be burdened
> with solving the general problem here. But I do think the general
> problem is worth some discussion.

I certainly don't want to hold up $SUBJECT for a larger rewrite of
autovacuum scheduling, but I also don't want to shy away from a larger
rewrite if it's an idea whose time has come. I'm looking forward to
hearing your ideas in your pgconf.dev talk.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2024-04-19 20:48:46 Re: fix tablespace handling in pg_combinebackup
Previous Message Tom Lane 2024-04-19 20:18:13 Re: fix tablespace handling in pg_combinebackup