Quick Links

Re: Trigger more frequent autovacuums of heavy insert tables

From:	Nathan Bossart <nathandbossart(at)gmail(dot)com>
To:	Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc:	Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Peter Geoghegan <pg(at)bowt(dot)ie>, David Rowley <dgrowley(at)gmail(dot)com>
Subject:	Re: Trigger more frequent autovacuums of heavy insert tables
Date:	2025-02-07 20:38:43
Message-ID:	Z6ZvUx4N4svJmXjF@nathan
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Fri, Feb 07, 2025 at 02:21:07PM -0500, Melanie Plageman wrote:
> On Fri, Feb 7, 2025 at 12:37 PM Nathan Bossart <nathandbossart(at)gmail(dot)com> wrote:
>> My first reaction is to question whether
>> it makes send to have two strategies for this sort of thing:
>> autovacuum_vacuum_max_threshold for updates/deletes and this for inserts.
>> Perhaps we don't want to more aggressively clean up bloat (except for the
>> very largest tables via the hard cap), but we do want to more aggressively
>> mark newly-inserted tuples frozen. I'm curious what you think.
>
> The goal with insert-only tables is to set the whole page frozen in
> the VM. So, the number of pages is more important than the total
> number of tuples inserted. Whereas, with updates/deletes, it seems
> like the total amount of garbage (# tuples) needing cleaning is more
> important.

I think this is a reasonable position. To be clear, I don't have a problem
with having different strategies, or even with swapping
autovacuum_vacuum_max_threshold with a similar change, if it's the right
thing to do. I just want to be able to articulate why they're different.

>> Wouldn't relallvisible be sufficient here? We'll skip all-visible pages
>> unless this is an anti-wraparound vacuum, at which point I would think the
>> insert threshold goes out the window.
>
> It's a great question. There are a couple reasons why I don't think so.
>
> I think this might lead to triggering vacuums too often for
> insert-mostly tables. For those tables, the pages that are not
> all-visible will largely be just those with data that is new since the
> last vacuum. And if we trigger vacuums based off of the % not
> all-visible, we might decrease the number of cases where we are able
> to vacuum inserted data and freeze it the first time it is vacuumed --
> thereby increasing the total amount of work.

Rephrasing to make sure I understand correctly: you're saying that using
all-frozen would trigger less frequent insert vacuums, which would give us
a better chance of freezing more than more frequent insert vacuums
triggered via all-visible? My suspicion is that the difference would tend
to be quite subtle in practice, but I have no concrete evidence to back
that up.

--
nathan

In response to

Re: Trigger more frequent autovacuums of heavy insert tables at 2025-02-07 19:21:07 from Melanie Plageman

Responses

Re: Trigger more frequent autovacuums of heavy insert tables at 2025-02-07 20:57:49 from Melanie Plageman

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Ilia Evdokimov	2025-02-07 20:41:08	Re: explain analyze rows=%.0f
Previous Message	Daniel Gustafsson	2025-02-07 20:12:34	Re: [PoC] Federated Authn/z with OAUTHBEARER