From: | Nathan Bossart <nathandbossart(at)gmail(dot)com> |
---|---|
To: | Melanie Plageman <melanieplageman(at)gmail(dot)com> |
Cc: | Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Peter Geoghegan <pg(at)bowt(dot)ie>, David Rowley <dgrowley(at)gmail(dot)com> |
Subject: | Re: Trigger more frequent autovacuums of heavy insert tables |
Date: | 2025-02-07 20:38:43 |
Message-ID: | Z6ZvUx4N4svJmXjF@nathan |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Feb 07, 2025 at 02:21:07PM -0500, Melanie Plageman wrote:
> On Fri, Feb 7, 2025 at 12:37 PM Nathan Bossart <nathandbossart(at)gmail(dot)com> wrote:
>> My first reaction is to question whether
>> it makes send to have two strategies for this sort of thing:
>> autovacuum_vacuum_max_threshold for updates/deletes and this for inserts.
>> Perhaps we don't want to more aggressively clean up bloat (except for the
>> very largest tables via the hard cap), but we do want to more aggressively
>> mark newly-inserted tuples frozen. I'm curious what you think.
>
> The goal with insert-only tables is to set the whole page frozen in
> the VM. So, the number of pages is more important than the total
> number of tuples inserted. Whereas, with updates/deletes, it seems
> like the total amount of garbage (# tuples) needing cleaning is more
> important.
I think this is a reasonable position. To be clear, I don't have a problem
with having different strategies, or even with swapping
autovacuum_vacuum_max_threshold with a similar change, if it's the right
thing to do. I just want to be able to articulate why they're different.
>> Wouldn't relallvisible be sufficient here? We'll skip all-visible pages
>> unless this is an anti-wraparound vacuum, at which point I would think the
>> insert threshold goes out the window.
>
> It's a great question. There are a couple reasons why I don't think so.
>
> I think this might lead to triggering vacuums too often for
> insert-mostly tables. For those tables, the pages that are not
> all-visible will largely be just those with data that is new since the
> last vacuum. And if we trigger vacuums based off of the % not
> all-visible, we might decrease the number of cases where we are able
> to vacuum inserted data and freeze it the first time it is vacuumed --
> thereby increasing the total amount of work.
Rephrasing to make sure I understand correctly: you're saying that using
all-frozen would trigger less frequent insert vacuums, which would give us
a better chance of freezing more than more frequent insert vacuums
triggered via all-visible? My suspicion is that the difference would tend
to be quite subtle in practice, but I have no concrete evidence to back
that up.
--
nathan
From | Date | Subject | |
---|---|---|---|
Next Message | Ilia Evdokimov | 2025-02-07 20:41:08 | Re: explain analyze rows=%.0f |
Previous Message | Daniel Gustafsson | 2025-02-07 20:12:34 | Re: [PoC] Federated Authn/z with OAUTHBEARER |