Re: Trigger more frequent autovacuums of heavy insert tables

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Nathan Bossart <nathandbossart(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Peter Geoghegan <pg(at)bowt(dot)ie>, David Rowley <dgrowley(at)gmail(dot)com>, Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
Subject: Re: Trigger more frequent autovacuums of heavy insert tables
Date: 2025-02-25 22:13:09
Message-ID: CAAKRu_bKcmQaY2=KgwRvar7nedFZ9bJvsGgBChsuUgTzRiv1NA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 25, 2025 at 1:52 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Tue, Feb 25, 2025 at 11:03 AM Melanie Plageman
> <melanieplageman(at)gmail(dot)com> wrote:
> > This does however leave me with the question of how to handle the
> > original question of whether or not to cap the proposed relallfrozen
> > to the value of relallvisible when updating stats at the end of
> > vacuum. The current code in heap_vacuum_rel() caps relallvisible to
> > relpages, so capping relallfrozen to relallvisible would follow that
> > pattern. However, the other places relallvisible is updated do no such
> > capping (do_analyze_rel(), index_update_stats()). It doesn't seem like
> > there is a good reason to do it one place and not the others. So, I
> > suggest either removing all the caps and adding a WARNING or capping
> > the value in all places. Because users can now manually update these
> > values in pg_class, there wouldn't be a way to detect the difference
> > between a bogus relallfrozen value due to VM corruption or a bogus
> > value due to manual statistics intervention. This led me to think that
> > a WARNING and no cap would be more effective for heap_vacuum_rel().
>
> I mean, does it really make any difference one way or the other?
>
> Given that users could manually update the catalog, we have to be able
> to tolerate bad data in the catalogs without the world ending. If that
> code has to exist anyway, then it's not mandatory to cap. On the other
> hand, there's no great virtue in refusing to correct data that we know
> to be wrong. Unless there is some other consideration which makes one
> way better than the other, this feels like author's choice.

I realized that whether or not we add a WARNING is an independent
question from whether or not we cap these values. In these instances,
we happen to have just read the whole VM and so we can tell you if it
is broken in a particular way. If I want to write a patch to warn
users of visibility map corruption after calling
visibilitymap_count(), I could do that and it might be a good idea,
but it should probably be a separate commit anyway.

- Melanie

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2025-02-25 22:14:19 Re: Parallel heap vacuum
Previous Message Masahiko Sawada 2025-02-25 22:05:28 Re: Make COPY format extendable: Extract COPY TO format implementations