Re: pgsql: Don't consider newly inserted tuples in nbtree VACUUM.

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-committers <pgsql-committers(at)lists(dot)postgresql(dot)org>
Subject: Re: pgsql: Don't consider newly inserted tuples in nbtree VACUUM.
Date: 2021-03-11 19:42:58
Message-ID: CAH2-WzkgfHcRWqunbe0UOoacOAMnCX=PtYOjAZymViHxjzSQ-A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers

On Thu, Mar 11, 2021 at 11:25 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> I can understand that those two settings might interact in some way
> that is bad or unintended, but I feel like if I can't understand what
> exactly the bad interaction is after reading the commit message,
> that's probably a sign that the commit message isn't as clear as it
> could be.

You have to think about it from first principles, I suppose.

What was the point of vacuum_cleanup_index_scale_factor in the first
place, back when it was committed to Postgres 11? It was something
like "have btvacuumcleanup() make sure that pg_class.reltuples gets
updated when most full scans during nbtree VACUUM get skipped". But
that's fundamentally not the responsibility of the index AM. As a
counterpoint, hash indexes simply return NULL when there is a
cleanup-only VACUUM (i.e. no hashbulkdelete() call for the VACUUM
operation), without worrying about pg_class being updated for the
index entry on any particular timeline. Because why would an index AM
ever need to worry about that? Index AMs are not supposed to concern
themselves with that in the case where the index hasn't been modified,
per the amvacuumcleanup() sgml docs.

This confusion over which component is responsible for maintaining
pg_class.reltuples makes vacuum_cleanup_index_scale_factor-driven
autovacuums uselessly perform full scans of all indexes on an
append-only table. In my judgement that was a bug that needed to be
fixed. Certainly, disabling a GUC in a stable release branch was an
unorthodox approach, but I concluded that it was the lesser evil here.
Scanning the indexes during a VACUUM driven by
vacuum_cleanup_index_scale_factor cannot be dismissed as the cost of
setting the VM bits for the end of the heap relation, because it has
no fixed relationship to what has changed -- it generally makes zero
sense to scan the indexes here.

--
Peter Geoghegan

In response to

Browse pgsql-committers by date

  From Date Subject
Next Message Tom Lane 2021-03-11 19:44:07 pgsql: Re-simplify management of inStart in pqParseInput3's subroutines
Previous Message Robert Haas 2021-03-11 19:25:25 Re: pgsql: Don't consider newly inserted tuples in nbtree VACUUM.