Re: Why doesn't pgstat_report_analyze() focus on not-all-visible-page dead tuple counts, specifically?

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Why doesn't pgstat_report_analyze() focus on not-all-visible-page dead tuple counts, specifically?
Date: 2021-12-07 01:13:53
Message-ID: CAH2-Wzm20CUamNR7gUnXsGQvLmPtLqHxEj7WWXYeM=UceMw1cw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Dec 6, 2021 at 2:37 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> On Mon, Dec 6, 2021 at 12:07 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> > So does this. If some of the table is now all-visible when it wasn't
> > before, it's certainly a good guess that the portions that still
> > aren't have about the same distribution of dead tuples that they did
> > before ... although the other direction is less clear: it seems
> > possible that newly not-all-visible pages have fewer dead tuples than
> > ones which have been not-all-visible for a while. But you have to make
> > some guess.
>
> To me, it seems natural to accept and even embrace the inherent
> uncertainty about the number of dead tuples.

> The number of dead tuples in the table is an inherently dynamic thing,
> which makes it totally dissimilar to the pg_statistics-based stats.
> And so a single snapshot of a point in time is inherently much less
> useful -- we ought to keep a few sets of old statistics within our new
> pgstat_report_analyze() -- maybe 3 or 5.

I just realized that I didn't really get around to explicitly
connecting this to your point about newly not-all-visible pages being
quite different to older ones that ANALYZE has seen -- which is
definitely an important consideration. I'll do so now:

Keeping some history makes the algorithm "less gullible" (a more
useful goal than making it "smarter", at least IMV). Suppose that our
starting point is 2 pieces of authoritative information, which are
current as of the instant we want to estimate the number of dead
tuples for VACUUM: 1. total relation size (relpages), and 2. current
not-all-visible-pages count (interesting/countable pages, calculated
by taking the "complement" of visibilitymap_count() value). Further
suppose we store the same 2 pieces of information in our ANALYZE
stats, reporting using pgstat_report_analyze() -- the same 2 pieces of
information are stored alongside the actual count of dead tuples and
live tuples found on not-all-visible pages.

The algorithm avoids believing silly things about dead tuples by
considering the delta between each piece of information, particularly
the difference between "right now" and "the last time ANALYZE ran and
called pgstat_report_analyze()". For example, if item 1/relpages
increased by exactly the same number of blocks as item
2/not-all-visible pages (or close enough to it), that is recognized as
a pretty strong signal. The algorithm should consider the newly
not-all-visible pages as likely to have very few dead tuples. At the
same time, the algorithm should not change its beliefs about the
concentration of dead tuples in remaining, older not-all-visible
pages.

This kind of thing will still have problems, no doubt. But I'd much
rather err in the direction of over-counting dead tuples like this.
The impact of the problem on the workload/autovacuum is a big part of
the picture here.

Suppose we believe that not-all-visible pages have 20 LP_DEAD items on
average, and they turn out to only have 3 or 5. Theoretically we've
done the wrong thing by launching autovacuum workers sooner -- we
introduce bias. But we also have lower variance over time, which might
make it worth it. I also think that it might not really matter at all.
It's no great tragedy if we clean up and set pages all-visible in the
visibility map a little earlier on average. It might even be a
positive thing.

The fact that the user expresses the dead-tuple-wise threshold using
autovacuum_vacuum_scale_factor is already somewhat arbitrary -- it is
based on some pretty iffy assumptions. Even if we greatly overestimate
dead tuples with the new algorithm, we're only doing so under
circumstances that might have caused
autovacuum_vacuum_insert_scale_factor to launch an autovacuum worker
anyway. Just setting the visibility map bit has considerable value.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2021-12-07 01:24:14 Re: parallel vacuum comments
Previous Message Kyotaro Horiguchi 2021-12-07 00:58:00 Re: Is it correct to update db state in control file as "shutting down" during end-of-recovery checkpoint?