Re: Count and log pages set all-frozen by vacuum

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Nitin Jadhav <nitinjadhavpostgres(at)gmail(dot)com>, Alastair Turner <minion(at)decodable(dot)me>, Peter Geoghegan <pg(at)bowt(dot)ie>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Count and log pages set all-frozen by vacuum
Date: 2024-12-11 19:18:15
Message-ID: CAD21AoADuzE+Mf=E642KS=ubodW_NeR4=rUv0DCy4p28kCfFng@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Dec 5, 2024 at 4:32 PM Melanie Plageman
<melanieplageman(at)gmail(dot)com> wrote:
>
> On Mon, Dec 2, 2024 at 9:28 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> >
> > On Fri, Nov 29, 2024 at 2:46 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > > Finally, in case where we have:
> > >
> > > visibility map: 10000 pages set all-visible, 10000 pages set all-frozen.
> > >
> > > We can understand that 10000 pages newly became all-frozen, but have
> > > no idea how many pages became all-visible but not all-frozen. It could
> > > be even 0. Users might want to know it to understand how (auto)vacuum
> > > and freezing are working well.
> >
> > I agree that this is possible, but I am not clear why the user should
> > care. In scenario #1, the same pages were set all-visible and also
> > all-frozen. In scenario #2, one set of pages were set all-visible and
> > a different set of pages were set all-frozen. Scenario #2 is a little
> > worse, for a couple of reasons. First, in scenario #2, more pages were
> > dirtied to achieve the same result. However, if the user is concerned
> > about the amount of I/O that vacuum is doing, they will more likely
> > look at the "buffer usage" and "WAL usage" lines in the VACUUM verbose
> > output rather than at the "visibility map" line. Second, in scenario
> > #2, we have deferred some work to the future and there is a risk that
> > the pages will remain all-visible but not all-frozen until the next
> > aggressive vacuum. I think it is possible someone could want to see
> > more detailed information for this reason.
> >
> > However, I expect that it will be unimportant in a practical sense of
> > Melanie's work in this area gets committed, because her goal is to set
> > things up so that we'll revisit all-visible but not all-frozen pages
> > sooner, so that the work doesn't build up so much prior to the next
> > aggressive vacuum. And I think that work will have its own logging
> > that will make it clear what it is doing, hence I don't foresee the
> > need for more detail here.
> >
> > All that said, if you really want this broken out into three
> > categories rather than two, I'm not overwhelmingly opposed to that. It
> > is always possible that more detail could be useful; it is just
> > difficult to decide where the likelihood is high enough to justify
> > adding more output.

I agree that it will be unimportant from Melanie's work in this area.
Also, I agree that if semi-aggressive vacuum has its own new logging
message about what it's done, this new message doesn't necessarily
need to be detailed. But looking at the proposed patches, there seems
to be no such new logging message so far. Showing three categories
makes sense to me independent of semi-aggressive vacuum work. If we
figure out it's better to merge some parts of this new message to
semi-aggressive vacuum's logs, we can adjust them later.

> Hmm. So at the risk of producing two loaves that are worse than none,
> I've decided I like the "everything" approach:
>
> visibility map: 5 pages newly set all-visible, of which 2 set
> all-frozen. 2 all-visible pages newly set all-frozen.
>
> With this I can clearly get any of the information I might want: the
> number of pages that changed state, the total number of pages set
> all-visible or all-frozen, and the total number of vmbits set. It
> makes the all-visible but not all-frozen debt clear. Without
> specifying how many of the pages it set all-frozen were all-visible,
> you don't really get that. I originally envisioned this log message as
> a way to know which vmbits were set. But it is kind of nice to know
> which pages changed state too.
>
> With three counters, the code and comment repetition is a bit trying,
> but I don't think it is quite enough to justify a "log_vmbits_set()"
> function.
>
> Here's an example to exercise the new log message:
>
> create table foo (a int, b int) with (autovacuum_enabled = false);
> insert into foo select generate_series(1,1000), 1;
> delete from foo where a > 500;
> vacuum (verbose) foo;
> -- visibility map: 5 pages newly set all-visible, of which 2 set
> all-frozen. 0 all-visible pages newly set all-frozen.
> insert into foo select generate_series(1,500), 1;
> vacuum (verbose, freeze) foo;
> --visibility map: 3 pages newly set all-visible, of which 3 set
> all-frozen. 2 all-visible pages newly set all-frozen.

The output makes sense to me. The patch mostly looks good to me. Here
is one minor comment:

if ((old_vmbits & VISIBILITYMAP_ALL_VISIBLE) == 0 &&
(old_vmbits & VISIBILITYMAP_ALL_FROZEN) == 0)

Maybe it would be better to rewrite it to:

if ((old_vmbits & (VISIBILITYMAP_ALL_VISIBLE | VISIBILITYMAP_ALL_FROZEN)) == 0)

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2024-12-11 19:20:58 Re: Unmark gen_random_uuid() function leakproof
Previous Message Peter Geoghegan 2024-12-11 19:13:23 Re: Adding skip scan (including MDAM style range skip scan) to nbtree