Re: Eagerly scan all-visible pages to amortize aggressive vacuum

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Robert Treat <rob(at)xzilla(dot)net>, Marcos Pegoraro <marcos(at)f10(dot)com(dot)br>, Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>, Andres Freund <andres(at)anarazel(dot)de>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Peter Geoghegan <pg(at)bowt(dot)ie>
Subject: Re: Eagerly scan all-visible pages to amortize aggressive vacuum
Date: 2025-01-27 17:44:51
Message-ID: CAAKRu_akptJf1BhCuMHktPb9_CckkPJXySs5v+JaKH4Ch67TLw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jan 24, 2025 at 3:43 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Fri, Jan 24, 2025 at 3:02 PM Melanie Plageman
> <melanieplageman(at)gmail(dot)com> wrote:
> > I think you're right. I would go with a percentage. I don't see many
> > other GUCs that are percents. What would you call it? Perhaps
> > vacuum_eager_scan_fail_threshold? The % of the blocks in the table
> > vacuum may scan and fail to freeze.
> >
> > There is an issue I see with making it a percentage. The current
> > default vacuum_eager_scan_max_fails is 128 out of 4096. That means you
> > are allowed to scan about 3% of the blocks in the table even if you
> > fail to freeze every one. I don't think there are very many reasonable
> > values above 256, personally. So, it might be weird to add a
> > percentage GUC value with only very low acceptable values. Part of
> > this is that we are not making the success cap configurable, so that
> > means that you might have lots of extra I/O if you are both failing
> > and succeeding. Someone configuring this GUC might think this is
> > controlling the amount of extra I/O they are incurring.
>
> Why would it be unreasonable to set this value to 25% or even 100%? I
> grant that it doesn't sound like the most prudent possible value, but
> if I'm willing to fritter away my VACUUM resources to have the best
> possible chance of eagerly freezing stuff, isn't that up to me? I
> think setting this value to 100% would be WAY less damaging than
> work_mem='1TB' or autovacuum_naptime='7d', both of which are allowed.
> In fact, setting this to 100% could theoretically have no negative
> consequences at all, if it so happens that no freeze failures occur.
> Couldn't it even be a win, if freeze failures are common but
> minimizing the impact of aggressive vacuum is of overwhelming
> importance?
>
> Looking at ConfigureNamesReal[], some existing words we use to talk
> about real-valued GUCs include: cost, fraction, factor, bias,
> multiplier, target, rate. Of those, I like "fraction" and "rate" best
> here, because they imply a range of 0-1 rather than anything broader.
> vacuum_max_eager_freeze_failure_rate? Kinda long...

attached v11 uses a fraction with this name. It follows the
conventions and I find it descriptive.

Changing the configuration to a fraction does mean that quite a few of
the comments are different in this version. I think the description of
the guc in
config.sgml could use some review in particular. I tried to make it
clear that the percentage is not the maximum number of blocks that
could be eager scanned, because you may also eagerly scan blocks and
succeed.

Other note: I noticed AutoVacOpts that are floating point numbers (like
vacuum_cost_delay) are float8s but their associated GUCs (like
autovacuum_vacuum_cost_delay) are doubles. These are going to be equivalent,
but I wanted to note the inconsistency in case I was missing something.

> > """
> > <command>VACUUM</command> typically scans pages that have been
> > modified since the last vacuum. However, some all-visible but not
> > all-frozen pages are eagerly scanned to try and freeze them. Note that
> > the <structfield>relfrozenxid</structfield> can only be advanced when
> > every page of the table that might contain unfrozen XIDs is scanned.
> > """
>
> "to try and freeze them" seems a little awkward to me. The rest seems fine.
>
> "in the hope that we can freeze them"?
>
> That still doesn't seem great to me. Maybe it's worth expending more
> text here, not sure.

Given this and Treat's input from a different mail:

> Yeah, this feels like a net negative. As I see it, we're trying to
> connect three not-obviously related ideas for users who are trying to
> understand how this system works especially with regards to
> relfrozenxid advanced (based on the section of the docs we are in),
> loosely
> 1. you've probably heard vacuum mainly scans modified pages for cleanup
> 2. you might notice it also scans non-modified pages, because
> sometimes it wants to freeze stuff
> 3. you might think, if we are freezing stuff, why doesn't relfrozenxid advance?
>
> So that middle bit is trying to act as glue that pulls this all
> together. I thought the previous version was closer, with Haas's
> feedback I might go with something more like this:

I've mostly used his suggested wording in attached v11.

> > I rewrote the comment using some of your input and trying to clarify
> > my initial point about opportunistic vs guaranteed freezing.
> >
> > """
> > If FreezeLimit has not advanced past the relfrozenxid or
> > MultiXactCutoff has not advanced past relminmxid, we are unlikely to
> > be able to freeze anything new.
> > Tuples with XIDs older than OldestXmin or MXIDs older than OldestMxact
> > are technically freezable, but we won't freeze them unless some other
> > criteria for opportunistic freezing is met.
>
> It's not super clear to me what "some other criteria" means here, but
> maybe it's fine.

Given this thought and Treat's feedback below

> Yeah, I also think the "some other criteria" part seems poor, although
> tbh this all feels like a net negative to the previous wording or
> Haas's suggested change, like you're describing what the code below it
> does rather than what the intentions of the code are. For example, I
> like the "freeze horizon" language vs explicit FreezeLimit since the
> code could change even if the goal doesn't. But in lue of going back
> to that wording (modulo your 1 line explanation in your email), if I
> were to attempt to split the difference:
>
> We only want to
> enable eager scanning if we are likely to be able to freeze some of
> the pages in the relation, which is unlikely if FreezeLimit has not
> advanced past
> relfrozenxid or if MultiXactCutoff has not advanced passed
> relminmxid. Granted, there may be pages we didn't try to freeze
> before, or some previously blocking XID greater than FreezeLimit may
> have now ended (allowing for freezing), but as a heuristic we wait
> until the
> FreezeLimit advances to increase our chances of successful freezing.

I've updated the proposed text to mostly be in line with what he suggested.

> > ended, rendering additional tuples freezable.
> > As a heuristic, however, it makes sense to wait until the FreezeLimit
> > or MultiXactCutoff has advanced before eager scanning.
> > """
> >
> > > + ereport(INFO,
> > > + (errmsg("Vacuum successfully froze %u eager scanned blocks of
> > > \"%s.%s.%s\". Now disabling eager scanning.",
> > >
> > > I predict that if Tom sees this, you will get complaints about both
> > > the wording of the message, which pretty clearly does not conform to
> > > style guidelines for a primary error message, and also about the use
> > > of the INFO level. Allow me to suggest DEBUG1 or DEBUG2 and "disabling
> > > eager scanning after freezing %u eagerly scanned blocks".
> >
> > I've used your wording. Just for future reference, are the style
> > guidelines I was violating the capitalization and punctuation? Are
> > these documented somewhere? Also, what is a primary error message?
> > INFO level? Or ones that use ereport and are translated? I looked at
> > other messages and saw that they don't capitalize the first word or
> > use punctuation, so I assume that those were problems.
>
> Yes. Primary error messages, i.e. errmsg(), are not capitalized and
> punctuated, unlike errdetail() and errhint() messages, which are.
>
> See https://www.postgresql.org/docs/current/error-style-guide.html
>
> INFO level is used for VERY few things. I can't tell you off the top
> of my head when it's appropriate, but I think the answer is "almost
> never".

I have changed it to DEBUG2. I started it as INFO because the other vacuum
messages about whether or not the vacuum is an aggressive vacuum were at INFO
level.I don't know what a user might prefer. Treat said this downthread:

> Maybe, but one of the areas that INFO is used for is providing
> additional details in VACUUM VERBOSE output, and this seems like it
> would be pretty useful information to have if you are trying to
> discern changes in i/o rate during a vacuum, or trying to tune the
> failure rate setting, or several other related fields (including
> automated capture by tools like pganalyze), so I believe INFO is the
> right choice for this.

I'm happy to go either way. I don't want users mad about verbosity, but if they
think it's helpful, then that seems good.

- Melanie

Attachment Content-Type Size
v11-0001-Eagerly-scan-all-visible-pages-to-amortize-aggre.patch application/octet-stream 41.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2025-01-27 17:52:16 Re: Eagerly scan all-visible pages to amortize aggressive vacuum
Previous Message Masahiko Sawada 2025-01-27 17:30:51 Re: Skip collecting decoded changes of already-aborted transactions