Re: Eagerly scan all-visible pages to amortize aggressive vacuum

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Robert Treat <rob(at)xzilla(dot)net>, Marcos Pegoraro <marcos(at)f10(dot)com(dot)br>, Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>, Andres Freund <andres(at)anarazel(dot)de>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Peter Geoghegan <pg(at)bowt(dot)ie>
Subject: Re: Eagerly scan all-visible pages to amortize aggressive vacuum
Date: 2025-01-24 20:43:05
Message-ID: CA+Tgmoah-S4U1tGpVcf9P-t5sf1xu=4t+Tc4Hk0C2o07sHuPeg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jan 24, 2025 at 3:02 PM Melanie Plageman
<melanieplageman(at)gmail(dot)com> wrote:
> This thought exercise made me realize something is wrong with my
> current patch, though. If you set the failure tolerance
> (vacuum_eager_scan_max_fails) to 0 right now, it disables eager
> scanning altogether. That might be unexpected. You would probably
> expect setting that to 0 to still allow eager scanning if it is only
> succeeding. That made me think that perhaps I should have a -1 value
> that disables eager scanning altogether and 0 just sets the failure
> tolerance to 0.

I would not expect that. I would expect setting a value of N to mean
"try until we have N failures". "Try until you have 0 failures" is
logically equivalent to "don't try".

> I think you're right. I would go with a percentage. I don't see many
> other GUCs that are percents. What would you call it? Perhaps
> vacuum_eager_scan_fail_threshold? The % of the blocks in the table
> vacuum may scan and fail to freeze.
>
> There is an issue I see with making it a percentage. The current
> default vacuum_eager_scan_max_fails is 128 out of 4096. That means you
> are allowed to scan about 3% of the blocks in the table even if you
> fail to freeze every one. I don't think there are very many reasonable
> values above 256, personally. So, it might be weird to add a
> percentage GUC value with only very low acceptable values. Part of
> this is that we are not making the success cap configurable, so that
> means that you might have lots of extra I/O if you are both failing
> and succeeding. Someone configuring this GUC might think this is
> controlling the amount of extra I/O they are incurring.

Why would it be unreasonable to set this value to 25% or even 100%? I
grant that it doesn't sound like the most prudent possible value, but
if I'm willing to fritter away my VACUUM resources to have the best
possible chance of eagerly freezing stuff, isn't that up to me? I
think setting this value to 100% would be WAY less damaging than
work_mem='1TB' or autovacuum_naptime='7d', both of which are allowed.
In fact, setting this to 100% could theoretically have no negative
consequences at all, if it so happens that no freeze failures occur.
Couldn't it even be a win, if freeze failures are common but
minimizing the impact of aggressive vacuum is of overwhelming
importance?

Looking at ConfigureNamesReal[], some existing words we use to talk
about real-valued GUCs include: cost, fraction, factor, bias,
multiplier, target, rate. Of those, I like "fraction" and "rate" best
here, because they imply a range of 0-1 rather than anything broader.
vacuum_max_eager_freeze_failure_rate? Kinda long...

> The while was meant to contrast the sentence before it about typically
> only scanning pages that have been modified since the previous vacuum.
> But, I see how that didn't work out for me. How about this:
>
> """
> <command>VACUUM</command> typically scans pages that have been
> modified since the last vacuum. However, some all-visible but not
> all-frozen pages are eagerly scanned to try and freeze them. Note that
> the <structfield>relfrozenxid</structfield> can only be advanced when
> every page of the table that might contain unfrozen XIDs is scanned.
> """

"to try and freeze them" seems a little awkward to me. The rest seems fine.

"in the hope that we can freeze them"?

That still doesn't seem great to me. Maybe it's worth expending more
text here, not sure.

> I rewrote the comment using some of your input and trying to clarify
> my initial point about opportunistic vs guaranteed freezing.
>
> """
> If FreezeLimit has not advanced past the relfrozenxid or
> MultiXactCutoff has not advanced past relminmxid, we are unlikely to
> be able to freeze anything new.
> Tuples with XIDs older than OldestXmin or MXIDs older than OldestMxact
> are technically freezable, but we won't freeze them unless some other
> criteria for opportunistic freezing is met.

It's not super clear to me what "some other criteria" means here, but
maybe it's fine.

> It is also possible than a transaction newer than the FreezeLimit has

than->that

> ended, rendering additional tuples freezable.
> As a heuristic, however, it makes sense to wait until the FreezeLimit
> or MultiXactCutoff has advanced before eager scanning.
> """
>
> > + ereport(INFO,
> > + (errmsg("Vacuum successfully froze %u eager scanned blocks of
> > \"%s.%s.%s\". Now disabling eager scanning.",
> >
> > I predict that if Tom sees this, you will get complaints about both
> > the wording of the message, which pretty clearly does not conform to
> > style guidelines for a primary error message, and also about the use
> > of the INFO level. Allow me to suggest DEBUG1 or DEBUG2 and "disabling
> > eager scanning after freezing %u eagerly scanned blocks".
>
> I've used your wording. Just for future reference, are the style
> guidelines I was violating the capitalization and punctuation? Are
> these documented somewhere? Also, what is a primary error message?
> INFO level? Or ones that use ereport and are translated? I looked at
> other messages and saw that they don't capitalize the first word or
> use punctuation, so I assume that those were problems.

Yes. Primary error messages, i.e. errmsg(), are not capitalized and
punctuated, unlike errdetail() and errhint() messages, which are.

See https://www.postgresql.org/docs/current/error-style-guide.html

INFO level is used for VERY few things. I can't tell you off the top
of my head when it's appropriate, but I think the answer is "almost
never".

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2025-01-24 20:49:51 Re: BF member drongo doesn't like 035_standby_logical_decoding.pl
Previous Message Devulapalli, Raghuveer 2025-01-24 20:34:39 RE: Proposal for Updating CRC32C with AVX-512 Algorithm.