Re: Maybe we should reduce SKIP_PAGES_THRESHOLD a bit?

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Tomas Vondra <tomas(at)vondra(dot)me>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Maybe we should reduce SKIP_PAGES_THRESHOLD a bit?
Date: 2024-12-17 19:34:35
Message-ID: CAH2-WzkUyZ+HWmAbnee3rL6w9mnwW5aSmFkK+G5dA1n_uhwO_A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 17, 2024 at 11:44 AM Melanie Plageman
<melanieplageman(at)gmail(dot)com> wrote:
> I did look at the wiki page a bit. But one thing I didn't quite grasp
> is how you are proposing to measure the costs/benefits of scanning all
> all-visible pages. When you first mentioned this, I imagined you would
> use visibilitymap_count() at the beginning of the vacuum and consider
> scanning all the all-visible pages if there aren't many (when compared
> to the total number of pages needing scanning).

FWIW, the way that it actually worked in my abandoned patch set was
that VACUUM would always work off of a copy of the VM, taken at the
start -- a "VM snapshot" (I had an ambition that this VM snapshot
concept would eventually be used to make VACUUM suspendable, since it
was easy to serialize to disk in a temp file).

Working off a VM snapshot gave VACUUM a fixed idea about which pages
it needs to scan -- the cost model worked off of a count (of
all-visible vs all-frozen pages) gathered when the VM snapshot was
first established. There could never be any repeat VM accesses, since
VACUUM just worked off of the snapshot for everything VM related other
than setting VM bits.

I'm not saying that you should do this exact thing. I'm just providing context.

> But then, I'm not sure
> I see how that lets you advance relfrozenxid more often. It seems like
> the all-visible pages you would scan this way would be younger and
> less likely to be required to freeze (per freeze limit), so you'd end
> up just uselessly scanning them.

I think that it would make sense to have a low "extra pages to scan"
threshold of perhaps 5% of rel_pages, so that we automatically scan
those pages regardless of the current value of age(relfrozenxid). It
could be helpful with advancing relminmxid, in particular.

There is a need to work out how that varies as age(relfrozenxid) gets
closer to the cutoff for anti-wraparound autovacuum. The way I did
that in the old patch set involved ramping up from 5% of rel_pages
once age(relfrozenxid) reached the half-way point (half-way to
requiring an anti-wraparound autovacuum). Maybe these heuristics
aren't the best, but they seem like roughly the right idea to me. A
conservative version might be to do something like this, without any
of the ramp-up behavior (don't ramp up based on the current
age(relfrozenxid)).

> > Why should we necessarily have to advance relfrozenxid exactly up to
> > FreezeLimit during every aggressive VACUUM? Surely the picture over
> > time and across multiple VACUUM operations is what matters most? At
> > the very least, we should have an independent XID cutoff for "must
> > advance relfrozenxid up to here, no matter what" -- we should just
> > reuse FreezeLimit to control that behavior. We might very well "try
> > quite hard to advance relfrozenxid to a value >= FreezeLimit" -- we
> > just don't have to do it no matter what the cost is. There is a huge
> > practical difference between "try quite hard" (e.g., retry the cleanup
> > lock acquisition 3 times, with a sleep between each) and "try
> > infinitely hard" (i.e., wait for a cleanup lock indefinitely).
>
> I got a bit confused here. Do you mean that because we call
> lazy_scan_noprune() and visit tuples this way, we can still advance
> the relfrozenxid to the oldest unfrozen xid value just based on what
> we see in lazy_scan_noprune() (i.e. even if we don't get the cleanup
> lock)?

What I meant is that the high-level rule that says that aggressive
VACUUM must advance relfrozenxid to a value >= FreezeLimit is fairly
arbitrary, and not very principled. It's an artefact of the way that
FreezeLimit used to work -- nothing more.

It does make sense to have a rule roughly like that, of course, but I
think that it should be much looser. VACUUM could be more conservative
about advancing relfrozenxid on average (i.e. it could advance
relfrozenxid to a value more recent than FreezeLimit, and advance
relfrozenxid in more individual VACUUM operations), while at the same
time not making an absolute iron-clad guarantee about how much
relfrozenxid has to be advanced within any given VACUUM. In short,
VACUUM would promise less but deliver more.

A part of that might be to teach lazy_scan_noprune to "try a little
harder" when it made sense to wait a little while (but not forever)
for a cleanup lock. Alternatively, as discussed with Robert today, it
might be possible to freeze (and "prune") without requiring a cleanup
lock in a way that was sufficient to always be able to advance
relfrozenxid without hindrance from cursors with conflicting buffer
pins and whatnot.

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2024-12-17 19:40:32 Re: Adding NetBSD and OpenBSD to Postgres CI
Previous Message Melanie Plageman 2024-12-17 19:34:00 Re: Count and log pages set all-frozen by vacuum