Re: Maybe we should reduce SKIP_PAGES_THRESHOLD a bit?

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Tomas Vondra <tomas(at)vondra(dot)me>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Maybe we should reduce SKIP_PAGES_THRESHOLD a bit?
Date: 2024-12-16 17:31:47
Message-ID: CAH2-Wzky0C7v1i2TGkPQEaVUrZNhwdZn9nnOtNnzNceZkLjY3Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Dec 16, 2024 at 10:37 AM Melanie Plageman
<melanieplageman(at)gmail(dot)com> wrote:
> On a related note, the other day I noticed another negative effect
> caused in part by SKIP_PAGES_THRESHOLD. SKIP_PAGES_THRESHOLD interacts
> with the opportunistic freeze heuristic [1] causing lots of all-frozen
> pages to be scanned when checksums are enabled. You can easily end up
> with a table that has very fragmented ranges of frozen, all-visible,
> and modified pages. In this case, the opportunistic freeze heuristic
> bears most of the blame.

Bears most of the blame for what? Significantly reducing the total
amount of WAL written?

> However, we are not close to coming up with a
> replacement heuristic, so removing SKIP_PAGES_THRESHOLD would help.
> This wouldn't have affected your results, but it is worth considering
> more generally.

One of the reasons why we have SKIP_PAGES_THRESHOLD is that it makes
it more likely that non-aggressive VACUUMs will advance relfrozenxid.
Granted, it's probably not doing a particularly good job at that right
now. But any effort to replace it should account for that.

This is possible by making VACUUM consider the cost of scanning extra
heap pages up-front. If the number of "extra heap pages to be scanned"
to advance relfrozenxid happens to not be very high (or not so high
*relative to the current age(relfrozenxid)*), then pay that cost now,
in the current VACUUM operation. Even if age(relfrozenxid) is pretty
far from the threshold for aggressive mode, if the added cost of
advancing relfrozenxid is still not too high, why wouldn't we just do
it?

I think that aggressive mode is a bad idea more generally. The
behavior around waiting for a cleanup lock (the second
aggressive-mode-influenced behavior) is also a lot more brittle than
it really needs to be, simply because we're not weighing costs and
benefits. There's a bunch of relevant information that could be
applied when deciding what to do (at the level of each individual heap
page that cannot be cleanup locked right away), but we make no effort
to apply that information -- we only care about the static choice of
aggressive vs. non-aggressive there.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jelte Fennema-Nio 2024-12-16 17:33:29 Re: Improving default column names/aliases of subscript text expressions
Previous Message Sami Imseih 2024-12-16 17:06:22 Re: improve EXPLAIN for wide tables