Re: Maybe we should reduce SKIP_PAGES_THRESHOLD a bit?

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Tomas Vondra <tomas(at)vondra(dot)me>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Maybe we should reduce SKIP_PAGES_THRESHOLD a bit?
Date: 2024-12-16 18:49:47
Message-ID: CAAKRu_ZsHEe4LBEk2ka_qLZ32f6TMMaaJooGiw0QoqjSPy9bgg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Dec 16, 2024 at 12:32 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>
> On Mon, Dec 16, 2024 at 10:37 AM Melanie Plageman
> <melanieplageman(at)gmail(dot)com> wrote:
> > On a related note, the other day I noticed another negative effect
> > caused in part by SKIP_PAGES_THRESHOLD. SKIP_PAGES_THRESHOLD interacts
> > with the opportunistic freeze heuristic [1] causing lots of all-frozen
> > pages to be scanned when checksums are enabled. You can easily end up
> > with a table that has very fragmented ranges of frozen, all-visible,
> > and modified pages. In this case, the opportunistic freeze heuristic
> > bears most of the blame.
>
> Bears most of the blame for what? Significantly reducing the total
> amount of WAL written?

No, I'm talking about the behavior of causing small pockets of
all-frozen pages which end up being smaller than SKIP_PAGES_THRESHOLD
and are then scanned (even though they are already frozen). What I
describe in that email I cited is that because we freeze
opportunistically when we have or will emit an FPI, and bgwriter will
write out blocks in clocksweep order, we end up with random pockets of
pages getting frozen during/after a checkpoint. Then in the next
vacuum, we end up scanning those all-frozen pages again because the
ranges of frozen pages are smaller than SKIP_PAGES_THRESHOLD. This is
mostly going to happen for an insert-only workload. I'm not saying
freezing the pages is bad, I'm saying that causing these pockets of
frozen pages leads to scanning all-frozen pages on future vacuums.

> > However, we are not close to coming up with a
> > replacement heuristic, so removing SKIP_PAGES_THRESHOLD would help.
> > This wouldn't have affected your results, but it is worth considering
> > more generally.
>
> One of the reasons why we have SKIP_PAGES_THRESHOLD is that it makes
> it more likely that non-aggressive VACUUMs will advance relfrozenxid.
> Granted, it's probably not doing a particularly good job at that right
> now. But any effort to replace it should account for that.
>
> This is possible by making VACUUM consider the cost of scanning extra
> heap pages up-front. If the number of "extra heap pages to be scanned"
> to advance relfrozenxid happens to not be very high (or not so high
> *relative to the current age(relfrozenxid)*), then pay that cost now,
> in the current VACUUM operation. Even if age(relfrozenxid) is pretty
> far from the threshold for aggressive mode, if the added cost of
> advancing relfrozenxid is still not too high, why wouldn't we just do
> it?

That's an interesting idea. And it seems like a much more effective
way of getting some relfrozenxid advancement than hoping that the
pages you scan due to SKIP_PAGES_THRESHOLD end up being enough to have
scanned all unfrozen tuples.

> I think that aggressive mode is a bad idea more generally. The
> behavior around waiting for a cleanup lock (the second
> aggressive-mode-influenced behavior) is also a lot more brittle than
> it really needs to be, simply because we're not weighing costs and
> benefits. There's a bunch of relevant information that could be
> applied when deciding what to do (at the level of each individual heap
> page that cannot be cleanup locked right away), but we make no effort
> to apply that information -- we only care about the static choice of
> aggressive vs. non-aggressive there.

What kind of information? Could you say more?

Andres mentioned the other day that we could set pages all-visible in
the VM even if we don't get the cleanup lock (lazy_scan_noprune())
case. That seems like a good idea.

- Melanie

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jelte Fennema-Nio 2024-12-16 19:05:39 Re: Improving default column names/aliases of subscript text expressions
Previous Message Andreas Karlsson 2024-12-16 18:35:18 Re: Improved psql tab completion for joins