Re: Maybe we should reduce SKIP_PAGES_THRESHOLD a bit?

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Tomas Vondra <tomas(at)vondra(dot)me>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Maybe we should reduce SKIP_PAGES_THRESHOLD a bit?
Date: 2024-12-17 19:49:46
Message-ID: CAAKRu_YD4-j5hufeVOSYh-aSydwMcX4k97AWJhHHYL9rbRyi5g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 17, 2024 at 1:46 PM Tomas Vondra <tomas(at)vondra(dot)me> wrote:
>
> On 12/17/24 18:06, Melanie Plageman wrote:
> > On Tue, Dec 17, 2024 at 9:11 AM Tomas Vondra <tomas(at)vondra(dot)me> wrote:
> >>
> >>
> >>
> >> On 12/16/24 19:49, Melanie Plageman wrote:
> >>
> >>> No, I'm talking about the behavior of causing small pockets of
> >>> all-frozen pages which end up being smaller than SKIP_PAGES_THRESHOLD
> >>> and are then scanned (even though they are already frozen). What I
> >>> describe in that email I cited is that because we freeze
> >>> opportunistically when we have or will emit an FPI, and bgwriter will
> >>> write out blocks in clocksweep order, we end up with random pockets of
> >>> pages getting frozen during/after a checkpoint. Then in the next
> >>> vacuum, we end up scanning those all-frozen pages again because the
> >>> ranges of frozen pages are smaller than SKIP_PAGES_THRESHOLD. This is
> >>> mostly going to happen for an insert-only workload. I'm not saying
> >>> freezing the pages is bad, I'm saying that causing these pockets of
> >>> frozen pages leads to scanning all-frozen pages on future vacuums.
> >>>
> >>
> >> Yeah, this interaction between the components is not great :-( But can
> >> we think of a way to reduce the fragmentation? What would need to change?
> >
> > Well reducing SKIP_PAGES_THRESHOLD would help.
>
> How does SKIP_PAGES_THRESHOLD change the fragmentation? I think that's
> the side that's affected by the fragmentation, but it's really due to
> the eager freezing / bgwriter evictions, etc. If the threshold is set to
> 1 (i.e. to always skip), that just lowers the impact, but the relation
> is still as fragmented as before, no?

Yep, exactly. It doesn't help with fragmentation. It just helps us not
scan all-frozen pages. The question is whether or not the
fragmentation on its own matters. I think it would be better if we
didn't have it -- we could potentially do larger reads, for example,
if we have one continuous block of pages that are not all-frozen (most
likely when using the read stream API).

> > And unfortunately we do
> > not know if the skippable pages are all-frozen without extra
> > visibilitymap_get_status() calls -- so we can't decide to avoid
> > scanning ranges of skippable pages because they are frozen.
> >
>
> I may be missing something, but doesn't find_next_unskippable_block()
> already get those bits? In fact, it even checks VISIBILITYMAP_ALL_FROZEN
> but only for aggressive vacuum. But even if that wasn't the case, isn't
> checking the VM likely much cheaper than vacuuming the heap page?

find_next_unskippable_block() has them, but then if we do decide to
skip a range of pages, back in heap_vac_scan_next_block() where we
decide whether or not to skip a range using SKIP_PAGES_THRESHOLD, we
know that the pages in the range are all-visible (otherwise they
wouldn't be skippable) but we no longer know which of them were
all-frozen.

> >> Maybe the freezing code could check how many of the nearby pages are
> >> frozen, and consider that together with the FPI write?
> >
> > That's an interesting idea. We wouldn't have any guaranteed info
> > because we only have a lock on the page we are considering freezing.
> > But we could keep track of the length of a run of pages we are
> > freezing and opportunistically freeze pages that don't require
> > freezing if they follow one or more pages requiring freezing.
>
> I don't think we need a "guaranteed" information - a heuristics that's
> correct most of the time (say, >90%?) ought to be good enough. I mean,
> it has to be, because we'll never get a rule that's correct 100%. So
> even just looking at a batch of pages in VM should be enough, no?
>
> > But I don't know how much more this buys us than removing
> > SKIP_PAGES_THRESHOLD. Since it would "fix" the fragmentation, perhaps
> > it makes larger future vacuum reads possible. But I wonder how much
> > benefit it would be vs complexity.
> >
>
> I think that depends on which cost we're talking about. If we only talk
> about the efficiency of a single vacuum, then it probably does not help
> very much. I mean, if we assume the relation is already fragmented, then
> it seems to be cheaper to vacuum just the pages that need it (as if with
> SKIP_PAGES_THRESHOLD=1).
>
> But if we're talking about long-time benefits, in reducing the amount of
> freezing needed overall, maybe it'd be a win? I don't know.

Yea, it just depends on whether or not the pages we freeze for this
reason are likely to stay frozen.
I think I misspoke in saying we want to freeze pages next to pages
requiring freezing. What we really want to do is freeze pages next to
pages that are being opportunistically frozen -- because those are the
ones that are creating the fragmentation. But, then where do you draw
the line? You won't know if you are creating lots of random holes
until after you've skipped opportunistically freezing some pages --
and by then it's too late.

- Melanie

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-12-17 20:58:36 Re: improve EXPLAIN for wide tables
Previous Message Andres Freund 2024-12-17 19:40:32 Re: Adding NetBSD and OpenBSD to Postgres CI