Re: Maybe we should reduce SKIP_PAGES_THRESHOLD a bit?

From: Tomas Vondra <tomas(at)vondra(dot)me>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Maybe we should reduce SKIP_PAGES_THRESHOLD a bit?
Date: 2024-12-17 18:46:39
Message-ID: 02af4880-fa6c-446c-bfdd-6d78d9903986@vondra.me
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12/17/24 18:06, Melanie Plageman wrote:
> On Tue, Dec 17, 2024 at 9:11 AM Tomas Vondra <tomas(at)vondra(dot)me> wrote:
>>
>>
>>
>> On 12/16/24 19:49, Melanie Plageman wrote:
>>
>>> No, I'm talking about the behavior of causing small pockets of
>>> all-frozen pages which end up being smaller than SKIP_PAGES_THRESHOLD
>>> and are then scanned (even though they are already frozen). What I
>>> describe in that email I cited is that because we freeze
>>> opportunistically when we have or will emit an FPI, and bgwriter will
>>> write out blocks in clocksweep order, we end up with random pockets of
>>> pages getting frozen during/after a checkpoint. Then in the next
>>> vacuum, we end up scanning those all-frozen pages again because the
>>> ranges of frozen pages are smaller than SKIP_PAGES_THRESHOLD. This is
>>> mostly going to happen for an insert-only workload. I'm not saying
>>> freezing the pages is bad, I'm saying that causing these pockets of
>>> frozen pages leads to scanning all-frozen pages on future vacuums.
>>>
>>
>> Yeah, this interaction between the components is not great :-( But can
>> we think of a way to reduce the fragmentation? What would need to change?
>
> Well reducing SKIP_PAGES_THRESHOLD would help.

How does SKIP_PAGES_THRESHOLD change the fragmentation? I think that's
the side that's affected by the fragmentation, but it's really due to
the eager freezing / bgwriter evictions, etc. If the threshold is set to
1 (i.e. to always skip), that just lowers the impact, but the relation
is still as fragmented as before, no?

> And unfortunately we do
> not know if the skippable pages are all-frozen without extra
> visibilitymap_get_status() calls -- so we can't decide to avoid
> scanning ranges of skippable pages because they are frozen.
>

I may be missing something, but doesn't find_next_unskippable_block()
already get those bits? In fact, it even checks VISIBILITYMAP_ALL_FROZEN
but only for aggressive vacuum. But even if that wasn't the case, isn't
checking the VM likely much cheaper than vacuuming the heap page?

Also, I don't quite see why would this information help with reducing
the fragmentation? Can you explain?

>> I don't think bgwriter can help much - it's mostly oblivious to the
>> contents of the buffer, I don't think it could consider stuff like this
>> when deciding what to evict.
>
> Agreed.
>
>> Maybe the freezing code could check how many of the nearby pages are
>> frozen, and consider that together with the FPI write?
>
> That's an interesting idea. We wouldn't have any guaranteed info
> because we only have a lock on the page we are considering freezing.
> But we could keep track of the length of a run of pages we are
> freezing and opportunistically freeze pages that don't require
> freezing if they follow one or more pages requiring freezing.

I don't think we need a "guaranteed" information - a heuristics that's
correct most of the time (say, >90%?) ought to be good enough. I mean,
it has to be, because we'll never get a rule that's correct 100%. So
even just looking at a batch of pages in VM should be enough, no?

> But I don't know how much more this buys us than removing
> SKIP_PAGES_THRESHOLD. Since it would "fix" the fragmentation, perhaps
> it makes larger future vacuum reads possible. But I wonder how much
> benefit it would be vs complexity.
>

I think that depends on which cost we're talking about. If we only talk
about the efficiency of a single vacuum, then it probably does not help
very much. I mean, if we assume the relation is already fragmented, then
it seems to be cheaper to vacuum just the pages that need it (as if with
SKIP_PAGES_THRESHOLD=1).

But if we're talking about long-time benefits, in reducing the amount of
freezing needed overall, maybe it'd be a win? I don't know.

>>>>> However, we are not close to coming up with a
>>>>> replacement heuristic, so removing SKIP_PAGES_THRESHOLD would help.
>>>>> This wouldn't have affected your results, but it is worth considering
>>>>> more generally.
>>>>
>>>> One of the reasons why we have SKIP_PAGES_THRESHOLD is that it makes
>>>> it more likely that non-aggressive VACUUMs will advance relfrozenxid.
>>>> Granted, it's probably not doing a particularly good job at that right
>>>> now. But any effort to replace it should account for that.
>>>>
>>
>> I don't follow. How could non-aggressive VACUUM advance relfrozenxid,
>> ever? I mean, if it doesn't guarantee freezing all pages, how could it?
>
> It may, coincidentally, not skip any all-visible pages. Peter points
> out that this happens all the time for small tables, but wouldn't the
> overhead of an aggressive vacuum be barely noticeable for small
> tables? It seems like there is little cost to waiting.
>

Yeah, that's kinda my point / confusion. For it to help we would have to
not skip any pages, but for large tables that seems quite unlikely
(because it only takes one table that gets skipped, and there are many
opportunities). And for small tables, I think it doesn't matter that
much, because even aggressive vacuum is cheap.

>>>> This is possible by making VACUUM consider the cost of scanning extra
>>>> heap pages up-front. If the number of "extra heap pages to be scanned"
>>>> to advance relfrozenxid happens to not be very high (or not so high
>>>> *relative to the current age(relfrozenxid)*), then pay that cost now,
>>>> in the current VACUUM operation. Even if age(relfrozenxid) is pretty
>>>> far from the threshold for aggressive mode, if the added cost of
>>>> advancing relfrozenxid is still not too high, why wouldn't we just do
>>>> it?
>>>
>>> That's an interesting idea. And it seems like a much more effective
>>> way of getting some relfrozenxid advancement than hoping that the
>>> pages you scan due to SKIP_PAGES_THRESHOLD end up being enough to have
>>> scanned all unfrozen tuples.
>>>
>>
>> I agree it might be useful to formulate this as a "costing" problem, not
>> just in the context of a single vacuum, but for the overall maintenance
>> overhead - essentially accepting the vacuum gets slower, in exchange for
>> lower cost of maintenance later.
>
> Yes, that costing sounds like a big research and benchmarking project
> on its own.
>

True. I don't know if it makes sense to try to construct a detailed /
accurate cost model for this. I meant more of a general model showing
the general relationship between the amount of work that has to happen
in different places.

regards

--
Tomas Vondra

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2024-12-17 18:50:54 Re: Maybe we should reduce SKIP_PAGES_THRESHOLD a bit?
Previous Message Peter Eisentraut 2024-12-17 18:29:16 Re: fixing tsearch locale support