From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Melanie Plageman <melanieplageman(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Jeff Davis <pgsql(at)j-davis(dot)com> |
Subject: | Re: Eager page freeze criteria clarification |
Date: | 2023-09-26 15:11:00 |
Message-ID: | 20230926151100.2afqsbsuhmqtgkpr@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2023-09-25 14:45:07 -0400, Robert Haas wrote:
> On Fri, Sep 8, 2023 at 12:07 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > > Downthread, I proposed using the RedoRecPtr of the latest checkpoint
> > > rather than the LSN of the previou vacuum. I still like that idea.
> >
> > Assuming that "downthread" references
> > https://postgr.es/m/CA%2BTgmoYb670VcDFbekjn2YQOKF9a7e-kBFoj2WJF1HtH7YPaWQ%40mail.gmail.com
> > could you sketch out the logic you're imagining a bit more?
>
> I'm not exactly sure what the question is here.
That I'd like you to expand on "using the RedoRecPtr of the latest checkpoint
rather than the LSN of the previou vacuum." - I can think of ways of doing so
that could end up with quite different behaviour...
> > Perhaps we can mix both approaches. We can use the LSN and time of the last
> > vacuum to establish an LSN->time mapping that's reasonably accurate for a
> > relation. For infrequently vacuumed tables we can use the time between
> > checkpoints to establish a *more aggressive* cutoff for freezing then what a
> > percent-of-time-since-last-vacuum appach would provide. If e.g. a table gets
> > vacuumed every 100 hours and checkpoint timeout is 1 hour, no realistic
> > percent-of-time-since-last-vacuum setting will allow freezing, as all dirty
> > pages will be too new. To allow freezing a decent proportion of those, we
> > could allow freezing pages that lived longer than ~20%
> > time-between-recent-checkpoints.
>
> Yeah, I don't know if that's exactly the right idea, but I think it's
> in the direction that I was thinking about. I'd even be happy with
> 100% of the time-between-recent checkpoints, maybe even 200% of
> time-between-recent checkpoints. But I think there probably should be
> some threshold beyond which we say "look, this doesn't look like it
> gets touched that much, let's just freeze it so we don't have to come
> back to it again later."
As long as the most extreme cases are prevented, unnecessarily freezing is imo
far less harmful than freezing too little.
I'm worried that using something as long as 100-200%
time-between-recent-checkpoints won't handle insert-mostly workload well,
which IME are also workloads suffering quite badly under our current scheme -
and which are quite common.
> I think part of the calculus here should probably be that when the
> freeze threshold is long, the potential gains from making it even
> longer are not that much. If I change the freeze threshold on a table
> from 1 minute to 1 hour, I can potentially save uselessly freezing
> that page 59 times per hour, every hour, forever, if the page always
> gets modified right after I touch it. If I change the freeze threshold
> on a table from 1 hour to 1 day, I can only save 23 unnecessary
> freezes per day. Percentage-wise, the overhead of being wrong is the
> same in both cases: I can have as many extra freeze operations as I
> have page modifications, if I pick the worst possible times to freeze
> in every case. But in absolute terms, the savings in the second
> scenario are a lot less. I think if a user is accessing a table
> frequently, the overhead of jamming a useless freeze in between every
> table access is going to be a lot more noticeable then when the table
> is only accessed every once in a while. And I also think it's a lot
> less likely that we'll reliably get it wrong. Workloads that touch a
> page and then touch it again ~N seconds later can exist for all values
> of N, but I bet they're way more common for small values of N than
> large ones.
True. And with larger Ns it also becomes more likely that we'd need to freeze
the rows anyway. I've seen tables being hit with several anti-wraparound vacuums
a day, but not several anti-wraparound vacuums a minute...
> Is there also a need for a similar guard in the other direction? Let's
> say that autovacuum_naptime=15s and on some particular table it
> triggers every time. I've actually seen this on small queue tables. Do
> you think that, in such tables, we should freeze pages that haven't
> been modified in 15s?
I don't think it matters much, proportionally to the workload of rewriting
nearly all of the table every few seconds, the overhead of freezing a bunch of
already dirty pages is neglegible.
> > Hm, possibly stupid idea: What about using shared_buffers residency as a
> > factor? If vacuum had to read in a page to vacuum it, a) we would need read IO
> > to freeze it later, as we'll soon evict the page via the ringbuffer b)
> > non-residency indicates the page isn't constantly being modified?
>
> This doesn't seem completely stupid, but I fear it would behave
> dramatically differently on a workload a little smaller than s_b vs.
> one a little larger than s_b, and that doesn't seem good.
Hm. I'm not sure that that's a real problem. In the case of a workload bigger
than s_b, having to actually read the page again increases the cost of
freezing later, even if the workload is just a bit bigger than s_b.
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Aleksander Alekseev | 2023-09-26 15:19:23 | Re: pg_resetwal tests, logging, and docs update |
Previous Message | Nazir Bilal Yavuz | 2023-09-26 14:51:37 | Re: Build the docs if there are changes in docs and don't run other tasks if the changes are only in docs |