Re: Should vacuum process config file reload more often

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Daniel Gustafsson <daniel(at)yesql(dot)se>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Subject: Re: Should vacuum process config file reload more often
Date: 2023-04-11 00:16:12
Message-ID: CAAKRu_ZaVV32vQD2zp00-4Zr0ncusckJspDPzAUGGYPFVuv4Og@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Apr 7, 2023 at 9:07 AM Melanie Plageman
<melanieplageman(at)gmail(dot)com> wrote:
>
> On Fri, Apr 7, 2023 at 2:53 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > On Fri, Apr 7, 2023 at 8:08 AM Daniel Gustafsson <daniel(at)yesql(dot)se> wrote:
> > >
> > > > On 7 Apr 2023, at 00:12, Melanie Plageman <melanieplageman(at)gmail(dot)com> wrote:
> > > >
> > > > On Thu, Apr 6, 2023 at 5:45 PM Daniel Gustafsson <daniel(at)yesql(dot)se> wrote:
> > > >>
> > > >>> On 6 Apr 2023, at 23:06, Melanie Plageman <melanieplageman(at)gmail(dot)com> wrote:
> > > >>
> > > >>> Autovacuum workers, at the end of VacuumUpdateCosts(), check if cost
> > > >>> limit or cost delay have been changed. If they have, they assert that
> > > >>> they don't already hold the AutovacuumLock, take it in shared mode, and
> > > >>> do the logging.
> > > >>
> > > >> Another idea would be to copy the values to local temp variables while holding
> > > >> the lock, and release the lock before calling elog() to avoid holding the lock
> > > >> over potential IO.
> > > >
> > > > Good idea. I've done this in attached v19.
> > > > Also I looked through the docs and everything still looks correct for
> > > > balancing algo.
> > >
> > > I had another read-through and test-through of this version, and have applied
> > > it with some minor changes to comments and whitespace. Thanks for the quick
> > > turnaround times on reviews in this thread!
> >
> > Cool!
> >
> > Regarding the commit 7d71d3dd08, I have one comment:
> >
> > + /* Only log updates to cost-related variables */
> > + if (vacuum_cost_delay == original_cost_delay &&
> > + vacuum_cost_limit == original_cost_limit)
> > + return;
> >
> > IIUC by default, we log not only before starting the vacuum but also
> > when changing cost-related variables. Which is good, I think, because
> > logging the initial values would also be helpful for investigation.
> > However, I think that we don't log the initial vacuum cost values
> > depending on the values. For example, if the
> > autovacuum_vacuum_cost_delay storage option is set to 0, we don't log
> > the initial values. I think that instead of comparing old and new
> > values, we can write the log only if
> > message_level_is_interesting(DEBUG2) is true. That way, we don't need
> > to acquire the lwlock unnecessarily. And the code looks cleaner to me.
> > I've attached the patch (use_message_level_is_interesting.patch)
>
> Thanks for coming up with the case you thought of with storage param for
> cost delay = 0. In that case we wouldn't print the message initially and
> we should fix that.
>
> I disagree, however, that we should condition it only on
> message_level_is_interesting().
>
> Actually, outside of printing initial values when the autovacuum worker
> first starts (before vacuuming all tables), I don't think we should log
> these values except when they are being updated. Autovacuum workers
> could vacuum tons of small tables and having this print out at least
> once per table (which I know is how it is on master) would be
> distracting. Also, you could be reloading the config to update some
> other GUCs and be oblivious to an ongoing autovacuum and get these
> messages printed out, which I would also find distracting.
>
> You will have to stare very hard at the logs to tell if your changes to
> vacuum cost delay and limit took effect when you reload config. I think
> with our changes to update the values more often, we should take the
> opportunity to make this logging more useful by making it happen only
> when the values are changed.
>
> I would be open to elevating the log level to DEBUG1 for logging only
> updates and, perhaps, having an option if you set log level to DEBUG2,
> for example, to always log these values in VacuumUpdateCosts().
>
> I'd even argue that, potentially, having the cost-delay related
> parameters printed at the beginning of vacuuming could be interesting to
> regular VACUUM as well (even though it doesn't benefit from config
> reload while in progress).
>
> To fix the issue you mentioned and ensure the logging is printed when
> autovacuum workers start up before vacuuming tables, we could either
> initialize vacuum_cost_delay and vacuum_cost_limit to something invalid
> that will always be different than what they are set to in
> VacuumUpdateCosts() (not sure if this poses a problem for VACUUM using
> these values since they are set to the defaults for VACUUM). Or, we
> could duplicate this logging message in do_autovacuum().
>
> Finally, one other point about message_level_is_interesting(). I liked
> the idea of using it a lot, since log level DEBUG2 will not be the
> common case. I thought of it but hesitated because all other users of
> message_level_is_interesting() are avoiding some memory allocation or
> string copying -- not avoiding take a lock. Making this conditioned on
> log level made me a bit uncomfortable. I can't think of a situation when
> it would be a problem, but it felt a bit off.
>
> > Also, while testing the autovacuum delay with relopt
> > autovacuum_vacuum_cost_delay = 0, I realized that even if we set
> > autovacuum_vacuum_cost_delay = 0 to a table, wi_dobalance is set to
> > true. wi_dobalance comes from the following expression:
> >
> > /*
> > * If any of the cost delay parameters has been set individually for
> > * this table, disable the balancing algorithm.
> > */
> > tab->at_dobalance =
> > !(avopts && (avopts->vacuum_cost_limit > 0 ||
> > avopts->vacuum_cost_delay > 0));
> >
> > The initial values of both avopts->vacuum_cost_limit and
> > avopts->vacuum_cost_delay are -1. I think we should use ">= 0" instead
> > of "> 0". Otherwise, we include the autovacuum worker working on a
> > table whose autovacuum_vacuum_cost_delay is 0 to the balancing
> > algorithm. Probably this behavior has existed also on back branches
> > but I haven't checked it yet.
>
> Thank you for catching this. Indeed this exists in master since
> 1021bd6a89b which was backpatched. I checked and it is true all the way
> back through REL_11_STABLE.
>
> Definitely seems worth fixing as it kind of defeats the purpose of the
> original commit. I wish I had noticed before!
>
> Your fix has:
> !(avopts && (avopts->vacuum_cost_limit >= 0 ||
> avopts->vacuum_cost_delay >= 0));
>
> And though delay is required to be >= 0
> avopts->vacuum_cost_delay >= 0
>
> Limit does not. It can just be > 0.
>
> postgres=# create table foo (a int) with (autovacuum_vacuum_cost_limit = 0);
> ERROR: value 0 out of bounds for option "autovacuum_vacuum_cost_limit"
> DETAIL: Valid values are between "1" and "10000".
>
> Though >= is also fine, the rest of the code in all versions always
> checks if limit > 0 and delay >= 0 since 0 is a valid value for delay
> and not for limit. Probably best we keep it consistent (though the whole
> thing is quite confusing).

I have created an open item for each of these issues on the wiki
(one for 16 and one under the section "affects stable branches").

- Melanie

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Melanie Plageman 2023-04-11 00:23:15 Re: Show various offset arrays for heap WAL records
Previous Message Jonathan S. Katz 2023-04-10 23:53:15 Re: When to drop src/tools/msvc support