From: | Greg Stark <stark(at)mit(dot)edu> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: autovacuum prioritization |
Date: | 2022-01-26 23:46:00 |
Message-ID: | CAM-w4HOq4uvzQxXt2d_PvF+2F=YR6Pr1gpxuQjLaN85xktk+qg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, 20 Jan 2022 at 14:31, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> In my view, previous efforts in this area have been too simplistic.
>
One thing I've been wanting to do something about is I think
autovacuum needs to be a little cleverer about when *not* to vacuum a
table because it won't do any good.
I've seen a lot of cases where autovacuum kicks off a vacuum of a
table even though the globalxmin hasn't really advanced significantly
over the oldest frozen xid. When it's a large table this really hurts
because it could be hours or days before it finishes and at that point
there's quite a bit of bloat.
This isn't a common occurrence, it happens when the system is broken
in some way. Either there's an idle-in-transaction session or
something else keeping the global xmin held back.
What it does though is make things *much* worse and *much* harder for
a non-expert to hit on the right remediation. It's easy enough to tell
them to look for these idle-in-transaction sessions or set timeouts.
It's much harder to determine whether it's a good idea for them to go
and kill the vacuum that's been running for days. And it's not a great
thing for people to be getting in the habit of doing either.
I want to be able to stop telling people to kill vacuums kicked off by
autovacuum. I feel like it's a bad thing for someone to ever have to
do and I know some fraction of the time I'm telling them to do it
it'll have been a terrible thing to have done (but we'll never know
which times those were). Determining whether a running vacuum is
actually doing any good is pretty hard and on older versions probably
impossible.
I was thinking of just putting a check in before kicking off a vacuum
and if the globalxmin is a significant fraction of the distance to the
relfrozenxid then instead log a warning. Basically it means "we can't
keep the bloat below the threshold due to the idle transactions et al,
not because there's insufficient i/o bandwidth".
At the same time it would be nice if autovacuum could recognize when
the i/o bandwidth is insufficient. If it finishes a vacuum it could
recheck whether the table is eligible for vacuuming and log that it's
unable to keep up with the vacuuming requirements -- but right now
that would be a lie much of the time when it's not a lack of bandwidth
preventing it from keeping up.
--
greg
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Geoghegan | 2022-01-26 23:54:27 | Re: autovacuum prioritization |
Previous Message | Andrew Dunstan | 2022-01-26 23:45:53 | Re: snapper and skink and fairywren (oh my!) |