From: | Andrei Zubkov <zubkov(at)moonset(dot)ru> |
---|---|
To: | Jim Nasby <jnasby(at)upgrade(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com> |
Cc: | Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru |
Subject: | Re: Vacuum statistics |
Date: | 2024-10-29 12:40:09 |
Message-ID: | 6732acf8ce0f31025b535ae1a64568750924a887.camel@moonset.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
Thanks for your attention to our patch!
On Mon, 2024-10-28 at 16:03 -0500, Jim Nasby wrote:
> > Yes, but as Masahiko-san pointed out, PgStat_TableCounts is almost
> > tripled in space. That a huge change from having no statistics on
> > vacuum to have it in much more detail than everything else we
> > currently have. I think the feasible way might be to introduce
> > some
> > most demanded statistics first then see how it goes.
>
> Looking at the stats I do think the WAL stats are probably not
> helpful. First, there’s nothing users can do to tune how much WAL is
> generated by vacuum. Second, this introduces the risk of users saying
> “Wow, vacuum is creating a lot of WAL! I’m going to turn it down!”,
> which is most likely to make matters worse. There’s already a lot of
> stuff that goes into WAL without any detailed logging; if we ever
> wanted to provide a comprehensive view of what data is in WAL that
> should be handled separately.
Yes, there is nothing we can directly do with WAL generated by vacuum,
but WAL generation is the part of vacuum work, and it will indirectly
affected by the changes of vacuum settings. So, WAL statistics is one
more dimension of vacuum workload. Also WAL stat is universal metric
which is measured cluster-wide and on the statement-level with
pg_stat_statements. Vacuum WAL counters will explain the part of
difference between those metrics. Besides vacuum WAL counters can be
used to locate abnormal vacuum behavior caused by a bug or the data
corruption. I think if the DBA is smart enough to look at vacuum WAL
generated stats and to understand what it means, the decision to
disable the autovacuum due to its WAL generation is unlikely.
Anyway I think some stats can be excluded to save some memory. The
first candidates are the system_time and user_time fields. Those are
very valuable, but are measured by the rusage stats, which won't be
available on all platforms. I think total_time and delay_time would be
sufficient.
The second is the interrupts field. It is needed for monitoring to know
do we have them or not, so tracking them on the database level will do
the trick. Interrupt is quite rare event, so once the monitoring system
will catch one the DBA can go to the server log for the details.
It seems there is another way. If the vacuum stats doesn't seems to be
mandatory in all systems, maybe we should add some hooks to the vacuum
so that vacuum statistics tracking can be done in an extension. I don't
think it is a good idea, because vacuum stats seems to me as mandatory
as the vacuum process itself.
> Is there a reason some fields are omitted
> from pg_stat_vacuum_database? While some stats are certainly more
> interesting at the per-relation level, I can’t really think of any
> that don’t make sense at the database level as well.
Some of the metrics are table-specific, some index-specific, so we
moved to the database level metrics more or less specific to the whole
database. Can you tell what stats you want to see at the database
level?
> Looking at the per table/index stats, I strongly dislike the use of
> the term “delete” - it is a recipe for confusion with row deletion..
> A much better term is “remove” or “removed”. I realize the term
> “delete” is used in places in vacuum logging, but IMO we should fix
> that as well instead of doubling-down on it.
Yes, this point was discussed in our team, and it seems confusing to me
too. We decided to name it as it is named in the code and to get
feedback from the community. Now we get one. Thank you. Now we should
discuss it and choose the best one. My personal choice is "removed".
> I realize “relname” is being used for consistency with
> pg_stat_all_(tables|indexes), but I’m not sure it makes sense to
> double-down on that. Especially in pg_stat_vacuum_indexes, where it’s
> not completely clear whether relname is referring to the table or the
> index. I’m also inclined to say that the name of the table should be
> included in pg_stat_vacuum_indexes.
Agreed. Table name is needed in the index view.
> For all the views the docs should clarify that total_blks_written
> means blocks written by vacuum, as opposed to the background Ywriter.
We have the "Number of database blocks written by vacuum operations
performed on this table" in the docs now. Do you mean we should
specifically note the vacuum process here?
> Similarly they should clarify the difference between
> rel_blks_(read|hit) and total_blks_(read|hit). In the case of
> pg_stat_vacuum_indexes it’d be better if rel_blks_(read|hit) were
> called index_blks_(read|hit). Although… if total_blks_* is actually
> the count across the table and all the indexes I don’t know that we
> even need that counter. I realize that not ever vacuum even looks at
> the indexes, but if we’re going to go into that level of detail then
> we would (at minimum) need to count the number of times a vacuum
> completely skipped scanning the indexes.
It is not clear to me enough. The stats described just as it is -
rel_blocks_* tracks blocks of the current heap, and total_* is for the
whole database blocks - not just tables and indexes, vacuum do some
work (quite a little) in the catalog and this work is counted here too.
Usually this stat won't be helpful, but maybe we can catch unusual
vacuum behavior using this stat.
> Having rev_all_(frozen|visible)_pages in the same view as vacuum
> stats will confuse users into thinking that vacuum is clearing the
> bits. Those fields really belong in pg_stat_all_tables.
Agreed.
> Sadly index_vacuum_count is may not useful at all at present. At
> minimum you’d need to know the number of times vacuum had run in
> total. I realize that’s in pg_stat_all_tables, but that doesn’t help
> if vacuum stats are tracked or reset separately.
I'm in doubt - is it really possible to reset the vacuum stats
independent of pg_stat_all_tables?
> At minimum the docs should mention them. They also need to clarify
> if index_vacuum_count is incremented per-index or per-pass (hopefully
> the later). Assuming it’s per-pass, a better name for the field would
> be index_vacuum_passes, index_passes, index_pass_count, or similar.
> But even with that we still need a counter for the number of vacuums
> where index processing was skipped.
Agreed, the "index_passes" looks good to me, and index processing skip
counter looks good.
> First, there’s still gaps in trying to track HOT; most notably a
> counter for how many updates would never be HOT eligible because they
> modify indexes. pg_stat_all_tables.n_tup_newpage_upd is really
> limited without that info.
Nice catch, I'll think about it. Those are not directly connected to
the vacuum workload but those are important.
> There should also be stats about unused line pointers - in degenerate
> cases the lp array can consume a significant portion of heap storage.
>
> Monitoring bloat would be a lot more accurate if vacuum reported
> total tuple length for each run along with the total number of tuples
> it looked at. Having that info would make it trivial to calculate
> average tuple size, which could then be applied to reltuples and
> relpages to calculate how much space would being lost to bloat.
Yes, bloat tracking is in our plans. Right now it is not clear enough
how to do it in the most reliable and convenient way.
> Autovacuum will self-terminate if it would block another process
> (unless it’s an aggressive vacuum) - that’s definitely something that
> should be tracked. Not just the number of times that happens, but
> also stats about how much work was lost because of this.
Agreed.
> Shrinking a relation (what vacuum calls truncation, which is very
> confusing with the truncate command) is a rather complex process that
> currently has no visibility.
In this patch table truncation can be seen in the "pages_removed" field
of "pg_stat_vacuum_tables" at least as the cumulative number of removed
pages. It is not clear enough, but it is visible.
> Tuning vacuum_freeze_min_age (and the MXID variant) is rather
> complicated. We maybe have enough stats on whether it could be set
> lower, but there’s no visibility on how the settings affect how often
> vacuum decides to be aggressive. At minimum, we should have stats on
> when vacuum is aggressive, especially since it significantly changes
> the behavior of autovac.
When you say "agressive" do you mean the number of times when the
vacuum was processing the table with the FREEZE intention? I think this
is needed too.
> I saw someone else already mentioned tuning vacuum memory usage, but
> I’ll mention it again. Even if the issues with index_vacuum_count are
> fixed that still only tells you if you have a problem; it doesn’t
> give you a great idea of how much more memory you need. The best you
> can do is assuming you need (number of passes - 1) * current memory.
Do you think such approach is insufficient? It seems we do not need
byte-to-byte accuracy here.
> Speaking of which… there should be stats on any time vacuum decided
> on it’s own to skip index processing due to wraparound proximity.
Maybe we should just count the number of times when the vacuum was
started to prevent wraparound?
Jim, thank you for such detailed review of our patch!
--
regards, Andrei Zubkov
Postgres Professional
From | Date | Subject | |
---|---|---|---|
Next Message | Ranier Vilela | 2024-10-29 12:45:02 | Re: Avoid possible useless call to makeNode function (src/backend/commands/view.c) |
Previous Message | David Rowley | 2024-10-29 12:25:23 | Re: Avoid possible useless call to makeNode function (src/backend/commands/view.c) |