Re: Vacuum statistics

From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: Sami Imseih <samimseih(at)gmail(dot)com>
Cc: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Jim Nasby <jnasby(at)upgrade(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru, Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Subject: Re: Vacuum statistics
Date: 2025-01-04 20:37:51
Message-ID: a55da89f-bbc9-444d-8a0b-1a354257f571@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi, thank you for your attention to this patch.

On 02.01.2025 23:12, Sami Imseih wrote:
> Hi,
>
> Thanks for the work you have done here. Exposing cumulative
> metrics at this level of detail for vacuum is surely useful to find
> vacuum bottlenecks and to determine the effectiveness of
> vacuum tuning.
Yes, we hope that this will help provide more detailed information about
the current efficiency of the vacuum and also suggest how to best
configure it for the relationship.
> I am yet to look very closely, but I think some additional columns that
> will be useful is the number of failsafe autovacuums occurred.
Do you mean when the autovacuum started to prevent workaround?
> Also
> the counter for number of index_cleanup skipped, truncate phase
> skipped and toast vacuuming skipped ( the latter will only be relevant
> for the main relation ).

I can add, but concerns have already been expressed about the large
amount of vacuum statistics and, as a consequence, this leads to the
allocation of additional memory (3 times).

Of course, now we are saved by the guc I added for statistics... I
understand that this information can better show the efficiency of the
vacuum, but how does it help in setting it up for heap relations?

regarding the skipped truncate phase, the statistics are already
collected in vacrel->nonempty_pages, it's easy to put them outside. I
think the current statistics only show the number of deleted tuples and
pages (both deleted and those visited by vacuum during tuple deletion),
so the opposite view won't hurt.

index_cleanup skipped can be obtained based on information from a small
number of vacuum buffer statistics and the number of pages of indexes
that belong to heap relations. I think you can notice the behavior
through current statistics: if the index's buffer values have increased
very slightly, then the vacuum does not go there probably because of the
impossibility of taking a clean-up lock on the index. The same
information can be obtained based on the number of missed_tuples in heap
relations. I wrote earlier how these values ​​​​are related.

toast vacuuming skipped to be honest I haven't found a place where
vacuum skips it in the code yet, so I can't say anything about them yet.

> I also wonder if if makes sense to break down timing by phase. I surely
> would like to know how much of my vacuum time was spent in index
> cleanup vs heap scan, etc.
At the moment, this information has already been added to the statistics
as a total time for heap relations and their indexes.
>
> A nit: I noticed in v14, the column is "schema". It should be "schemaname"
> for consistency.
Thank you, I'll fix it in the next version of the patch.
> Also, instead of pg_stat_vacuum_tables, what about pg_stat_vacuum?
>
> Now, I became aware of this discussion after starting a new thread
> to track total time spent in vacuum/analyze in pg_stat_all_tables [1].
> But this begs the question of what should be done with the current
> counters in pg_stat_all_tables? I see it mentioned above that (auto)vacuum_count
> should be added to this new view, but it's also already in pg_stat_all_tables.
> I don't think we should be duplicating the same columns across views.
>
> Alternatively, we can remove the vacuum related stats from pg_stat_all_tables,
> but that will break monitoring tools and will leave us with the (auto)analyze
> metrics alone in pg_stat_all_tables. This sounds very ugly.
>
> What do you think?
>
> Regards,
>
> Sami Imseih
> Amazon Web Services (AWS)
>
> [1]https://commitfest.postgresql.org/52/5485/

I don't think they interfere with my more detailed views of how the
vacuum works. I don't think there's anything worth removing.
>
> I think total_time should be removed from your current patch and added
> as is being suggested in [1]. This way high level metrics such as counts
> and total time spent remain in pg_stat_all_tables, while the new view
> you are proposing will contain more details. I don't think we will have
> consistency issues between the views because a reset using pg_stat_reset()
> will act on all the stats and pg_stat_reset_single_table_counters() will act on
> all the stats related to that table. There should be no way to reset the vacuum
> stats independently, AFAICT.

I think it is not quite correct to do so.

Firstly, the total time of vacuum operation does not give you a complete
idea of ​​when vacuum did not work delay time. I have seen many reports
where vacuum spends very little time on cleaning relations and most of
the time just sleeping.

Secondly, where to put the total time of vacuum for indexes and
databases? It would be incorrect not to take them into account at all.
What if we remove the total time from the heap statistics and add it to
pg_stat_tables and only leave the vacuum statistics total time of vacuum
operation of indexes and databases? It seems strange to me that they
will have to be viewed from different views.

I think it is necessary to look at the total time for tables into
perspective of how much time vacuum spent in total on processing
indexes, since indexes can be bloated, for example. I think it is better
to leave these statistics here.

--
Regards,
Alena Rybakina
Postgres Professional

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Paul Jungwirth 2025-01-04 21:39:26 Re: SQL:2011 application time
Previous Message Michel Pelletier 2025-01-04 20:34:40 Re: Using Expanded Objects other than Arrays from plpgsql