POC: track vacuum/analyze cumulative time per relation

From: Sami Imseih <samimseih(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: POC: track vacuum/analyze cumulative time per relation
Date: 2025-01-02 18:24:06
Message-ID: CAA5RZ0uVOGBYmPEeGF2d1B_67tgNjKx_bKDuL+oUftuoz+=Y1g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

After a recent question regarding tracking vacuum start_time in
pg_stat_all_tables [1], it dawned on me that this view is missing
an important cumulative metric, which is how much time is spent
performing vacuum per table.

Currently, the only way a user can get this
information is if they enable autovacuum logging or track timing
for manual vacuums. Even then, if a user wants to trend
the time spent vacuuming over time, they must store the
timing data somewhere and perform the calculations.

Also, unless autovacuum logging is enabled for all a/v
operations, they could have gaps in their analysis.

Having the total (auto)vacuum elapsed time
along side the existing (auto)vaccum_count
allows a user to track the average time an
operating overtime and to find vacuum tuning
opportunities.

The same can also be said for (auto)analyze.

attached a patch ( without doc changes)
that adds 4 new columns:

total_autovacuum_time
total_vacuum_time
total_autoanalyze_time
total_analyze_time

Below is an example of output and how it
can be used to derive the average vacuum
operation time.

postgres=# select
relname,
autovacuum_count,
total_autovacuum_time,
total_autovacuum_time/NULLIF(autovacuum_count,0) average_autovac_time,
vacuum_count,
total_vacuum_time,
total_vacuum_time/NULLIF(vacuum_count,0) average_vac_time
from pg_catalog.pg_stat_all_tables
where relname = 'pgbench_history';
-[ RECORD 1 ]---------+-----------------
relname | pgbench_history
autovacuum_count | 3
total_autovacuum_time | 1689
average_autovac_time | 563
vacuum_count | 1
total_vacuum_time | 1
average_vac_time | 1

It should be noted that the timing is only tracked
when the vacuum or analyze operation completes,
as is the case with the other metrics.

Also, there is another discussion in-flight [2] regarding
tracking vacuum run history in a view, but this serves a
different purpose as this will provide all the metrics
that are other wise exposed in vacuum logging
via sql. This history will also be required to drop
entries using some criteria to keep the cache from
growing infinitely.

Feedback for the attached patch is appreciated!

Regards,

Sami Imseih
Amazon Web Services (AWS)

[1] https://www.postgresql.org/message-id/flat/CAGjGUAKQ4UBNdkjunH2qLsdUVG-3F9gCuG0Kb0hToo%2BuMmSteQ%40mail.gmail.com
[2] https://www.postgresql.org/message-id/flat/b68ab452-c41f-4d04-893f-eaab84f1855b%40vondra.me

Attachment Content-Type Size
POC-track-vacuum-analyze-cumulative-time-per-table.patch application/octet-stream 13.2 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Álvaro Herrera 2025-01-02 18:40:25 Re: Allow NOT VALID foreign key constraints on partitioned tables.
Previous Message Andrey M. Borodin 2025-01-02 18:12:37 Re: Switching XLog source from archive to streaming when primary available