From: | Sami Imseih <samimseih(at)gmail(dot)com> |
---|---|
To: | wenhui qiu <qiuwenhuifx(at)gmail(dot)com> |
Cc: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, nickyang <nickyang905(at)gmail(dot)com> |
Subject: | Re: add vacuum starttime columns |
Date: | 2024-12-31 15:53:54 |
Message-ID: | CAA5RZ0sRmy59N7Ms4MdzOk=V+Cn0pWn4wRvha5z=AbQvnefpXw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> However, if the log_autovacuum_min_duration parameter is not configured (as I’ve encountered in many cases where this parameter is either not set or has an inappropriate value), we cannot determine the trigger time and duration of the operation.
log_autovacuum_min_duration logs when the vacuum completes
and not when it starts. It does provide the elapsed time in the log
as you say.
> Our idea is to directly query the pg_stat_all_tables view to obtain the start time of (auto)vacuum/(auto)analyze for a table. This would help us monitor the duration of vacuum/analyze and determine whether it is necessary to tune and speed up the vacuum/analyze process.
This is the crux of the question. I agree that pg_stat_all_tables is missing
information to allow someone to determine if the vacuums for a table
are becoming progressively longer to complete or to verify the vacuum
is faster after some tuning was performed.
The pg_stat_all_tables.vacuum(autovacuum)_count
does not help answer these questions and one must then enable
extra logging.
> but if there are many tables in the database, the vacuum entries in the logs might be quite numerous, making it difficult to analyze.
Agree
> However, this could still indicate whether a vacuum is in progress (if last_(auto)vacuum_start is more recent than last_(auto)vacuum, it means a vacuum is ongoing). While it is possible to monitor vacuum activity through the pg_stat_progress_vacuum view, this view itself does not record timestamps, so additional views might be needed, which would be less convenient than querying pg_stat_all_tables directly.
The progress view does a good job of providing the
phase of the vacuum which at least tells you if your
vacuum is not "stuck" on a single operation. It is not
really good at providing information to predict when
the vacuum will complete. It is very difficult to actually
make such a prediction.
IMO, I think this situation could be improved, but I am
not sure if a start_time is the best answer. Let's suppose
you set the start_time at the beginning of the vacuum,
then if the vacuum for some reason terminates early, then
you are left with some inconsistent data.
Thinking about this, it will be good to have cumulative statistics
for elapsed times of a all vacuums on a relation ( since last reset ).
This way, someone who is monitoring pg_stat_all_tables
can divide the total accumulated elapsed time by the vacuum counts.
This can give a good idea of how long vacuum is taking on a table
and with enough sampling over time, one can derive patterns
of vacuum slowing down, etc.
Of course, the time can only be accumulated when the vacuum
completes, which should be good enough.
Regards,
Sami Imseih
From | Date | Subject | |
---|---|---|---|
Next Message | Bernd Helmle | 2024-12-31 16:06:35 | Modern SHA2- based password hashes for pgcrypto |
Previous Message | Tom Lane | 2024-12-31 15:43:38 | Re: Backport of CVE-2024-10978 fix to older pgsql versions (11, 9.6, and 9.4) |