Re: add vacuum starttime columns

From: wenhui qiu <qiuwenhuifx(at)gmail(dot)com>
To: Sami Imseih <samimseih(at)gmail(dot)com>
Cc: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, nickyang <nickyang905(at)gmail(dot)com>
Subject: Re: add vacuum starttime columns
Date: 2024-12-31 07:32:59
Message-ID: CAGjGUA+osstu7Xqjxr_QPhU_GmmbDHtt9CK=ju2_=ZDJUGdxMQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Sami
Many people have encountered situations where autovacuum or auto analyze on
tables are not triggered in time, leading to suboptimal execution plans and
some performance issues. When analyzing such problems, we often need to
trace back to when autovacuum or auto analyze was triggered for the
corresponding table. However, if the log_autovacuum_min_duration parameter
is not configured (as I’ve encountered in many cases where this parameter
is either not set or has an inappropriate value), we cannot determine the
trigger time and duration of the operation.

Our idea is to directly query the pg_stat_all_tables view to obtain the
start time of (auto)vacuum/(auto)analyze for a table. This would help us
monitor the duration of vacuum/analyze and determine whether it is
necessary to tune and speed up the vacuum/analyze process.

Of course, to observe the duration of vacuum operations, we can configure
the log_autovacuum_min_durationparameter, but if there are many tables in
the database, the vacuum entries in the logs might be quite numerous,
making it difficult to analyze.

Additionally, we are currently considering whether it would be possible to
add a last_(auto)vacuum_start field to the pg_stat_all_tables view. For
tables where a vacuum operation is in progress, the last_(auto)vacuum field
may not be updated, and it may not be possible to estimate the vacuum
duration using just these two fields.

However, this could still indicate whether a vacuum is in progress (if
last_(auto)vacuum_start is more recent than last_(auto)vacuum, it means a
vacuum is ongoing). While it is possible to monitor vacuum activity through
the pg_stat_progress_vacuum view, this view itself does not record
timestamps, so additional views might be needed, which would be less
convenient than querying pg_stat_all_tables directly.

Therefore, we personally believe that adding such fields would be
beneficial for monitoring the execution of (auto)vacuum and (auto)analyze.

Thanks

On Tue, Dec 31, 2024 at 12:40 PM Sami Imseih <samimseih(at)gmail(dot)com> wrote:

> The last_(auto)vacuum is useful because it allows
> the user to monitor vacuum to ensure that vacuums
> are completing on a relation at expected intervals.
> I am not sure what value a start time will provide.
> Can you provide a reason for this?
>
> Regards,
>
> Sami Imseih
>
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bertrand Drouvot 2024-12-31 07:53:18 Re: Autovacuum giving up on tables after crash because of lack of stats
Previous Message Nisha Moond 2024-12-31 07:05:10 Re: Introduce XID age and inactive timeout based replication slot invalidation