Re: POC: track vacuum/analyze cumulative time per relation

From: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: Sami Imseih <samimseih(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: track vacuum/analyze cumulative time per relation
Date: 2025-01-13 15:55:50
Message-ID: Z4U3hkWuEgByhGgJ@ip-10-97-1-34.eu-west-3.compute.internal
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Fri, Jan 10, 2025 at 04:26:05PM -0600, Sami Imseih wrote:
> {
> /* time is already in msec, just convert to double for presentation */
> PG_RETURN_FLOAT8((double)
> pgstat_fetch_stat_checkpointer()->write_time);
> }
>
> > +
> > + PgStat_Counter total_vacuum_time; /* user initiated vacuum */
> > + PgStat_Counter total_autovacuum_time; /* autovacuum initiated */
> > + PgStat_Counter total_analyze_time; /* user initiated vacuum */
> > + PgStat_Counter total_autoanalyze_time; /* autovacuum initiated */
> >
> > Those comments look weird to me.
> >
>
> Ok, Will remove these.
>
> I also updated the comments for the instrument code path to reflect the
> fact starttime is now set for all cases.Also, added documentation.
>
> See the attached v2.

Thanks for the patch update!

A few random comments:

=== 1

+/* pg_stat_get_vacuum_time */
+PG_STAT_GET_RELENTRY_FLOAT8(total_vacuum_time)
+
+/* pg_stat_get_autovacuum_time */
+PG_STAT_GET_RELENTRY_FLOAT8(total_autovacuum_time)
+
+/* pg_stat_get_analyze_time */
+PG_STAT_GET_RELENTRY_FLOAT8(total_analyze_time)
+
+/* pg_stat_get_autoanalyze_time */
+PG_STAT_GET_RELENTRY_FLOAT8(total_autoanalyze_time)
+

The comments do not reflect the function names ("total" is missing to give
pg_stat_get_total_vacuum_time() and such).

=== 2

+#define PG_STAT_GET_RELENTRY_FLOAT8(stat)
+Datum
+CppConcat(pg_stat_get_,stat)(PG_FUNCTION_ARGS)
+{
+ Oid relid = PG_GETARG_OID(0);
+ int64 result;
+ PgStat_StatTabEntry *tabentry;
+
+ if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL)
+ result = 0;
+ else
+ result = (float8) (tabentry->stat);
+
+ PG_RETURN_FLOAT8(result);
+}
+
/* pg_stat_get_analyze_count */
PG_STAT_GET_RELENTRY_INT64(analyze_count

I think it's better to define the macro just before its first usage (meaning
just after pg_stat_get_vacuum_count()): that would be consistent with the places
the other macros are defined.

=== 3

+ int64 result;

s/int64/double/?

=== 4

+ Total time this table has spent in manual vacuum
+ </para></entry>

Mention the unit?

=== 5

+ /*
+ * When verbose or autovacuum logging is used, initialize a resource usage
+ * snapshot and optionally track I/O timing.
+ */
if (instrument)
{

Out of curiosity, why this extra comment? To be somehow consistent with
do_analyze_rel()?

=== 6

@@ -343,6 +349,8 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
pgstat_progress_start_command(PROGRESS_COMMAND_VACUUM,
RelationGetRelid(rel));

+ starttime = GetCurrentTimestamp();

I wonder if it wouldn't make more sense to put the GetCurrentTimestamp() call
before the pgstat_progress_start_command() call. That would be aligned with the
"endtime" being after the pgstat_progress_end_command() and where it was before
the patch.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2025-01-13 16:04:31 Re: pgbench error: (setshell) of script 0; execution of meta-command failed
Previous Message Chiranmoy.Bhattacharya@fujitsu.com 2025-01-13 15:48:49 Re: [PATCH] Hex-coding optimizations using SVE on ARM.