Re: POC: track vacuum/analyze cumulative time per relation

From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: track vacuum/analyze cumulative time per relation
Date: 2025-01-10 22:26:05
Message-ID: CAA5RZ0tP6my7Mbr_syP7hRAiFANC_bG8SyoQF=-vfV9VawpV=g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks for the review!

> === 1
>
> + endtime = GetCurrentTimestamp();
> pgstat_report_vacuum(RelationGetRelid(rel),
> rel->rd_rel->relisshared,
> Max(vacrel->new_live_tuples, 0),
> vacrel->recently_dead_tuples +
> - vacrel->missed_dead_tuples);
> + vacrel->missed_dead_tuples,
> + starttime,
> + endtime);
> pgstat_progress_end_command();
>
> if (instrument)
> {
> - TimestampTz endtime = GetCurrentTimestamp();
>
> What about keeping the endtime assignment after the pgstat_progress_end_command()
> call? I think that it makes more sense that way.

Yes, that can be done. I see a few more examples in
which we report stats after end_command, i.e.
AbortTransaction

> What about doing the elapsedtime computation prior the this call and passed
> it as a parameter (then remove the starttime one and keep the endtime as it
> is needed)?

Yes, makes sense. It will keep the changes to pgstat_report_ to a minimum.

> === 4
>
> +/* pg_stat_get_vacuum_time */
> +PG_STAT_GET_RELENTRY_INT64(total_vacuum_time)
> +
> +/* pg_stat_get_autovacuum_time */
> +PG_STAT_GET_RELENTRY_INT64(total_autovacuum_time)
> +
> +/* pg_stat_get_analyze_time */
> +PG_STAT_GET_RELENTRY_INT64(total_analyze_time)
> +
> +/* pg_stat_get_autoanalyze_time */
> +PG_STAT_GET_RELENTRY_INT64(total_autoanalyze_time)
>
> I wonder if it wouldn't be better to use FLOAT8 here (to match things like
> pg_stat_get_checkpointer_write_time(), pg_stat_get_checkpointer_sync_time() among
> others).
>
> === 5

I see the checkpointer write_time for example is converted
to double "for presentation". I am not sure why, expect for maybe
historical reasons. I will do the same for this new cumulative
time field for the sake of consistency. There is no macro for
relentry to return a stat as FLOAT8, so adding it as part of this
patch.

Datum
pg_stat_get_checkpointer_write_time(PG_FUNCTION_ARGS)
{
/* time is already in msec, just convert to double for presentation */
PG_RETURN_FLOAT8((double)
pgstat_fetch_stat_checkpointer()->write_time);
}

> +
> + PgStat_Counter total_vacuum_time; /* user initiated vacuum */
> + PgStat_Counter total_autovacuum_time; /* autovacuum initiated */
> + PgStat_Counter total_analyze_time; /* user initiated vacuum */
> + PgStat_Counter total_autoanalyze_time; /* autovacuum initiated */
>
> Those comments look weird to me.
>

Ok, Will remove these.

I also updated the comments for the instrument code path to reflect the
fact starttime is now set for all cases.Also, added documentation.

See the attached v2.

Regards,

Sami

Attachment Content-Type Size
v2-0001-Track-per-relation-cumulative-time-spent-in-vacuu.patch application/octet-stream 16.4 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Kane 2025-01-10 22:26:50 Restore support for USE_ASSERT_CHECKING in extensions only
Previous Message Melanie Plageman 2025-01-10 22:09:22 Re: Moving the vacuum GUCs' docs out of the Client Connection Defaults section