Re: PoC: history of recent vacuum/checkpoint runs (using new hooks)

From: wenhui qiu <qiuwenhuifx(at)gmail(dot)com>
To: Tomas Vondra <tomas(at)vondra(dot)me>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PoC: history of recent vacuum/checkpoint runs (using new hooks)
Date: 2024-12-23 06:35:04
Message-ID: CAGjGUAJSg3tJGxjeyU6T98ZYCXcAiGpq0eitAUQVaOa+nULDuA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Tomas
This is a great feature.
+ /*
+ * Define (or redefine) custom GUC variables.
+ */
+ DefineCustomIntVariable("stats_history.size",
+ "Sets the amount of memory available for past events.",
+ NULL,
+ &statsHistorySizeMB,
+ 1,
+ 1,
+ 128,
+ PGC_POSTMASTER,
+ GUC_UNIT_MB,
+ NULL,
+ NULL,
+ NULL);
+
RAM is in terabytes now, the statsHistorySize is 128MB ,I think can
increase to store more history record ?

Thanks

On Sun, Dec 22, 2024 at 4:28 AM Tomas Vondra <tomas(at)vondra(dot)me> wrote:

> Hi,
>
> Our runtime stats system is great, but it only keeps a snapshot of
> cumulative stats / snapshot. And while that works for user workloads, it
> may not be quite sufficient when analyzing maintenance operations like
> vacuum/checkpoint, etc.
>
> For those operations it's useful to have information about individual
> runs, not just the cumulative counters (or even deltas between regular
> snapshots). There's also the issue that we only keep a very limited
> subset of available information - just look at the info included in
> VACUUM VERBOSE or with log_checkpoints=on, and how little of that is
> available in pg_stats_. For vacuum we have the vacuum/analyze counts,
> and timestamp of the last operation, but that's it. VACUUM VERBOSE
> provides way more information, but we can only guess based on the
> regular pgstat counters.
>
> Yes, we can get this info written to server log using log_checkpoints
> and log_autovacuum_min_duration (AFAIK there's no way to ensure logging
> for manual VACUUM). But processing this information is not particularly
> convenient, as it requires parsing the log, the message format is
> suitable more for humans, etc. And it'd be very convenient to be able to
> query this information, just like the other pgstat catalogs.
>
> I wonder if we might/should do two things, to improve this:
>
> 1) Introduce hooks that allow doing some custom stuff with info about
> those actions, after logging it. The attached 0001 and 0002 patches do
> this for vacuum and checkpoint.
>
> 2) Allow keeping information about events. The 0003 patch does that in
> an extension, leveraging the new hooks, but it'd certainly possible to
> do in core too.
>
> I realize our current pgstat collector is keeping per-object stats, not
> per-event. We might add this to per-object stats (e.g. each table would
> have stats about vacuum runs), but that wouldn't work for checkpoints.
> There's also the question of memory consumption - I'm sure we don't want
> to keep infinite history of vacuum runs, for example.
>
> So the extension simply maintains a fixed-size circular queue, i.e. when
> it gets full it starts evicting oldest entries. 1MB is enough for
> storing ~4k VACUUM runs - I'm sure it can be made more compact.
>
> I don't think there's a better way to do this. I've considered if this
> might be done using emit_log_hook, but (a) that only helps when we end
> up logging the event (and I'd like to do this always), and (b) it'd
> require parsing the server log. So it's not much better than just doing
> that, I think ...
>
>
> Opinions?
>
> --
> Tomas Vondra
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-12-23 06:50:17 Re: transaction lost when delete clog file after normal shutdown
Previous Message 2024-12-23 06:26:44 Re: Re: Re: transaction lost when delete clog file after normal shutdown