Re: PoC: history of recent vacuum/checkpoint runs (using new hooks)

From: Tomas Vondra <tomas(at)vondra(dot)me>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: wenhui qiu <qiuwenhuifx(at)gmail(dot)com>, Robert Treat <rob(at)xzilla(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PoC: history of recent vacuum/checkpoint runs (using new hooks)
Date: 2024-12-31 15:06:26
Message-ID: 7be212d2-6212-4878-b439-b141ecba2acb@vondra.me
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12/31/24 02:06, Michael Paquier wrote:
> On Sat, Dec 28, 2024 at 02:25:16AM +0100, Tomas Vondra wrote:
>> And the more I think about it the more I'm convinced we don't need to
>> keep the data about past runs in memory, a file should be enough (except
>> maybe for a small buffer). That would mean we don't need to worry about
>> dynamic shared memory, etc. I initially rejected this because it seemed
>> like a regression to how pgstat worked initially (sharing data through
>> files), but I don't think that's actually true - this data is different
>> (almost append-only), etc.
>
> Right, I was looking a bit at 0003 that introduces the extension. I
> am wondering if we are not repeating the errors of pgss by using a
> different file, and if we should not just use pgstats and its single
> file instead to store this data through an extension. You are right
> that as an append-only pattern using the dshash of pgstat does not fit
> well into this picture. How about the second type of stats kinds:
> the fixed-numbered stats kind? These allocate a fixed amount of
> shared memory, meaning that you could allocate N entries of history
> and just manage a queue of them, then do a memcpy() of the whole set
> if adding new history at the head of the queue, or just append new
> ones at the tail of the queue in shmem, memcpy() once the queue is
> full. The extension gets simpler:
> - No need to manage a new file, flush of the stats is controlled by
> pgstats itself.
> - The extension could register a fixed-numbered custom stats kind.

I'm not against leveraging some of the existing pstat infrastructure,
but as I explained earlier I don't like the "fixed amount of shmem"
approach. It's either wasteful (on machines with few vacuum runs) or
difficult to configure to keep enough history.

FWIW I'm not sure what you mean by "pgss errors" - I'm not saying it's
flawless, but OTOH reading/writing a flat file with entries is not
particularly hard. (It'd need to be a bit more complex to evict stuff
for dropped objects, but not by much).

> - Push the stats with the new hooks.
> - Perhaps less useful, but it is possible to control the timing where
> the data is pushed.

Not sure what you mean by "pushed". Pushed where, or how would that matter?

> - Add SQL wrappers on top to fetch the data from pgstat.

OK, although I think this is not particularly complicated part of the
code. It could even be simplified further.

regards

--
Tomas Vondra

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2024-12-31 15:20:48 Re: PoC: history of recent vacuum/checkpoint runs (using new hooks)
Previous Message Andrew Dunstan 2024-12-31 12:36:23 Re: Backport of CVE-2024-10978 fix to older pgsql versions (11, 9.6, and 9.4)