Re: Pluggable cumulative statistics

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Pluggable cumulative statistics
Date: 2024-06-21 04:28:11
Message-ID: ZnUBW0fJ6dcLG-zR@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jun 21, 2024 at 01:09:10PM +0900, Kyotaro Horiguchi wrote:
> At Thu, 13 Jun 2024 16:59:50 +0900, Michael Paquier <michael(at)paquier(dot)xyz> wrote in
>> * The kind IDs may change across restarts, meaning that any stats data
>> associated to a custom kind is stored with the *name* of the custom
>> stats kind. Depending on the discussion happening here, I'd be open
>> to use the same concept as custom RMGRs, where custom kind IDs are
>> "reserved", fixed in time, and tracked in the Postgres wiki. It is
>> cheaper to store the stats this way, as well, while managing conflicts
>> across extensions available in the community ecosystem.
>
> I prefer to avoid having a central database if possible.

I was thinking the same originally, but the experience with custom
RMGRs has made me change my mind. There are more of these than I
thought originally:
https://wiki.postgresql.org/wiki/CustomWALResourceManagers

> If we don't intend to move stats data alone out of a cluster for use
> in another one, can't we store the relationship between stats names
> and numeric IDs (or index numbers) in a separate file, which is loaded
> just before and synced just after extension preloading finishes?

Yeah, I've implemented a prototype that does exactly something like
that with a restriction on the stats name to NAMEDATALEN, except that
I've added the kind ID <-> kind name mapping at the beginning of the
main stats file. At the end, it still felt weird and over-engineered
to me, like the v1 prototype of upthread, because we finish with a
strange mix when reloading the dshash where the builtin ID are handled
with fixed values, with more code paths required when doing the
serialize callback dance for stats kinds like replication slots,
because the custom kinds need to update their hash keys to the new
values based on the ID/name mapping stored at the beginning of the
file itself.

The equation complicates itself a bit more once you'd try to add more
ways to write some stats kinds to other places, depending on what a
custom kind wants to achieve. I can see the benefits of both
approaches, still fixing the IDs in time leads to a lot of simplicity
in this infra, which is very appealing on its own before tackling the
next issues where I would rely on the proposed APIs.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2024-06-21 04:39:48 Re: ON ERROR in json_query and the like
Previous Message Pavel Stehule 2024-06-21 04:22:42 Re: ON ERROR in json_query and the like