Re: Pluggable cumulative statistics

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Pluggable cumulative statistics
Date: 2024-06-20 00:46:30
Message-ID: ZnN75rx5Gb5otGCn@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jun 13, 2024 at 04:59:50PM +0900, Michael Paquier wrote:
> - How should the persistence of the custom stats be achieved?
> Callbacks to give custom stats kinds a way to write/read their data,
> push everything into a single file, or support both?
> - Should this do like custom RMGRs and assign to each stats kinds ID
> that are set in stone rather than dynamic ones?

These two questions have been itching me in terms of how it would work
for extension developers, after noticing that custom RMGRs are used
more than I thought:
https://wiki.postgresql.org/wiki/CustomWALResourceManagers

The result is proving to be nicer, shorter by 300 lines in total and
much simpler when it comes to think about the way stats are flushed
because it is possible to achieve the same result as the first patch
set without manipulating any of the code paths doing the read and
write of the pgstats file.

In terms of implementation, pgstat.c's KindInfo data is divided into
two parts, for efficiency:
- The exiting in-core stats with designated initializers, renamed as
built-in stats kinds.
- The custom stats kinds are saved in TopMemoryContext, and can only
be registered with shared_preload_libraries. The patch reserves a set
of 128 harcoded slots for all the custom kinds making the lookups for
the KindInfos quite cheap. Upon registration, a custom stats kind
needs to assign a unique ID, with uniqueness on the names and IDs
checked at registration.

The backend code does ID -> information lookups in the hotter paths,
meaning that the code only checks if an ID is built-in or custom, then
redirects to the correct array where the information is stored.
There is one code path that does a name -> information lookup for the
undocumented SQL function pg_stat_have_stats() used in the tests,
which is a bit less efficient now, but that does not strike me as an
issue.

modules/injection_points/ works as previously as a template to show
how to use these APIs, with tests for the whole.

With that in mind, the patch set is more pleasant to the eye, and the
attached v2 consists of:
- 0001 and 0002 are some cleanups, same as previously to prepare for
the backend-side APIs.
- 0003 adds the backend support to plug-in custom stats.
- 0004 includes documentation.
- 0005 is an example of how to use them, with a TAP test providing
coverage.

Note that the patch I've proposed to make stats persistent at
checkpoint so as we don't discard everything after a crash is able to
work with the custom stats proposed on this thread:
https://commitfest.postgresql.org/48/5047/
--
Michael

Attachment Content-Type Size
v2-0001-Switch-PgStat_Kind-from-enum-to-uint32.patch text/x-diff 3.1 KB
v2-0002-Replace-hardcoded-identifiers-in-pgstats-file-by-.patch text/x-diff 2.6 KB
v2-0003-Introduce-pluggable-APIs-for-Cumulative-Statistic.patch text/x-diff 13.6 KB
v2-0004-doc-Add-section-for-Custom-Cumulative-Statistics-.patch text/x-diff 3.3 KB
v2-0005-injection_points-Add-statistics-for-custom-points.patch text/x-diff 14.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2024-06-20 01:05:35 Re: DOCS: Generated table columns are skipped by logical replication
Previous Message Tom Lane 2024-06-19 23:51:52 Re: suspicious valgrind reports about radixtree/tidstore on arm64