Re: Enhancing Memory Context Statistics Reporting

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Rahila Syed <rahilasyed90(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Enhancing Memory Context Statistics Reporting
Date: 2024-10-22 06:48:39
Message-ID: ZxdKx0DywUTAvkEF@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Oct 21, 2024 at 11:54:21PM +0530, Rahila Syed wrote:
> On the other hand, [2] provides the statistics for all backends but logs
> them in a file, which may not be convenient for quick access.

To be precise, pg_log_backend_memory_contexts() pushes the memory
context stats to LOG_SERVER_ONLY or stderr, hence this is appended to
the server logs.

> A fixed-size shared memory block, currently accommodating 30 records,
> is used to store the statistics. This number was chosen arbitrarily,
> as it covers all parent contexts at level 1 (i.e., direct children of the
> top memory context)
> based on my tests.
> Further experiments are needed to determine the optimal number
> for summarizing memory statistics.

+ * Statistics are shared via fixed shared memory which
+ * can hold statistics for 29 contexts. The rest of the
[...]
+ MemoryContextInfo memctx_infos[30];
[...]
+ memset(&memCtxState->memctx_infos, 0, 30 * sizeof(MemoryContextInfo));
[...]
+ size = add_size(size, mul_size(30, sizeof(MemoryContextInfo)));
[...]
+ memset(&memCtxState->memctx_infos, 0, 30 * sizeof(MemoryContextInfo));
[...]
+ memset(&memCtxState->memctx_infos, 0, 30 * sizeof(MemoryContextInfo));

This number is tied to MemoryContextState added by the patch. Sounds
like this would be better as a constant properly defined rather than
hardcoded in all these places. This would make the upper-bound more
easily switchable in the patch.

+ Datum path[128];
+ char type[128];
[...]
+ char name[1024];
+ char ident[1024];
+ char type[128];
+ Datum path[128];

Again, constants. Why these values? You may want to use more
#defines here.

> Any additional statistics that exceed the shared memory capacity
> are written to a file per backend in the PG_TEMP_FILES_DIR. The client
> backend
> first reads from the shared memory, and if necessary, retrieves the
> remaining data from the file,
> combining everything into a unified view. The files are cleaned up
> automatically
> if a backend crashes or during server restarts.

Is the addition of the file to write any remaining stats really that
useful? This comes with a heavy cost in the patch with the "in_use"
flag, the various tweaks around the LWLock release/acquire protecting
the shmem area and the extra cleanup steps required after even a clean
restart. That's a lot of facility for this kind of information.
Another thing that may be worth considering is to put this information
in a DSM per the variable-size nature of the information, perhaps cap
it to a max to make the memory footprint cheaper, and avoid all
on-disk footprint because we don't need it to begin with as this is
information that makes sense only while the server is running.

Also, why the single-backend limitation? One could imagine a shared
memory area indexed similarly to pgproc entries, that includes
auxiliary processes as much as backends, so as it can be possible to
get more memory footprints through SQL for more than one single
process at one moment in time. If each backend has its own area of
shmem to deal with, they could use a shared LWLock on the shmem area
with an extra spinlock while the context data is dumped into memory as
the copy is short-lived. Each one of them could save the information
in a DSM created only when a dump of the shmem is requested for a
given PID, for example.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2024-10-22 07:28:18 Re: Pgoutput not capturing the generated columns
Previous Message Tom Lane 2024-10-22 06:41:16 Re: Fix C23 compiler warning