Enhancing Memory Context Statistics Reporting

From: Rahila Syed <rahilasyed90(at)gmail(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Enhancing Memory Context Statistics Reporting
Date: 2024-10-21 18:24:21
Message-ID: CAH2L28v8mc9HDt8QoSJ8TRmKau_8FM_HKS41NeO9-6ZAkuZKXw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

PostgreSQL provides following capabilities for reporting memory contexts
statistics.
1. pg_get_backend_memory_contexts(); [1]
2. pg_log_backend_memory_contexts(pid); [2]

[1] provides a view of memory context statistics for a local backend,
while [2] prints the memory context statistics of any backend or auxiliary
process to the PostgreSQL logs. Although [1] offers detailed statistics,
it is limited to the local backend, restricting its use to PostgreSQL
client backends only.
On the other hand, [2] provides the statistics for all backends but logs
them in a file,
which may not be convenient for quick access.

I propose enhancing memory context statistics reporting by combining these
capabilities and offering a view of memory statistics for all PostgreSQL
backends
and auxiliary processes.

Attached is a patch that implements this functionality. It introduces a SQL
function
that takes the PID of a backend as an argument, returning a set of records,
each containing statistics for a single memory context. The underlying C
function
sends a signal to the backend and waits for it to publish its memory
context statistics
before returning them to the user. The publishing backend copies these
statistics
during the next CHECK_FOR_INTERRUPTS call.

This approach facilitates on-demand publication of memory statistics
for a specific backend, rather than collecting them at regular intervals.
Since past memory context statistics may no longer be relevant,
there is little value in retaining historical data. Any collected
statistics
can be discarded once read by the client backend.

A fixed-size shared memory block, currently accommodating 30 records,
is used to store the statistics. This number was chosen arbitrarily,
as it covers all parent contexts at level 1 (i.e., direct children of the
top memory context)
based on my tests.
Further experiments are needed to determine the optimal number
for summarizing memory statistics.

Any additional statistics that exceed the shared memory capacity
are written to a file per backend in the PG_TEMP_FILES_DIR. The client
backend
first reads from the shared memory, and if necessary, retrieves the
remaining data from the file,
combining everything into a unified view. The files are cleaned up
automatically
if a backend crashes or during server restarts.

The statistics are reported in a breadth-first search order of the memory
context tree,
with parent contexts reported before their children. This provides a
cumulative summary
before diving into the details of each child context's consumption.

The rationale behind the shared memory chunk is to ensure that the
majority of contexts which are the direct children of TopMemoryContext,
fit into memory
This allows a client to request a summary of memory statistics,
which can be served from memory without the overhead of file access,
unless necessary.

A publishing backend signals waiting client backends using a condition
variable when it has finished writing its statistics to memory.
The client backend checks whether the statistics belong to the requested
backend.
If not, it continues waiting on the condition variable, timing out after 2
minutes.
This timeout is an arbitrary choice, and further work is required to
determine
a more practical value.

All backends use the same memory space to publish their statistics.
Before publishing, a backend checks whether the previous statistics have
been
successfully read by a client using a shared flag, "in_use."
This flag is set by the publishing backend and cleared by the client
backend once the data is read. If a backend cannot publish due to shared
memory being occupied, it exits the interrupt processing code,
and the client backend times out with a warning.

Please find below an example query to fetch memory contexts from the backend
with id '106114'. Second argument -'get_summary' is 'false',
indicating a request for statistics of all the contexts.

postgres=#
select * FROM pg_get_remote_backend_memory_contexts('116292', false) LIMIT
2;
-[ RECORD 1 ]-+----------------------
name | TopMemoryContext
ident |
type | AllocSet
path | {0}
total_bytes | 97696
total_nblocks | 5
free_bytes | 15376
free_chunks | 11
used_bytes | 82320
pid | 116292
-[ RECORD 2 ]-+----------------------
name | RowDescriptionContext
ident |
type | AllocSet
path | {0,1}
total_bytes | 8192
total_nblocks | 1
free_bytes | 6912
free_chunks | 0
used_bytes | 1280
pid | 116292

TODO:
1. Determine the behaviour when the statistics don't fit in one file.

*[1] **PostgreSQL: Re: Creating a function for exposing memory usage of
backend process
<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.postgresql.org%2Fmessage-id%2F0a768ae1-1703-59c7-86cc-7068ff5e318c%2540oss.nttdata.com&data=05%7C02%7Csyedrahila%40microsoft.com%7C3b35e97c29cf4796042408dcee8a4dbb%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638647525436604911%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=cbO2DBP6IsgMPTEVFNh%2FKeq4IoK3MZvTpzKkCQzNPMo%3D&reserved=0>*

[2] *PostgreSQL: Re: Get memory contexts of an arbitrary backend process
<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.postgresql.org%2Fmessage-id%2Fbea016ad-d1a7-f01d-a7e8-01106a1de77f%2540oss.nttdata.com&data=05%7C02%7Csyedrahila%40microsoft.com%7C3b35e97c29cf4796042408dcee8a4dbb%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638647525436629740%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=UCwkwg6kikVEf0oHf3%2BlliA%2FTUdMG%2F0cOiMta7fjPPk%3D&reserved=0>*

Thank you,
Rahila Syed

Attachment Content-Type Size
0001-Function-to-report-memory-context-stats-of-any-backe.patch application/octet-stream 33.8 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Lakhin 2024-10-21 19:00:00 Re: race condition in pg_class
Previous Message Heikki Linnakangas 2024-10-21 17:32:33 Re: CSN snapshots in hot standby