Re: Enhancing Memory Context Statistics Reporting

From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Rahila Syed <rahilasyed90(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Enhancing Memory Context Statistics Reporting
Date: 2024-10-25 00:21:52
Message-ID: 2c54ef2a67a3da3a8bc15428ed85e58a@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2024-10-24 14:59, Rahila Syed wrote:
> Hi Torikoshia,
>
> Thank you for reviewing the patch!
>
> On Wed, Oct 23, 2024 at 9:28 AM torikoshia
> <torikoshia(at)oss(dot)nttdata(dot)com> wrote:
>
>> On 2024-10-22 03:24, Rahila Syed wrote:
>>> Hi,
>>>
>>> PostgreSQL provides following capabilities for reporting memory
>>> contexts statistics.
>>> 1. pg_get_backend_memory_contexts(); [1]
>>> 2. pg_log_backend_memory_contexts(pid); [2]
>>>
>>> [1] provides a view of memory context statistics for a local
>> backend,
>>> while [2] prints the memory context statistics of any backend or
>>> auxiliary
>>> process to the PostgreSQL logs. Although [1] offers detailed
>>> statistics,
>>> it is limited to the local backend, restricting its use to
>> PostgreSQL
>>> client backends only.
>>> On the other hand, [2] provides the statistics for all backends
>> but
>>> logs them in a file,
>>> which may not be convenient for quick access.
>>>
>>> I propose enhancing memory context statistics reporting by
>> combining
>>> these
>>> capabilities and offering a view of memory statistics for all
>>> PostgreSQL backends
>>> and auxiliary processes.
>>
>> Thanks for working on this!
>>
>> I originally tried to develop something like your proposal in [2],
>> but
>> there were some difficulties and settled down to implement
>> pg_log_backend_memory_contexts().
>
> Yes. I am revisiting this problem :)
>
>>> Attached is a patch that implements this functionality. It
>> introduces
>>> a SQL function
>>> that takes the PID of a backend as an argument, returning a set of
>>> records,
>>> each containing statistics for a single memory context. The
>> underlying
>>> C function
>>> sends a signal to the backend and waits for it to publish its
>> memory
>>> context statistics
>>> before returning them to the user. The publishing backend copies
>>> these statistics
>>> during the next CHECK_FOR_INTERRUPTS call.
>>
>> I remember waiting for dumping memory contexts stats could cause
>> trouble
>> considering some erroneous cases.
>>
>> For example, just after the target process finished dumping stats,
>> pg_get_remote_backend_memory_contexts() caller is terminated before
>> reading the stats, calling pg_get_remote_backend_memory_contexts()
>> has
>> no response any more:
>>
>> [session1]$ psql
>> (40699)=#
>>
>> $ kill -s SIGSTOP 40699
>>
>> [session2] psql
>> (40866)=# select * FROM
>> pg_get_remote_backend_memory_contexts('40699', false); -- waiting
>>
>> $ kill -s SIGSTOP 40866
>>
>> $ kill -s SIGCONT 40699
>>
>> [session3] psql
>> (47656) $ select pg_terminate_backend(40866);
>>
>> $ kill -s SIGCONT 40866 -- session2 terminated
>>
>> [session3] (47656)=# select * FROM
>> pg_get_remote_backend_memory_contexts('47656', false); -- no
>> response
>>
>> It seems the reason is memCtxState->in_use is now and
>> memCtxState->proc_id is 40699.
>> We can continue to use pg_get_remote_backend_memory_contexts() after
>>
>> specifying 40699, but it'd be hard to understand for users.
>
> Thanks for testing and reporting. While I am not able to reproduce
> this problem,
> I think this may be happening because the requesting backend/caller is
> terminated
> before it gets a chance to mark memCtxState->in_use as false.

Yeah, when I attached a debugger to 47656 when it was waiting on
pg_get_remote_backend_memory_contexts('47656', false),
memCtxState->in_use was true as you suspected:

(lldb) p memCtxState->in_use
(bool) $1 = true
(lldb) p memCtxState->proc_id
(int) $2 = 40699
(lldb) p pid
(int) $3 = 47656

> In this case memCtxState->in_use should be marked as
> 'false' possibly during the processing of ProcDiePending in
> ProcessInterrupts().
>
>>> This approach facilitates on-demand publication of memory
>> statistics
>>> for a specific backend, rather than collecting them at regular
>>> intervals.
>>> Since past memory context statistics may no longer be relevant,
>>> there is little value in retaining historical data. Any collected
>>> statistics
>>> can be discarded once read by the client backend.
>>>
>>> A fixed-size shared memory block, currently accommodating 30
>> records,
>>> is used to store the statistics. This number was chosen
>> arbitrarily,
>>> as it covers all parent contexts at level 1 (i.e., direct
>> children of
>>> the top memory context)
>>> based on my tests.
>>> Further experiments are needed to determine the optimal number
>>> for summarizing memory statistics.
>>>
>>> Any additional statistics that exceed the shared memory capacity
>>> are written to a file per backend in the PG_TEMP_FILES_DIR. The
>> client
>>> backend
>>> first reads from the shared memory, and if necessary, retrieves
>> the
>>> remaining data from the file,
>>> combining everything into a unified view. The files are cleaned up
>>> automatically
>>> if a backend crashes or during server restarts.
>>>
>>> The statistics are reported in a breadth-first search order of the
>>> memory context tree,
>>> with parent contexts reported before their children. This
>> provides a
>>> cumulative summary
>>> before diving into the details of each child context's
>> consumption.
>>>
>>> The rationale behind the shared memory chunk is to ensure that the
>>> majority of contexts which are the direct children of
>>> TopMemoryContext,
>>> fit into memory
>>> This allows a client to request a summary of memory statistics,
>>> which can be served from memory without the overhead of file
>> access,
>>> unless necessary.
>>>
>>> A publishing backend signals waiting client backends using a
>> condition
>>>
>>> variable when it has finished writing its statistics to memory.
>>> The client backend checks whether the statistics belong to the
>>> requested backend.
>>> If not, it continues waiting on the condition variable, timing out
>>> after 2 minutes.
>>> This timeout is an arbitrary choice, and further work is required
>> to
>>> determine
>>> a more practical value.
>>>
>>> All backends use the same memory space to publish their
>> statistics.
>>> Before publishing, a backend checks whether the previous
>> statistics
>>> have been
>>> successfully read by a client using a shared flag, "in_use."
>>> This flag is set by the publishing backend and cleared by the
>> client
>>> backend once the data is read. If a backend cannot publish due to
>>> shared
>>> memory being occupied, it exits the interrupt processing code,
>>> and the client backend times out with a warning.
>>>
>>> Please find below an example query to fetch memory contexts from
>> the
>>> backend
>>> with id '106114'. Second argument -'get_summary' is 'false',
>>> indicating a request for statistics of all the contexts.
>>>
>>> postgres=#
>>> select * FROM pg_get_remote_backend_memory_contexts('116292',
>> false)
>>> LIMIT 2;
>>> -[ RECORD 1 ]-+----------------------
>>> name | TopMemoryContext
>>> ident |
>>> type | AllocSet
>>> path | {0}
>>> total_bytes | 97696
>>> total_nblocks | 5
>>> free_bytes | 15376
>>> free_chunks | 11
>>> used_bytes | 82320
>>> pid | 116292
>>> -[ RECORD 2 ]-+----------------------
>>> name | RowDescriptionContext
>>> ident |
>>> type | AllocSet
>>> path | {0,1}
>>> total_bytes | 8192
>>> total_nblocks | 1
>>> free_bytes | 6912
>>> free_chunks | 0
>>> used_bytes | 1280
>>> pid | 116292
>>
>> 32d3ed8165f821f introduced 1-based path to
>> pg_backend_memory_contexts,
>> but pg_get_remote_backend_memory_contexts() seems to have 0-base
>> path.
>
> Right. I will change it to match this commit.
>
>> pg_backend_memory_contexts has "level" column, but
>> pg_get_remote_backend_memory_contexts doesn't.
>>
>> Are there any reasons for these?
>
> No particular reason, I can add this column as well.
>
> Thank you,
> Rahila Syed

--
Regards,

--
Atsushi Torikoshi
Seconded from NTT DATA GROUP CORPORATION to SRA OSS K.K.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Paul Jungwirth 2024-10-25 00:22:32 Re: SQL:2011 application time
Previous Message Greg Sabino Mullane 2024-10-25 00:13:28 Re: Changing the default random_page_cost value