Re: Trim the heap free memory

From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: shawn wang <shawn(dot)wang(dot)pg(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Trim the heap free memory
Date: 2024-08-26 11:04:49
Message-ID: CAExHW5uDqsh2u8r52YGzT+PguFLh8dsZZrpXoJ9HQa7AxAkQ4Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Shawn,
It will be good to document usage of this function. Please add
document changes in your patch. We need to document the impact of this
function so that users can judiciously decide whether or not to use
this function and under what conditions. Also they would know what to
expect when they use this function.

Running it after a query finishes is one thing but that can't be
guaranteed because of the asynchronous nature of signal handlers.
malloc_trim() may be called while a query is being executed. We need
to assess that impact as well.

Can you please share some numbers - TPS, latency etc. with and without
this function invoked during a benchmark run?

--
Best Wishes,
Ashutosh Bapat

On Sat, Aug 24, 2024 at 8:12 AM shawn wang <shawn(dot)wang(dot)pg(at)gmail(dot)com> wrote:
>
> Hi Ashutosh, thank you for your response.
> Firstly, the purpose of caching memory in malloc is for performance, so when we execute malloc_trim(), it will affect the efficiency of memory usage in the subsequent operation. Secondly, the function of malloc_trim() is to lock and traverse the bins, then execute madvise on the memory that can be released. When there is a lot of memory in the bins, the traversal time will also increase. I once placed malloc_trim() to execute at the end of each query, which resulted in a 20% performance drop. Therefore, I use it as such a function. The new v2 patch has included the omitted code.
>
> Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> 于2024年8月23日周五 20:02写道:
>>
>> Hi Shawn,
>>
>>
>> On Fri, Aug 23, 2024 at 2:24 PM shawn wang <shawn(dot)wang(dot)pg(at)gmail(dot)com> wrote:
>> >
>> > Hi hackers,
>> >
>> > Currently, all processes in PostgreSQL actually use malloc to allocate and free memory. In the case of long connections where business queries are executed over extended periods, the distribution of memory can become extremely complex.
>> >
>> > Under certain circumstances, a common issue in memory usage due to the caching strategy of malloc may arise: even if memory is released through the free function, it may not be returned to the OS in a timely manner. This can lead to high system memory usage, affecting performance and the operation of other applications, and may even result in Out-Of-Memory (OOM) errors.
>> >
>> > To address this issue, I have developed a new function called pg_trim_backend_heap_free_memory, based on the existing pg_log_backend_memory_contexts function. This function triggers the specified process to execute the malloc_trim operation by sending signals, thereby releasing as much unreturned memory to the operating system as possible. This not only helps to optimize memory usage but can also significantly enhance system performance under memory pressure.
>> >
>> > Here is an example of using the pg_trim_backend_heap_free_memory function to demonstrate its effect:
>> >>
>> >> CREATE OR REPLACE FUNCTION public.partition_create(schemaname character varying, numberofpartition integer)
>> >> RETURNS integer
>> >> LANGUAGE plpgsql
>> >> AS $function$
>> >> declare
>> >> currentTableId integer;
>> >> currentSchemaName varchar(100);
>> >> currentTableName varchar(100);
>> >> begin
>> >> execute 'create schema ' || schemaname;
>> >> execute 'create table ' || schemaname || '.' || schemaname || 'hashtable (p1 text, p2 text, p3 text, p4 int, p5 int, p6 int, p7 int, p8 text, p9 name, p10 varchar, p11 text, p12 text, p13 text) PARTITION BY HASH(p1);';
>> >> currentTableId := 1;
>> >> loop
>> >> currentTableName := schemaname || '.' || schemaname || 'hashtable' || ltrim(currentTableId::varchar(10));
>> >> execute 'create table ' || currentTableName || ' PARTITION OF ' || schemaname || '.' || schemaname || 'hashtable' || ' FOR VALUES WITH(MODULUS ' || numberofpartition || ', REMAINDER ' || currentTableId - 1 || ')';
>> >> currentTableId := currentTableId + 1;
>> >> if (currentTableId > numberofpartition) then exit; end if;
>> >> end loop;
>> >> return currentTableId - 1;
>> >> END $function$;
>> >>
>> >> select public.partition_create('test3', 5000);
>> >> select public.partition_create('test4', 5000);
>> >> select count(*) from test4.test4hashtable a, test3.test3hashtable b where a.p1=b.p1;
>> >
>> > You are now about to see the memory size of the process executing the query.
>> >>
>> >> postgres 68673 1.2 0.0 610456 124768 ? Ss 08:25 0:01 postgres: postgres postgres [local] idle
>> >> Size: 89600 kB
>> >> KernelPageSize: 4 kB
>> >> MMUPageSize: 4 kB
>> >> Rss: 51332 kB
>> >> Pss: 51332 kB
>> >>
>> >> 02b65000-082e5000 rw-p 00000000 00:00 0 [heap]
>> >
>> >
>> >
>> > After use pg_trim_backend_heap_free_memory, you will see:
>> >>
>> >> postgres=# select pg_trim_backend_heap_free_memory(pg_backend_pid());
>> >> 2024-08-23 08:27:53.958 UTC [68673] LOG: trimming heap free memory of PID 68673
>> >> pg_trim_backend_heap_free_memory
>> >> ----------------------------------
>> >> t
>> >> (1 row)
>> >> 02b65000-082e5000 rw-p 00000000 00:00 0 [heap]
>> >> Size: 89600 kB
>> >> KernelPageSize: 4 kB
>> >> MMUPageSize: 4 kB
>> >> Rss: 4888 kB
>> >> Pss: 4888 kB
>> >>
>> >> postgres 68673 1.2 0.0 610456 75244 ? Ss 08:26 0:01 postgres: postgres postgres [local] idle
>> >
>> >
>> > Looking forward to your feedback,
>> Looks useful.
>>
>> How much time does malloc_trim() take to finish? Does it affect the
>> current database activity in that backend? It may be good to see
>> effect of this function by firing the function on random backends
>> while the query is running through pgbench.
>>
>> In the patch I don't see definitions of
>> ProcessTrimHeapFreeMemoryInterrupt() and
>> HandleTrimHeapFreeMemoryInterrupt(). Am I missing something?
>>
>> --
>> Best Wishes,
>> Ashutosh Bapat

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2024-08-26 11:05:02 Re: Introduce XID age and inactive timeout based replication slot invalidation
Previous Message Andy Fan 2024-08-26 11:01:23 Re: New function normal_rand_array function to contrib/tablefunc.