Re: Add the ability to limit the amount of memory that can be allocated to backends.

From: "Anton A(dot) Melnikov" <a(dot)melnikov(at)postgrespro(dot)ru>
To: Tomas Vondra <tomas(at)vondra(dot)me>, James Hunter <james(dot)hunter(dot)pg(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, Stephen Frost <sfrost(at)snowman(dot)net>, reid(dot)thompson(at)crunchydata(dot)com, Arne Roland <A(dot)Roland(at)index(dot)de>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, vignesh C <vignesh21(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, "stephen(dot)frost" <stephen(dot)frost(at)crunchydata(dot)com>
Subject: Re: Add the ability to limit the amount of memory that can be allocated to backends.
Date: 2024-12-28 12:36:37
Message-ID: 600384a7-09bc-41f8-a38c-2f3d2195054b@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi!

On 28.12.2024 04:48, Tomas Vondra wrote:
> On 12/27/24 20:14, James Hunter wrote:
>> Reviving this thread, because I am thinking about something related --
>> please ignore the "On Fri, Dec 27, 2024" date, this seems to be an
>> artifact of me re-sending the message, from the list archive. The
>> original message was from January 28, 2024.
>>
>> On Fri, Dec 27, 2024 at 11:02 AM Tomas Vondra
>> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>>>
>>> Firstly, I agree with the goal of having a way to account for memory
>>> used by the backends, and also ability to enforce some sort of limit.
>>> It's difficult to track the memory at the OS level (interpreting RSS
>>> values is not trivial), and work_mem is not sufficient to enforce a
>>> backend-level limit, not even talking about a global limit.
>>>
>>> But as I said earlier, it seems quite strange to start by introducing
>>> some sort of global limit, combining memory for all backends. I do
>>> understand that the intent is to have such global limit in order to
>>> prevent issues with the OOM killer and/or not to interfere with other
>>> stuff running on the same machine. And while I'm not saying we should
>>> not have such limit, every time I wished to have a memory limit it was a
>>> backend-level one. Ideally a workmem-like limit that would "adjust" the
>>> work_mem values used by the optimizer (but that's not what this patch
>>> aims to do), or at least a backstop in case something goes wrong (say, a
>>> memory leak, OLTP application issuing complex queries, etc.).
>>
>> I think what Tomas suggests is the right strategy.

I'm also interested in this topic. And agreed that it's best to move from the limit
for a separate backend to the global one. In more details let me suggest
the following steps or parts:
1) realize memory limitation for a separate backend independent from the work_mem GUC;
2) add workmem-like limit that would "adjust" the work_mem values used by
the optimize as Thomas suggested;
3) add global limit for all backends.

As for p.1 there is a patch that was originally suggested by my colleague
Maxim Orlov <orlovmg(at)gmail(dot)com> and which i modified for the current master.
This patch introduces the only max_backend_memory GUC that specifies
the maximum amount of memory that can be allocated to a backend.
Zero value means no limit.
If the allocated memory size is exceeded, a standard "out of memory" error will be issued.
Also the patch introduces the pg_get_backend_memory_contexts_total_bytes() function,
which allows to know how many bytes have already been allocated
to the process in contexts. And in the build with asserts it adds
the pg_get_backend_memory_allocation_stats() function that allows
to get additional information about memory allocations for debug purposes.

>> This strategy solves the ongoing problem of how to set work_mem, if
>> some queries have lots of operators and others don't -- now we just
>> set backend_work_mem, as a limit on the entire query's total work_mem.
>> And a bit of integration with the optimizer will allow us to
>> distribute the total backend_work_mem to individual execution nodes,
>> with the goal of minimizing spilling, without exceeding the
>> backend_work_mem limit.

As for p.2 maybe one can set a maximum number of parallel sort or
hash table operations before writing to disk instead of absolute
value in the work_mem GUC? E.g. introduce а max_query_operations GUC
or a variable in such a way that old work_mem will be equal
to max_backend_memory divided by max_query_operations.

What do you think about such approach?

With the best wishes,

--
Anton A. Melnikov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
0001-v1-limit-backend-heap-memory-allocation.patch text/x-patch 27.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2024-12-28 14:57:44 Re: Add the ability to limit the amount of memory that can be allocated to backends.
Previous Message jian he 2024-12-28 10:34:27 Re: Re: proposal: schema variables