Quick Links

Re: BUG #18675: Postgres is not realasing memory causing OOM

From:	Tomas Vondra <tomas(at)vondra(dot)me>
To:	Maciej Jaros <eccenux(at)gmail(dot)com>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject:	Re: BUG #18675: Postgres is not realasing memory causing OOM
Date:	2024-10-28 19:26:10
Message-ID:	ac7d412b-dd3e-4c6f-afb6-80d2f11bf833@vondra.me
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-bugs

On 10/28/24 19:07, Maciej Jaros wrote:
> David G. Johnston (28.10.2024 14:42):
>> On Monday, October 28, 2024, PG Bug reporting form
>> <noreply(at)postgresql(dot)org> wrote:
>>
>> The following bug has been logged on the website:
>>
>> Bug reference: 18675
>> Logged by: Maciej Jaros
>> Email address: eccenux(at)gmail(dot)com
>> PostgreSQL version: 16.4
>> Operating system: Ubuntu 22.04
>> Description:
>>
>>
>> or maybe
>> PostgreSQL should include garbage collection?
>>
>>
>> Garbage collection is typically used in relation to a programming
>> language feature to make writing applications in those languages
>> easier. Applications themselves don’t really implement garbage
>> collection. And C, the language PostgreSQL, is written in, doesn’t
>> have garbage collection. To our knowledge, though, there are no
>> significant memory leaks in supported versions.
>>
>>
>> RAMforPG = shared_buffers + (temp_buffers + work_mem) *
>> max_connections;
>>
>>
>> The expression: work_mem * max_connections is incorrect. See the doc
>> for work_mem for how it is used.
>>
>> There is so much more info needed to conclude there is a bug here -
>> which there probably is not. Exploring the query and tuning the
>> system is better discussed on the -general mailing list.
>>
>> David J.
>>
>
> Could you share what would be the correct expression to calculate or at
> least estimate max RAM usage then? I've checked and haven't found
> anything in the docs. I've found that expression in user space. I know
> autovac might need to be accounted for, but as said we are not using it.
> How would this estimation of 20GB go to 50GB?
>

Unfortunately there's no universal formula, because it depends on what
queries you run. For example a query that needs to do 10 sorts may need
to use 10 x work_mem, and so on. Yes, this is unfortunate, we'd like to
have a per-session memory limit, but we don't have that. So the only
recommendation is to set these limits conservatively, not too close to
the available memory limit.

Also, if you really found a memory leak, these formulas are pointless. A
memory leak is usually about "breaking" such limits, and we may not even
know about all memory that gets allocated (for external libraries).

> There just seem to be no limit in RAM usage so it does seem like a
> memory leak. It just grows until there is no more RAM available an we
> restart the service. There are same operations, same connections
> (pooling on the Java side) and it just grows everyday. It seem to be a
> memory leak.It doesn't seem to have an end.
>

The question is how you define a memory leak. All memory allocated by a
query (using "our" infrastructure) is tied to a "memory context" and
should be released at the end. It's possible for a query to allocate a
lot of memory, perhaps even not release right away, but it should be
released at the end of a query. I'm not going to rule out a bug that
breaks this (e.g. by using a long-lived memory context), but it's very
unlikely we'd not find that pretty soon.

Also, the memory leak seems to be permanent - in your chart the memory
usage grows over a week. Presumably your queries are shorter than that,
so that's not consistent with this type of memory leak.

What I think might be more likely is that you're using something that
allocates memory by directly calling malloc() - say, an extension using
some external library, etc. These allocations are completely outside our
control, and if not freed explicitly, would be a "normal" memory leak.

This is why people were asking about JIT earlier. JIT relies on llvm,
which allocates memory directly, and so a bug in LLVM could leak memory
like this. The first thing to do is to disable JIT, and see if the
memory leak goes away. If it does, then we know it's either a bug in
LLVM, or in how we use it (e.g. we may not trigger cleanup).

But all this is just a wild guess. We don't even know if you're using
some other extensions which might also leak memory, etc.

regards

--
Tomas Vondra

In response to

Re: BUG #18675: Postgres is not realasing memory causing OOM at 2024-10-28 18:07:45 from Maciej Jaros

Responses

Re: BUG #18675: Postgres is not realasing memory causing OOM at 2024-10-29 12:26:09 from Maciej Jaros

Browse pgsql-bugs by date

	From	Date	Subject
Next Message	Michael Paquier	2024-10-28 22:48:30	Re: BUG #18674: Partitioned table doesn't depend on access method it uses
Previous Message	Andrew Dunstan	2024-10-28 19:00:12	Re: pg_rewind fails on Windows where tablespaces are used