Re: BUG #18675: Postgres is not realasing memory causing OOM

From: Maciej Jaros <eccenux(at)gmail(dot)com>
To: Tomas Vondra <tomas(at)vondra(dot)me>
Cc: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #18675: Postgres is not realasing memory causing OOM
Date: 2024-10-31 12:21:54
Message-ID: CAOn6LZoQi+LyPLaf17qP11a5pyy5v+SNu8CAvf+4aTUwhD0O=w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

>> We terminate all queries running longer than 30 minutes....
> That just supports my speculation this is not the in-query memory leak
where we allocate memory in a memory context, because that'd be freed at
the end of a query. I'm not sure about what happens to memory allocated by
LLVM if a query gets interrupted because of a timeout. How often do queries
hit the 30-minute limit?

Very rarely. Aside from bad deployments, long-running query kills happen
maybe once a week or even once a month. It depends on the load (e.g. had
more last month due to the beginning of the school year).

I did some more JIT testing, and, at least in our case, JIT might be
problematic (an example at the end). I expected that JIT compiler would run
once and then bytecode could be reused at least for subsequent queries, at
least in the default configuration. However, testing proved this assumption
wrong. We do have the default JIT configuration, yet we observe that JIT is
10x slower for some queries and remains the same speed on repeat. The
slowest queries involve counts with EXISTS subqueries and similar queries
listing the top 25 rows. So, JIT is now off as you suggest.

I still have too little data to be sure if it also means problems in terms
of memory consumption. I'll get back to you once I have more stats to
compare memory with and without JIT...

But I still believe Postgres should take responsibility for the memory it
uses. I mean, I'm a dev too, so I understand the appeal of saying, "we are
using library X, and it's not our problem," but come on, someone chose
those default libraries and default configuration. It wasn't me, the user
of PGSQL. You (PG devs) should take responsibility for how you compile
queries and how you release memory. JVM does that, browsers do that. I
mean, if I close a browser tab, memory is released. If I create objects
that are serialized for an API response, that memory is released sometime
after the API call. Once I finalize a transaction in PG, I think memory
should be released too and that doesn't seem to happen. I understand this
isn't easy, but both JVM and most browser engines are written in C++, so
it's not impossible. Sure, JVM is a different monster, but PGSQL is also
using a programming language, optimizing compilation, creating and
releasing objects.

An example of how slow default JIT can be in our case:
-- test without JIT
SET jit_above_cost = 100000000; -- should not apply to queries
EXPLAIN ANALYZE select
from
"m6035".AuthorEntry this_
where
this_.bibliographicDatabaseId = 30
and lower(this_.base) like 'suwała%'
---...subqueries
limit 25;

-- forcing JIT
SET jit_above_cost = 1000; -- force JIT
EXPLAIN ANALYZE select
---...query exactly as above

Without JIT:

- Planning Time: 2.228 ms
- * Execution Time: 47*.993 ms

With JIT:

- Planning Time: 2.282 ms
- JIT:
- Functions: 61
- Options: Inlining true, Optimization true, Expressions true,
Deforming true
- Timing: Generation 3.750 ms, Inlining 18.327 ms, Optimization
309.607 ms, Emission 196.158 ms, Total 527.841 ms
- * Execution Time: 574*.460 ms

Repeating this yields similar results.

wt., 29 paź 2024 o 17:16 Tomas Vondra <tomas(at)vondra(dot)me> napisał(a):

>
>
> On 10/29/24 13:26, Maciej Jaros wrote:
> > Thanks, Tomas. That helped me understand the problem and the comments a
> > bit more. To answer some questions about our setup and possible causes
> > of the issues:
> >
> > * We are not using any non-standard extensions. We use PL/pgSQL in
> > some maintenance scripts, but that extension is built-in, so I guess
> > you didn't mean that.
> > * This is mostly a default configuration (aside from memory and CPU
> > adjustments). More specifically all of the JIT options are in their
> > default states (none are set).
>
> OK. Yesterday you posted this:
>
> ^ name ^ value ^
> | jit | on |
> | jit_above_cost | 100000 |
> | jit_debugging_support | off |
> | jit_dump_bitcode | off |
> | jit_expressions | on |
> | jit_inline_above_cost | 500000 |
> | jit_optimize_above_cost | 500000 |
> | jit_profiling_support | off |
> | jit_provider | llvmjit |
> | jit_tuple_deforming | on |
>
> Which means the JIT is enabled.
>
> > * For context about connections, we have 10 app servers (Tomcat, JDK
> > 17, Hibernate 5) connecting to the database in question, each with
> > around 20 active connections in practice (pooling, though some
> > schedules might add a bit). There is also a data warehouse with
> > pooling and should not exceed 20 connections. So, in practice, we
> > have around 250 connections, not the 500 we have in |
> > max_connections| setting. Also most of the connections are idle most
> > of the time. So at least our |max_connections| is quite
> > conservative, I think.
>
> Seems like that. You haven't shared any information about how much
> memory is used by individual backends, but it might be interesting to
> look at that, and check if the memory usage is high for some subset of
> backends (say, those for the warehouse).
>
> > * We terminate all queries running longer than 30 minutes. Typical
> > queries are below 1 second, and Java responses are mostly limited to
> > 20 seconds. Additionally, most queries have a limit of 25 (25 items
> > per page).
>
> That just supports my speculation this is not the in-query memory leak
> where we allocate memory in a memory context, because that'd be freed at
> the end of a query. I'm not sure about what happens to memory allocated
> by LLVM if a query gets interrupted because of a timeout. How often do
> queries hit the 30-minute limit?
>
> > * The application is in use from 8 am to 6-7 pm, and it is mostly idle
> > at night. There is some maintenance at night (including vacuum and
> > vacuumlo). RAM availability stays flat at night, and I would expect
> > it to drop at some point around 7-8 pm. RAM usage on separate app
> > servers does drop after hours.
> >
> > So, yes, that RAM usage is strange, and that’s why I reported it. It
> > doesn’t seem like this is a problem unique to us. I found questions
> > about memory usage on Stack Overflow, like this one, for example:
> > HowtolimitthememoryavailableforPostgreSQLserver <https://
> > stackoverflow.com/questions/28844170/how-to-limit-the-memory-that-is-
> > available-for-postgresql-server>. There is a comment there that seems to
> > describe what could be a bug (looks like a bug to me). Maybe not a bug-
> > bug, but definitely an unwanted behavior:
> >
> > Note that even if postgres logically releases memory it has
> > allocated, it may not be returned to operating system depending on
> > the |malloc()|/|free()| implementation of your execution
> > environment. That may result in multiple PostgreSQL processes
> > getting over the limit due use of hash aggregation as described
> > above and the memory is never released to OS even though PostgreSQL
> > isn't actually using it either. This happens because technically |
> > malloc()| may use |brk()| behind the scenes and releasing memory
> > back to OS is only possible only in some special cases.
> >
>
> That's not a bug, that's what glibc does for everything in user space.
> Yes, it can interfere with overcommit if you have the limit set too llow
> (but that's not your case), and most of the time it's not an issue
> thanks to virtual memory etc. It also should not lead to indefinite
> growth, the memory should be reused for future allocations.
>
> For the "memory limit", it's true we don't have a way to do that, but
> it's also not clear it'd actually help in any way. If you have a memory
> leak in the JIT code, that's completely outside our control - we don't
> even know how much memory LLVM allocated etc. so this would not be
> covered by the limit.
>
> > So, that comment led me to suggest adding some kind of process. I called
> > it a garbage collector, but maybe David is right; maybe that’s not
> > accurate. Anyway, that process, in my view, could try to actually
> > release memory to the system to prevent the OOM killer from doing its
> > bidding. Is that possible? I don’t know, don't know inner workings of
> > PG. I also don’t understand why calling |free| would not release memory.
> > I’m also not sure if that description of malloc/free is accurate, but it
> > does seem to align with what I’m seeing.
> >
>
> There are different ways to define garbage collection, but memory
> contexts could be seen as doing that. Of course, that's only "our" side,
> it has no impact on what happens in glibc. That's a different layer, we
> have no visibility into that.
>
> Anyway, people have already suggested you try disabling JIT by setting
>
> jit = off
>
> and see it that fixes the issue. If yes, that significantly narrows the
> area where the bug could be.
>
>
> regards
>
> --
> Tomas Vondra
>
>

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2024-10-31 13:09:38 BUG #18680: [ECPG] heap-use-after-free (read)
Previous Message PG Bug reporting form 2024-10-30 23:02:57 BUG #18679: Planner issue with bitmap scan recheck on external TOAST