From: | Michael Banck <mbanck(at)gmx(dot)net> |
---|---|
To: | Kirk Wolak <wolakk(at)gmail(dot)com> |
Cc: | Daniel Gustafsson <daniel(at)yesql(dot)se>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> |
Subject: | Re: Oom on temp (un-analyzed table caused by JIT) V16.1 [ NOT Fixed ] |
Date: | 2024-02-22 21:49:20 |
Message-ID: | 65d7c161.050a0220.68515.f205@mx.google.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On Wed, Jan 24, 2024 at 02:50:52PM -0500, Kirk Wolak wrote:
> On Mon, Jan 22, 2024 at 1:30 AM Kirk Wolak <wolakk(at)gmail(dot)com> wrote:
> > On Fri, Jan 19, 2024 at 7:03 PM Daniel Gustafsson <daniel(at)yesql(dot)se> wrote:
> >> > On 19 Jan 2024, at 23:09, Kirk Wolak <wolakk(at)gmail(dot)com> wrote:
> > Thank you, that made it possible to build and run...
> > UNFORTUNATELY this has a CLEAR memory leak (visible in htop)
> > I am watching it already consuming 6% of my system memory.
> >
> Daniel,
> In the previous email, I made note that once the JIT was enabled, the
> problem exists in 17Devel.
> I re-included my script, which forced the JIT to be used...
>
> I attached an updated script that forced the settings.
> But this is still leaking memory (outside of the
> pg_backend_memory_context() calls).
> Probably because it's at the LLVM level? And it does NOT happen from
> planning/opening the query. It appears I have to fetch the rows to
> see the problem.
I had a look at this (and blogged about it here[1]) and was also
wondering what was going on with 17devel and the recent back-branch
releases, cause I could also reproduce those continuing memory leaks.
Adding some debug logging when llvm_inline_reset_caches() is called
solves the mystery: as you are calling a function, the fix that is in
17devel and the back-branch releases is not applicable and only after
the function returns llvm_inline_reset_caches() is being called (as
llvm_jit_context_in_use_count is greater than zero, presumably, so it
never reaches the call-site of llvm_inline_reset_caches()).
If you instead run your SQL in a DO-loop (as in the blog post) and not
as a PL/PgSQL function, you should see that it no longer leaks. This
might be obvious to some (and Andres mentioned it in
https://www.postgresql.org/message-id/20210421002056.gjd6rpe6toumiqd6%40alap3.anarazel.de)
but it took me a while to figure out/find.
Michael
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Smith | 2024-02-22 22:04:04 | Re: Add lookup table for replication slot invalidation causes |
Previous Message | Euler Taveira | 2024-02-22 21:01:23 | Re: speed up a logical replica setup |