From: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
---|---|
To: | Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: memory leak in trigger handling (since PG12) |
Date: | 2023-05-24 13:17:52 |
Message-ID: | 9574e4b6-3334-777b-4287-29d81151963a@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 5/24/23 10:19, Jakub Wartak wrote:
> Hi, just two cents:
>
> On Tue, May 23, 2023 at 8:01 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
>>
>> Hi,
>>
>> On 2023-05-23 13:28:30 -0400, Tom Lane wrote:
>>> Andres Freund <andres(at)anarazel(dot)de> writes:
>>>> Could it help to have a mode where the executor shutdown hook checks how much
>>>> memory is allocated in ExecutorState and warns if its too much?
>>>
>>> It'd be very hard to set a limit for what's "too much", since the amount
>>> of stuff created initially will depend on the plan size.
>>
>> I was thinking of some limit that should really never be reached outside of a
>> leak or work_mem based allocations, say 2GB or so.
>
> RE: instrumentation subthread:
> if that helps then below technique can work somewhat good on normal
> binaries for end users (given there are debug symbols installed), so
> maybe we don't need that much infrastructure added in to see the hot
> code path:
>
> perf probe -x /path/to/postgres 'palloc' 'size=%di:u64' # RDI on
> x86_64(palloc size arg0)
> perf record -avg --call-graph dwarf -e probe_postgres:palloc -aR -p
> <pid> sleep 3 # cannot be longer, huge overhead (~3s=~2GB)
>
> it produces:
> 50.27% (563d0e380670) size=24
> |
> ---palloc
> bms_copy
> ExecUpdateLockMode
> ExecBRUpdateTriggers
> ExecUpdate
> [..]
>
> 49.73% (563d0e380670) size=16
> |
> ---palloc
> bms_copy
> RelationGetIndexAttrBitmap
> ExecUpdateLockMode
> ExecBRUpdateTriggers
> ExecUpdate
> [..]
>
> Now we know that those small palloc() are guilty, but we didn't know
> at the time with Tomas. The problem here is that we do not know in
> palloc() - via its own arguments for which MemoryContext this is going
> to be allocated for. This is problematic for perf, because on RHEL8, I
> was not able to generate an uprobe that was capable of reaching a
> global variable (CurrentMemoryContext) at that time.
>
I think there are a couple even bigger issues:
(a) There are other methods that allocate memory - repalloc, palloc0,
MemoryContextAlloc, ... and so on. But presumably we can trace all of
them at once? We'd have to trace reset/deletes too.
(b) This doesn't say if/when the allocated chunks get freed - even with
a fix, we'll still do exactly the same number of allocations, except
that we'll free the memory shortly after that. I wonder if we could
print a bit more info for each probe, to match the alloc/free requests.
> Additionally what was even more frustrating on diagnosing that case on
> the customer end system, was that such OOMs were crashing other
> PostgreSQL clusters on the same OS. Even knowing the exact guilty
> statement it was impossible to limit RSS memory usage of that
> particular backend. So, what you are proposing makes a lot of sense.
> Also it got me thinking of implementing safety-memory-net-GUC
> debug_query_limit_backend_memory=X MB that would inject
> setrlimit(RLIMIT_DATA) through external extension via hook(s) and
> un-set it later, but the man page states it works for mmap() only
> after Linux 4.7+ so it is future proof but won't work e.g. on RHEL7 -
> maybe that's still good enough?; Or, well maybe try to hack a palloc()
> a little, but that has probably too big overhead, right? (just
> thinking loud).
>
Not sure about setting a hard limit - that seems pretty tricky and may
easily backfire. It's already possible to set such memory limit using
e.g. cgroups, or even better use VMs to isolate the instances.
I certainly agree it's annoying that when OOM hits, we end up with no
information about what used the memory. Maybe we could have a threshold
triggering a call to MemoryContextStats? So that we have at least some
memory usage info in the log in case the OOM killer intervenes.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2023-05-24 13:38:41 | Re: memory leak in trigger handling (since PG12) |
Previous Message | Robert Haas | 2023-05-24 12:33:33 | Re: Large files for relations |