Re: excessive amounts of consumed memory (RSS), triggering OOM killer

From: Tomas Vondra <tv(at)fuzzy(dot)cz>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: excessive amounts of consumed memory (RSS), triggering OOM killer
Date: 2014-12-02 09:59:14
Message-ID: e010519fbe83b1331ee0dfcb122a616a@fuzzy.cz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dne 2014-12-02 02:52, Tom Lane napsal:
> Tomas Vondra <tv(at)fuzzy(dot)cz> writes:
>> On 2.12.2014 01:33, Tom Lane wrote:
>>> What I suspect you're looking at here is the detritus of creation of
>>> a huge number of memory contexts. mcxt.c keeps its own state about
>>> existing contents in TopMemoryContext. So, if we posit that those
>>> contexts weren't real small, there's certainly room to believe that
>>> there was some major memory bloat going on recently.
>
>> Aha! MemoryContextCreate allocates the memory for the new context from
>> TopMemoryContext explicitly, so that it survives resets of the parent
>> context. Is that what you had in mind by keeping state about existing
>> contexts?
>
> Right.
>
>> That'd probably explain the TopMemoryContext size, because array_agg()
>> creates separate context for each group. So if you have 1M groups, you
>> have 1M contexts. Although I don's see how the size of those contexts
>> would matter?
>
> Well, if they're each 6K, that's your 6GB right there.

Yeah, but this memory should be freed after the query finishes, no?

>> Maybe we could move this info (list of child contexts) to the parent
>> context somehow, so that it's freed when the context is destroyed?
>
> We intentionally didn't do that, because in many cases it'd result in
> parent contexts becoming nonempty when otherwise they'd never have
> anything actually in them. The idea was that such "shell" parent
> contexts
> should be cheap, requiring only a control block in TopMemoryContext and
> not an actual allocation arena. This idea of a million separate child
> contexts was never part of the design of course; we might need to
> rethink
> whether that's a good idea. Or maybe there need to be two different
> policies about where to put child control blocks.

Maybe. For me, the 130MB is not really a big deal, because for this to
happen there really needs to be many child contexts at the same time,
consuming much more memory. With 6.5GB consumed in total, 130MB amounts
to ~2% which is negligible. Unless we can fix the RSS bloat.

>> Also, this explains the TopMemoryContext size, but not the RSS size
>> (or
>> am I missing something)?
>
> Very possibly you're left with "islands" that prevent reclaiming very
> much of the peak RAM usage. It'd be hard to be sure without some sort
> of memory map, of course.

Yes, that's something I was thinking about too - I believe what happens
is that allocations of info in TopMemoryContext and the actual contexts
are interleaved, and at the end only the memory contexts are deleted.
The
blocks allocated in TopMemoryContexts are kept, creating the "islands".

If that's the case, allocating the child context info within the parent
context would solve this, because these pieces would be reclaimed with
the
rest of the parent memory.

But then again, there are probably other ways to create such islands
(e.g. allocating additional block in a long-lived context while the
child
contexts exist).

regards
Tomas

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Olivier MATROT 2014-12-02 10:17:43 Serialization exception : Who else was involved?
Previous Message Peter Geoghegan 2014-12-02 08:17:58 Re: using Core Foundation locale functions