Re: Out of Memory errors are frustrating as heck!

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Gunther <raj(at)gusw(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-performance(at)lists(dot)postgresql(dot)org, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>
Subject: Re: Out of Memory errors are frustrating as heck!
Date: 2019-04-15 15:26:30
Message-ID: 20190415152630.e3havpoul2uvqnaz@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Sun, Apr 14, 2019 at 11:59:45PM -0400, Gunther wrote:
>On 4/14/2019 23:24, Tom Lane wrote:
>>> ExecutorState: 2234123384 total in 266261 blocks; 3782328 free (17244 chunks); 2230341056 used
>>Oooh, that looks like a memory leak right enough. The ExecutorState
>>should not get that big for any reasonable query.
>2.2 GB is massive yes.
>>Your error and stack trace show a failure in HashBatchContext,
>>which is probably the last of these four:
>>
>>> HashBatchContext: 57432 total in 3 blocks; 16072 free (6 chunks); 41360 used
>>> HashBatchContext: 90288 total in 4 blocks; 16072 free (6 chunks); 74216 used
>>> HashBatchContext: 90288 total in 4 blocks; 16072 free (6 chunks); 74216 used
>>> HashBatchContext: 100711712 total in 3065 blocks; 7936 free (0 chunks); 100703776 used
>>Perhaps that's more than it should be, but it's silly to obsess over 100M
>>when there's a 2.2G problem elsewhere.
>Yes.
>> I think it's likely that it was
>>just coincidence that the failure happened right there. Unfortunately,
>>that leaves us with no info about where the actual leak is coming from.
>
>Strange though, that the vmstat tracking never showed that the cache
>allocated memory goes much below 6 GB. Even if this 2.2 GB memory leak
>is there, and even if I had 2 GB of shared_buffers, I would still have
>enough for the OS to give me.
>

Depends on how the kernel is configured. What are vm.overcommit_memory
and vm.overcommit_ratio set to, for example?

It may easily be the case that the kernel is only allowing 50% of RAM to
be committed to user space, and then refusing to allocate more despite
having free memory. That's fairly common issue on swapless systems.

Try running the query again, watch

cat /proc/meminfo | grep Commit

and if it crashes when Committed_AS hits the CommitLimit.

That doesn't explain where the memory leak is, though :-(

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2019-04-15 15:28:42 Re: Out of Memory errors are frustrating as heck!
Previous Message Mahmoud Moharam 2019-04-15 15:02:20 iscsi performance