Re: Out of Memory errors are frustrating as heck!

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Gunther <raj(at)gusw(dot)net>
Cc: postgres performance list <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Out of Memory errors are frustrating as heck!
Date: 2019-10-08 17:44:39
Message-ID: CAHyXU0x0TNaMYCeRXwo4Y5weZn_bO4zMsm_0Bvcj1zH=g4kjEA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Sun, Apr 14, 2019 at 3:51 PM Gunther <raj(at)gusw(dot)net> wrote:
>
> For weeks now, I am banging my head at an "out of memory" situation. There is only one query I am running on an 8 GB system, whatever I try, I get knocked out on this out of memory. It is extremely impenetrable to understand and fix this error. I guess I could add a swap file, and then I would have to take the penalty of swapping. But how can I actually address an out of memory condition if the system doesn't tell me where it is happening?
> We can't really see anything too worrisome. There is always lots of memory used by cache, which could have been mobilized. The only possible explanation I can think of is that in that moment of the crash the memory utilization suddenly skyrocketed in less than a second, so that the 2 second vmstat interval wouldn't show it??? Nah.
>
> I have already much reduced work_mem, which has helped in some other cases before. Now I am going to reduce the shared_buffers now, but that seems counter-intuitive because we are sitting on all that cache memory unused!
>
> Might this be a bug? It feels like a bug. It feels like those out of memory issues should be handled more gracefully (garbage collection attempt?) and that somehow there should be more information so the person can do anything about it.

I kind of agree that nothing according to vmstat suggests you have a
problem. One thing you left out is the precise mechanics of the
failure; is the database getting nuked by the oom killer? Do you have
the logs?

*) what are values of shared_buffers and work_mem and maintenance_work_mem?

*) Is this a 32 bit build? (I'm guessing no, but worth asking)

*) I see that you've disabled swap. Maybe it should be enabled?

*) Can you get the query to run through? an 'explain analyze' might
point to gross misses in plan; say, sort memory overuse

*) If you're still getting failures, maybe we need to look at sampling
frequency of memory usage.

*) iowait is super high.

*) I see optimization potential in this query; explain analyze would
help here too.

merlin

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Merlin Moncure 2019-10-08 17:50:32 Re: Out of Memory errors are frustrating as heck!
Previous Message Michael Lewis 2019-10-07 21:53:17 Re: distinct on extract returns composite type