Re: Out of memory/corrupted shared memory problem on server

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Johann Spies <johann(dot)spies(at)gmail(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Out of memory/corrupted shared memory problem on server
Date: 2017-08-29 11:48:33
Message-ID: 32546.1504007313@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Johann Spies <johann(dot)spies(at)gmail(dot)com> writes:
> On 25 August 2017 at 13:48, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Remember that "work_mem" is "work memory per plan node", so a complex
>> query could easily chew up a multiple of that number --- and that's
>> with everything going according to plan. If, say, the planner
>> underestimates the number of table entries involved in a hash
>> aggregation, the actual consumption might be much larger.

> The main source of this query (doing a lot of calculations) is another
> Materialized View
> with more than 700 million records. I then analyzed that MV and this
> morning the good news was:

> # REFRESH MATERIALIZED VIEW wos_2017_1.citation_window_mv ;
> REFRESH MATERIALIZED VIEW
> Time: 27128469.899 ms

OK, so almost certainly the crash was caused by a hash aggregate
using so much memory that it triggered the OOM killer. Whether
a hash aggregate's hashtable will stay within work_mem is dependent
on whether the planner correctly predicts the number of entries needed.
Analyzing the input MV must have improved that estimate and led the
planner to choose some other plan.

regards, tom lane

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Stuart Bishop 2017-08-29 12:52:43 Re: Create Action for psql when NOTIFY Recieved
Previous Message Michael Paquier 2017-08-29 11:19:52 Re: Unlogged Crash Detection