Re: Out of memory/corrupted shared memory problem on server

From: Johann Spies <johann(dot)spies(at)gmail(dot)com>
To: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Out of memory/corrupted shared memory problem on server
Date: 2017-08-29 05:59:29
Message-ID: CAGZ55DRDW9C6h81BOu5+dTsdX71dq0v5VKGY+MT8569mpJaWNA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 25 August 2017 at 13:48, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> How complex is "complex"? I can think of two likely scenarios:
> 1. You've stumbled across some kind of memory-leak bug in Postgres.
> 2. The query's just using too much memory. In this connection, it's
> not good that you've got
>> work_mem = 2GB
> Remember that "work_mem" is "work memory per plan node", so a complex
> query could easily chew up a multiple of that number --- and that's
> with everything going according to plan. If, say, the planner
> underestimates the number of table entries involved in a hash
> aggregation, the actual consumption might be much larger.
>
> My first move would be to reduce work_mem by an order of magnitude
> or two. If that doesn't help, check the plan for the view's query
> and see if it contains any hash aggregation steps --- if so, does
> "set enable_hashagg = off" help? (Also, make sure the view's input
> tables have been ANALYZEd recently.)
>
> If none of that helps, we should investigate the memory-leak-bug
> theory. One thing you could do in that direction is to run
> the postmaster with a "ulimit -v" size less than what will trigger
> the ire of the OOM killer, so that the query encounters a normal
> ENOMEM error rather than SIGKILL when it's eaten too much memory.
> That should result in it dumping a memory consumption map to stderr,
> which would give some clue where the problem is. We'd need to see
> that map as well as details about your query to make progress.

Thanks Tom and Christoph Moench-Tegeder.

I first tried to refresh it after bringing down the work_mem to 1 GB.
It failed again.
The main source of this query (doing a lot of calculations) is another
Materialized View
with more than 700 million records. I then analyzed that MV and this
morning the good news was:

# REFRESH MATERIALIZED VIEW wos_2017_1.citation_window_mv ;
REFRESH MATERIALIZED VIEW
Time: 27128469.899 ms

Regards
Johann
--
Because experiencing your loyal love is better than life itself,
my lips will praise you. (Psalm 63:3)

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Gersner 2017-08-29 08:17:58 Unlogged Crash Detection
Previous Message David G. Johnston 2017-08-29 02:02:57 Re: Create Action for psql when NOTIFY Recieved