Re: Postgresql out-of-memory kill

From: Israel Brewster <israel(at)ravnalaska(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Postgresql out-of-memory kill
Date: 2017-02-01 23:23:58
Message-ID: FA223BA5-FEC2-4729-A4F8-BFBCF4A3021F@ravnalaska.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Feb 1, 2017, at 1:45 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Israel Brewster <israel(at)ravnalaska(dot)net> writes:
>> So just a bit ago I ran into a bit of excitement when the kernel decided
>> to kill one of my postmaster processes due to an out-of-memory issue,
>
> Fun :-(
>
>> So a single postmaster process was using over 72GB of ram.
>
> No, the kernel was blaming it for 72GB, which is an entirely different
> statement. The Linux OOM killer makes some assumptions that are
> ludicrously wrong for Postgres: not only does it blame a parent process
> for the total memory consumption of all its children, but if the children
> share a large shared memory segment, *it counts the shared memory segment
> over again for each child*. At least this was true last I looked;
> perhaps very recent kernels are a bit less insane about shared memory.
> In any case, the core problem is blaming the parent process for the
> sins of a child.
>
> Now the PG postmaster itself consumes very little memory, and this is
> quite unlikely to suddenly go wrong because it doesn't do very much.
> A child backend process might go crazy, but what you want to happen then
> is for the OOM killer to kill the child process not the postmaster.
> That will still result in a database crash/restart scenario, but as long
> as the postmaster is alive everything should recover automatically.
>
> Your problem, then, is that the OOM killer is egregiously and with malice
> aforethought killing the wrong process.
>
> The usual fix for this is to configure things so that the postmaster is
> excluded from OOM kill but its children aren't. See
> https://www.postgresql.org/docs/current/static/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT
> (but be sure to consult the page for your PG version, as we've changed
> the support mechanism for that in the past.)
>
> If you're using a vendor-supplied packaging of PG and it doesn't have some
> easy way to turn on this behavior, complain to the vendor ...
>
> regards, tom lane
>

Thanks for the explanation. This is a CentOS 6 box, kernel 2.6.32-642.11.1.el6.x86_64, running the PostgreSQL supplied Postgres 9.6.1, so hopefully the information on that page applies. I'll mess around with modifying the init.d script to exclude the postmaster process. Thanks again!

-----------------------------------------------
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
-----------------------------------------------

>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general

In response to

Browse pgsql-general by date

  From Date Subject
Next Message postgres user 2017-02-01 23:39:32 Testing an extension exhaustively?
Previous Message Nikolai Zhubr 2017-02-01 23:14:47 Re: Causeless CPU load waves in backend, on windows, 9.5.5 (EDB binary).