Re: slave restarts with kill -9 coming from somewhere, or nowhere

From: Bert <biertie(at)gmail(dot)com>
To: "pgsql-admin(at)postgresql(dot)org" <pgsql-admin(at)postgresql(dot)org>
Subject: Re: slave restarts with kill -9 coming from somewhere, or nowhere
Date: 2013-04-04 06:02:04
Message-ID: CAFCtE1movdoU5fBnfoCi4gZpMsFr0v0+gsC_fA0_4gOyKk9oPg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

hi,

this is strange: one connection almost killed the server. So not a
combination of a lot of connections. I saw one connection grewing till over
100GB. Then I cancelled the connection before the oom killer became active
again.

These are my memory settings:
shared_buffers = 20GB
temp_buffers = 1GB
max_prepared_transactions = 10
work_mem = 4GB
maintenance_work_mem = 1GB
max_stack_depth = 8MB
wal_buffers = 32MB
effective_cache_size = 88GB

The server has 128GB ram

How is it possible that one connection (query) uses all the ram? And how
can I avoid it?

ps: the database is a DWH. I don't need a lot of connections. But I want to
process a lot of data fast.

cheers,
Bert

On Wed, Apr 3, 2013 at 10:10 AM, Bert <biertie(at)gmail(dot)com> wrote:

> Hi all,
>
> I have turned vm.overcommit_memory on 1.
>
> It's a pretty much dedicated machine anyway, except for some postgres
> maintainance scripts I run in python / bash from the server.
>
> We'll see what it gives.
>
> cheers,
> Bert
>
>
> On Wed, Apr 3, 2013 at 8:45 AM, Bert <biertie(at)gmail(dot)com> wrote:
>
>> Hi Tom,
>>
>> thanks for the tip! it was indeed the oom killer.
>>
>> Is it wise to disable the oom killer? Or will the server really go down
>> withough postgres doing something about it?
>>
>> currently I already lowered the shared_memory value a bit..
>>
>> cheers,
>> Bert
>>
>>
>> On Tue, Apr 2, 2013 at 8:06 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>
>>> Bert <biertie(at)gmail(dot)com> writes:
>>> > I'm running the latest postgres version (9.2.3), and today for the
>>> first
>>> > time I encountered this:
>>>
>>> > 12774 2013-04-02 18:13:10 CEST LOG: server process (PID 28463) was
>>> > terminated by signal 9: Killed
>>>
>>> AFAIK there are only two possible sources of signal 9: a manual kill,
>>> or the Linux kernel's OOM killer. If it's the latter there should be
>>> a concurrent entry in the kernel logfiles about this. If you find one,
>>> suggest reading up on how to disable OOM kills, or at least reconfigure
>>> your system to make them less probable.
>>>
>>> regards, tom lane
>>>
>>
>>
>>
>> --
>> Bert Desmet
>> 0477/305361
>>
>
>
>
> --
> Bert Desmet
> 0477/305361
>

--
Bert Desmet
0477/305361

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Tom Lane 2013-04-04 06:17:58 Re: slave restarts with kill -9 coming from somewhere, or nowhere
Previous Message Albe Laurenz 2013-04-03 09:48:39 Re: FW: psql error