Re: [ADMIN]openvz and shared memory trouble

From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Willy-Bas Loos <willybas(at)gmail(dot)com>, lst_hoe02(at)kwsoft(dot)de, pgsql-admin <pgsql-admin(at)postgresql(dot)org>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: [ADMIN]openvz and shared memory trouble
Date: 2014-03-31 15:14:36
Message-ID: 5339865C.9020209@aklaver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin pgsql-general

On 03/31/2014 08:01 AM, Tom Lane wrote:
> Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com> writes:
>> On 03/31/2014 04:12 AM, Willy-Bas Loos wrote:
>>> I'm still worried that it's like Tom Lane said in another discussion:"So
>>> basically, you've got a broken kernel here: it claimed to give PG circa
>>> (135MB) of memory, but what's actually there is only about (128MB). I
>>> don't see any connection between those numbers and the shmmax/shmall
>>> settings, either --- so I think this must be some busted implementation
>>> of a VM-level limitation."
>>> (here:
>>> http://www.postgresql.org/message-id/CAK3UJREBcyVBtr8D7vMfU=uDdkjXkrPnGcuy8erYB0tMfKe1LA@mail.gmail.com)
>>>
>>> And it makes me wonder what else may be issues that arise from that. But
>>> especially, what i can do about it.
>
> FWIW, I went back and re-read that message while perusing this thread,
> and this time it struck me that there was a significant bit of evidence
> I'd overlooked: namely, that the buffer block array is by no means the
> last thing in Postgres' shared memory segment. There are a bunch of
> other shared data structures allocated after it, some of which almost
> certainly had to have been touched by the startup subprocess. The gdb
> output makes it clear that the kernel stopped providing memory at
> 0xb6c4b000; but either it resumed doing so further on, or the whole shared
> memory segment *had* been provisioned originally, and then part of it
> got unmapped again while the startup process was running.
>
> So it's still clearly a kernel bug, but it seems less likely that it is
> triggered by some static limit on shared memory size. Perhaps instead,
> the kernel had been filling in pages for the shared segment on-demand,
> and then when it got to some limit it refused to do so anymore and allowed
> a SIGBUS to happen instead.
>
>> I do not use openvz so I do not have a test bed to try out, but this
>> page seems to be related to your problem:
>> http://openvz.org/Resource_shortage
>> or if you want more detail and a link to what looks to a replacement for
>> beancounters:
>> http://openvz.org/Setting_UBC_parameters
>
> If this software's idea of resource management is to allow SIGBUS to
> happen upon attempting to use memory that had been successfully granted,
> then it's a piece of junk that you should get rid of ASAP. (No, I
> don't like Linux's OOM-kill solution to resource overcommit either.)

At this point the memory allocation as a problem is as much conjecture
as anything else, at least to me. So what is causing SIGBUS is an open
question in my mind.

>
> regards, tom lane
>
>

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Tom Lane 2014-03-31 15:28:13 Re: [GENERAL] openvz and shared memory trouble
Previous Message Tom Lane 2014-03-31 15:01:02 Re: [GENERAL] openvz and shared memory trouble

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2014-03-31 15:28:13 Re: [GENERAL] openvz and shared memory trouble
Previous Message Rob Sargent 2014-03-31 15:08:55 char array overhead