Re: Changing shared_buffers without restart

From: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
To: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Changing shared_buffers without restart
Date: 2024-11-29 16:47:27
Message-ID: 6qqzpbypx5qniaehxphntm2uhbve53ww2l66v5b2ecearghrpe@onpwthbyukek
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Fri, Nov 29, 2024 at 01:56:30AM GMT, Matthias van de Meent wrote:
>
> I mean, we can do the following to get a nice contiguous empty address
> space no other mmap(NULL)s will get put into:
>
> /* reserve size bytes of memory */
> base = mmap(NULL, size, PROT_NONE, ...flags, ...);
> /* use the first small_size bytes of that reservation */
> allocated_in_reserved = mmap(base, small_size, PROT_READ |
> PROT_WRITE, MAP_FIXED, ...);
>
> With the PROT_NONE protection option the OS doesn't actually allocate
> any backing memory, but guarantees no other mmap(NULL, ...) will get
> placed in that area such that it overlaps with that allocation until
> the area is munmap-ed, thus allowing us to reserve a chunk of address
> space without actually using (much) memory.

From what I understand it's not much different from the scenario when we
just map as much as we want in advance. The actual memory will not be
allocated in both cases due to CoW, oom_score seems to be the same. I
agree it sounds attractive, but after some experimenting it looks like
it won't work with huge pages insige a cgroup v2 (=container).

The reason is Linux has recently learned to apply memory reservation
limits on hugetlb inside a cgroup, which are applied to mmap. Nowadays
this feature is often configured out of the box in various container
orchestrators, meaning that a scenario "set hugetlb=1GB on a container,
reserve 32GB with PROT_NONE" will fail. I've also tried to mix and
match, reserve some address space via non-hugetlb mapping, and allocate
a hugetlb out of it, but it doesn't work either (the smaller mmap
complains about MAP_HUGETLB with EINVAL).

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Matheus Alcantara 2024-11-29 17:05:24 Re: Use streaming read API in pgstattuple.
Previous Message Andrey M. Borodin 2024-11-29 16:40:54 Re: Отв.: Re: UUID v7