From: | Dmitry Dolgov <9erthalion6(at)gmail(dot)com> |
---|---|
To: | Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Changing shared_buffers without restart |
Date: | 2024-11-29 16:47:27 |
Message-ID: | 6qqzpbypx5qniaehxphntm2uhbve53ww2l66v5b2ecearghrpe@onpwthbyukek |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> On Fri, Nov 29, 2024 at 01:56:30AM GMT, Matthias van de Meent wrote:
>
> I mean, we can do the following to get a nice contiguous empty address
> space no other mmap(NULL)s will get put into:
>
> /* reserve size bytes of memory */
> base = mmap(NULL, size, PROT_NONE, ...flags, ...);
> /* use the first small_size bytes of that reservation */
> allocated_in_reserved = mmap(base, small_size, PROT_READ |
> PROT_WRITE, MAP_FIXED, ...);
>
> With the PROT_NONE protection option the OS doesn't actually allocate
> any backing memory, but guarantees no other mmap(NULL, ...) will get
> placed in that area such that it overlaps with that allocation until
> the area is munmap-ed, thus allowing us to reserve a chunk of address
> space without actually using (much) memory.
From what I understand it's not much different from the scenario when we
just map as much as we want in advance. The actual memory will not be
allocated in both cases due to CoW, oom_score seems to be the same. I
agree it sounds attractive, but after some experimenting it looks like
it won't work with huge pages insige a cgroup v2 (=container).
The reason is Linux has recently learned to apply memory reservation
limits on hugetlb inside a cgroup, which are applied to mmap. Nowadays
this feature is often configured out of the box in various container
orchestrators, meaning that a scenario "set hugetlb=1GB on a container,
reserve 32GB with PROT_NONE" will fail. I've also tried to mix and
match, reserve some address space via non-hugetlb mapping, and allocate
a hugetlb out of it, but it doesn't work either (the smaller mmap
complains about MAP_HUGETLB with EINVAL).
From | Date | Subject | |
---|---|---|---|
Next Message | Matheus Alcantara | 2024-11-29 17:05:24 | Re: Use streaming read API in pgstattuple. |
Previous Message | Andrey M. Borodin | 2024-11-29 16:40:54 | Re: Отв.: Re: UUID v7 |