Re: Identify huge pages accessibility using madvise

From: Peter Eisentraut <peter(at)eisentraut(dot)org>
To: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Gabriele Bartolini <gabriele(dot)bartolini(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Identify huge pages accessibility using madvise
Date: 2024-11-12 10:07:50
Message-ID: 3fc1ea14-af2b-4b6e-9309-1557190d1591@eisentraut.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 08.11.24 09:54, Dmitry Dolgov wrote:
> Looks like there is a plot twist. After talking to Gabriele off list and
> testing on an EKS, I've discovered that since 5.7 Linux kernel supports
> hugetlb reservation via hugetlbfs [1]. That means that together with the
> original limitation at page fault time there is one at reservation time,
> which has a separate knob in cgroupfs:
>
> # cgroup v2, hugetlb controller
> #
> # original limit, page fault level
> hugetlb.2MB.limit_in_bytes
> #
> # new one, reservation level
> hugetlb.2MB.rsvd.limit_in_bytes
>
> This means that there still could be people facing the original issue patch is
> trying to address: for that one needs to either run older kernel, or have a
> container orchestration tool that do not set rsvd value (looks like there are
> such examples). But in the long term perspective I would expect everyone
> converging to use reservation limits correctly, so maybe the patch is not
> needed after all.

Ah good, it looks like the issue was addressed properly in the kernel
then, and we don't need the workaround your patch proposes anymore.

So, I think we don't need to proceed with your patch. The issue will
hopefully go away over time (or has already), and those who are still
affected by it for some reason can refer to this thread for discussion
and maybe choose to apply the patch on their own.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2024-11-12 10:18:03 Re: NOT ENFORCED constraint feature
Previous Message Masahiro Ikeda 2024-11-12 10:01:47 Fix to increment the index scan counter for the bloom filter index