Re: Changing shared_buffers without restart

From: Ni Ku <jakkuniku(at)gmail(dot)com>
To: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Changing shared_buffers without restart
Date: 2025-03-21 08:48:30
Message-ID: CAPuPUJxHKNbrFm2gxGbOoffWvrxX1Nbo1_dsQ_+=dcOCSmxt_Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks for your insights and confirmation, Dmitry.
Right, I think the anonymous fd approach would work to keep the memory
contents intact in between munmap and mmap with the new size, so bufferpool
expansion would work.
But it seems shrinking would still be problematic, since that approach
requires the anonymous fd to remain open (for memory content protection),
and so munmap would not release the memory back to the OS right away (gets
released when the fd is closed). From testing this is true for hugepage
memory at least.
Is there a way around this? Or maybe I misunderstood what you have in mind
;)

Regards,

Jack Ng

On Thu, Mar 20, 2025 at 6:21 PM Dmitry Dolgov <9erthalion6(at)gmail(dot)com> wrote:

> > On Thu, Mar 20, 2025 at 04:55:47PM GMT, Ni Ku wrote:
> >
> > I ran some simple tests (outside of PG) on linux kernel v6.1, which has
> > this commit that added some hugepage support to mremap (
> >
> https://patchwork.kernel.org/project/linux-mm/patch/20211013195825(dot)3058275-1-almasrymina(at)google(dot)com/
> > ).
> >
> > From reading the kernel code and testing, for a hugepage-backed mapping
> it
> > seems mremap supports only shrinking but not growing. Further, for
> > shrinking, what I observed is that after mremap is called the hugepage
> > memory
> > is not released back to the OS, rather it's released when the fd is
> closed
> > (or when the memory is unmapped for a mapping created with
> MAP_ANONYMOUS).
> > I'm not sure if this behavior is expected, but being able to release
> memory
> > back to the OS immediately after mremap would be important for use cases
> > such as supporting "serverless" PG instances on the cloud.
> >
> > I'm no expert in the linux kernel so I could be missing something. It'd
> be
> > great if you or somebody can comment on these observations and whether
> this
> > mremap-based solution would work with hugepage bufferpool.
>
> Hm, I think you're right. I didn't realize there is such limitation, but
> just verified on the latest kernel build and hit the same condition on
> increasing hugetlb mapping you've mentioned above. That's annoying of
> course, but I've got another approach I was originally experimenting
> with -- instead of mremap do munmap and mmap with the new size and rely
> on the anonymous fd to keep the memory content in between. I'm currently
> reworking mmap'ing part of the patch, let me check if this new approach
> is something we could universally rely on.
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrei Lepikhov 2025-03-21 09:02:36 Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment
Previous Message Zhijie Hou (Fujitsu) 2025-03-21 08:17:43 RE: Conflict detection for multiple_unique_conflicts in logical replication