Changing shared_buffers without restart

From: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Changing shared_buffers without restart
Date: 2024-10-18 19:21:19
Message-ID: cnthxg2eekacrejyeonuhiaezc7vd7o2uowlsbenxqfkjwgvwj@qgzu6eoqrglb
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

TL;DR A PoC for changing shared_buffers without PostgreSQL restart, via
changing shared memory mapping layout. Any feedback is appreciated.

Hi,

Being able to change PostgreSQL configuration on the fly is an important
property for performance tuning, since it reduces the feedback time and
invasiveness of the process. In certain cases it even becomes highly desired,
e.g. when doing automatic tuning. But there are couple of important
configuration options that could not be modified without a restart, the most
notorious example is shared_buffers.

I've been working recently on an idea how to change that, allowing to modify
shared_buffers without a restart. To demonstrate the approach, I've prepared a
PoC that ignores lots of stuff, but works in a limited set of use cases I was
testing. I would like to discuss the idea and get some feedback.

Patches 1-3 prepare the infrastructure and shared memory layout. They could be
useful even with multithreaded PostgreSQL, when there will be no need for
shared memory. I assume, in the multithreaded world there still will be need
for a contiguous chunk of memory to share between threads, and its layout would
be similar to the one with shared memory mappings.

Patch 4 actually does resizing. It's shared memory specific of course, and
utilized Linux specific mremap, meaning open portability questions.

Patch 5 is somewhat independent, but quite convenient to have. It also utilizes
Linux specific call memfd_create.

The patch set still doesn't address lots of things, e.g. shared memory segment
detach/reattach, portability questions, it doesn't touch EXEC_BACKEND code and
huge pages.

So far I was doing some rudimentary testing: spinning up PostgreSQL, then
increasing shared_buffers and running pgbench with the scale factor large
enough to extend the data set into newly allocated buffers:

-- shared_buffers 128 MB
=# SELECT * FROM pg_buffercache_summary();
buffers_used | buffers_unused | buffers_dirty | buffers_pinned
--------------+----------------+---------------+----------------
134 | 16250 | 1 | 0

-- change shared_buffers to 512 MB
=# select pg_reload_conf();
=# SELECT * FROM pg_buffercache_summary();
buffers_used | buffers_unused | buffers_dirty | buffers_pinned
--------------+----------------+---------------+---------------
221 | 65315 | 1 | 0

-- round of pgbench read-only load
=# SELECT * FROM pg_buffercache_summary();
buffers_used | buffers_unused | buffers_dirty | buffers_pinned
--------------+----------------+---------------+---------------
41757 | 23779 | 216 | 0

Here is the breakdown:

v1-0001-Allow-to-use-multiple-shared-memory-mappings.patch

Preparation, introduces the possibility to work with many shmem mappings. To
make it less invasive, I've duplicated the shmem API to extend it with the
shmem_slot argument, while redirecting the original API to it. There are
probably better ways of doing that, I'm open for suggestions.

v1-0002-Allow-placing-shared-memory-mapping-with-an-offse.patch

Implements a new layout of shared memory mappings to include room for resizing.
I've done a couple of tests to verify that such space in between doesn't affect
how the kernel calculates actual used memory, to make sure that e.g. cgroup
will not trigger OOM. The only change seems to be in VmPeak, which is total
mapped pages.

v1-0003-Introduce-multiple-shmem-slots-for-shared-buffers.patch

Splits shared_buffers into multiple slots, moving out structures that depend on
NBuffers into separate mappings. There are two large gaps here:

* Shmem size calculation for those mappings is not correct yet, it includes too
many other things (no particular issues here, just haven't had time).
* It makes hardcoded assumptions about what is the upper limit for resizing,
which is currently low purely for experiments. Ideally there should be a new
configuration option to specify the total available memory, which would be a
base for subsequent calculations.

v1-0004-Allow-to-resize-shared-memory-without-restart.patch

Do shared_buffers change without a restart. Current approach is clumsy, it adds
an assign hook for shared_buffers and goes from there using mremap to resize
mappings. But I haven't immediately found any better approach. Currently it
supports only an increase of shared_buffers.

v1-0005-Use-anonymous-files-to-back-shared-memory-segment.patch

Allows an anonyous file to back a shared mapping. This makes certain things
easier, e.g. mappings visual representation, and gives an fd for possible
future customizations.

In this thread I'm hoping to answer following questions:

* Are there any concerns about this approach?
* What would be a better mechanism to handle resizing than an assign hook?
* Assuming I'll be able to address already known missing bits, what are the
chances the patch series could be accepted?

Attachment Content-Type Size
v1-0001-Allow-to-use-multiple-shared-memory-mappings.patch text/plain 28.7 KB
v1-0002-Allow-placing-shared-memory-mapping-with-an-offse.patch text/plain 8.6 KB
v1-0003-Introduce-multiple-shmem-slots-for-shared-buffers.patch text/plain 10.7 KB
v1-0004-Allow-to-resize-shared-memory-without-restart.patch text/plain 12.4 KB
v1-0005-Use-anonymous-files-to-back-shared-memory-segment.patch text/plain 6.8 KB

Browse pgsql-hackers by date

  From Date Subject
Previous Message Tom Lane 2024-10-18 19:14:01 Re: Inconsistent use of relpages = -1