From: | Dmitry Dolgov <9erthalion6(at)gmail(dot)com> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> |
Subject: | Re: Changing shared_buffers without restart |
Date: | 2025-02-25 09:52:05 |
Message-ID: | xuuxvlhom2tiinwwnh7r6wds74o2fkwryy6palehytuzm76l4t@3q7lszfqic3b |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> On Fri, Oct 18, 2024 at 09:21:19PM GMT, Dmitry Dolgov wrote:
> TL;DR A PoC for changing shared_buffers without PostgreSQL restart, via
> changing shared memory mapping layout. Any feedback is appreciated.
Hi,
Here is a new version of the patch, which contains a proposal about how to
coordinate shared memory resizing between backends. The rest is more or less
the same, a feedback about coordination is appreciated. It's a lot to read, but
the main difference is about:
1. Allowing to decouple a GUC value change from actually applying it, sort of a
"pending" change. The idea is to let a custom logic be triggered on an assign
hook, and then take responsibility for what happens later and how it's going to
be applied. This allows to use regular GUC infrastructure in cases where value
change requires some complicated processing. I was trying to make the change
not so invasive, plus it's missing GUC reporting yet.
2. Shared memory resizing patch became more complicated thanks to some
coordination between backends. The current implementation was chosen from few
more or less equal alternatives, which are evolving along following lines:
* There should be one "coordinator" process overseeing the change. Having
postmaster to fulfill this role like in this patch seems like a natural idea,
but it poses certain challenges since it doesn't have locking infrastructure.
Another option would be to elect a single backend to be a coordinator, which
will handle the postmaster as a special case. If there will ever be a
"coordinator" worker in Postgres, that would be useful here.
* The coordinator uses EmitProcSignalBarrier to reach out to all other backends
and trigger the resize process. Backends join a Barrier to synchronize and wait
untill everyone is finished.
* There is some resizing state stored in shared memory, which is there to
handle backends that were for some reason late or didn't receive the signal.
What to store there is open for discussion.
* Since we want to make sure all processes share the same understanding of what
NBuffers value is, any failure is mostly a hard stop, since to rollback the
change coordination is needed as well and sounds a bit too complicated for now.
We've tested this change manually for now, although it might be useful to try
out injection points. The testing strategy, which has caught plenty of bugs,
was simply to run pgbench workload against a running instance and change
shared_buffers on the fly. Some more subtle cases were verified by manually
injecting delays to trigger expected scenarios.
To reiterate, here is patches breakdown:
Patches 1-3 prepare the infrastructure and shared memory layout. They could be
useful even with multithreaded PostgreSQL, when there will be no need for
shared memory. I assume, in the multithreaded world there still will be need
for a contiguous chunk of memory to share between threads, and its layout would
be similar to the one with shared memory mappings. Note that patch nr 2 is
going away as soon as I'll get to implement shared memory address reservation,
but for now it's needed.
Patch 4 is a new addition to handle "pending" GUC changes.
Patch 5 actually does resizing. It's shared memory specific of course, and
utilized Linux specific mremap, meaning open portability questions.
Patch 6 is somewhat independent, but quite convenient to have. It also utilizes
Linux specific call memfd_create.
I would like to get some feedback on the synchronization part. While waiting
I'll proceed implementing shared memory address space reservation and Ashutosh
will continue with buffer eviction to support shared memory reduction.
Attachment | Content-Type | Size |
---|---|---|
v2-0001-Allow-to-use-multiple-shared-memory-mappings.patch | text/plain | 30.1 KB |
v2-0002-Allow-placing-shared-memory-mapping-with-an-offse.patch | text/plain | 8.7 KB |
v2-0003-Introduce-multiple-shmem-segments-for-shared-buff.patch | text/plain | 11.1 KB |
v2-0004-Introduce-pending-flag-for-GUC-assign-hooks.patch | text/plain | 11.9 KB |
v2-0005-Allow-to-resize-shared-memory-without-restart.patch | text/plain | 32.9 KB |
v2-0006-Use-anonymous-files-to-back-shared-memory-segment.patch | text/plain | 6.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | vignesh C | 2025-02-25 10:02:46 | Re: Add an option to skip loading missing publication to avoid logical replication failure |
Previous Message | Vladlen Popolitov | 2025-02-25 09:47:39 | Re: SQL Property Graph Queries (SQL/PGQ) |