Re: Changing shared_buffers without restart

From: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Changing shared_buffers without restart
Date: 2024-11-26 19:17:58
Message-ID: pw5phxoc34xh4o5mtv7x4ujgf3jmdsiihbi5n5d5ucy64ksjow@7i6gkeeqrp6w
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Mon, Nov 25, 2024 at 02:33:48PM GMT, Robert Haas wrote:
>
> I think the idea of having multiple shared memory segments is
> interesting and makes sense, but I would prefer to see them called
> "segments" rather than "slots" just as do we do for DSMs. The name
> "slot" is somewhat overused, and invites confusion with replication
> slots, inter alia. I think it's possible that having multiple fixed
> shared memory segments will spell trouble on Windows, where we already
> need to use a retry loop to try to get the main shared memory segment
> mapped at the correct address. If there are multiple segments and we
> need whatever ASLR stuff happens on Windows to not place anything else
> overlapping with any of them, that means there's more chances for
> stuff to fail than if we just need one address range to be free.
> Granted, the individual ranges are smaller, so maybe it's fine? But I
> don't know.

I haven't had a chance to experiment with that on Windows, but I'm
hoping that in the worst case fallback to a single mapping via proposed
infrastructure (and the consequent limitations) would be acceptable.

> The big thing that worries me is synchronization, and while I've only
> looked at the patch set briefly, it doesn't look to me as though
> there's enough machinery here to make that work correctly. Suppose
> that shared_buffers=8GB (a million buffers) and I change it to
> shared_buffers=16GB (2 million buffers). As soon as any one backend
> has seen that changed and expanded shared_buffers, there's a
> possibility that some other backend which has not yet seen the change
> might see a buffer number greater than a million. If it tries to use
> that buffer number before it absorbs the change, something bad will
> happen. The most obvious way for it to see such a buffer number - and
> possibly the only one - is to do a lookup in the buffer mapping table
> and find a buffer ID there that was inserted by some other backend
> that has already seen the change.

Right, I haven't put much efforts into synchronization yet. It's in my
bucket list for the next iteration of the patch.

> code, but I'm not sure exactly which points are safe. If we have no
> code anywhere that assumes the address of an unpinned buffer can't
> change before we pin it, then I guess the check for pins is the only
> thing we need, but I don't know that to be the case.

Probably I'm missing something here. What scenario do you have in mind,
when the address of a buffer is changing?

> I guess I would have imagined that a change like this would have to be
> done in phases. In phase 1, we'd tell all of the backends that
> shared_buffers had expanded to some new, larger value; but the new
> buffers wouldn't be usable for anything yet. Then, once we confirmed
> that everyone had the memo, we'd tell all the backends that those
> buffers are now available for use. If shared_buffers were contracted,
> phase 1 would tell all of the backends that shared_buffers had
> contracted to some new, smaller value. Once a particular backend
> learns about that, they will refuse to put any new pages into those
> high-numbered buffers, but the existing contents would still be valid.
> Once everyone has been told about this, we can go through and evict
> all of those buffers, and then let everyone know that's done. Then
> they shrink their mappings.

Yep, sounds good. I was pondering about more crude approach, but doing
this in phases seems to be a way to go.

> It looks to me like the patch doesn't expand the buffer mapping table,
> which seems essential. But maybe I missed that.

Do you mean the "Shared Buffer Lookup Table"? It does expand it, but
under somewhat unfitting name STRATEGY_SHMEM_SLOT. But now that I look
at the code, I see a few issues around that -- so I would have to
improve it anyway, thanks for pointing that out.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2024-11-26 19:30:29 Re: UUID v7
Previous Message Sergey Prokhorenko 2024-11-26 19:11:21 Re: UUID v7