Re: connection establishment versus parallel workers

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: connection establishment versus parallel workers
Date: 2025-02-06 22:53:58
Message-ID: CA+hUKGLLDcDTsHypTmCzAtKxMKwkd84cwy8PsMGMcpU6CQO76A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 20, 2025 at 6:33 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> Here's the WIP code I have up with for that so far.
>
> Remaining opportunities not attempted:
> 1. When a child exits, we could use a hash table to find it by pid.
> 2. When looking for a bgworker slot that is not in use, we could do
> something better than linear search.

I haven't had time to work on this again due to other projects, but I
wanted to write down the ideas I thought about for the record.
Obviously we'd want some kind of free list, but the postmaster would
need to push free slots into it when workers exit, and it can't use
locks or any data structures that can cause it to get stuck just
because a backend has corrupted shared memory. It must always be able
to process shutdown commands and coordinate crash restarts, no matter
how bananas the backends go. With that in mind:

Idea #1: A CAS-based linked list, relying on CAS never being emulated
with locks, and probably requiring 16 bit indexes since we can only
expect 32 bit atomic hardware and you might need to change both the
head and tail of a hypothetical list head when going from empty to one
element. But you have to convince yourself that it's OK to run a
CAS-loop in the postmaster, which might in theory might be prevented
from completing...

Idea #2: The free list could be a simple circular buffer of slot
index numbers. That would be symmetrical with
0001-Remove-BackgroundWorkerStateChange-s-outer-loop.patch's shared
memory "start" queue, and could be coded essentially the same way.
The "start" queue and the "free" queue would then both be simple
arrays with a head and a tail, and in both cases only the consumers
(regular backends) need an lwlock to serialise against each other,
while the producer (the postmaster) can get away with careful memory
barriers and just has to range-check the head/tail indexes to deal
with untrusted shared memory contents. A rogue backend can jam up the
bgworker subsystem, but that's already true. It still can't prevent
shutdown or crash restart.

Idea #3: Suggested by Robert in an off-list chat about all this:
backends could maintain a shared memory free-list using existing dlist
technology protected by an lwlock. When it's empty *they* (not the
postmaster) would fill it up again using a linear slot search. It's
simple but not quite as satisfying, because it is still possible to
degrade to high frequency linear searches that only find a small
number of free slots each time if you're unlucky, ie you can entirely
fail to amortise.

Idea #4: Maintain a bitmap of free slots indexes, which the
postmaster sets with atomic_fetch_or().

I lean towards idea #2, but haven't actually tried it.

A similar situation exists for DSM slot management, though that has
different complications: no postmaster interaction, but funky handle
requirements due to portability concerns. I think the handles could
probably be changed to encode the slot index + generation for O(1)
lookup at "attach" time, and free slots could be stored in a circular
queue for O(1) slot allocation. I think the use of random numbers
stemmed from SysV shared memory's need to find a free key in a 32 bit
OS-wide namespace (yuck). I haven't looked at that code in a while
but I don't recall any reason why even those couldn't be hidden inside
the slot itself, instead of being exposed in the handle, forcing
linear searches. A generation scheme would also be more robust
against weird random number collisions, and could detect handles that
were valid but now are no longer in a more obvious way, instead of "I
looked everywhere and I couldn't find it".

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2025-02-06 23:25:30 Should we allow ALTER OPERATOR CLASS to ADD/DROP operators and procedures?
Previous Message Alexander Korotkov 2025-02-06 22:26:04 Re: Get rid of WALBufMappingLock