Re: [HACKERS] Reducing sema usage (was Postmaster dies with many child processes)

From: The Hermit Hacker <scrappy(at)hub(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] Reducing sema usage (was Postmaster dies with many child processes)
Date: 1999-01-31 01:52:42
Message-ID: Pine.BSF.4.05.9901302150410.13391-100000@thelab.hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, 30 Jan 1999, Tom Lane wrote:

> I said:
> > Another thing we ought to look at is changing the use of semaphores so
> > that Postgres uses a fixed number of semaphores, not a number that
> > increases as more and more backends are started. Kernels are
> > traditionally configured with very low limits for the SysV IPC
> > resources, so having a big appetite for semaphores is a Bad Thing.
>
> I've been looking into this issue today, and it looks possible but messy.
>
> The source of the problem is the lock manager
> (src/backend/storage/lmgr/proc.c), which wants to be able to wake up a
> specific process that is blocked on a lock. I had first thought that it
> would be OK to wake up any one of the processes waiting for a lock, but
> after looking at the lock manager that seems a bad idea --- considerable
> thought has gone into the queuing order of waiting processes, and we
> don't want to give that up. So we need to preserve this ability.
>
> The way it's currently done is that each extant backend has its own
> SysV-style semaphore, and when you want to wake up a particular backend
> you just V() its semaphore. (BTW, the semaphores get allocated in
> chunks of 16, so an out-of-semaphores condition will always occur when
> trying to start the 16*N+1'th backend...) This is simple and reliable
> but fails if you want to have more backends than the kernel has SysV
> semaphores. Unfortunately kernels are usually configured with not
> very many semaphores --- 64 or so is typical. Also, running the system
> down to nearly zero free semaphores is likely to cause problems for
> other subsystems even if Postgres itself doesn't run out.
>
> What seems practical to do instead is this:
> * At postmaster startup, allocate a fixed number of semaphores for
> use by all child backends. ("Fixed" can really mean "configurable",
> of course, but the point is we won't ask for more later.)
> * The semaphores aren't dedicated to use by particular backends.
> Rather, when a backend needs to block, it finds a currently free
> semaphore and grabs it for the duration of its wait. The number
> of the semaphore a backend is using to wait with would be recorded
> in its PROC struct, and we'd also need an array of per-sema data
> to keep track of free and in-use semaphores.
> * This works with very little extra overhead until we have more
> simultaneously-blocked backends than we have semaphores. When that
> happens (which we hope is really seldom), we overload semaphores ---
> that is, we use the same sema to block two or more backends. Then
> the V() operation by the lock's releaser might wake the wrong backend.
> So, we need an extra field in the LOCK struct to identify the intended
> wake-ee. When a backend is released in ProcSleep, it has to look at
> the lock it is waiting on to see if it is supposed to be wakened
> right now. If not, it V()s its shared semaphore a second time (to
> release the intended wakee), then P()s the semaphore again to go
> back to sleep itself. There probably has to be a delay in here,
> to ensure that the intended wakee gets woken and we don't have its
> bed-mates indefinitely trading wakeups among the wrong processes.
> This is why we don't want this scenario happening often.
>
> I think this could be made to work, but it would be a delicate and
> hard-to-test change in what is already pretty subtle code.
>
> A considerably more straightforward approach is just to forget about
> incremental allocation of semaphores and grab all we could need at
> postmaster startup. ("OK, Mac, you told me to allow up to N backends?
> Fine, I'm going to grab N semaphores at startup, and if I can't get them
> I won't play.") This would force the DB admin to either reconfigure the
> kernel or reduce MaxBackendId to something the kernel can support right
> off the bat, rather than allowing the problem to lurk undetected until
> too many clients are started simultaneously. (Note there are still
> potential gotchas with running out of processes, swap space, or file
> table slots, so we wouldn't have really guaranteed that N backends can
> be started safely.)
>
> If we make MaxBackendId settable from a postmaster command-line switch
> then this second approach is probably not too inconvenient, though it
> surely isn't pretty.
>
> Any thoughts about which way to jump? I'm sort of inclined to take
> the simpler approach myself...

I'm inclined to agree...get rid of the 'hard coded' max, make it a
settable option on run time, and 'reserve the semaphores' on startup...

Marc G. Fournier
Systems Administrator @ hub.org
primary: scrappy(at)hub(dot)org secondary: scrappy(at){freebsd|postgresql}.org

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Charles Hornberger 1999-01-31 02:35:04 Re: [HACKERS] nested loops in joins, ambiguous rewrite rules
Previous Message Bruce Momjian 1999-01-31 01:45:23 Re: [HACKERS] Re: Reducing sema usage (was Postmaster dies with many child processes)