From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Magnus Hagander <magnus(at)hagander(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: windows doesn't notice backend death |
Date: | 2009-05-04 23:41:11 |
Message-ID: | 20447.1241480471@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
I wrote:
> I don't think we'll be able to prevent PHP from doing that :-(. But
> it now seems clear that we should try to make the database as a whole
> recover with some degree of grace. I'll go work up a patch.
Attached is a proposed patch for the "dead man switch" idea. The switch is
armed when a child process acquires a regular PGPROC and disarmed when the
PGPROC is released. (So there is no coverage for auxiliary processes, but
I doubt we need that.) I chose to put the management code into pmsignal.c
--- if you hold your head at the proper angle, this can be seen as a form
of child-to-postmaster signaling, so that seemed like a reasonable place.
The array slot number is passed down to child processes in the same way as
for MyCancelKey.
Also, since the number of array slots needed is exactly the same as the
size of the ShmemBackendArray for the EXEC_BACKEND case, I tweaked the
code for the latter a little bit to avoid duplicate array-searching ---
the array slot number assigned for the deadman switch flag is also used to
index ShmemBackendArray.
One bit of ugliness is that in the EXEC_BACKEND case, InitProcess is done
before CreateSharedMemoryAndSemaphores; so it's necessary to have
special-case code to pass down the shmem state pointer for pmsignal.c,
similarly to what we do for ProcGlobal and a few other pointers.
We could avoid that by arming the switch only sometime after
CreateSharedMemoryAndSemaphores ... but if we're gonna have this mechanism
at all, I'd like it to cover the process's entire ownership of
shared-memory objects, not only part of it.
This doesn't include the proposed change to use an atexit callback to
ensure that proc_exit cleanup gets done --- that looks like a trivial
refactoring in proc_exit, but it's a separate idea anyway.
Barring objections I'll go ahead and apply this to HEAD. I'm wondering
whether we are sufficiently worried about the Windows task manager issue
to risk back-patching into 8.3 and 8.2 ... comments?
regards, tom lane
Attachment | Content-Type | Size |
---|---|---|
dead-man-switch-1.patch.gz | application/octet-stream | 6.9 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2009-05-04 23:56:22 | Re: windows shared memory error |
Previous Message | Peter Eisentraut | 2009-05-04 22:10:53 | Re: Unicode string literals versus the world |