From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Charles Hornberger <charlie(at)hss(dot)caltech(dot)edu> |
Cc: | pgsql-admin(at)postgresql(dot)org |
Subject: | Re: postmaster dead but backends still running? |
Date: | 2003-06-19 17:47:48 |
Message-ID: | 10023.1056044868@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin |
Charles Hornberger <charlie(at)hss(dot)caltech(dot)edu> writes:
> However, I think I know the cause (though I haven't tested to see if this
> indeed causes the postmaster to die): A few hours before I noticed that
> the postmaster was dead, one of the sysadmins made a typo that caused an
> NFS mount to become unavailable -- the very NFS mount that held the
> postgres executable (all our Solaris boxes share the same executables). So
> the theory is that the postmaster tried to fork() a process using a
> non-existent executable, and died as a result. Does this make any sense?
A fork() failure would not cause the postmaster to die (it's not
uncommon to see fork() failures due to resource limits, so this path is
really pretty well tested). I'm not familiar enough with Solaris to know
whether other fatal error conditions might arise in this scenario.
(I know HPUX gets rather unhappy if you try to delete an executable file
or shared library that's in use by live processes...) But the trouble
with that line of thought is that the postmaster and the backends are
all the same executable; if the postmaster went south because of loss of
the executable file, I'd expect the backends not to survive it either.
Unless maybe the backends weren't actually doing anything --- is it
possible that the connected clients had issued no commands in the whole
episode?
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Charles Hornberger | 2003-06-19 18:17:01 | Re: postmaster dead but backends still running? |
Previous Message | Ragnar Kjørstad | 2003-06-19 17:22:50 | Re: Database Encryption |