Re: Mac OS X: system shutdown prevents checkpoint

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Bierman <bierman(at)apple(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Mac OS X: system shutdown prevents checkpoint
Date: 2002-05-02 04:45:19
Message-ID: 4752.1020314719@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

Peter Bierman <bierman(at)apple(dot)com> writes:
> Is fork() disallowed after shutdown starts?
>>
>> No, it's allowed. But, depending upon timing, the new process may be
>> hammered with a SIGTERM right away (maybe even before main()).

Good point. The fork is executed with SIGTERM blocked --- but the
checkpoint child process currently will enable SIGTERM shortly after
being forked. On reflection that seems like a bad idea; probably the
checkpoint process should ignore SIGTERM so that it won't get killed
prematurely during system shutdown.

However, that doesn't explain our OS X problem. I added some debug
printouts, and can now report positively that (a) the fork() call
returns normally in the parent process, providing an apparently-correct
child PID value; but (b) the fork never returns in the child. It
doesn't ever get as far as trying to enable SIGTERM.

>> Is fork really returning a PID in the parent, and it just looks like the
>> child didn't make it to returning from its fork() call? There are some
>> preparation things that happen in dyld and libc as part of returning fom
>> fork in the child, and these run before we make it look like fork()
>> returned in the child. If they encounter an error (maybe because the
>> services they need to talk to are no longer available), they have nothing
>> else to do but call _exit() - making it look like the child never returned
>> from fork().

Hmmm ... that seems very close to what I'm seeing.

>> But in either the dydl/libc exit case, or the signal case, the parent
>> should get a wait result indicating why the child went away so
>> prematurely.

The parent is not getting any wait() result indicating that its child died.
(If it were, we'd not have the problem being complained of.)

Is it possible that something in the child's fork() processing will wait
around for a response from a service that's already died? Why is fork()
dependent on any outside service whatever --- isn't that a certain
recipe for system failures?

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2002-05-02 05:00:51 Re: On Distributions In 7.2.1
Previous Message Mark kirkwood 2002-05-02 04:17:47 On Distributions In 7.2.1

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2002-05-02 04:49:41 Re: Search from newer tuples first, vs older tuples first?
Previous Message Tom Lane 2002-05-02 03:33:12 Re: Schemas: status report, call for developers