home *** CD-ROM | disk | FTP | other *** search
- From: bowen@sgi.com (Jerre Bowen)
-
- Folks:
-
- I'm wondering if there is an easy way in POSIX to be absolutely
- certain that a process which calls a library routine that forks and waits
- on a child does not lose any SIGCLDs. I apologize for the length of this
- article. Here's the scenario:
-
-
- void cldhandler();
-
- pid_t pid;
-
- main()
- {
- sigset_t mtmask;
- struct sigaction action;
-
- sigemptyset(&mtmask); /* sigsuspend with no sigs blocked */
-
- /* SIGCLD handler runs with SIGCLD blocked */
- sigemptyset(&action.sa_mask);
- sigaddset(&action.sa_mask, SIGCLD);
- action.sa_handler = cldhandler;
- action.sa_flags = 0;
- sigaction(SIGCLD, &action,NULL);
-
- if ( (pid = fork()) == 0) {
- sleep(1);
- exit;
- }
- else {
- forkit();
- sigsuspend(&mtmask); /* will parent awaken? */
- }
- }
-
- void
- cldhandler(sig)
- {
- waitpid(pid, &stat, (WNOHANG|WUNTRACED));
- }
-
- forkit()
- {
- struct sigaction act, oact;
-
- act.sa_handler = SIG_DFL;
- act.sa_mask = 0;
- act.sa_flags = 0;
- sigaction(SIGCLD, &act, &oact); /* default handling for SIGCLD */
- <process forks and execs a program which runs for at least 1 sec>
- <process does a waitpid() on its child process>
- sigaction(SIGCLD, &oact, NULL); /* reinstall prior handling */
- }
-
-
- The problem here is that the original child of the parent will
- exit while forkit() is executing, and since SIGCLD is SIG_DFL'ed during
- that time, a zombie *will* be created, but the SIGCLD will *not* be delivered.
- The parent then suspends waiting for the SIGCLD indicating that
- its child exited, which of course never arrives. (Obviously, I am
- primarily concerned about the case where forkit() is a library routine, and
- the user has no idea what the routine is doing with signals--and
- *shouldn't* need to either.)
-
- SysV solves this problem in signal() and sigset() by checking for
- zombied children at the bottom of the kernel code, and--if any exist--
- re-raising a SIGCLD, thus creating the impression that it is impossible to
- lose a SIGCLD.
-
- BSD requires the user to get around the problem of lost SIGCHLDs
- by calling wait3(WNOHANG) until no more children remain to be reaped
- whenever one SIGCHLD is received. But in a BSD version of the above code,
- you never get any SIGCHLD, so the parent hangs.
-
- POSIX has provided waitpid in order to allow library routines
- such as system(3) and popen/pclose(3), which need to fork and wait for
- child processes, to be implemented reliably even in the case that the
- calling program has child processes that may terminate while in the
- library routines. But the above program example shows that a conforming
- implementation still does not necessarily allow an application program
- to depend on facilities like system(3). The reason is that POSIX explicitly
- leaves undefined the question of whether SIGCHLD is raised when a process
- with a terminated child for which it has not waited establishes a handler
- for SIGCHLD (see section 3.3.1.3 paragraph 3(e)). One way in which an
- implementation can make the above program work properly is to raise
- SIGCHLD in this case (i.e. whenever a process with an outstanding zombie
- calls sigaction to set a handler for SIGCHLD).
-
- Is there a compelling reason for the standard not to require this
- behavior? Granted the implementor has the ability to make things work
- correctly. But if the behavior isn't required, the writer of conforming
- applications can't depend on it.
-
- Is there some other better solution to the problem posed by the sample
- program?
-
- Thanks -- Jerre Bowen (bowen@sgi.com)
-
- Volume-Number: Volume 18, Number 79
-
-