home *** CD-ROM | disk | FTP | other *** search
- Submitted-by: davis@unidata.ucar.edu (Glenn P. Davis)
-
- I have a program which demulplexes input and hands it out to
- various child processes. The child process can be anything that
- reads from stdin and terminate up reading EOF. The parent
- calls pipe(), fork(), dup() (actually dup2()), close() and
- execve() in the usual idiomatic way.
-
- The only thing slightly unusual about the program is that is
- maintains a list of data structures representing each subprocess,
- effectively maintaining open pipe connections to multiple running children.
- This list has a maximum size and if a new connection child needs to be
- created, then the least recently used connection is closed and
- the associated resources recycled.
-
- This is where the problem comes in, and I don't understand what is
- going on. I'm working under SunOS 4.1.1.
- The sequence of calls is
-
- close(child->pipe_file_descriptor) ;
- waitpid(child->process_id, &status, 0) ; /* Blocks til child exits */
-
- What I am seeing is that the child blocks waiting for input
- instead of reading EOF, and the parent blocks in waitpid() --> deadlock.
- 'ps -l' reports both processes sleeping (STAT S), with the parent WCHAN
- as "child" and the child's as "socket". This is true if a call to sleep()
- is placed between close() and waitpid().
-
- The problem does not occur if the child is the "most recently opened".
- If I set the list maximum size down to 1, things work properly:
- the child reads EOF after the parent close() and exits,
- the parent waitpid() returns and we proceed.
- When I shutdown all descriptors sequentially from the Most Recently Used to
- the Least Recently Used, as occurs in normal parent program shutdown, things
- work properly.
-
- Some interesting observations can be made if the waitpid call is changed
- to
- waitpid(-1, &status, WNOHANG) ; /* Don't Block, reap any child */
-
- The parent program now proceeds. There is logic in the main loop of the
- parent to close the least recently used connection if there was no input
- in a time interval. Thus all output channels will be closed after some
- number of timeouts with no input. While this is going on, 'ps' reports
- all children sleeping, waiting for input until the most recently used
- connection is closed. At this point, all of the children start reporting
- EOF and exiting, starting with the most recently used!
-
- ------
-
- So, what is going on here? My major concern is that the child processes
- are not able to detect EOF in a timely manner after the close of the parent
- end of the pipe. Asynchronous wait() 'ing does not really solve the problem:
- in the presence of continuous input there is an accumlation of child processes
- until system resources such as process table entries or memory are exhausted.
- Am I misunderstanding something or is this an OS bug?
-
- Please reply directly.
-
- Thanks.
-
- Glenn P. Davis
- UCAR / Unidata
- PO Box 3000 3300 Mitchell Lane, Suite 170
- Boulder, CO 80307-3000 Boulder, CO 80301
- (303) 497 8643
-
- Volume-Number: Volume 28, Number 18
-
-