NetNews Usenet Archive 1992 #27

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #27 / NN_1992_27.iso / spool / comp / unix / wizards / 4876 < prev next >

Wrap

Internet Message Format | 1992-11-24 | 7.4 KB

Path: sparky!uunet!cis.ohio-state.edu!magnus.acs.ohio-state.edu!usenet.ins.cwru.edu!agate!dog.ee.lbl.gov!horse.ee.lbl.gov!torek From: torek@horse.ee.lbl.gov (Chris Torek) Newsgroups: comp.unix.wizards Subject: Re: Changing the owner of a process Date: 24 Nov 1992 03:20:20 GMT Organization: Lawrence Berkeley Laboratory, Berkeley Lines: 155 Message-ID: <27630@dog.ee.lbl.gov> References: <1992Nov21.022833.24351@exlog.com> Reply-To: torek@horse.ee.lbl.gov (Chris Torek) NNTP-Posting-Host: 128.3.112.15 Keywords: process ownership In article <1992Nov21.022833.24351@exlog.com> mcdowell@exlogcorp.exlog.com (Steve McDowell) writes: >Calm down Chris, there's no need for a personal exchange here. You don't >know me or what my background is, so be very careful before you presume >the wrong things. True (although the `exlog.com' is a bit of a giveaway :-) ). Perhaps I was overly grouchy and irritable---moving tends to do that.... I could have included a smiley or something. >>... I will not try to tell you how to sell commercial systems. >But you don't understand, I wish you would -- we really do need the help.... Well, if you insist :-) >Or, in the case that my original article was in response to, you have a user >who fancies himself a "system programmer". I'm a big believer in the >condom approach to end-user computing -- let a user stroke himself all >he wants to but don't let his results impregnate the integrety of my >operating system. One can easily take this too far. For instance, it is possible on many (most? all?) Unix machines to open /dev/mem and alter the actual kernel code. (I did this once on a live multi-user VAX, to patch a bug in the old 4.1BSD CMU IPC. Quite an experience. :-) ) If someone replaces the kernel code, recovery is difficult---maybe not impossible (take a look at the techniques people use in CORE WARS), but certainly not worthwhile to everyone. >... If there's a situation that can be recovered from, then recover. This is a *very* hard problem in general. Some researchers argue that fault tolerance is in fact *too* hard, and that a `reboot and recover work-in-progress' approach is more tractable. What we do in practise (in both research and, in my previous incarnation at Univ. of MD, systems support) is try to anticipate `likely failures' and insert recovery code for these. In this case (per-UID process counts), the failure is unlikely, and the recovery code will mostly be `dead weight'. The benefit just does not match the cost. Note that costs and benefits differ for others; people who buy Tandem systems to run banks are willing to spend the extra money for dual or triple redundant systems, and willing to take speed reduction factors of 2 or more, to achieve higher reliability. Other things, like the `timeout table overflow' panic, are symptoms not of *errors* but of *overload*, and in this case I would be happy to replace it with something more effective (if I could just think of something simple and reliable...). There are about 140 panic's in the 4.4-alpha `kern' directory right now. Of these, I only see two or three that I think should `never' be there; the rest represent sanity checks that make sure the rest of the kernel is behaving in the manner expected. Errors in device drivers, file systems, and so forth can trigger them, but presumably those installing such drivers or file systems will prefer to debug these. Note that many of these are under `#ifdef DIAGNOSTIC': if you believe the system works, you can simply turn them off entirely. (This is not the same as `handling the situation'!) >Of course, for your purposes developing an operating system is a means >unto itself. You don't have to worry about irrelevant things like >"customers" and "applications". True to some extent, anyway. Our systems have to be reliable enough to keep us working, but not so reliable as to put us out of a job. :-) Just for fun, let us consider several panic's and possible alternatives, all from /sys/kern/kern_descrip.c. dup2(p, uap, retval) ... if (new >= fdp->fd_nfiles) { if (error = fdalloc(p, new, &i)) return (error); if (new != i) panic("dup2: fdalloc"); } else if ... Now, fdalloc's job is to allocate the first free file descriptor greater than or equal to the given one (arg 2; i holds the result). fdp->fd_nfiles is the limit on valid descriptors: all are in [0..fd_nfiles). Since new >= fd_nfiles, we must (by definition) get back the desired descriptor. If new != i, what went wrong? - maybe fd_nfiles is wrong. - maybe fdalloc() is broken. - maybe someone snuck in and allocated a descriptor while we were not watching (i.e., there is a race). Each of these appears to deserve different treatment. If fd_nfiles was wrong, just fix it and continue. If fdalloc() is broken, there is not much we can do here; we could return an error (EINTERNAL) to say that the system is broken, but we will never be able to get anything done that way. If there is a race, we could try again and maybe win the race---but we have to be careful not to loop forever in this case. fstat() ... switch (fp->f_type) { ... default: panic("fstat"); } Since the only known `type's are vnodes and sockets, something is definitely wrong. Perhaps someone overwrote the original type. In this case we really have no way to recover it. I would argue that the right cure for this panic is to remove the distinctions between vnodes and sockets (this is now possible with the VFS, or would be with a few minor tweaks), so that instead of switching on the type, we just call through the appropriate pointer: struct vnode *vp; vp = f->f_vp; vp->v_ops->vo_stat(p, vp, &ub); In this case, if any of the pointers are overwritten, we will probably just crash immediately; but at least we will have written less code. :-) closef() ... if (fp->f_count < 0) panic("closef: count < 0"); Those who ran 4.2BSD on the VAX when it first came out have some experience here. This panic did not appear in the original code. Instead, the count actually did go negative, due to the ability to longjmp() out of a close when (e.g.) closing a tty with a SIGALRM pending. When that happened, the kernel would eventually trash a file system. The panic prevented that, and gave the clues needed to diagnose the problem. Since then I have not seen this panic occur; its purpose is to act as a `firewall'. Nonetheless, how would one fix this? The count could be recovered, but only by scanning all descriptor tables for all processes---and in any case it is probably too late: the underlying close routine may have been called already, so the network connection to a peer (if that is what this represents) is already gone, with no way to get it back. In all these cases, whenever a count could be wrong, the pointer that points to it could be wrong instead. How do we tell the difference? Should we test every pointer before following it, in case it would cause a trap? Or perhaps try to recover from within the trap handler? (This is *not* straightforward on many machines, and both methods have a high cost.) So, all in all, the 4.4BSD line generally panics when a consistency check fails, because that makes it debuggable. The system reboots in a few minutes (fsck is quicker than it was in 4.3BSD), and after saving the evidence (the vmunix and vmcore files) the machine is back up and running. -- In-Real-Life: Chris Torek, Lawrence Berkeley Lab CSE/EE (+1 510 486 5427) Berkeley, CA Domain: torek@ee.lbl.gov