home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.unix.aix
- Path: sparky!uunet!ukma!darwin.sura.net!uvaarpa!murdoch!sasha.acc.Virginia.EDU!scl
- From: scl@sasha.acc.Virginia.EDU (Steve Losen)
- Subject: Re: runaway processes (was Re: Vi is still broken)
- Message-ID: <1992Aug19.145808.2342@murdoch.acc.Virginia.EDU>
- Sender: usenet@murdoch.acc.Virginia.EDU
- Reply-To: scl@sasha.acc.Virginia.EDU (Steve Losen)
- Organization: University of Virginia
- References: <1992Aug17.163739.29534@APS.Atex.Kodak.COM> <133789@lll-winken.LLNL.GOV> <1992Aug19.132117.5939@msc.cornell.edu>
- Date: Wed, 19 Aug 1992 14:58:08 GMT
- Lines: 41
-
- [ complaints about runaway processes eating up cpu time ]
-
- In article <1992Aug19.132117.5939@msc.cornell.edu>,
- |>
- |> Yup. We see this kind of thing frequently. Since we are about to start
- |> chargeback accounting, this will be a severe pain.
- |>
- |> Our most recent method of producing the problem is to close a window in
- |> which a process which is on the receiving end of a pipe is running.
- |>
- |> This has been going on for years but has lately become intolerable. I
- |> must now try to come up with a reproducible example and phone up IBM
- |> to see if it's a 'supported defect' :-}
-
- We've had this problem since day one way back at AIX 3.1. No AIX upgrade
- has fixed it yet. In our case, the runaways are all interactive jobs such
- as editors, mail readers, news readers, etc., and they all seem to happen
- when a telnet session ends abnormally.
-
- Very early on I wrote a perl script that runs "ps caux" every few minutes
- and looks for runaways. I have a "hit list" of interactive commands that
- are known to runaway, including vi, jove, more, less, telnetd, rlogind,
- mail, mush, etc. The script kills off any of these commands if ps
- indicates that it is using over 9% of the cpu and has accumulated 2
- minutes of cpu time. I just pulled these heuristics out of thin air, but
- they have worked well on several loaded 540s and 550s.
-
- Sure beats getting called up several times a day to kill these things off.
-
- I would post the perl script, but it has grown very large because it does
- a whole lot of other stuff such as renicing long running cpu burners,
- detecting when a user is running >1 cpu burner at a time, etc. Also, I
- will have to fix the script to run under 3.2. IBM has changed the output
- format of ps. Thankfully the new format is easier to parse. Under 3.1.5,
- some of the fields can run together. I think under 3.2, you are always
- guaranteed at least one space of separation.
-
- --
- Steve Losen scl@virginia.edu
-
- University of Virginia Academic Computing Center
-