home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.sys.sun.admin
- Path: sparky!uunet!cs.utexas.edu!zaphod.mps.ohio-state.edu!pacific.mps.ohio-state.edu!linac!newsaintmail
- From: billq@fnal.gov (William R. Quayle)
- Subject: Re: Amd is not perfect
- Message-ID: <Y0S7O09EP@linac.fnal.gov>
- Sender: daemon@linac.fnal.gov (The Background Man)
- Nntp-Posting-Host: boise.fnal.gov
- Reply-To: billq@pogo.fnal.gov
- Organization: Independant Consultant
- References: <Bs88Cu.7zp@fiu.edu>
- Date: Fri, 31 Jul 1992 13:42:34 GMT
- Lines: 33
-
- Is anyone at Sun listening? This bug has been reported at least 4 times
- that I know about. I've run into it here at Fermi twice, twice when I
- was at AT&T Bell Labs, and I've seen it posted roughly two months ago
- (I wish I had saved it). Symptoms are exactly as Carlos describes;
- all nfsd's go into DW, load rises dramatically, and the server deadlocks.
- Root is the only one that can log in and function. Then, if you are
- patient enough not to reboot, the system falls back into a normal state.
- I've seen the machine (a 4/490) deadlocked for 20-30 minutes before
- regaining its sanity. OS 4.1.2, Sun automounter, ~20 automounted filesystems.
-
- What gives?
-
- In article <Bs88Cu.7zp@fiu.edu>, ibarrac@kzin.fiu.edu (Carlos A. Ibarra) writes:
- [deleted]
- |> Around once a month, all the processes referencing files through amd on
- |> one of our servers get stuck. A ps shows them all in DW state. Meanwhile,
- |> amd is happily running fine. The load increases monotonically. Each new
- |> process which attempts to access a filesystem through the automounter,
- |> gets stuck in kernel wait. Sometimes, if we are lucky, this stops by itself.
- |> The load goes back down and everything works fine. Other times we have
- |> to resort to a reboot. It looks to me like some kind of deadlock, but
- |> I have not been able to find out where the cyclical wait occurs. It may
- |> also be an amd bug.
- |>
- |> This used to happen a lot before with Sun's automounter. Amd reduced, but
- |> did not eliminate, the frequency of this problem.
-
- ----------------------------------------------------------------------------
- William R. Quayle | UNIX Systems Administration
- Fermi National Accelerator Laboratory | Distributed Computing Department
- P.O. Box 500, MS-368 | Internet: billq@fnal.fnal.gov
- Batavia, IL 60510 | (708) 840-8254
- ----------------------------------------------------------------------------
-