home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!ornl!utkcs2!darwin.sura.net!jvnc.net!rutgers!cmcl2!adm!news
- From: nmw@ionospheric-physics.leicester.ac.uk (N.M. Wade)
- Newsgroups: comp.sys.sgi
- Subject: IP12 client nfs slow
- Message-ID: <34007@adm.brl.mil>
- Date: 10 Nov 92 00:47:49 GMT
- Sender: news@adm.brl.mil
- Lines: 68
-
-
-
- Hi,
-
- If I get any of this wrong I hope someone from SGI will correct the mistakes.
-
- In January this year we took delivery of a new SGI system comprising a 4D/220
- server and 8 indigo clients. Whilst undergoing acceptance tests I noticed
- that the NFS performance was considerably worse than that for a SUN sparcstation
- acting as a client to the same server.
-
- After various checks and tests it became obvious that the problem only occured
- when the client was an Indigo (ie. the Indigos/SUNS/4D220 performed perfectly
- as servers, and the SUNs and 4D220 were ok as clients.)
-
- I initially reported this to our Hotline service in the UK at the end of
- January. As far as I know this was the first report SGI received on the
- problem, certainly the UK support centre found no other references.
- We had several visits from SGI engineers to check our network and system
- configuration, and to monitor the problem. During these tests we had access to
- NetVisualizer software, and it soon became apparent that the problem
- would not manifest itself when NetVis was running. Our only explanation of this
- effect was that NetVis puts the ethernet port into promiscuous mode, and this
- "cured" the problem. We could see the client sending the initally request and
- the server responding. The problem was that the client missed the response,
- thus causing timeout and retransmission. In some cases several of these would
- occur in succession causing a gap in data transfer of many seconds. Increasing
- the timeout period in the NFS mount only made the delay worse, since the client
- waited even longer before retransmitting the request.
-
- After many days of tests, phone calls and faxes the call was eventually
- escalated to the US at the end of February. During March other sites began
- to report similar problems and SGI in the US started to actually take some
- notice.
-
- At the end of March I received an unoffical patch from Georges Lauri at Berkley
- who had been helping SGI in the States with their tests. The patch was a
- simple program, which could be run in the background, and put the ethernet
- port into promiscuous mode, as does NetVis. I later received an official patch
- from SGI which was a mod. to the if_ec kernel module, but was effectively
- the same as the unofficial patch in operation.
-
- As I understand it the "fix" which is incorporated in 4.0.5 is a slightly more
- sophisticated version of the original patch. Now, instead of running continually
- in promiscuous mode, the port runs as normal unless the kernel notices excessive
- collisions, in which case it puts the port into promiscous mode for a
- predetermined period.
-
- During this period we had installed a bridge between our small network and the
- main campus ethernet. This completely cured the problem for us. We still
- don't know why, but it worked and we have kept it that way.
-
- It should be noted that the problem is in the hardware of the Indigo network
- interface board, and the fix is simply a patch to hide the problem. It does
- not cure the machines which run this fix. Also, running in promiscuous mode
- may cause CPU overhead, we have not done these tests yet. Obviously, we are
- reluctant to reconfigure our network to recreate a problem so that we can
- check the fix!!
-
-
-
- Nigel Wade,
-
- Sys. Admin., Ionospheric Physics Group, Leicester University, UK.
-
- e-mail to Janet (UK) : nmw@uk.ac.le.ion
- phone : +44 533 523568
- fax : +44 533 523555
-