NetNews Usenet Archive 1992 #26

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #26 / NN_1992_26.iso / spool / comp / os / vms / 17837 < prev next >

Wrap

Internet Message Format | 1992-11-12 | 5.2 KB

Path: sparky!uunet!cs.utexas.edu!zaphod.mps.ohio-state.edu!pacific.mps.ohio-state.edu!linac!att!ucbvax!lrw.com!leichter From: leichter@lrw.com (Jerry Leichter) Newsgroups: comp.os.vms Subject: re: VAX 4000/200 ethernet problems Message-ID: <9211121236.AA02598@uu3.psi.com> Date: 12 Nov 92 11:23:57 GMT Sender: daemon@ucbvax.BERKELEY.EDU Distribution: world Organization: The Internet Lines: 111 We had a MicroVAX II. We upgraded it to VMS 5.4-2. Then we purchased an upgrade kit to convert it to a VAX 4000/200. The upgrade kit included a new CPU board, memory board, an DSSI disk, and various cables and patch panels. Anyhow, the VAX 4000/200 CPU board has an onboard ethernet port, EZA0. I've been trying to get this port working, but haven't been getting anywhere. The old DEQNA board still works, but is rumored to be much slower than the EZA0 channel. Not only that, but VMS 5.4-2 doesn't support it, at least for LAVC connections. After I took the ethernet cable off the DEQNA (AUI cable) and attached it to the EZA0 port, I would get lots of decnet errors on the console like "Circuit up", "Circuit Down", "Adjacency up" There were numbers that came with these errors, and I could look them up in the DECNET manual if necessary. When I would "set host" out of this machine to other machines, the link would be VERY VERY slow, and often freeze up for several seconds, as the "circuit up", "circuit down", and "adjacency up" errors flasshed by. Hint on diagnosing network problems: All of these are "high level" error conditions. About all they tell you is that something is wrong at the physical circuit level. Slow and delayed response in SET HOST means that the lower layers are losing or damaging a lot of packets; DECnet is recovering by retransmitting, but that takes time. The actual details, at this level, are almost never of any help in diagnosing the problem. I got a new CPU board from DEC, and also replaced all the cables and the patch panel. After that, then I couldn't "set host" at all, because the destination would always be unavailable. I don't know how the second board differed from the first, except that the problem got worse. I also no longer get any "circuit up", "circuit down", and "adjacency up" errors. When I executed a "mcr ncp sho known circuits" it would give me circuit state isa-0 on - synchronizing Again, this is still a fairly high level indication of a problem. When I execute a "mcr ncp sho active lines counters" I would get: Ah, now we are getting to the APPROPRIATE level. ... 83 send failures, including: excessive collisions carrier check failed remote failure to defer 1 collision detect check failure These are the important numbers. In fact, they pretty much pinpoint the problem. 0 user boffer unavailable ^ Really! Just what ARE you doing with that system? :-) Anyhow I called Colorado Springs. Colorado Springs sent software patch CSCPAT_0252019 which is a new ezdriver.exe. I've installed the patch, but the problem is still occuring. After the patch, here is a sample of "mcr ncp sho active lines counters" A bad driver can, of course, make a piece of hardware appear to fail in any arbitrary way; but somehow I doubt that's your problem. 91 send failures, including: carrier check failed Again, this is the important counter. In a properly configured system, the only acceptable number for send failures, under normal conditions (i.e., when you aren't playing with the hardware) is zero. (On two systems I have here, in a total of 213065 blocks sent, the total count is indeed zero.) Also, "mcr ncp sho known circuits" now states that isa-0's state is on, and not "on -synchronizing". This is where I am right now. Has anyone ever heard of this problem? I actually swapped out the 4000/200 CPU once more since then, but that didn't fix the problem either. Each time I install a new board, I run sys$manager:netconfig to setup the parameters. I've also swapped out the AUI cable and the tranceiver to the backbone. I'll bet your swapped the transceiver for an identical one. I'll further guess that you have an old transceiver, most likely a DEC H4000. I'd say the odds are 99% or better that you have a hardware configuration problem of some sort. My bet is on the following: The DEQNA was built at a time when the original Ethernet standards were in effect, while your new interface complies with the IEEE specs. The two are almost, but not quite, the same. You need different transceivers (or at least different transceiver configurations) for devices built to the different specs. The H4000 was eventually replaced by the H4005, identical except that it complied with IEEE rather than Ethernet specs. I'm sure the H4005 is long obsolete; I don't know what its modern replacement would be. The effect of using the wrong kind of transceiver would be to cause the timing of some of the self-check features to be wrong. This will show up as carrier check failures, spurious collisions, spurious remote failures to defer, and so on. Some transceivers allow you to select either Ethernet or IEEE operating mode. If yours is one of those, try changing the setting. If not, you'll need a new transceiver. -- Jerry