home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!cs.utexas.edu!zaphod.mps.ohio-state.edu!pacific.mps.ohio-state.edu!linac!att!ucbvax!lrw.com!leichter
- From: leichter@lrw.com (Jerry Leichter)
- Newsgroups: comp.os.vms
- Subject: re: VAX 4000/200 ethernet problems
- Message-ID: <9211121236.AA02598@uu3.psi.com>
- Date: 12 Nov 92 11:23:57 GMT
- Sender: daemon@ucbvax.BERKELEY.EDU
- Distribution: world
- Organization: The Internet
- Lines: 111
-
-
- We had a MicroVAX II. We upgraded it to VMS 5.4-2. Then we purchased
- an upgrade kit to convert it to a VAX 4000/200. The upgrade kit
- included a new CPU board, memory board, an DSSI disk, and various
- cables and patch panels.
-
- Anyhow, the VAX 4000/200 CPU board has an onboard ethernet port, EZA0.
- I've been trying to get this port working, but haven't been getting
- anywhere. The old DEQNA board still works, but is rumored to be much
- slower than the EZA0 channel.
-
- Not only that, but VMS 5.4-2 doesn't support it, at least for LAVC
- connections.
-
- After I took the ethernet cable off the DEQNA (AUI cable) and attached
- it to the EZA0 port, I would get lots of decnet errors on the console
- like "Circuit up", "Circuit Down", "Adjacency up" There were numbers
- that came with these errors, and I could look them up in the DECNET
- manual if necessary. When I would "set host" out of this machine to
- other machines, the link would be VERY VERY slow, and often freeze up
- for several seconds, as the "circuit up", "circuit down", and
- "adjacency up" errors flasshed by.
-
- Hint on diagnosing network problems: All of these are "high level" error
- conditions. About all they tell you is that something is wrong at the
- physical circuit level. Slow and delayed response in SET HOST means that the
- lower layers are losing or damaging a lot of packets; DECnet is recovering by
- retransmitting, but that takes time.
-
- The actual details, at this level, are almost never of any help in diagnosing
- the problem.
-
- I got a new CPU board from DEC, and also replaced all the cables and
- the patch panel. After that, then I couldn't "set host" at all,
- because the destination would always be unavailable. I don't know how
- the second board differed from the first, except that the problem got
- worse. I also no longer get any "circuit up", "circuit down", and
- "adjacency up" errors.
-
- When I executed a "mcr ncp sho known circuits" it would give me
- circuit state
- isa-0 on - synchronizing
-
- Again, this is still a fairly high level indication of a problem.
-
- When I execute a "mcr ncp sho active lines counters" I would get:
-
- Ah, now we are getting to the APPROPRIATE level.
- ...
- 83 send failures, including:
- excessive collisions
- carrier check failed
- remote failure to defer
- 1 collision detect check failure
-
- These are the important numbers. In fact, they pretty much pinpoint the
- problem.
-
- 0 user boffer unavailable
- ^
-
- Really! Just what ARE you doing with that system? :-)
-
- Anyhow I called Colorado Springs. Colorado Springs sent software
- patch CSCPAT_0252019 which is a new ezdriver.exe. I've installed the
- patch, but the problem is still occuring. After the patch, here is a
- sample of "mcr ncp sho active lines counters"
-
- A bad driver can, of course, make a piece of hardware appear to fail in any
- arbitrary way; but somehow I doubt that's your problem.
-
- 91 send failures, including:
- carrier check failed
-
- Again, this is the important counter. In a properly configured system, the
- only acceptable number for send failures, under normal conditions (i.e.,
- when you aren't playing with the hardware) is zero. (On two systems I have
- here, in a total of 213065 blocks sent, the total count is indeed zero.)
-
- Also, "mcr ncp sho known circuits" now states that isa-0's state is
- on, and not "on -synchronizing".
-
- This is where I am right now. Has anyone ever heard of this problem?
- I actually swapped out the 4000/200 CPU once more since then, but that
- didn't fix the problem either. Each time I install a new board, I run
- sys$manager:netconfig to setup the parameters. I've also swapped out
- the AUI cable and the tranceiver to the backbone.
-
- I'll bet your swapped the transceiver for an identical one. I'll further
- guess that you have an old transceiver, most likely a DEC H4000.
-
- I'd say the odds are 99% or better that you have a hardware configuration
- problem of some sort. My bet is on the following: The DEQNA was built at a
- time when the original Ethernet standards were in effect, while your new
- interface complies with the IEEE specs. The two are almost, but not quite,
- the same. You need different transceivers (or at least different transceiver
- configurations) for devices built to the different specs. The H4000 was
- eventually replaced by the H4005, identical except that it complied with IEEE
- rather than Ethernet specs. I'm sure the H4005 is long obsolete; I don't know
- what its modern replacement would be.
-
- The effect of using the wrong kind of transceiver would be to cause the timing
- of some of the self-check features to be wrong. This will show up as carrier
- check failures, spurious collisions, spurious remote failures to defer, and
- so on.
-
- Some transceivers allow you to select either Ethernet or IEEE operating mode.
- If yours is one of those, try changing the setting. If not, you'll need a
- new transceiver.
- -- Jerry
-
-