home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!bcstec!shamu.ca.boeing.com!mr2461
- From: mr2461@shamu.ca.boeing.com
- Newsgroups: comp.os.vms
- Subject: How to get accurate memory error count
- Message-ID: <1993Jan5.152750.137@shamu.ca.boeing.com>
- Date: 5 Jan 93 15:27:50 -0800
- Organization: Central Storage Facility
- Lines: 35
-
-
-
-
- I'm developing a suite of routines to monitor various VAXes, and notify
- appropriate people when an "important event" has occurred. One of the
- categories of events that I'm watching for is hardware errors. I use the
- $GETDVI system service to retrieve the number of errors most devices
- have accumulated, and (ugh!) parse the output of the $SHOW ERROR command
- to see if there are any memory or CPU errors. Recently, on a 7610 we
- just installed, we noticed that we were generating massive numbers of
- memory errors that were not showing up via the $SHOW ERROR command, but
- were being logged in the error log. I know that VAXsim+ can report errors
- of this type (although we have been told by DEC not to run that product
- on a 7610), so I assume this information is detectable from software.
- Unfortunately, I don't have source for VAXsim+ available to look up
- its methods.
-
- Three questions:
-
- 1. Why do some errors get logged to the error log, but not increment
- the count in the $SHOW ERROR command?
-
- 2. Does this behavior apply to devices other than memory, and thus
- render my calls to $GETDVI for the ERRCNT inaccurate?
-
- 3. How (without parsing the error log file!) can I get the true
- number of errors for a device, particularly memory and CPU
- boards?
-
- Thanks, in advance, for any help you can provide. If there is
- enough interest, I'll post a summary.
-
-
- - Matt Robertson
- mr2461@shamu.ca.boeing.com
-