home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!stanford.edu!bu.edu!lll-winken!uwm.edu!spool.mu.edu!howland.reston.ans.net!usc!news.service.uci.edu!unogate!mvb.saic.com!info-multinet
- From: NED@SIGURD.INNOSOFT.COM (Ned Freed)
- Newsgroups: vmsnet.networks.tcp-ip.multinet
- Subject: RE: mail problem
- Message-ID: <01GTZNGPT3T68Y4XEH@SIGURD.INNOSOFT.COM>
- Date: 27 Jan 93 04:21:44 GMT
- Organization: Info-Multinet<==>Vmsnet.Networks.Tcp-Ip.Multinet Gateway
- Lines: 50
- X-Gateway-Source-Info: Mailing List
-
- > There are very few computer/network applications in our environment that
- > don't provide an immediate indication when an error occurs. Unfortunately
- > one of our most critical computer/network applications (e-mail) is an
- > application that doesn't provide immediate notification. Do any Multinet
- > sites have an automated alarm system to determine when e-mail is not being
- > delivered? Waiting for user complaints is not appropriate in our business.
-
- This sort of system-level monitoring of the overall health of the email system
- is quite a different problem than per-user-per-message reporting. Although
- probe messages are a useful tool, it is usually more important to monitor
- aggregrate statistics like message queue sizes, mean time to delivery, and so
- on.
-
- The problem with monitoring and reporting on email is that email itself often
- as not is the tool of choice for providing notification about application
- errors. (For example, the print symbiont we use locally reports PostScript
- errors via email.) But if there's a problem in the email system the last thing
- you want to use to report it is email! I've seen email-reports-on-email setups
- go into catastrophic overload and meltdown on more than one occasion; one such
- mess took several days of staff time to clean up.
-
- The obvious tool to use for this sort of monitoring would be SNMP, of course.
- (I won't bother to explain SNMP's characteristics that make it a good choice
- here.) However, application monitoring with SNMP has received almost total
- neglect in IETF circles -- I don't know why this is so, but it sure is!
-
- There is a very rudimentary Applications MIB that provides facilities for
- monitoring mail systems and directory services. However, in my opinion
- it is not suitable at present for serious production use.
-
- But there is some hope -- there's a new IETF working group called MADMAN (Mail
- And Directory Management) whose task is to put together a pair of MIBs for
- monitoring mail and directory systems. The group is off to a slow start but I
- am hopeful that things will begin to move in the near future.
-
- > >The only thing that's unreliable about nondelivery notices is the inconsistency
- > >of their generation. However, users adapt amazingly quickly to the nature of
- > >the systems they use, so they usually take these differences in stride.
-
- > And the amount of time that must pass before they are generated. The default
- > value on systems here ranges from 7 days to a few hours. The user community
- > prefers a standard delay (preferably single digit hours), but that is hard
- > to implement when some vendors hardcode this value.
-
- Not only must this be settable to a wide range of values, it also must
- vary from one service to the next. User expectations of email, FAX, and pagers,
- for example, are quite different. PMDF, for example, provides per-hour
- or per-day units that can be set on a per-channel basis.
-
- Ned
-