NetNews Usenet Archive 1993 #3

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1993 #3 / NN_1993_3.iso / spool / vmsnet / networks / tcpip / multinet / 2787 < prev next >

Wrap

Internet Message Format | 1993-01-28 | 4.6 KB

Path: sparky!uunet!stanford.edu!bu.edu!rpi!usc!howland.reston.ans.net!sol.ctr.columbia.edu!hamblin.math.byu.edu!arizona.edu!mvb.saic.com!info-multinet From: RAS@GV3.CACD.CR.ROCKWELL.COM (Robert A. Schneider II . 299/106-180 . 435-3863) Newsgroups: vmsnet.networks.tcp-ip.multinet Subject: RE: mail problem Message-ID: <930127074724.2200271a@GV3.CACD.CR.ROCKWELL.COM> Date: 27 Jan 93 13:47:24 GMT Organization: Info-Multinet<==>Vmsnet.Networks.Tcp-Ip.Multinet Gateway Lines: 70 X-Gateway-Source-Info: Mailing List >> There are very few computer/network applications in our environment that >> don't provide an immediate indication when an error occurs. Unfortunately >> one of our most critical computer/network applications (e-mail) is an >> application that doesn't provide immediate notification. Do any Multinet >> sites have an automated alarm system to determine when e-mail is not being >> delivered? Waiting for user complaints is not appropriate in our business. > >This sort of system-level monitoring of the overall health of the email system >is quite a different problem than per-user-per-message reporting. Although >probe messages are a useful tool, it is usually more important to monitor >aggregrate statistics like message queue sizes, mean time to delivery, and so >on. I agree that the aggregrate statistics are useful, especially for a system which functions as a central e-mail hub. However, many of our users are on workstations that receive mail directly (POP and IMAP are rarely used). Aggregrate statistics aren't really needed on the workstations, but a prompt indication that e-mail isn't going through is. Very rarely does e-mail not deliver a message from a local user to a local user. The local mail system is either up or down and it is very obvious to the user community if it is down. E-mail delivery problems tend to occur in delivering a message to a remote node. It is these messages where nondelivery notification is needed. And it is these problems which are often caught by the e-mail probes. >The problem with monitoring and reporting on email is that email itself often >as not is the tool of choice for providing notification about application >errors. (For example, the print symbiont we use locally reports PostScript >errors via email.) But if there's a problem in the email system the last thing >you want to use to report it is email! I've seen email-reports-on-email setups >go into catastrophic overload and meltdown on more than one occasion; one such >mess took several days of staff time to clean up. Very true. However, realistically, what other method is available to deliver a message to a user who may or may not be logged in when the message is issued? Sending messages to the user's terminal doesn't guarantee 100% delivery (user may not be logged in). Perhaps an error file could be written in the user's home directory and the login process could display the file contents at login, but many users only log in 2 or 3 times a day, so this wouldn't be a timely mechanism. But neither of these mechanism work well if the user is on a remote node. E-mail seems to be the most appropriate method. >The obvious tool to use for this sort of monitoring would be SNMP, of course. >(I won't bother to explain SNMP's characteristics that make it a good choice >here.) However, application monitoring with SNMP has received almost total >neglect in IETF circles -- I don't know why this is so, but it sure is! If an SNMP station is not monitored 7 X 24, how do you propose to deliver the SNMP messages to the cognizant people? Often those individuals who support the environment have multiple responsibilities. Even when support is distributed amongst several individuals, they may not invoke an application to check SNMP status indicators frequently enough to respond quickly to an alarm condition. However, I can guarantee that any support person who is logged in (or who has just logged in and is doing a quick scan of received mail) will respond quickly to a mail message received from "ALARM@MONITOR." Yes SNMP could gather statistics and signal alarm conditions. But, realistically speaking, I believe most sites will want to use e-mail to deliver the alarms and statistics to the cognizant support staff. Now when client-server functionality truely becomes a commodity within the support community (i.e. the SNMP stations fully support X-windows and all the support staff have X-window display capability on their desktop) this may change. Until then, rightly or wrongly, e-mail will the mechanism of choice for most users and application developers. Bob