home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!stanford.edu!bu.edu!rpi!usc!howland.reston.ans.net!sol.ctr.columbia.edu!hamblin.math.byu.edu!arizona.edu!mvb.saic.com!info-multinet
- From: RAS@GV3.CACD.CR.ROCKWELL.COM (Robert A. Schneider II . 299/106-180 . 435-3863)
- Newsgroups: vmsnet.networks.tcp-ip.multinet
- Subject: RE: mail problem
- Message-ID: <930127074724.2200271a@GV3.CACD.CR.ROCKWELL.COM>
- Date: 27 Jan 93 13:47:24 GMT
- Organization: Info-Multinet<==>Vmsnet.Networks.Tcp-Ip.Multinet Gateway
- Lines: 70
- X-Gateway-Source-Info: Mailing List
-
-
- >> There are very few computer/network applications in our environment that
- >> don't provide an immediate indication when an error occurs. Unfortunately
- >> one of our most critical computer/network applications (e-mail) is an
- >> application that doesn't provide immediate notification. Do any Multinet
- >> sites have an automated alarm system to determine when e-mail is not being
- >> delivered? Waiting for user complaints is not appropriate in our business.
- >
- >This sort of system-level monitoring of the overall health of the email system
- >is quite a different problem than per-user-per-message reporting. Although
- >probe messages are a useful tool, it is usually more important to monitor
- >aggregrate statistics like message queue sizes, mean time to delivery, and so
- >on.
-
- I agree that the aggregrate statistics are useful, especially for a system
- which functions as a central e-mail hub. However, many of our users are on
- workstations that receive mail directly (POP and IMAP are rarely used).
- Aggregrate statistics aren't really needed on the workstations, but a prompt
- indication that e-mail isn't going through is.
-
- Very rarely does e-mail not deliver a message from a local user to a local
- user. The local mail system is either up or down and it is very obvious to
- the user community if it is down. E-mail delivery problems tend to occur
- in delivering a message to a remote node. It is these messages where
- nondelivery notification is needed. And it is these problems which are
- often caught by the e-mail probes.
-
- >The problem with monitoring and reporting on email is that email itself often
- >as not is the tool of choice for providing notification about application
- >errors. (For example, the print symbiont we use locally reports PostScript
- >errors via email.) But if there's a problem in the email system the last thing
- >you want to use to report it is email! I've seen email-reports-on-email setups
- >go into catastrophic overload and meltdown on more than one occasion; one such
- >mess took several days of staff time to clean up.
-
- Very true. However, realistically, what other method is available to deliver
- a message to a user who may or may not be logged in when the message is
- issued? Sending messages to the user's terminal doesn't guarantee 100%
- delivery (user may not be logged in). Perhaps an error file could be written
- in the user's home directory and the login process could display the file
- contents at login, but many users only log in 2 or 3 times a day, so this
- wouldn't be a timely mechanism. But neither of these mechanism work well
- if the user is on a remote node. E-mail seems to be the most appropriate
- method.
-
- >The obvious tool to use for this sort of monitoring would be SNMP, of course.
- >(I won't bother to explain SNMP's characteristics that make it a good choice
- >here.) However, application monitoring with SNMP has received almost total
- >neglect in IETF circles -- I don't know why this is so, but it sure is!
-
- If an SNMP station is not monitored 7 X 24, how do you propose to deliver
- the SNMP messages to the cognizant people? Often those individuals who
- support the environment have multiple responsibilities. Even when support
- is distributed amongst several individuals, they may not invoke an application
- to check SNMP status indicators frequently enough to respond quickly to an
- alarm condition. However, I can guarantee that any support person who is
- logged in (or who has just logged in and is doing a quick scan of received
- mail) will respond quickly to a mail message received from "ALARM@MONITOR."
-
- Yes SNMP could gather statistics and signal alarm conditions. But,
- realistically speaking, I believe most sites will want to use e-mail to
- deliver the alarms and statistics to the cognizant support staff.
-
- Now when client-server functionality truely becomes a commodity within the
- support community (i.e. the SNMP stations fully support X-windows and all
- the support staff have X-window display capability on their desktop) this
- may change. Until then, rightly or wrongly, e-mail will the mechanism of
- choice for most users and application developers.
-
- Bob
-