monitor.log

Contains the log of web site events and performances. It can be broken down in

FORMAT

{   #GMTTIME,
    #AGENT,
    #URL,
    #TYPE,
    #LOCALTIME,
    #MESSAGE
}

#TYPE is one of: "Failure", "Restart", "Warning", "Info".

monitor.log: events
EXAMPLE

This message records that the web site works ok (on the first time the web site is tested).

#GTMTIME #AGENT #URL #TYPE #LOCALTIME #MESSAGE

"09/07/99 09:44","WebMonitor1.1_cont", "http://www.mycompany.com/","Restart","09/07/99 09:44","Works ok !"

This message records a web site failure.
 

#GTMTIME #AGENT #URL #TYPE #LOCALTIME #MESSAGE

"06/07/99 11:51","WebMonitor1.1_cont","http://www.mycompany.com/","Failure"," 06/07/99 11:51",
    "NoAlarm:Invalid HTTP response code 0"

This message records a web site restart.
 

#GTMTIME #AGENT #URL #TYPE #LOCALTIME #MESSAGE

"06/07/99 12:15","WebMonitor1.1_cont","http://www.mycompany.com/","Restart"," 06/07/99 12:15"," Works again!"

The message may start with "Alarm:" or "NoAlarm:" indicating if an alarm was raised and if there is an entry in alarms.log.
 

MESSAGES (see httpd error code for more)
 
type
message
note
"Failure" "NoAlarm:Invalid HTTP response code 0" Web site failure, see HTTP error code table.
"Failure" "Content has changed: verify manually" The content has changed 

or the keyword expected is not found in the content

"Failure" "HTTP was public, is now protected" The web site was public and now is password protected (this may or may not be allowed).
"Failure" "HTTP was protected, is now public" The web site was password protected and now is public (this may or may not be allowed).
"Failure" "Sensor timed out too many times." There has been [3] successive timeouts when testing the URL. This means the web site may be down or hanging.
"Warning" "DNS lookup failure for [url]" The name lookup failed but the server is up (may or may not be allowed)
"Warning" "Performance is downgraded" The performance average over the last [15] connections is significantly lower that the average performance since inception (see performance curves)
"Warning" "Sensor timed out.1"

"Sensor timed out.2"

"Sensor timed out.3"

The url connection could not be done within [1] minute,

if this happens [3] times, there is a failure "Sensor timed out too many times".

"Info" "IP address for site http://www.mycompany.com/ is http://192.123.59.134/" Initial record of the IP address on the first test (after restart)
"Info" "IP address for site http://www.mycompany.com/ was http://192.123.59.134/ is now http://192.123.59.137/" The IP address has changed.
"Info" "Shutdown" The application was shutdown.
"Info" "Agent restarted" The application was restarted.

This list is not exhaustive.
 
 

HTTPD ERROR CODES

Most Failure message will contain the HTTPD error code and give an explanation, according to the http norm:

6.1.1 Status Code and Reason Phrase
The Status-Code element is a 3-digit integer result code of the
attempt to understand and satisfy the request. These codes are fully
defined in section 10. The Reason-Phrase is intended to give a short
textual description of the Status-Code. The Status-Code is intended
for use by automata and the Reason-Phrase is intended for the human
user. The client is not required to examine or display the Reason-
Phrase.
The first digit of the Status-Code defines the class of response. The
last two digits do not have any categorization role. There are 5
values for the first digit:
. 1xx: Informational - Request received, continuing process
. 2xx: Success - The action was successfully received,
understood, and accepted
. 3xx: Redirection - Further action must be taken in order to
complete the request
. 4xx: Client Error - The request contains bad syntax or cannot
be fulfilled
. 5xx: Server Error - The server failed to fulfill an apparently
valid request
The individual values of the numeric status codes defined for
HTTP/1.1, and an example set of corresponding Reason-Phrase's, are
presented below. The reason phrases listed here are only
recommendations -- they MAY be replaced by local equivalents without
affecting the protocol.
Status-Code =
"100" ; Section 10.1.1: Continue
| "101" ; Section 10.1.2: Switching Protocols
| "200" ; Section 10.2.1: OK
| "201" ; Section 10.2.2: Created
| "202" ; Section 10.2.3: Accepted
| "203" ; Section 10.2.4: Non-Authoritative Information
| "204" ; Section 10.2.5: No Content
| "205" ; Section 10.2.6: Reset Content
| "206" ; Section 10.2.7: Partial Content
| "300" ; Section 10.3.1: Multiple Choices
| "301" ; Section 10.3.2: Moved Permanently
| "302" ; Section 10.3.3: Found
| "303" ; Section 10.3.4: See Other
| "304" ; Section 10.3.5: Not Modified
| "305" ; Section 10.3.6: Use Proxy
| "307" ; Section 10.3.8: Temporary Redirect
| "400" ; Section 10.4.1: Bad Request
| "401" ; Section 10.4.2: Unauthorized
| "402" ; Section 10.4.3: Payment Required
| "403" ; Section 10.4.4: Forbidden
| "404" ; Section 10.4.5: Not Found
| "405" ; Section 10.4.6: Method Not Allowed
| "406" ; Section 10.4.7: Not Acceptable
| "407" ; Section 10.4.8: Proxy Authentication Required
| "408" ; Section 10.4.9: Request Time-out
| "409" ; Section 10.4.10: Conflict
| "410" ; Section 10.4.11: Gone
| "411" ; Section 10.4.12: Length Required
| "412" ; Section 10.4.13: Precondition Failed
| "413" ; Section 10.4.14: Request Entity Too Large
| "414" ; Section 10.4.15: Request-URI Too Large
| "415" ; Section 10.4.16: Unsupported Media Type
| "416" ; Section 10.4.17: Requested range not satisfiable
| "417" ; Section 10.4.18: Expectation Failed
| "500" ; Section 10.5.1: Internal Server Error
| "501" ; Section 10.5.2: Not Implemented
| "502" ; Section 10.5.3: Bad Gateway
| "503" ; Section 10.5.4: Service Unavailable
| "504" ; Section 10.5.5: Gateway Time-out
| "505" ; Section 10.5.6: HTTP Version not supported
| extension-code
extension-code = 3DIGIT
Reason-Phrase = *<TEXT, excluding CR, LF>
HTTP status codes are extensible. HTTP applications are not required
to understand the meaning of all registered status codes, though such
understanding is obviously desirable. However, applications MUST
understand the class of any status code, as indicated by the first
digit, and treat any unrecognized response as being equivalent to the
x00 status code of that class, with the exception that an
unrecognized response MUST NOT be cached. For example, if an
unrecognized status code of 431 is received by the client, it can
Extract from draft-ietf-http-v11-spec-rev-06.txt
Source Internet Engineering Task Force


How to read monitor.log (events)

The usual sequence of events for a web site is:

Time Type Message Note

10:00 Restart Works ok! // first test: web site is ok !

15:00 Failure Alarm:cannot connect. // failure is detected, alarm is sent

15:35 Restart Works again! // the web site was restarted

When as sequence "Failure", "Restart" occurs, it is possible to determine the downtime. In this example the web site was down for about 35 minutes.
 

monitor.log: cycperf
For example,

#GTMTIME #AGENT #URL #TYPE #LOCALTIME #MESSAGE

11/07/99 00:02 WebMonitor1.1_cont http://www.mycompany.com/ Info 11/07/99 00:02
"cycPerfDuration = 928, avgPerfDuration = 157, avgPerfCount = 765"

The performance information is contained in the message:

This example will have generated a performance warning, because 928 is significantly longer than 157, the usual duration.

Why not indicate the debit ?

Because the name lookup and open connection do not depend on the length of the content, it makes to sense to calculate a debit with the cycPerfDuration.
 
 

monitor.log: urlperf
For example,

#GTMTIME #AGENT #URL #TYPE #LOCALTIME #MESSAGE

11/07/99 00:02 WebMonitor1.1_cont http://www.mycompany.com/ Info 11/07/99 00:02
"urlPerfDuration = 105, length = 2096, debit = 19 kb per sec"

The performance information is contained in the message:

How to read cycperf and urlperf

The logs of several monitors will be needed to do an accurate performance analysis. cycperf and urlperf must be read together for diagnostic.
 

Tracing curves

The curves can be generated with MS Excel using the CSV files and extracting the performance information. For example:
Cycle avg duration variation during the day (ms), using cycperf
 


Debit variation during the day (kb/s), using urlperf