MS BackOffice Unleashed

General Monitoring Concepts
Windows NT Monitoring Tools
Parameters that Can Be Monitored
BackOffice Integration with NT Monitoring
Extending Windows NT Monitoring for Applications
Summary

— 4 —

Monitoring Environment

Integrated architectures and security environments are a good foundation on which to build systems. So far, so good. Now it is time to leave the theoretical and architectural worlds behind and start talking about the actual implementation of the BackOffice family. The first topic in this chapter should be close to the hearts of administrators and systems support staff. It deals with the tools that are available to monitor the BackOffice family during operation to see how it is performing.

A good starting point is a clarification of what tools are being discussed in this chapter. Are you going to learn the tools that monitor the operating system, Windows NT? Does this chapter cover the tools needed to monitor SQL Server? How about Exchange Server, SNA Server, SMS, and IIS? The answer to these questions is yes, but there is a really neat catch. You are going to look at only the set of tools that is built into the Windows NT operating system. The neat catch is that when you learn the NT tools, you also learn the tools for BackOffice, because the folks at Microsoft were kind enough to make their developers use the NT tools for the various BackOffice components.

This is good in a number of ways. It is good for Microsoft because they do not have to pay people to develop, test, and support a number of different monitoring tools. That obviously saves them some money and enables them to use programmers elsewhere to develop useful features that are specific to the various BackOffice components. It is also very good for those of us who use the monitoring tools in the field. Learning one set of monitoring tools is much easier than having to learn how Exchange Server developers thought and then learn how the SMS people approached the issue. The support staff of most systems that I have run across have more than enough to do already, so every bit of time savings is generally appreciated.

What functions do monitoring tools perform for you? It generally depends on who you are talking to. Security types say that the main functions of monitoring tools are to capture who has access to the various types of data and validate these accesses against the standard list of security privileges. Users, on the other hand, are generally interested is having their huge database queries complete within a fraction of a second. Therefore, you are usually under some pressure to monitor performance of the system and see what you can do to improve it. Finally, because software and hardware are imperfect, it is sometimes useful to have tools that record messages from the various hardware and software components to see what they are doing. This can often take the form of the last words from a dying process that try to capture what caused the problem.

Where do monitoring tools fit into the duties of BackOffice support staff? I tend to think of them as part of the first aid kit for the operating system and other BackOffice components. You typically do not use them every day. You typically do not use them a lot. However, when the system is crashing and users are desperate to get their work done, these are the essential tools that help you determine what the problem is and what you need to do to fix it. Therefore, it is wise to become comfortable with the various tools, what each one can do for you, and which bits of information provided by these tools are most useful when you have a particular problem.

This chapter begins with an overview of some of the general monitoring concepts that you should consider when getting acquainted with your monitoring tools. Next, you look at the tools that are provided by Windows NT for monitoring your operating system and BackOffice components. An overview of the parameters that can be monitored and which are most useful in common situations is followed by several sample monitoring plans for your consideration when you plan for your own installations. You explore the integration of BackOffice with Windows NT monitoring tools next. This chapter wraps up with a discussion on how your developers can integrate their applications into the NT/BackOffice monitoring environment.

General Monitoring Concepts

Before you jump into the details of the monitoring tools that are provided in Windows NT and BackOffice, let's look at a few of the design considerations and functional requirements of a good set of monitoring tools. This will enable you to better understand the way things have been set up in NT and reduce the amount of time that it will take for you to learn the environment. The first concept to be covered is parameter. Basically, it is something that can be measured. Examples of computer parameters could include disk space used or abnormal termination of a particular application (which I hope you won't observe very often).

There are an enormous number of parameters of a computer and its operation. Some are not very useful (the color of the metal on the inside of the computer’s cover, for example). Others are useful at planning time, but after that are seldom used (the size of the computer case, for example). This chapter focuses on those parameters that are important to computer operations—and therefore are of interest to BackOffice and NT administrators.

After eliminating the parameters that do not deal with computer operations, you still have a large number of parameters that can be measured or observed. These measurements can include usage (how much of a disk drive is currently filled with data), activity (number of reads and writes to a particular disk drive per second), or other quantities (such as an internal application error occurring). It seems appropriate to break these parameters down into categories so that they can be studied in more detail. For purposes of this discussion, the operational parameters of the system are divided into three categories:

Hardware parameters relate to the physical pieces of equipment in your computer, such as the central processing unit (CPU) or disk drives.
Operating system parameters relate to tunable portions of the operating system.
Application parameters are specific to the individual applications.

These parameters form a hierarchy, as shown in Figure 4.1. The key item to remember is that you need to have all parts of this hierarchy working together to get the performance that you want. For example, you can have the best operating system and applications in the world, but there are fixed limits as to the number of computations an 80286 processor can complete in a given period of time. Therefore, it is important to be able to understand what is going on at all levels to achieve the performance (and reliability) that you need.

FIGURE 4.1. Hierarchy for performance and monitoring.

The first set of parameters, hardware items, typically fall into the areas of capacity (megabytes of disk storage), performance (number of bytes transferred per second to a given disk drive), and events (such as an error when trying to read from a disk drive). They can also be divided according to hardware components. The three big items monitored on most computer systems are the CPU, disk drives, and physical memory. Of course, you may be interested in the transfer capacity to a CD-ROM drive at certain times, so you also need to be able to monitor the less commonly used parameters.

The next set of parameters relate to the operating system. If operating systems were fixed entities that performed the same for all computer systems and applications, you probably would not need to be concerned with this category of parameters, because they would map directly to the performance parameters of the hardware. However, operating system designers are quite clever and work hard to enable their creations to adapt to different hardware configurations and application needs. For example, if you put additional memory into a Windows NT system, it will reconfigure itself to use that additional memory to speed up application processing. You also may want to intervene to adjust various operating system parameters to adapt to special needs of your application (complex numerical models of the interaction of galaxies with one another have different needs than those of a word processor).

The final set of parameters relate to the applications that are running on the computer. This could liberally be thought of as all software that did not come as part of the operating system distribution media. This is often the most difficult area to deal with from a monitoring point of view, because many developers record little about the execution of their application other than fatal error messages. It is also one of the more pleasant design features of BackOffice tools, because this set tends to record a substantial amount of information (startup, shutdown, and even processing levels). The more data that you have about critical application parameters, the better you will be able to support those applications and keep performance levels high.

One thing that you have to be careful about is the volume of data that you monitor. As you will learn throughout the rest of this section, Windows NT is designed to enable you to select the parameters that will be monitored instead of recording every scrap of information that is available to it. The good news is that you can see the status of important parameters right away, without having to wade through a mountain of numbers. The bad news is that you have to figure out which of the hundreds of parameters available are important to you. Later sections in this chapter cover the list of parameters and preparation of monitoring plans.

Another factor in the monitoring process is the utility of the numbers themselves. Suppose I told you that a disk drive has had 100M of data transferred to it since the operating system was started up and its counters reset. What exactly does that tell you? Actually, it tells you very little. If the operating system had just been started a few minutes ago, this could be a sign that this is a heavily loaded (and perhaps overloaded) disk drive. If the operating system was last started up a month ago, it could be a very normal number. The point is that to have true utility, numbers need to be expressed in terms that are useful (bytes read per second) for the particular parameter measured and also provide a reasonable graduation of time over which the data was taken (what was the average over the last minute as opposed to the average over the last six months). A particularly useful unit of measurement is a percentage of total capacity. For example, knowing the number of instructions processed per second by the CPU is nice, but being told that the CPU is operating at 20 percent of its processing capacity is much more useful.

The next consideration when you are trying to monitor the operations and performance of your system is the presentation format of the data. A few dozen screens with a few hundred numbers each is generally difficult for mere human beings to wade through. However, a graph that has a line showing how the disk throughput varied over the last day and another one showing CPU utilization are easy formats for administrators to review. Other parameters, such as the log of problems with the operating system, do not lend themselves to graphical formats. However, Windows NT uses time-ordered lists of errors and uses icons and filters to enable you to sort out the problem messages from the normal routine. The key to remember is that you need to be familiar with the various display formats of the monitoring utilities so that you can select the one that is most useful to you on a given task.

Now you have a set of data regarding only those parameters that are truly important to you. A logical question at this point is to ask when a given parameter being monitored is cause for concern. That is actually quite a tricky question in many cases. Some parameters are easy to understand and have fixed capacity values. For example, a disk drive will store a certain number of megabytes and no more. CPU utilization is another example where if you see that your CPU is running at 100 percent utilization and all your other parameters are well within their limits, it may be time to get a faster CPU.

The problem comes when you try to measure performance of a large number of the parameters in your system. For example, you find that a disk drive is transferring 10M per second to your system. Is that too high or absolutely normal? It depends on the type of disk drive, the type of disk drive controller, and the type of transfers involved. (Do they cause the heads of the disk drive to have to keep moving around or is it one long sequential read?) With all these variables, it is almost impossible to come up with an absolute set of numbers that can be used to interpret the numbers that are coming from your monitoring efforts.

The only practical solution for many parameters is to just observe the values for the parameters and the performance of the system. Take a set of numbers when the system is lightly loaded and everything is working well. Then try taking numbers at higher load levels and observe the performance of your applications. If everything is functioning well and the users are happy, that number of 10M-per-second data transfer to a disk drive must be okay. Save this data for problem times. Then, when you monitor your system, you may find that the disk drive is transferring 12M per second and that the queue of requests for that disk drive is very high. This is your indicator that the drive has reached capacity. You can store this number for future reference to help you avoid problems with this and similar disk drives.

Another important point about the monitoring process is that it should not bias any data that you are receiving. This can be tricky on a computer where monitoring a large number of parameters can take up a lot of CPU time and disk transfer capacity. The good news is that you have a fairly fine level of control over the number of parameters and where you are storing your data. You just need to keep this in the back of your mind before you start logging every event that is happening to your computer.

That is probably enough discussion on the basics of monitoring. These are some of the general monitoring considerations that you need to think about and that do not fit in with the discussion of this tool or that pull-down menu. To close out this section, let's look at a brief presentation of another way of classifying parameters that matches well with the tools that are provided in Windows NT:

Events are records of a discrete event occurrences. They usually do not have a value associated with them. Instead, they contain a text message, such as the fact that the browser service started successfully or that a print job from a certain user was completed.
Performance parameters are records of performance or utilization measurements. They almost always have numbers associated with them that can be graphed and compared with other measurements.
Auditing records are similar to event monitoring, but they tend to focus on actions taken by users and applications. They are typically used for security and other such applications. An example of this would be recording the fact that a specific user used a particular system administrator privilege.

Windows NT Monitoring Tools

With this general discussion completed, it is time to move on to the specific tools and parameters that you will use in Windows NT to measure the performance of your BackOffice system. This section focuses on the tools that are available to you. Its goal is to cover the basics of how to set up and use these tools for common monitoring tasks. The next sections review the various parameters and what they mean.

As mentioned in the last section, Windows NT provides three basic tools to enable you to monitor activities on your system. These tools are used for monitoring both Windows NT operating system activities and the activities internal to most of the BackOffice tools. Figure 4.2 shows these tools and their basic functions.

FIGURE 4.2. Windows NT performance monitoring tools.

The first tool that you need is the Performance Monitor. Its main task is to measure performance parameters for both the operating system and the hardware. It has been extended to be able to present performance parameters for most of the BackOffice applications. You also have the option of using it to capture performance information for your applications. It provides a number of display formats and data capture options to enable you to see the data in the way that makes the most sense to you.

The second tool in the Windows NT monitoring tool kit is the Event Viewer. Its goal is to provide a means of displaying records of activities that have occurred on the system. There are several different logs (security, application, and system events) that can be displayed. There are also a number of display formats.

The next Windows NT monitoring tool is the auditing record. The good news is that the auditing data is presented in a tool with which you will soon be familiar—the Event Viewer. The only trick is that you have to set your options using another tool, the User Manager.

The final monitoring tool is the Server application, which can be found on Control Panel. Although it does not produce fancy graphics, it is the best source for finding out things such as who is connected to the system and what resources they are sharing. There is not much to discuss about this tool, because it is relatively simple; however, that is part of its charm. The key points are that this is a tightly integrated monitoring environment and there are only a few simple tools to learn.

Performance Monitor

Figure 4.3 shows my favorite Performance Monitor view—the line graph. Performance Monitor is a flexible tool that presents data in a number of different formats. It is also very graphically oriented and easy to get used to. The basic interface for this tool has the following characteristics:

Across the top of the windows is the standard Windows menu bar. It enables you to access all your basic processing options.
Just below the menu is a toolbar that enables you to pick the most commonly used functions with a single menu click. The first set of four buttons enables you to select the type of display that you want to view: chart view, alert view, log view, or report. The next three buttons enable you to add, modify, or delete a particular counter from your list via a pop-up dialog box. The next button enables you to take a picture of the parameters data at the current moment. The next-to-last button enables you to write a comment to the log file at a particular time that might help you to remember what was going on later when you are reviewing the log. The final button displays the Display Options dialog box to enable you to set the environment to your liking.
The rest of the display is used to display the actual performance data itself. As mentioned earlier, there are four display formats. Each of these formats is covered in the next sections of this chapter.

FIGURE 4.3. Performance Monitor’s line graph interface.

Before getting into the specifics of the various displays in Performance Monitor, let's look at an option that might be useful to some of you. Many people are comfortable with only one window opened up to the full size of the monitor at a time. Those of you who are comfortable with several windows or toolbars being open on your desktop at a given time, however, can use a few control keys to minimize the Performance Monitor display and keep it open for your review at all times.

Figure 4.4 shows a Performance Monitor graph of SPU process time use in a nice, graph-only window. How do you accomplish this? First, set up the Performance Monitor graph that you want to see (the details of this process are shown in a few paragraphs). To trim the windows, press the following keys: Control+M (which toggles the menu on and off), Control+S (which toggles the status line on and off), and Control+T (which toggles the toolbar on and off). You can also use Control+P to set up Performance Monitor so that it stays on top of all the other windows on your desktop. In summary, set up the graph that you want and then press a few control keys to customize the display to suit your tastes.

FIGURE 4.4. Trimmed Performance Monitor window.

The best thing about Performance Manager is that it is installed when you install Windows NT. You do not have to go through any additional steps to get the software on your system. When you start Performance Monitor, you will notice that is not doing anything. There are no monitoring parameters selected, so it is just sitting there waiting for you to instruct it on what you want done. This makes sense, because there are so many different parameters and monitoring needs that Microsoft could not even begin to guess what it was that you needed by default.

Your first task is to set the options, counters (monitored parameters), and instances (for monitored objects such as disk drives, you have to tell it which disk drive you are interested in when you have more than one) for the monitoring task at hand. This is a relatively easy task. Click on the plus sign icon in the toolbar to add the counters for the parameters that you need to monitor. Performance Monitor displays the Add to Chart dialog box that is shown in Figure 4.5. The following are the options that you specify on this dialog:

The Computer box enables you to select the computer for which you want the monitoring to occur. Performance Monitor is running on your current computer, but it can collect and display statistics for any Windows NT computer on your network. You can even get a version of Performance Monitor for Windows 95 that enables you to monitor remote NT Servers from a Windows 95 computer. The selection box at the right enables you to search the network for a list of computers that you can monitor (using the standard computer name that you put in when you installed that computer). You can, of course, type in the computer name at the prompt using the standard double-backslash notation (for example, \\MASTER_SERVER).
The Object drop-down listbox enables you to select the objects (that is, physical disk drives or CPU) that you wish to monitor. The object selected determines the values that are available in the Counter and Instance scrollable listboxes.
The Counter scrollable listbox enables you to select which of the various counters (parameters) is applicable to the selected object that you wish to monitor. You can select multiple counters for a given object.
The Instance scrollable listbox enables you to choose which particular object for the selected object type that you wish to monitor. In most instances, there is only one instance of the object (that is, systems with only one CPU or one set of random access memory). However, there are cases, such as multiple physical and logical disk drives, that are common.
The Add button enables you to add the monitoring Computer, Object, Counter, and Instance configurations specified to be added to the parameters being monitored in your current instance of the Performance Monitor application. You can have multiple instances of Performance Monitor running on your computer to monitor the performance of different computers or different parameters on a single computer.
The Explain button is for those of you who do not have the time to become intimately familiar with all the various counters. It displays some additional text at the bottom of the dialog box. It is not the most detailed explanation of the subject matter, but it is helpful.
The Cancel and Done buttons enable you to close out the Add to Chart dialog box.

FIGURE 4.5. Add to Chart dialog box.

The same basic Add to dialog box is used for all the views (chart, alert, log, and report). This reduces the number of user interfaces that you have to learn to work with. The last word in the title bar changes to show the type of view that you are working with (Add to Chart, Add to Log). You will also notice the color, scale, width, and style controls at the bottom of the dialog box. Performance Monitor automatically cycles through a predefined set of color and line patterns as you add counters. However, those of you with finer artistic tastes may want to take matters into your own hands and specifically set the patterns on your graphs. These buttons enable you to accomplish this task.

The Chart View in Performance Monitor

The chart view is my personal favorite for displaying performance data. I have a science and engineering background, so I have gotten quite used to following lines across the page. There are a few considerations that you should keep in mind when you are setting up a chart view. The first is that the chart view is best suited for viewing numeric data that varies over time. It is not that great when you are looking for a rare occurrence (a spike in CPU utilization, for example), because the data on the chart view is routinely overwritten with new data. Therefore, you may want to use the alert view to catch rare events.

Another important chart consideration is avoiding excessively busy graphs. The human eye can follow a few lines as they vary over time, but it can be very confusing if you have a large number of lines moving all over the screen (see Figure 4.6). This graph illustrates several points. First, you probably find it difficult to distinguish which line is associated with a given counter. Very few people can follow a large number of lines, especially if they intersect (it is okay to have a few more lines if some are at the top of the graph and the others are at the bottom).

You can solve the problem of excessive graph complexity simply by starting up multiple instances of Performance Monitor with several, less-busy graphs running at the same time.

Another consideration that most people might forget is that Microsoft uses color well to help you distinguish the lines. It is easier to differentiate the green line from the red line, especially when they intersect. However, you may have difficulty presenting your wonderful color charts to a boss who is color-blind. If you do not have access to a wonderful color printer, the charts that make so much sense to you when you look at them on your color screen may be impossible to read in black and white off your laser printer. The line pattern can be used to compensate for this somewhat; however, if you have to present them to others, you may want to keep the complexity of the charts down (see Figure 4.6).

FIGURE 4.6. A busy Performance Monitor chart.

Perhaps you are a displaced graphical artist. You spend a lot of time customizing the counters, colors, patterns, and other chart settings to get that perfect graphical display of the key parameters of your system. It would be annoying to have to go through this every time you start Performance Monitor. On the File Menu, you have the option of saving your chart settings as shown in Figure 4.7. You also have the option of saving your charts, option menu selections, and so forth using the Save Workspace menu option. The goal of these menu picks is to let you build up a series of Performance Monitor charts, alerts, and so forth that provide you with key system information. You can do this while everything is going along smoothly and you have a little spare time. Then, when everything is crashing down around your feet, you can call up the appropriate monitors to determine the true problems quickly. Put this on your list of things to do.

FIGURE 4.7. Saving Performance Monitor settings to a file.

If you have ever done business application development, you have probably come to appreciate the diverse formatting and display demands of different users of the system. I have seen people argue for hours over the exact point size for the font used on reports and what the exact wording of the heading should be. Microsoft deals with this fact of life by providing you several options for presenting this data. These controls enable you to set the details about how the data on the chart is presented and collected. The Chart Options dialog box enables you to set the following presentation options (see Figure 4.8):

The Legend checkbox enables you to specify whether you want to use part of the display space to show a legend. The legend shows a sample section of a particular line (for example, the green one) and associates with a specified computer, object, counter, and instance. I recommend keeping this option activated unless the contents of a particular graph are really obvious.
The Value Bar checkbox controls whether the Last, Average, Minimum, Maximum, and Graph Time data displays are presented beneath the chart itself, as is shown in Figure 4.8.
The Vertical Grid checkbox controls whether vertical lines are drawn in the middle of the graph to help you see the value of a given data point. Figure 4.8 shows both vertical and horizontal grid lines.
The Horizontal Grid checkbox controls whether horizontal lines are drawn in the middle of the graph to help you see the time that corresponds to a given data point.
The Vertical Labels checkbox controls whether the scale of values for the vertical label is displayed.
The Gallery option contains radio buttons to select either Graph (a line graph where all the data points are represented as dots connected together with a line) or Histogram (a bar chart where each data point is represented as a vertical bar whose height corresponds to the value being measured).
The Vertical Maximum edit box enables you to control whether you display the whole range of values possible, as determined by Windows NT (0 to 100 percent), or whether you want to focus on a narrower range where the interesting data lies (0 to 50 percent). This can be useful in situations where you want to see finer variations in data that is confined to a narrower range.
The Update Time section enables you to specify whether you want Periodic Update or Manual Update. Periodic Update is the default in which Performance Monitor collects a data point for the parameters being monitored at the interval (in seconds) specified in the edit box. Manual Update collects data for all the parameters being monitored when you select the Update Now selection from the Options menu, when you click the toolbar item that looks like a camera, or when you press Control-U.

FIGURE 4.8. Chart Options dialog box.

In summary, the chart view in Performance Monitor provides an impressive array of data presentation options. You are probably uncertain as to which options suit your tastes from this brief discussion. There is no substitute for actually sitting down and trying out the options. As mentioned earlier, it is beneficial to figure out the key parameters and charting options that you like when you have time to work with the system. So far, however, you have seen only one of the four views of data provided by Performance Monitor. The good news is that although the format of the displays differs between these views, user interface is very similar so you should be able to adapt quickly to the other three presentation formats.

The Alert View in Performance Monitor

The next Performance Monitor view is the alert view (see Figure 4.9). The basic concept behind this display is quite simple. Suppose you want to monitor several parameters on several servers to detect any problems. You are not interested in the many hours of normal, within-limits data that would be generated on these servers. Instead, you want to know only when action is required on your part. The alert view enables you to tell the system to write an entry into the log whenever a parameter on a given system exceeds the value that you specify. The computer then takes on the task of sorting through the data that is collected at routine intervals and writing the values that meet your criteria.

FIGURE 4.9. Alert Log display.

To set up monitoring using the alert view, you first select this view from the toolbar (the icon has a log book with an exclamation mark on it). You add the systems, objects, counters, and instances that are of interest to you using the Add to Alert dialog box, which is very similar to the Add to Chart dialog box discussed previously (see Figure 4.10). The key data element that you want to enter on this panel is at the bottom of the screen. The Alert If control enables you to specify when the alert record is written. You can specify that you are interested in values that are either under or over the number that you specify.

FIGURE 4.10. Add to Alert dialog box.

An interesting option that can be used to provide highly automated systems is the Run Program on Alert control. You have already defined a condition that will trigger activity based on the built-in monitoring utilities in Windows NT. This control enables you to run a specified program when this limit value is reached. You could, for example, run a program that clears out temporary log files when a certain disk drive starts to get too full. Another option is to activate a mail utility to send you mail or a telephony utility that pages you with a coded message. You have the option of running this program every time the alert condition is reached or only the first time that you run across this condition (depending on how you want to implement your administrative programs).

The alert view is a very useful option when you want to run your monitoring on a continuous basis over long periods of time. It collects only that data that is of interest to you. It gives you a quick summary of what occurred and when. This makes it ideally suited to running over long periods of time, trying to catch when problem conditions arise. You can even run a specified program when the alert condition is reached that either tries to fix the problem or at least notifies you that the problem was detected. You still have to make time on a routine basis to read the data, but at least you avoid having to sort through large volumes of data to find the interesting events.

The Log View in Performance Monitor

The next view provided by Performance Monitor is the log view. The principle behind this view is simple. You write all the data to a log file instead of displaying it on the screen in a chart or scanning it for out-of-limits parameters as with the Alert view. Later, you open this log file using Performance Monitor and either view the results or export the results to a data file that can be imported into spreadsheets such as Microsoft Excel or Lotus for further processing. You can even write your own software applications to read these files and massage the data.

The log view is relatively simple to work with. First, you need to specify the computer, objects, counters, and instances that you wish to monitor using the Add to Log menu or toolbar options. Next, you need to specify the file that is going to contain the logging data and how often the data points are to be collected. You have to be careful because collecting data for even a few counters with a relatively frequent sampling interval can add up to a fair amount of disk space when running over several days or weeks. Figure 4.11 shows the Log Options dialog box.

FIGURE 4.11. Log Options dialog box.

The Log Options dialog box contains the standard file selection controls. You can add data to an existing file or create a new file. You may want to consider locating all your log files (from all applications) in a separate directory so that it is easy to find them and also easy to purge them when they are no longer needed. I have run across systems where log files have been capturing data without anyone even knowing that they are there leading to log files that are many tens of megabytes in size. Once you have specified the filename, the only other real decision is the update interval. You need to be careful to pick an interval that captures problems that occur (daily averages will not show peak usage at certain hours of the day) but does not overload you with data that has to be processed (as a 1-second average might). For reference purposes, if you choose a 10-second interval, you will generate over 8,600 data points per counter selected per day. Then click the Start Log button to begin data capture. You need to keep Performance Monitor running to continue to collect data to this log file.

If you wish to collect performance data in an automated fashion without keeping a particular computer logged in at all times, you can use scripts to start and stop the Performance Monitoring logging. This is actually useful, because you can pick different monitoring profiles for different periods of the day. You may not want to collect a lot of data at night when no one is logged in to the system. You may also want to collect more detailed data between 8 and 9 a.m. when you experience very slow response times.

Automated performance monitoring is accomplished using an interesting little utility in the Windows NT Resource Kit. This datalog.exe utility enables you to capture performance data using a Performance Monitor settings file. You can customize your data collection to various parts of your processing cycle using multiple settings files. You turn this service on and off using the at scheduling utility of Windows NT (similar to cron on UNIX). The following starts the monitoring utility at 8 a.m. and stops it at 11:59 a.m. (the Windows NT Resource Kit provides more details about this and other interesting utilities):

c:\> at 8:00 "monitor START"

c:\> at 11:59 "monitor STOP"

The Report View in Performance Monitor

The final view provided by Performance monitor is the report view. It functions, once again, according to a very simple concept. You specify a set of counters that you want to monitor, and Performance Monitor gives you a screen that lists the parameter and the value observed in the last time interval. Figure 4.12 shows the report view’s display. This is one view where manual data collection can be especially useful. Imagine a situation where everything is running fine on your system until you start up an application. You may want to collect a set of performance data manually when the application is not running to a report (which you save for later reference). You then start up the offending application and capture a new set of data to see what the problems are.

FIGURE 4.12. Report view in Performance Monitor.

Selection Data Sources in Performance Monitor

Each of the views in Performance Monitor is suited to a specific set of monitoring problems and also different user’s tastes. There is one especially interesting feature in Performance Monitor that has become available with Windows NT 4.0 (although you could download the files from the Internet for Windows 3.51 several months before 4.0 was available). This modified version of Performance Monitor enables you to monitor the performance of remote NT computers almost as easily as if you were monitoring your local machine. There is even a version for Windows 95 that lets you monitor remote NT (but not Windows 95) computers. This can be really convenient when you have a series of servers and workstations scattered around the building or in a data center, but your desk is in a traditional office area. The remote-capable version of Performance Monitor is actually part of a series of remote administration tools that enable you to manage user accounts and a number of other functions remotely. Many functions that involve sensitive activities (such as adding user accounts) require that you be a member of a Windows Domain, but that is not especially difficult to set up.

A lot more could be written about performance monitoring and the Windows NT Performance Monitoring tool. It is a flexible tool that can be extended by application developers to meet additional needs. It is relatively easy to set up. It can be customized to suit your individual needs and tastes. Best of all, it comes as part of the Windows NT operating system, so there are no additional software packages to purchase and install.

Auditing

Auditing information is actually displayed using the Event Viewer tool that is discussed in the next section. I have intentionally split auditing into a separate section for several reasons. First, auditing is typically associated with the gathering of information for future review. This includes such things as user access to sensitive information and logging in as a privileged (administrative) user and performing security maintenance tasks. The other reason is to bring out the auditing configuration tools, which are separate from the Event Viewer display.

Auditing is typically viewed by computer people as a defense against hackers and others who are trying to get at data and resources that they are not allowed to access. Although it is true that you do have to waste a lot of your precious time putting up a defense against these less-than-desirable individuals, auditing can be used for other, more beneficial purposes. One of the most difficult problem solving tasks on a computer is to figure out why an application crashed or started performing poorly and then started to perform well at a later time. Usually, by the time you get your other monitoring tools started up, the problem has already gone away or the system has restarted. If you have a good audit trail, you can look at what was happening on the system just before the crash or slow down to see what the possible causes of the problem were.

Let's start this discussion of auditing by exploring the three areas on which Windows NT auditing provides information:

Operating system events
System security events
Application events

The first area is the operating system events. Microsoft has trained Windows NT to write a number of useful bits of information that can be viewed using Event Viewer. This information includes the following types of system events:

Services that fail to start
Hardware conflicts detected
Start of key services, such as the Event Viewer
Print jobs completed
Anonymous login requests, such as those from the World Wide Web
Disks nearing or completely out of space
Access requests to the CD-ROM drive where there is no CD in the drive

The second area that NT monitors captures data related to system security events, such as the following:

Activation of Remote Access Server (RAS) processes
Use of special privileges by users
User login and logout
Failed logins
Access to certain objects (for example, the auditing setup)
Policy changes
System shutdown and restart
User and group management activities

The final, and most expandable area, of monitoring under Windows NT is the capability of capturing events from within applications on the system, including those of the BackOffice applications. The developers of these programs have to insert code to make their applications write event data to the log. The good news is that the developers of BackOffice have written a fair amount of this event logging code, and therefore you get a good picture of what is happening in your BackOffice applications when reviewing the Event Log. The following are some examples of audited application events:

Setting up BackOffice applications, such as Exchange Server
Starting up BackOffice applications
Events from your locally developed applications that might indicate that there are problems or suspicious activities going on

The one auditing tool that is not located within the Event Viewer is contained as a menu selection within the User Manager tool. Although the system and application logging is a pretty straightforward application, the security logging mechanisms need finer control to accommodate the wide variety of environments available. If you audited everything imaginable in security, your log file would be so large that you would have to write an application just to filter out the events that might possibly be of interest (which would not be all that difficult, actually). However, some groups that I work with do not even want to waste the disk space for security auditing because everyone has full access to all resources, and therefore these records have no meaning (small software development environments can be fun in this regard).

You can set security auditing policies from within User Manager using the Policies menu’s Audit option. The Audit option displays the Audit Policy dialog shown in Figure 4.13. The radio buttons are important, because they make the basic selection as to whether you perform any of these detailed security auditing functions. If you select to perform these security auditing actions, you are given two basic choices for auditing. The first column of checkboxes enables you to audit when users perform the functions listed successfully (such as logging on with the correct user ID and password). The second column of checkboxes enables you to audit when users fail when trying to perform a function. You can greatly reduce the number of events recorded if you look only for failures. This catches hackers or people trying to access a resource for which they lack permission. However, if you are trying to gather statistics on usage or logging all (legal) access to a resource, you use the first column's checkboxes. Of course, if you really want a lot of information, you can audit both successes and failures.

FIGURE 4.13. Audit Policy dialog box.

Splitting the security auditing features into these seven categories enables you to pick the items that meet your needs and ignore the others, which can also reduce the number of log file entries you have to review. The categories you have to choose from are as follows:

Logon and Logoff.
File and Object Access. This shows when a given user reads or changes a particular object (for example, read a file or print to a printer.) This can be very detailed, but it works only for NTFS file systems that have access rights (FAT is a pretty open storage mechanism designed for nonsecure DOS environments).
Use of User Rights. This refers to rights that have been granted to a user or group (for example, performing system administrative functions with a personal NT account).
User and Group Management. This catches when someone alters the privileges of another account or creates an account to perform some bad activities and then deletes it. It can also detect unauthorized password changes.
Security Policy Changes. This is another area where the whole security picture of your system can be changed, and therefore it is a common audit function.
Restart, Shutdown, and System. This can be useful for recording downtime and determining when serious problems such as power outages occurred.
Process Tracking. This captures some very detailed information, such as process activation and indirect object access.

Event Viewer

With all the general discussion of auditing out of the way, let's look at the tool that you use to get at that wealth of event information that NT is collecting for you. The first key concept to grasp with the Event Viewer is that this one interface displays all three types of Windows NT audit records: system events, security events, and application events. You select which of these event types is displayed using the Log menu. The Event Viewer remembers the type of log you were looking at the last time you used this utility and displays this same log type when you next start it. To determine which log is displayed at any given time, you simply look at the title. The title contains two useful bits of information. First is the type of auditing performed. The other is the machine whose audit log is being displayed. Once of the nice features for administrators who are in charge of multiple systems is the capability of selecting various Windows NT computers for which you want to display the audit log. Of course, you have to have the appropriate trust relationships established with these machines, but it is worth it if you have several systems to keep track of.

Although these columns provide useful summary information, they are not everything you would want or need to solve a really nasty problem. They are useful to scan through quickly to see whether you have any real problems or whether your log is just full of routine information messages and expected conditions. To get the full details about a particular event that catches your interest, double-click on that event. When you do, you see a detailed display dialog, as shown in Figure 4.14.

FIGURE 4.14. Event Detail dialog box.

This dialog echoes the information contained on the summary display, along with several other useful bits of information. The exact details depend somewhat on the event encountered. For example, in Figure 4.14 where I intentionally typed in a bad password, you can see a text description of the reason and even the logon authentication package used. Some events display data in the bottom text area that can give you clues as to the state and activities that were occurring to cause the event. I always like to screen-print messages that I might have to take action on (or copy them down if screen printing is too much of a problem).

The next log to review is the security log, which is shown in Figure 4.15. Again, you see the same basic columns as in the system log. You get detailed information about a particular event by double-clicking on it just as in the system log. The key differences for this log are the icons that you see at the left edge of the display, the use of the Category field, and the fact that the User column is often the most important bit of information displayed.

FIGURE 4.15. Security Log.

The two icons you typically see are a key (indicating success, which could be bad in the case of hackers accessing something they shouldn't have) and a lock (indicating that the system stopped a user from doing something). It is normal to see a few failures now and then (for example, I have so many passwords that I have trouble keeping them all straight). You are really looking for several failures occurring in a short period of time, which might indicate hacking. You can also scan through the Category list for events that are of interest to you (for example, logon/logoff events). Finally, the User column can help you zero in on the activities of a particular user that you might be watching.

One thing I considered a bit strange with the Event Viewer display was the fact that it does not enable you to sort on any of the categories in the multicolumn listbox display by simply clicking the column heading (such as Category). You can find this type of control on the My Computer utility on your desktop. This makes it really easy to sort the list by user, for example, and see who is being naughty and nice. You do have some of this functionality by using the Filter option on the View menu. This gives you a dialog that enables you to focus on a particular category, user, and so forth (see Figure 4.16). You have to be somewhat careful because you don't see events that don't match your input criteria but might be important to you (use the Clear button on the Filter dialog to view all events). This is an interesting technique to keep in your inventory when you have to audit a large number of events.

FIGURE 4.16. Event Viewer Filter dialog box.

The final log is the application log, shown in Figure 4.17. This is perhaps the least-exploited log in its basic form. About the only applications I have loaded that make good use of this log are the Microsoft BackOffice applications, and even they could make greater use of it. I like the concept of being able to record application events in one central point rather than dozens of different log files scattered all over the system. (I have had to deal with these files before, and they can be a bit of a pain.) The software development kits from Microsoft enable programmers to interface to these audit files. You might want to consider this if you have any input into the application development at your facility.

FIGURE 4.17. Event Viewer Application Log.

The best way to get used to these logs is to run your system for a while and review the events that get recorded. You should be especially sensitive to the icons displayed on the far left side of the list of events, because these help you sort out the problems from the routine items. The next issue you have to deal with is controlling the event log itself. The control over the size of the data and time it is retained is set using the Log Settings option of the Log menu (see Figure 4.18).

FIGURE 4.18. Event Log Settings dialog box.

You set each of the log record sizes separately as controlled by the drop-down listbox. First, you set the maximum size of the log. Basically, I accept the default and adjust it only if I am retaining information for too long or too short a period. The next feature you set here is the one that controls the Windows NT auditing log self-cleaning feature. I really like this because I have seen some huge log files that someone (like me) forgot to clean out. Your options are to overwrite as needed, to overwrite events only if they are more than a certain number of days old, or to never overwrite (to clear the log manually). Remember: you have to set this up for each of the three logs used by NT's auditing (system, security, and application).

One final set of controls you have in the Event Viewer tool is shown in Figure 4.19. You have already learned about the All Events and Filter Events options. This menu also gives you the option of displaying the events from earliest to latest or latest to earliest. The Find dialog looks similar to the Filter dialog, but instead of limiting what is displayed, it takes you to the records that meet your input criteria. The Detail option is similar to double-clicking on a particular event. Finally, Refresh updates the display with any new events.

FIGURE 4.19. Event Viewer's View menu.

As you can see, you can set up a lot with NT auditing and Event Viewer. The interface that controls and reviews the auditing information is also a clean and simple one. Your steps in setting up auditing on an NT Server should include the following:

Determine the factors that affect your auditing needs.
Develop an audit plan.
Use Event Viewer's Event Log Settings dialog to control the amount of information retained in each of the logs.
Use the User Manager's Audit Policy dialog to control which security auditing features are implemented on your system.
Make time to review the logs using Event Viewer (often the biggest challenge related to auditing).

The Server Control Panel Tool

So far, you have explored the traditional monitoring tools that most people would consider. There are graphs that show system utilization in Performance Monitor and all the traditional auditing in the Event Viewer. There is one more standard Windows NT tool. Under the Control Panel, there is a little tool titled Server (not to be confused with the little tool in Control Panel titled Services). Figure 4.20 shows the Server window.

FIGURE 4.20. The Server tool under Control Panel.

The Server tool is a place to check several parameters quickly that impact system operations and performance without setting up formal Performance Monitor counters. The items that you can use for monitoring are the following:

The number of users who currently have sessions on your server. You can click the Users button to get a list of these users. If you have an intermittent problem, it might be useful to see who is always logged on when the problem occurs. You can then explore what that user is doing that might be causing the problem. This sounds tedious, but it may be the only way to figure out those problems that crop up and then disappear before you have time to set up formal monitoring routines.
The number of files that are locked. Certain applications are very interested in maintaining data integrity. They therefore lock up their data files while a thread is in the process of updating that file. The problem is that other user applications that need to access those files might be stuck waiting for the first application to release the data file. This period of being stuck seems the same to an end-user as a grossly overloaded CPU—it is just time sitting there looking at the hourglass cursor.
The other indicators. These might be interesting in certain cases. You might want to explore them when you get some time.

Third-Party Monitoring Tools

The discussion so far emphasized the monitoring capabilities that are provided by Microsoft with Windows NT. As with almost everything in the computer industry, there are a number of third-party products that are also targeted at monitoring for the Windows NT environment. I am hesitant to take on a detailed discussion of these products for several reasons. First, the tools that I have described have the advantage of coming with Windows NT. You do not have to study the alternatives, purchase them separately, or install them. Second, it is difficult enough to keep up with all the revisions that are going on in the Windows NT/BackOffice environment without trying to track new features and releases of third-party monitoring tools. You should know that these tools exist, however and may be exactly what you want to meet some of your special needs.

Parameters that Can Be Monitored

Actually, a complete listing of all the counters that are associated with a given Windows NT/BackOffice system would probably fill an entire chapter—and it would probably go out of date fairly quickly as new revisions of NT and the BackOffice components are released. There is just no substitute for getting into Performance Monitor, Event Viewer, and the User Manager Audit Policies windows and trying out the various configurations. To pique your interest about Performance Monitor, here are the objects that I found on my Windows NT 4.0 server installation and the number of counters that are associated with these objects:

Browser (21)
Cache (27)
IP (17)
Logical Disk (21)
Memory (27)
NetBEUI (39)
NetBEUI Resource (3)
Network Interface (17)
NWLink IPX (39)
NWLink NetBIOS (39)
NWLink SPX (39)
Objects (Events, Mutexes, Processes, Sections, Semaphores, and Threads)
Paging File (2)
Physical Disk 19)
Process (18)
Processor (10)
RAS Port (17)
RAS Total (18)
Redirector (37)
Server (26)
Server Work Queue (17)
System (24)
TCP (9)
Thread (12)

As you can see, there are a large number of parameters that can be monitored. In reality, there are only a few that are important to most administrators. The short list of items to consider include the following:

Processor: percent processor time
Memory: pages/second
Physical disks: percent time
Physical disks: queue length
Server: bytes total/second

BackOffice Integration with NT Monitoring

Remember: the various BackOffice tools do not come with monitoring utilities. Instead, Microsoft has integrated the monitoring services of these applications with those of the host operating system, Windows NT. You use the standard Windows NT tools, such as Performance Monitor and Event Viewer, to check on the activity within your BackOffice applications. The one BackOffice component that is not tightly integrated into this architecture is sort of the grandfather of the BackOffice family—Microsoft Mail. However, its heir apparent, Exchange Server, is very tightly integrated into the Windows NT monitoring environment.

This tight integration gives you even more incentive to become familiar with the Windows NT monitoring tools. You have the advantage of having to learn only one set of tools to aid in your administration of a number of tools. The best time to become familiar with these tools and build settings files is when everything is running well. That way, all you have to do is call up your standard Exchange Server Performance Monitor settings file to see what might be going wrong.

Extending Windows NT Monitoring for Applications

You can interface your locally developed applications with the Windows NT monitoring environment. Microsoft has published the specifications for this interface, and a number of development tools vendors have build object classes, subroutines, and so forth that enable you to write records to the event log or set up a Performance Monitor counter. You should check with your development tool documentation to see what these classes, methods, and so forth are.

Summary

Windows NT provides a fairly robust set of monitoring utilities as part of the basic operating system itself. You can examine various performance factors and review logs of events that have occurred both within the operating system and your application software. BackOffice products are especially well-integrated into this environment and use standard Windows NT tools for monitoring purposes. You can even extend this monitoring environment to locally developed applications. This integration theme continues in the next chapter where you explore the Windows NT administrative environment.