Support for VMS. Patches provided by Robin Garner.
A bug fix: the domain charts used to be based on all usages, regardless of settings for sites and documents to be ignored. Now they are based only on accesses included in the total access count. (Ignored sites and documents are not included. Hidden documents are included.)
Support for the PLEXUS server log format. Provided by Glenn Heinle. I have not tested this code myself; you can contact him at heinle@cmf.nrl.navy.mil if you ahve questions about the PLEXUS support.
Bug fix: you can now ignore/hide/etc. accesses of your root document by blocking accesses to "/".
Compatibility enhancement: since the CERN server uses "Welcome.html" and "welcome.html" in the way the NCSA server uses "index.html", all three are now aliased to "/" when used in the root directory.
Uppercase site names are now converted to lowercase before being analyzed.
Identity check information (person@) is now stripped off of site names before they are analyzed.
A bug fix for the double-reports that have appeared recently in 3.0.2. Hopefully. Hard to tell since I've seen them at my site but attempts to deliberately produce them fail. Let me know what happens with 3.1 if you've seen them with 3.0.2.
Compatibility fix: strdup() has been banished, replaced by a simple mystrdup() function. This will make those whose C libraries lack strdup() happy.
I have received patches for VMS and am working on built-in MSDOS support, among other things, for the next release. My apologies to those who hoped to see them in 3.1.
A small fix: graphs weren't coming out right on the first pass, but were OK if wusage was run a second time. Fixed -- graphs now come out right the first time.
Just a small fix to version 3.0: graphs were displaying an extra week with zero accesses. Fixed up.
Support for the new common log file format.. Those of you who have just upgraded to NCSA httpd 1.2 will note that wusage has stopped working. This is because the log file format has changed. The first line of your wusage.conf file now most likely reads either NCSA or CERN; change it to COMMON, if you have the latest version of either server. Note that data in the old format can't be read once you do this, and will be ignored. If you've been running wusage all along, this is not a problem, since you already have reports for all previous weeks; you may need to fudge your results a bit for the one week during which you switched servers.
Please see configuring wusage for your server.
Incredibly dumb mistake on my part while creating 2.3 which led to problems even worse than those in 2.2 is now fixed. It's amazing how much trouble one line of code can cause. THIS VERSION WORKS, at least on the systems available to me for testing.
The bullet is no longer inside the anchor in the list of weeks, owing to a problem with at least one client that won't accept this (although I believe it's valid HTML).
Compatibility fixes for yet more compilers.
Upgrade Notes: ppmfig has not changed since version 2.0, but both wusage.c and usagegraph.c have changed in this version. So if you already have a working ppmfig you can forego rebuilding it. If you're having problems, though, be sure to try rebuilding ppmfig against an up-to-date pbmplus version.
NCSA_HTTPD
CERN_HTTPD
Version 2.1 now ignores white space at the end of entries in the exclusion lists; not strictly a bug in 2.0, but it saves a lot of grief.
Second, Version 2.0 supports the exclusion of unwanted accesses such as gif files, personal files, and other materials that distort the statistical picture of the server, in the opinion of the operator. This mechanism is entirely under the control of the operator-- no code changes are needed.
Third, a major bug resulting in incorrect top-ten lists when wusage attempted to take care of several unprocessed weeks in one pass was fixed.
Written by Thomas Boutell, 11/93 - 5/94.
The GIF code is based on that found in the pbmplus utilities, which in turn is based on GIFENCOD by David Rowley. See the notice below:
/* ** Based on GIFENCOD by David Rowley .A ** Lempel-Zim compression based on "compress". ** ** Modified by Marcel Wijkstra ** ** Copyright (C) 1989 by Jef Poskanzer. ** ** Permission to use, copy, modify, and distribute this software and its ** documentation for any purpose and without fee is hereby granted, provided ** that the above copyright notice appear in all copies and that both that ** copyright notice and this permission notice appear in supporting ** documentation. This software is provided "as is" without express or ** implied warranty. ** ** The Graphics Interchange Format(c) is the Copyright property of ** CompuServe Incorporated. GIF(sm) is a Service Mark property of ** CompuServe Incorporated. */
wusage maintains usage statistics for a WWW server. Specifically, it updates the following information, week by week:
To use wusage, you will need the following:
That's it! previous versions required the presence of the pbmplus utilities and of a Unix shell. These requirements have been lifted. Version 3.2 should be a very easy (even trivial) port to MSDOS, including the GIF support routines. If you do this, please contact me so I can combine your code into the official package and make your binary available!
wusage is intended for use with the NCSA or CERN httpd servers, or with any server which produces the new "common logfile format". If you use a different server with a different access log file format, it will be necessary to patch the wusage.c source code appropriately, which should not be overly difficult. I will be glad to assist as best I can. Note that the author of your server should be using the new common log file format, so if they are not doing so I suggest you point this out to them.
You can fetch wusage as a compressed tar file here. Or you can FTP it directly from isis.cshl.org, in the subdirectory pub/wusage.
In order to build wusage, first untar the wusage.tar file with the following command:
uncompress wusage3.2.tar.Z tar -xf wusage3.2.tarThis will create the directory "wusage3.2" beneath the current directory.
cd to this directory and examine the Makefile, which you may need to change slightly. Specifically, if you are using a different C compiler which is not named or aliased to "cc" (this is quite uncommon), change
CC=ccto read
CC=accOr to another appropriate compiler.
If you are using the SGI C compiler, you will need to add "-cckr" to the CFLAGS line.
Now, to build the package, just type "make all". If all goes well, the program "wusage" will be compiled and linked without incident.
You have now built wusage. All that remains is to configure it for use with your server.
#Type of server log: COMMON (all new servers), NCSA_HTTPD, CERN_HTTPD, #or PLEXUS_HTTPD. #The latter three are for older versions of those servers; newer versions # should use the COMMON log file format (but CHECK YOUR DOCUMENTATION). COMMON #Name of your server as it should be presented Quest #File to use as a prefix; MUST BE A COMPLETE FILE SYSTEM PATH. REALLY. #NOT A URL. /home/www/prefix #File to use as a suffix; MUST BE A COMPLETE FILE SYSTEM PATH. REALLY. #NOT A URL. /home/www/suffix #Directory where html pages generated by usage program should be located /home/www/web/usage #URL to which locations of html pages should be appended for usage reports #(the same as the first line, but in web space, not filesystem space) /usage #Path of ncsa httpd log file /home/www/ncsa/logs/access_log #Your top-level domain name (org, edu, com... just the topmost level) org #Hidden items { } #Ignored items { } #Ignored sites { } #Domain aliases or "none" #none { { aliasname domain domain domain } ... More aliases, if any ... }
The first non-comment line should read:
COMMON
NCSA_HTTPD
CERN_HTTPDor
PLEXUS_HTTPDas appropriate to your server's log file format. Note that the latest versions of CERN and NCSA servers produce the COMMON log file format, and setting this line to a different value won't work for those versions! UPPERCASE REQUIRED.
Note to those upgrading: once you switch to the COMMON log file setting, wusage can't read any data in the old format that may be lying around, but it can skip over it tactfully. The upshot of this is that if you've been running wusage all along, you'll simply be able to start using it again and will only need to adjust the results for the one week during which you made the changeover to a common-logfile-format server version.
For those using wusage for the first time, this is a thornier problem, but it can be handled with some ingenuity (by switching the setting of the first line after running wusage on the pre-common format part of your log, then deleting the older content). I encourage server authors (and anyone else for that matter!) to write a conversion filter to translate old-style log file formats to the new style. It shouldn't be very difficult. At worst, you'll have statistics only from the point at which you switched to a common-logfile-format server.
The second non-comment line should contain the name of your server as you would like it to be referred to in the usage page.
The third line should contain the full filesystem path (NOT URL) of a file you would like to have copied in at the beginning of each page generated by wusage, or the word
none
(in lowercase letters). You can use this mechanism to add a link up to your home page, or an illustration of your choice.
The fourth line is just like the third, but specifies a suffix file to be appended at the end of each page.
Sample prefix and suffix files are provided. Note the link to the wusage documentation in the suffix file. You are not required to keep this link, but we will greatly appreciate it if you do so. (Of course, if your site is strictly internal and behind a firewall, you should remove the link, since it won't work for your users.)
The fifth line contains the directory in your file system in which html pages generated by wusage should reside. This will usually be a subdirectory of your server root directory called "usage". (In our case, DOCUMENT_ROOT is /home/www/web, so the fifth line is /home/www/web/usage.)
IMPORTANT: this directory should not be shared with other information! Please give usage a subdirectory to itself, since it creates and deletes files fairly freely and assumes its directory is a safe place in which to do so.
The sixth line is the "base URL" for html pages generated by wusage. This is similar to the second line, but is the location in web space, not in filesystem space. Thus, if DOCUMENT_ROOT is /home/www/web and you set the second line to /home/www/web/usage, the fourth line should be set to just /usage.
The seventh line is the location of the NCSA server access_log file, which wusage needs to be able to read in order to compute statistics. This file is located in .../ncsa/logs; ... is the location at which you installed the server. In our case it is installed beneath /home/www.
The eighth line is the default domain, which should be the domain in which your own server is located. For instance, if your server's name is siva.cshl.org, this line should read
org
. This is
new in version 3.1 and later.
#Hidden items { *.gif } #Ignored items { /~* } #Ignored sites { www.ourcompany.com *.ourschool.edu }
This mechanism makes it much easier to arrive at a meaningful top-ten list.
To make them more useful, it is possible to combine countries into continent domains.
The entire set of aliases is enclosed in a { ... } pair.
See the provided wusage.conf file for examples.
-c (location of wusage.conf file)which specifies the location of the configuration file.
You can simply run wusage by hand with the -c option (example:
wusage -c wusage.conf
). You will need to do so once a week.
To run it from a cgi script, create a cgi script which executes the above command and echoes back a reasonable page to the user indicating success. (Since reports are weekly no matter how often the program is run, it is recommended that such a button be placed on a private page, since it has no dramatic effect and need not be run incessantly by users.)
In order to install wusage as a regularly-scheduled automatically-run program, you need to add it to your crontab file and submit it to the program "crontab". Our crontab file looks like this:
1 0 * * 0 /home/www/wusage -c /home/www/wusage.conf ... other jobs, if any ...The crontab file submitted to the Unix system with the following command, assuming it is called "crontab.txt":
crontab crontab.txt
Of course, if you run the www server as root, you no doubt already have a crontab file for root, to which you will want to add this line, following this with a reinstall using crontab. (We created a separate www account to facilitate this sort of thing; I recommend this strategy to other server administrators.)
wusage -c /home/www/wusage.conf(Substitute the directory where wusage.conf resides on your system for /home/www in the above.)
Now, if all has gone well, edit your home page to include a link to the usage report. Here is the relevant excerpt from our home page:
<P>Usage of the Quest WWW server is kept track of through <A HREF="http://www.nbs.gov/usage/index.html"> <IMG ALIGN=TOP SRC="/usage/usage.graph.small.gif"></A> <A HREF="/usage/index.html">usage statistics</A>.</P>In addition to obvious name changes, you may need to change the directory linked to if you did not use /usage in your configuration file.
Note that in addition to a normal text link, a small usage graph is provided as an icon. This graph is genuine-- it is updated at the same time as the larger graph on the main usage page!
Important note:if you do purge your access_log file, then be sure to back up the directory in which wusage keeps its html pages. This directory contains important summary information for previous weeks which wusage must have in order to graph information regarding past weeks no longer in the access_log file. Of course, you should also compress and back up your old access_log data.