WebUtil
User manual
Copyright © 1998 by Harms Software Engineering
All rights reserved.
Table of contents
1.1. Requirements
6.1. Web sites and web pages [HTTP]
The Internet has changed the way that we use computers, in fact, the Internet has changed the way we live. By making information of all kinds available to anyone anywhere in the world, it is safe to say that we have even become dependent on the Internet. The Internet has also caused us to change. We have become more demanding, for example. Since we can get most anything, in terms in terms of information, we want to get is fast as possible. Search engines such as Lycos and Alta Vista, have become a commercial success as a result of our desire for information.
The Internet also has a down side. It is slow. Millions of people are experimenting with the Internet and many of those people it will become an important part of their lives. The large number of people using Internet makes it a success, but at the same time, slows the Internet down, making it sometimes frustrating to use. Just as Search engines were an answer to finding information quickly, a new concept called Web spiders was an answer to retrieving information from the Internet automatically so that you do not have to do it manually, thereby speeding up the use of the information.
A Web spider is a tool that collects the information that you are interested in, from the Internet, at a time when it is most convenient for you. For example, a Web spider can pick up the news from an international news agency in the middle of the night, while you are sleeping, so that you can read the news when you wake up in the morning. Stock prices, weather forcasts, new software releases, you name it, a Web spider can get it for you, at a convenient time, and without manual intervention!
Just as the Internet has grown large and slow, most Web spiders are not cut out to be used without manual intervention. They are large, have thousands of features, and require a college degree to understand what everything means. If you are not familiar with the technical details of the Internet, you probably will not understand most of the options that they offer. The whole purpose of a Web spider is to make your life easier. It must be easy to understand, easy to install, easy to configure, and easy to use!
WebUtil is an answer to the large and slow Web spiders currently on the market. WebUtil has been designed to be easy to install, easy to configure, and very easy to use. It does not have many options, but it does have a lot of features! The power of WebUtil lies in the fact that it offers the functionality that you would expect without bothering you with the technical details of the Internet that you probably do not care to know about.
(back)WebUtil only requires a few kilobytes of harddrive space and it also does not require very much system memory or CPU time. In short, it is a very small and efficient program.
(back)For example, if you instruct WebUtil to pick up an entire web site, then all of the links between the pages of the web site are translated in such a way that you can view the entire site on your local harddrive without having a connection to the Internet!
If you instruct WebUtil to only pickup a limited number of pages, then only the links to the pages downloaded will be translated. Later, when you view those pages with your favorite internet browser and you select a link that was not translated, your browser will automatically try to retrieve that page from the internet, assuming you have online connection at that point. This means that, assuming you are connected to the Internet when viewing your downloaded pages, all links will still point to valid pages, regardless if they were translated or not. If you do not have a connection to the Internet at that moment, then your browser will generate an error message, saying that it can not find the site.
To sum up what we have just explained:
For files, this process is only a little different, in the fact that you would not use your browser to view them.
Another very powerful feature that WebUtil has is the ability to only pick up web pages or files if they have changed since the last download. WebUtil will determine if the file has changed, and in case it has, it will pick it up. If it has not changed, it will not.
For FTP sites, WebUtil is even more powerful! It can synchronize the contents of a directory on your local harddrive with a directory on an FTP site. You can specify which location is the "leading" location. For example, you may want the FTP site to contain exactly the same files as a directory on your local harddrive. In another situation, you may want exactly the opposite. WebUtil is extremely flexible. You can synchronize a specific set of files, entire directories, or a combination of the two!
Below is a summary of the features we have just explained in the above paragraphs. WebUtil can:
WebUtil consists of one executable and a configuration file. The configuration file contains the information that instructs WebUtil what to pickup from the Internet. WebUtil itself does not contain any buttons or any menu items. When started up, WebUtil will carry out the instructions in the configuration file and then shut down. Executing WebUtil, therefore, consists of simply entering the name WEBUTIL on the OS/2 commandline or by double clicking the WebUtil icon on the desktop.
WebUtil has only optional commandline that can be used to tell WebUtil to use a different configuration file. Normally, WebUtil will look for a file named WEBUTIL.INI. If a different name is specified on the commandline, WebUtil will use that one instead. Example usage:
WEBUTIL MYCONFIG.INI
(back)As we stated in chapter 1, WebUtil is easy to configure.
To install WebUtil, simply un-zip the WebUtil archive into a new directory.
Then type install.
The installation procedure will create a new folder on your desktop with the name WebUtil. This folder will contain icons for the program WEBUTIL.EXE, a sample configuration file WEBUTIL.INI, the help file WEBUTIL.HTM, and the registration file WEBUTIL.REG.
(back)
;
; Sample WEBUTIL.INI file.
;
; Copyright (C) 1998, Harms Software Engineering, all rights reserved.
;
[MAIN]
LOG=c:\webutil\webutil.log
KEY=unregistered
NAME=Harald Harms
[END]
;
[HTTP]
NAME=ALLFIX WebSite
URI=http://www.allfix.com
LEVEL=3
LOCATION=d:\webutil\allfix
IF_MOD=TRUE
[END]
;
[FTP]
NAME=ALLFIX FtpSite
URI=ftp.allfix.com
USER=harald
PASSWORD=test
TRANSFER=c:\files\myfiles.zip,/harald/,UP,IFMOD
[END]
;
As can be seen in the example above, each block begins with a name enclosed in square brackets and it ends with the workd END also enclosed in square brackets. The block MAIN contains the general information. A block of the type HTTP contains instructions for which web sites or web pages need to picked up, and a block of the type FTP contain instructions for which files need to be collected or placed on an FTP site.
The instructions in the blocks consist of a verb followed by an equal sign which is in turn followed by a value. Some verbs are simple Yes/No items. In those situations, the value of the verb is either YES or NO, as can be seen in the HTTP block above (see verb IF_MOD).
Lines that start with a semicolon (;), are regarded as comments. This means that WebUtil will ignore those lines. We suggest that you include comments in your configuration file because it makes it easier for you, and for others, to understand what you have done.
(back)LOG
KEY
NAME
Web pages contain links to other pages on the Internet. The main page, often called INDEX.HTML, for example, may contain links to 10 other pages, which in turn contain links to many many more pages. Each time a link is followed from one page to another, we say we have gone a level deeper. This means that if you follow a link in INDEX.HTML to PRODUCTS.HTML, and then to HELP.HTML, we would say that you are currently at level 3, in the web site.
Levels are very important because you can tell WebUtil how many levels to pickup. If you specify a level of 1, then only the page that you include in the HTTP block will be picked up. If you specify 5 levels, then WebUtil will follow each link picking up the subsequent pages, until it has arrived at level 5.
The HTTP block can contain a number of different verbs. Below is a list of the verbs and an explanation of what they mean:
NAME
URI
LEVEL
LOCATION
IF_MOD
TRANSLATE
SHORTNAMES
When downloading pages, the directory where they are stored can become quite a mess. WebUti will automatically clean up the directories each time it is started up, unless the IF_MOD feature has been turned on. If this feature is turned off, the directories will not be cleaned up.
(back)The FTP block can contain a number of different verbs. Below is a list of the verbs and an explanation of what they mean:
NAME
URI
USER
PASSWORD
TRANSFER
[location][filespec],[location][filespec],UP|DOWN|SYNCH,IFMOD|DELETE
For example:
c:\files\myfiles.zip,/harald/,UP,IFMOD
or
/harald/myfiles.zip,c:\files\,DOWN,IFMOD
As you can see from the format shown above, the different parameters must be seperated with a comma. The first two parameters are used to specify locations and files. A "filespec" is a filename, which may include wildcards. The keywords UP, DOWN, or SYNCH indicate the direction of the transfer, your point of view. In other words, UP sends something to the FTP site and DOWN picks something up from the FTP site. The last parameter is also a keyword which indicates if files should only be downloaded if they have been modified (IFMOD) or if the files should be deleted after the they have been transfered (DELETE). The last paramter may be left out, or both the IFMOD and the DELETE paramters may be used together, in which case they need to be seperated with a pipe symbol (|).
The two parameters are a little confusing at first. The first parameter always indicates the location and filespec that an operation must be performed on. In the case of an upload (UP), it indicates the location and filespec that needs to be sent to the FTP site. In the case of a download (DOWN), it indicates the location and filespec on the FTP site that needs to be downloaded. The second parameter always specifies the destination where the transferred file should be placed. This can be either a directory on the FTP site or a directory on your local harddrive.
It becomes a little trickier when you want to use the synchronize feature. In that case, the first and second parameters may both contain file specs and directory names. There are four different combinations of synchronizing files that can be identified. These four are explained below:
1. Both directories (local and remote) must contain the same files. In this situation, both the first and the second parameters must indicate a directory, and should not contain any filespecs.
Example: c:\files\,/harald/,SYNCH
2. Certain files on the local harddrive need to be synchronized with the files on the FTP site. In this case, the first parameter should contain a location and filespec. The second parameter should contain only a location.
Example: c:\files\*.ZIP,/harald/,SYNCH
3. Certain files on the FTP site need to be synchronized with the files on the harddrive. In this case, the first parameter should only contain a location. The second parameter should contain location and filespec.
Example: c:\files\,/harald/*.ZIP,SYNCH
4. Certain files on the FTP site need to be synchronized with files on the local harddrive AND certain (other) files on the local harddrive need to be synchronized with the FTP site. In this case both the first and the second parameters should contain location and filespecs.
Example: c:\files\*.ZIP,/harald/*.ARJ,SYNCH
Note:
6.1 Web sites and web pages [HTTP]
Unable to establish connection
Unable to find <name>
Unable to create file on local drive <name>
File not modified, skipping download
Unable to establish connection
User name incorrect
Password error
Unable to establish data channel
Unable to find file
Login unsuccessful
File not modified, skipping download
WebUtil has been released under the shareware concept. This means that you are allowed to use to use it for a maximum of 30 days. If you enjoy using WebUtil and you wish to continue using it, then you are required to register the program. By registering the program, you receive a registration key which will make all of the features available.
In the un-registered version of WebUtil, the following features have been disabled:
Upon registering WebUtil, the above features will become available to you.
A registration key is valid for the current release version and for the next two release versions. This means that if you register WebUtil version 1.00, you will also be able to use 1.10 and 1.20 (assuming that those are the next two versions that are released).
You can register WebUtil by filling out the electronic registration form on our Web site, www.allfix.com or by completing the registration form (WEBUTIL.REG). Please consult the registration form for more information.
(back)
Abbreviation |
Meaning |
TCP/IP |
Transmission Control Protocol/Internet Protocol |
URI |
Universal Resource Identifier |
FTP |
File Transfer Protocol |
HTTP |
Hyper Text Transfer Protocol |
HTML |
Hyper Text Markup Language |