CHECKURL Manual
[Bottom][Contents][Prev]: Contents[Next]: Sample Batch File

CHECKURL.REX - Program to Check URLS (Windows and OS/2)

This program's basic task is to check URLs in an attempt to eliminate 404 errors etc from your web pages. It can be run under Windows 95/98/NT/W2K as well as OS/2 (rename to ".CMD").

The ppwizard "VALRURL.H" header file macros is one way that URL lists could be created. Another is the Netscape "Create URL object" option (or Netscape drag and drop pages). When used with the ppwizard header the location in your source is identified (not the location in the target html).

Syntax - Windows

    regina.exe CHECKURL.REX  Command  [CommandsParameters] [Options]
    

To use in windows requires the free regina interpreter as well as the contents of the enclosed and free "W32RXSCK.ZIP" for sockets support (unzip contents into current or windows directory).

Syntax - OS/2

    CHECKURL.CMD  Command  [CommandsParameters] [Options]
    

The native interpreter is used under OS/2, the only requirement is that you rename "ppwizard.rex" to "ppwizard.cmd".

CHECKURL Commands

This program supports a large number of commands and options, most of which you will not be interested in, you may wish to look at the example below which is the batch file I used when checking all my URLs.

Options may contain text of the form "{x??}" where "??" is the hexadecimal value of the character you wish substituted. You will not need to do this very often but when you do you will probably wish to use "{x20}" for the space character.

The program takes a number of commands as follows:

  1. VERSION?
    The program will return (not displayed) its version number.

  2. SOCKETVERSION?
    The program will return (not displayed) the version of "RxSock.DLL" in use (for "http" checks).

  3. FTPVERSION?
    The program will return (not displayed) the version of "RxFtp.DLL" in use (for "ftp" checks).

  4. SOCKETREADY?
    The program will return (not displayed) "OK" if "http" checking is possible or a reason why not.

  5. FTPREADY?
    The program will return (not displayed) "OK" if "ftp" checking is possible or a reason why not.

  6. CHECK1URL OneUrl
    The program will return (not displayed) "OK" if the URL (with begins with "http://" or "ftp://" can be accessed or a reason why it couldn't.

  7. CHECKLISTEDURLS [+]UrlFileMask1 ...
    The program will read the file or files specified and sort the URLs listed. If the file mask is preceeded by "+" then all subdirectories below the one specified are also processed.

    A file that ends in '.url' and whose first line starts with '[' is considered to be a Windows URL shortcut and the URL on the line that begins with 'URL=' is processed and all other lines are ignored.

    This option also allows OS/2 WPS URL objects to be tested, just open up the properties to determine the path.

    Within each file blank lines and lines that begin with ';' are ignored as is leading and trailing whitespace on all lines and URL duplicates.

    The progress of URL checking is displayed on screen and the return code is the total number of failing URL's.

    It is highly recommended that you use the "/CheckDays" switch even if you give it a value of 1 (recheck every day).

  8. CHECKURLSINHTML [+]HtmlFileMask1 ...
    The program will read the file or files specified. The files are scanned for URLs, otherwise it functions pretty much like the "CHECKLISTEDURLS" command.

CHECKURL Options

  1. /MemoryFile[:NameOfFile]
    A memory file is used to hold details about checked URLs. It becomes really useful if the "/CheckDays" option is also used as this program will then not retest all URLs but can be more selective (for example not retest URLs you only checked yesterday).

  2. /CheckDays[:OkUrlMinAge[-OkUrlMaxAge]]
    This will determine how long ago is too long since the last successful URL check. For example setting this to 14-14 means that a URL which tested OK today will not be retested for 2 weeks. You must have used the "/MemoryFile" switch to specify a file.

    If you had have said "14-21" then for each URL a random day between 14 and 21 would be chosen. This allows the URL checking to gradually spread, that is every two weeks you are not doing the full check.

    Note that once a URL has failed this checking is bypassed, the next time you run this program will will always retest URLs that failed on the previous run.

  3. /Exclude[:[+]UrlFileMask]
    This switch is only useful when the CHECKLISTEDURLS command is also used, you can use this switch as many times as you wish and order can be important (when input filemasks are mixed with exclude masks).

  4. /ForgetDays[:NumberOfDays]
    By default old URLs are dropped from memory if not referred to for a significant period of time. This option can be used to specify how old a URL can become or to turn off the dropping (don't specify a value).

  5. /ReadTimeOut[:NumberOfSeconds]
    Allows you to specify the maximum number of seconds we will wait for a server to respond to a request. A blank value will restore default.

  6. /TimeOutRetry[:NumberOfSeconds]
    If a URL check fails with a timeout, you may specify that after URL checking has completed that these URLs be retried. If a value of 0 is supplied you don't wish a retry otherwise the value represents the timeout value in seconds that should be used. A blank value will restore default. It may be wise to increase the value past that used on the "/ReadTimeOut" switch.

  7. /GetEnv:NameOfEnvVar
    Allows you to pick up options from an environment variable.

  8. /ErrorFile[:[+]NameOfFile]
    Create new or append to old error file. This file will hold the complete list of failing URLs along with the reason for the failure in a format that this program can accept as input (rename file first!). This is probably less useful if "/IniCheckDays" is specified.

  9. /FtpEmail:YourEmailAddress
    When checking FTP addresses this value is used for the password to the "Anonymous" user, if not supplied a default (obviously incorrect) value is used.

  10. /TestUrl[:Url]
    I don't know of any way I can tell if the network is available. By default a known internet URL is checked as this is expected to exist, if found you must have access to the internet. You can either change URLs (if testing intranet URLs etc) or turn it off altogether.

  11. /MemoryBackupLevel[:Level]
    This determines how many memory file backups are kept. A value of 0 turns off backups otherwise a value of 1 to 9 is required.

  12. /OkResponses:OkResponseList
    Some pages/servers return some annoying return codes which you know are "OK", I know of a page that returns "500" and another actually returns "404". The most common return codes that would be ignored are "301" and "302".

    The parameter on this switch is the name of a file. Blank lines or lines starting with ';' are ignored and all leading an trailing whitespace is removed. Each remaining line has two or three parameters separated by one or more spaces.

    There are actually a number of different command types as follows:

  13. /IgnoreFor:NumberOfDaye
    This command will output a "IGNORE" command into the error file suitable for cut+paste into a "/OkResponses" file. You specify the period of time (in days) that you would like to ignore thsi problem. Quite frequently problems will come and go, where you feel this is a possibility you can simply cut+paste the text rather than being forced to enter the command yourself...

  14. /PageMoved:PageMovedText
    Allows you to choose some text which from experience you know is included in pages that have moved. Some sites will not return server 301/302 codes to indicate that a page has been moved. Note that you will need to use "{x20}" to represent spaces.

  15. /MaxPageLng:NumberOfCharacters
    This program will keep reading html up until the page becomes longer than the value specified here. Note tham making this value too small can effect the detection of page movement where the server does NOT return 301 or 302 response codes.

  16. /CheckPoint:HowOften
    How often (after how many urls) a checkpoint to the "memory file" is made. If you test 200 URLS you don't want to start from scratch if something "happens".

  17. /HttpUserAgent[:SimulateWhichBrowser]
    This allows you to make the checking program look like a specific browser to the server. The server will see this information as "HTTP_USER_AGENT".

CHECKURL Environment Variables

  1. CHECKURL_DEBUG=[+]NameOfFile
    Create new or append to old debug file. This file will hold much more detail about the programs internal workings. This will slow down URL checking. If reporting problems with this command please send me the debug output.

  2. CHECKURL_OPTIONS=Options
    This allows you to specify command line options in the environment.

When a filename is required it may contain the text "{Date}", "{DateNumbers}" or "{Time}", these get replaced with "YYMonDD", "yyyymmdd" or "hhmmss".


[Top][Contents][Prev]: Contents[Next]: Sample Batch File