Based on the previous sample program the main program of the URL checker is pretty short:
/* URLCHECK.CMD - IBM REXX Sample Program */ Parse Arg URLList HTMLFile /* Load REXX Socket library if not already loaded */ If RxFuncQuery("SockLoadFuncs") Then Do Call RxFuncAdd "SockLoadFuncs","RXSOCK","SockLoadFuncs" Call SockLoadFuncs End /* check all URLs in the specified file for expiration */ URLS.0 = 0 Changed = "" Unchanged = "" Commented = "" Call CheckURLs URLList Call WriteHTML HTMLFile, Changed, Unchanged, Commented ExitThe following variables are used in the main program:
Based on the information in the the three index variables and the 'URLS' stem the function 'WriteHTML' then writes out a HTML file with links to all URLs from the input file grouped by their status.
The following functions are reused from the previous sample and are not listed again:
/********************************************************/ /* */ /* Procedure: CheckURLs */ /* Purpose: Check the modification dates of all URLs */ /* listed in the specified file. If the date */ /* has changed, update the list file with */ /* the new date. */ /* Arguments: URLFile - file containing URL list */ /* Returns: nothing */ /* */ /********************************************************/ CheckURLs: Procedure Expose URLS. Changed Unchanged, Commented Parse Arg URLFile Index = 0 Do While Lines(URLFile) /* read line with URL and last modification date */ URLLine = LineIn(URLFile) /* remember line for later update of file */ Index = Index + 1 URLS.0 = Index URLS.Index = URLLine /* if first character is not a "#" then process URL */ If SubStr(URLLine, 1, 1) \= "#" Then Do /* retrieve header for specified URL */ Parse Var URLLine URL ModDate Header = GetHeader(URL) If Length(Header) \= 0 Then Do /* header could be read, find date */ DocDate = GetModificationDate(Header) If Length(ModDate) = 0 | ModDate \= DocDate Then Do /* this URL has been changed, add to list */ /* of changed URLs and update the date */ Changed = Changed Index URLS.Index = URL DocDate End Else /* add index to list of unchanged URLs */ Unchanged = Unchanged Index End Else /* add index to list of unchanged URLs */ Unchanged = Unchanged Index End Else /* add index to list of all commented out URLs */ Commented = Commented Index End /* close input stream, erase it and then rewrite it */ Call Stream URLFile, "C", "CLOSE" "@DEL" URLFile Do Index = 1 To URLS.0 Call LineOut URLFile, URLS.Index End Call Stream URLFile, "C", "CLOSE" ReturnAfter the all documents have been checked the result will be formatted into a 'HTML' file with links to the original documents. For details of HTML (HyperText Markup Language) see RFC 1866. The output file is created in the function 'WriteHTML'. It deletes an already existing version of the output file, creates a simple header, formats the lists of changed, unchanged and commented documents, and finally closes the file with a simple trailer containing the current time:
/********************************************************/ /* */ /* Procedure: WriteHTML */ /* Purpose: Create a new HTML document with links to */ /* the input URLs grouped by modification. */ /* Arguments: HTML - output filename */ /* Changed - list of changed URL indices */ /* Unchanged - list of unchanged URL indices */ /* Commented - list of commented URL indices */ /* Returns: nothing */ /* */ /********************************************************/ WriteHTML: Procedure Expose URLS. Parse Arg HTML, Changed, Unchanged, Commented /* write new HTML document with links to URLs */ "@DEL" HTML "1>NUL 2>NUL" Call LineOut HTML, "<html><head>" Call LineOut HTML, "<title>My link list</title>" Call LineOut HTML, "</head><body>" Call LineOut HTML, "<h1>Changed documents</h1>" Call FormatURLList HTML, Changed Call LineOut HTML, "<h1>Unchanged documents</h1>" Call FormatURLList HTML, Unchanged Call LineOut HTML, "<h1>Commented documents</h1>" Call FormatURLList HTML, Commented Call LineOut HTML, "<p><i>Documents checked at", Date() "on" Time() "</i>" Call LineOut HTML, "</body></html>" ReturnThe function 'FormatURLList' is used to format a single index list into the HTML output format with one URL per line. This version of the formatter simply creates a hyper link to the document and lists the URL of the document followed by a line break. Another solution would be to format the URLS in an unordered list, etc., see the HTML reference for more formatting options.
/********************************************************/ /* */ /* Procedure: FormatURLList */ /* Purpose: Format a list of URL indices into a HTML */ /* formatted list with links to the URLs. */ /* Arguments: HTML - output filename */ /* List - list of indices */ /* Returns: nothing */ /* */ /********************************************************/ FormatURLList: Procedure Expose URLS. Parse Arg HTML, List /* are there any indices in the list? */ If Words(List) > 0 Then Do Do Index = 1 To Words(List) Idx = Word(List, Index) Parse Var URLS.Idx URL ModDate URL = Strip(URL, "L", "#") Call LineOut HTML, "<br><a href=""" || URL || """>" Call LineOut HTML, URL || "</a>" If Length(ModDate) > 0 Then Call LineOut HTML, ", last modified at" ModDate End End Else Call LineOut HTML, "<p><i>no documents in list</i><p>" ReturnThis is is a sample input file for the URL checker:
http://www.ibm.com http://www.myhost.mydomain/users/chris.html #http://www2.hursley.ibm.com/rexx/After running the checker the resulting HTML file could look like that:
<html><head> <title>My link list</title> </head><body> <h1>Changed documents</h1> <p><i>no documents in list</i><p> <h1>Unchanged documents</h1> <br><a href="http://www.ibm.com"> http://www.ibm.com</a> , last modified at THU, 18 JUL 1996 17:41:10 GMT <br><a href="http://www.myhost.mydomain/users/chris.html"> http://www.myhost.mydomain/users/chris.html</a> , last modified at MONDAY, 22-JUL-96 19:51:25 GMT <h1>Commented documents</h1> <br><a href="http://www2.hursley.ibm.com/rexx/"> http://www2.hursley.ibm.com/rexx/</a> <p><i>Documents checked at 24 Jul 1996 on 18:43:56 </i> </body></html>Running the program every night or early morning on a server gives you a daily updated list of the documents you want to follow. By using the generated HTML file as the startup page in your web browser you can access the changed documents directly via their links.
The shown URL checker can be improved in many ways, e.g.: