home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
OS/2 Spezial
/
SPEZIAL2_97.zip
/
SPEZIAL2_97.iso
/
ANWEND
/
ONLINE
/
HTMCK111
/
README.DOC
< prev
Wrap
Text File
|
1996-11-06
|
14KB
|
385 lines
HTML Check (Version 1.10 Nov 6, 1996)
Changes:
--------
1.10
- Added support to verify HTTP:// links and reports
1.03
- Fixed yet another memory problem
- Added support for BODY - BACKGROUND Tag
- Read REXX routines into memory for faster running
1.02 -
- Major Memory screw up!!! (Fixed)
1.01 -
- Fixed bad HTML check, won't Trap out on bad HTML
- Fixed the Physical path name on CGI-BIN files, and # references
- Added a buffer size 'MemAlloc=' in the INI file for increasing the
amount of memory available to the link list of Directories & HTML files
- Changed the Calls to the REXX function from MacroSpace to Disk call
(I couldn't get the MacroSpace cleaned up if there was an error in the
REXX macro)
- Display the Error messages to the Error list on the main display
- Removed the requirement of calling hcLoadVars - Do not call this
routine anymore it will display an error if you do. Just delete it
all the of the REXX variable will be defined automagicaly for you.
1.00 Initial release
Andy Wysocki
3109 Village Rd West
Norwood, MA 02062-2542
awysocki@bearsoft.com
http://www.bearsoft.com/abs/htmlchk.html
-or-
http://www.bearsoft.com/abs/abs_soft.html
HTMLChk was written because one day I found out that I was missing
a bunch of .GIF files from the HTTP server. I only came across it
when I looked at the ERROR.LOG file to see that the files were missing
for the past 10 days. The ERROR.LOG file would do a good enough job
for finding missing files, but when you move files or change HTML you
don't always want to BROWSE all the pages you just moved/changed. This
is where HTMLChk will do the work for you and produce a report with the
information you need.
I have added support to verify HTTP links in a HTML page. Just click
the check box on the screen and away you go.
I am open to all Suggestions and Ideas as to the functionality of
HTMLChk. So PLEASE feel free to submit your problems & suggestions.
REGISTRATION:
In version 1.01 there is no checking of being a registered user, After
a couple of version I will be putting in a REGISTERED CHECK!! So for now
enjoy, To register it will cost $20.00 USD make all checks payable to
AB Software
122 Richland Road
Norwood, MA 02062-5540
INSTALLATION:
To install HTMLChk you probably already UNZIPed the file into a directory
and are reading this README file. If you did it with the proper options
(no specific options for OS/2, '-do' for DOS) you should have two directories
that were created under the directory the files were UNZIPed into. They are
the MAC and RPT directories. Assuming you created a directory called
\HTMLCHK, the UNZIP should have created \HTMLCHK\MAC and \HTMLCHK\RPT.
If not make the two directories under your directory and 'MOVE *.MAC MAC' to
the MAC directory.
DESKTOP:
I have included a small REXX program to create an DESKTOP Object. You can
run the MAKEWPS.CMD file to create a desktop object. NOTE: You must run
this .CMD file from the directory where the files were installed. ie: if
you installed the programs in the HTMLCHK directory, then you must be in
that directory to run the MAKEWPS.CMD file. After you create the object
I recommend that you open up the setting on the object and add the default
path to it.
cd \htmlchk
makewps
CUSTOMIZING:
There are a couple of files you can customize to make HTMLCHK work better
for you. The HTMLCHK.INI file and the .MAC files.
HTMLCHK.INI
Below I will describe the Keyword that can be defined in the HTMLCHK.INI
file. The keywords can be any case (mixed, lower, upper). All the
basic INI stuff is kept under the [HTMLCheck] heading. An example INI
file would be as follows, excluding the six -'s
------
[HTMLCheck]
Debug=Off
ServerURL=http://www.bearsoft.com
ServerRoot=d:\os2httpd
DocumentRoot=d:\os2httpd\docs
ReportDefault=1
ReportDesc1=Standard Report
ReportKey1=hcStandard
[hcStandard]
hcInit=mac\hcinit.mac
hcSSec=mac\hcssec.mac
hcLine=mac\hcline.mac
hcESec=mac\hcesec.mac
hcTerm=mac\hcTerm.mac
Report=rpt\htmlchk.rpt
------
Debug=
Can be set to ON or OFF, Will turn on debug tracing to the HTMLCHK.DBG
file. No real information is kept here so for normal runs keep this
set to OFF.
The default = OFF
ServerRoot=
This is the Drive and Subdirectory of the Server ROOT directory.
For OS2HTTPD this would be D:\OS2HTTPD (with the drive letter
changing for the drive its on)
The default = C:\OS2HTTPD
ServerURL=
This is the URL for your server (root). This is used when parsing
the HTML if its matched it will assume the file is LOCAL and verify
that it exsists. Currently only 1 ServerURL is supported.
The default = HTTP://
DocumentRoot=
This is the drive and subdirectory of where the docuements start.
This is usually C:\OS2HTTPD\DOCS.
The default = C:\OS2HTTPD\DOCS
TopHTML=
This is the Top HTML file that the server will send to the client
when the server is hit from the top.
The default = INDEX.HTML
IndexTypes=
This is the list of valid HTML names if a client give an URL of
just the directory. The server will normally look for INDEX.HTML
and send it back to the user. This list should contain the same
list that the server would use. The file names are separated by
commas and HAVE NO BLANKS between. Some other files could be
index.shtml,index.sht,index.htm,index.html
The default = index.html,index.htm
HTMLTypes=
This is the list of valid HTML extensions to determine what files
are to be treated at HTML files. This list should contain the list
of extensions of the HTML files separated by commas with no blank
inbetween the list.
The default = .html,.htm
OffRoot=
This is the list of valid Server Root directories that the HTTP Server
can access.
The default = /ICONS/,/CGI-BIN/
IgnoreDir=
This is a list of Dirve:/Paths to ignore when gathering the information
If you have a directory that you are working in and you don't want
to scan it, enter the Drive:Path of the directories you want to skip.
REMEMBER separate by commas and NOT EXTRA BLANKS. Use IgnorePrint
if some of the files in a directory are accessed by HTML but you
don't want a report on the directory.
ie:IgnorePrint=C:\OS2HTTPD\DOCS\NCSA,C:\OS2HTTPD\DOCS\TEST
The default =
IgnorePrint=
This is the list of Dirve:/Paths to ignore when sending stuff to the
REXX Macros. If you have a directory that you are working in and
you don't want to enter into the reports. Enter the Drive:Path of
the directories you want to skip. REMEMBER separate by commas and
NOT EXTRA BLANKS
ie:IgnorePrint=C:\OS2HTTPD\DOCS\NCSA,C:\OS2HTTPD\DOCS\TEST
The default =
Browser=
This is the editor or browser to call up when the Browse report
button is pressed on the main screen.
The default = c:\os2\e.exe
ReportDefault=
This is the default report to select when the program is first
run. It should be any number between 1 and the max number of
reports defined.
The default = 1
ReportDesc_=
This is the report description for report number '_' where the _
is any valid sequential number. The number must start with 1 and
proceed upward. With each ReportDesc_ defined you must supply a
ReportKey_ keyword too.
The default = (none set)
ReportKey_=
This is the Keyword that is associated with the report. When
HTMLChk runs the report it will use this KeyWord as the heading
to searh for.
The default = (none set)
MemAlloc=
This is the number of 4K block of memory to allocate for the
directory tree. The default is 256 (1Meg). You should only have to
set this bigger if you have a big directory tree, or long
path names.
The default = 256
----------------
These keywords have to be defined under a ReportKey HEADING (see sample)
None of these keys have a default!
hcInit=
The Macro/Rexx file called at the start of a report
hcSSec=
The Macro/Rexx file called at the beginning of a new directory
or the start of an HTML file.
hcLine=
The Macro/Rexx file called for EACH line to be processed.
hcESec=
The Macro/Rexx file called at the end of a directory
or the end of an HTML file.
hcTerm=
The Macro/Rexx file called at the end of a report
Report=
The report file to use for output
MACRO VARIABLES & FUNCTIONS:
I have included REXX as the program language of choice for buiding the
reports. There are 5 REXX/MAC macros that the program will call when
generating the reports. They are INIT, SSEC, LINE, ESEC and TERM.
See the INI file descriptions for when/where/why these macros are called.
This chart will show when these variables are valid and what REXX/MAC
can use them.
Init SSec Line ESec Term
---------------------------------------------------------------------
hcReportName X X
hcReportDescription X X
hcServerURL X X
hcServerRoot X X
hcDocumentRoot X X
hcTopHTML X X
hcDeep X X X
hcTotalBytes X
hclDoc X X
hclPhysical X X
hclFileSize X X
hclHTMLTag X X
hclFileType X X
hclMatched X X
hclLocalFound X X
hclServerRoot X X
hclOffSite X X
hclParent X X
hclAccessCount X X
hcpDoc X X
hcpPhysical X X
hcpFileSize X X
hcpHTMLTag X X
hcpFileType X X
hcpMatched X X
hcpLocalFound X X
hcpServerRoot X X
hcpOffSite X X
hcpAccessCount X X
REXX/MAC VARIABLE DESCRIPTIONS:
hcReportName
- The Report Name file as defined by the REPORT INI Variable
hcReportDescription
- The report description as defined by the ReportDescription
INI variable
hcServerURL
- The Server URL as defined by the ServerURL INI Variable
hcServerRoot
- The Server Root Path as defined by the ServerRoot INI Variable
hcDocumentRoot
- The Document Root Path as defined by the DocumentRoot INI Variable
hcTopHTML
- The TOP HTML File name ss defined by the TopHTML INI Variable
hcDeep
- This variable is incremented every time it traverses a new
Directory or HTML file. So looking at a PATH of
E:\DOCS\USERS\ANDY\INDEX.HTML hcDeep would be = to 3 for the
Directory and 4 when it was parsing the HTML file.
hcTotalBytes
- The total number of bytes used by the files in a directory.
hclDoc
- The File name in the directory OR the HTML Tag Source
hclPhysical
- The Physical file name of the URL
hclFileSize
- The size of the file, HTML tags will always be zero
hclHTMLTag
- The Type of HTML tag that was processed. Valid Values are
A, IMG, HTML, LINK, FORM, ISINDEX
hclFileType
- The Type of file that was processed. Valid Values are
HTML, DIRECTORY, OTHER, HTMLTAG
hclMatched
- Yes/No field to say if this file is accessed by an HTML tag.
hclLocalFound
- Yes/No field to say if the Physical file was found.
hclServerRoot
- Yes/No field to say this file is accessed from the Server Root
hclOffSite
- Yes/No field to say this HTML URL is referencing a file somewhere
other than this server.
hclParent
- Yes/No field is the 'hcp...' variables are valid. This will
be set to yes when inside an HTML file.
hclAccessCount
- This number has two meanings, For a flat file it will always be
1, for a directory it should be the number of files in the directory.
for an HTML tag it will be the number of times inside the parent
document that this file is referenced.
--The rest of these variables are the same as the 'hcl...' variables except
they reference the parent document!
hcpDoc
hcpPhysical
hcpFileSize
hcpHTMLTag
hcpFileType
hcpMatched
hcpLocalFound
hcpServerRoot
hcpOffSite
hcpAccessCount
--- Thats all folks --- End of Document ---