home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
OS/2 Shareware BBS: 35 Internet
/
35-Internet.zip
/
unh222.zip
/
README.UNH
< prev
next >
Wrap
Text File
|
1999-05-23
|
1KB
|
41 lines
UNH is an OS/2 command line utility to strip HTML codes from
files saved from the WebX or other web browsers. If it is executed
without any aarguments, the following message will be displayed.
UNH 2.22 HTML stripper
by Don Hawkinson , author of CCA, DH-Grep-PM,
PMStripper, Pastry Box, and DH_ClipSave/2
http://www2.southwind.net/~dwhawk
dwhawk@southwind.net
usage: unh file1 file2 <file3>
file1 == html file
file2 == stripped text output file
file3 == URLs from html source file - optional
UNH does not check for the existance of the output file, and will
overwrite any existing file. UNH is HPFS aware.
UNH does not attempt to recreate the format of the Web page. UNH does
not attempt to force any format on the output text, nor does it attempt
to remove any existing text format. While the layout of tables and lists
is lost during stripping, data is sorted to separate lines for
legibility.
UNH has a filter which translates any embedded NULL characters
to spaces. I have no idea why anyone would use NULL characters
on a web page, but I have encountered at least one Web site that
has done this.
This program is free, but the author retains all rights. See the file
license.txt file for further information.
The command line utility UNH.EXE uses the same logic as PMStripper
to strip the HTML codes from files.