home *** CD-ROM | disk | FTP | other *** search
- DGARBAGE VERSION 1.1
- --------------------
-
-
- DISCLAIMER
- ----------
-
- This program is provided as is and I accept no responsibility
- for any loss or damage of any files or data that may be
- caused by using this program. I have performed fairly
- extensive testing of the program, but it is not possible for
- me to exhaustively test every potential situation. If the
- data in the file you wish to process is important to you then
- make a copy of the file and process that. I have never had
- any problems with the input file during my testing, but
- anything is possible.
-
- Once again, I AM NOT RESPONSIBLE IN ANY WAY FOR ANYTHING.
- (Or as my mother always says, I am totally irresponsible).
-
-
-
-
- INTRODUCTION
- ------------
-
- This utility program removes all non - printable characters
- from an ASCII text file and restricts the line length to a
- specified maximum number of characters. It was designed to
- remove the "line noise" characters from capture files made by
- a communications program on line to a bulletin board.
-
- The text is read from a file specified on the command line
- and the resulting text after processing is output to a second
- file, the name of which is also specified on the command
- line. The original file is left untouched which allows
- multiple processing with different options set until the
- desired result is achieved.
-
- All characters with an ASCII value less than 32, excepting
- ASCII 10 (Line Feed) and ASCII 13 (Carriage Return) are
- removed from the text. By default, all characters with an
- ASCII value of greater than 127 are also removed, although
- this can be changed by command line options so that either
- the character is converted to a seven bit character by
- stripping the eighth bit, or is left entirely unchanged.
-
-
-
-
- USAGE
- -----
-
-
- The syntax for the command is:-
-
- DGARBAGE infile outfile (options)
-
- Infile and outfile MUST be specified - they are,
- respectively, the file containing the raw text to be
- processed and the file that will hold the text after
- processing.
-
- Infile must exist and must either be present in the current
- directory or must have an appropriate path prefixed to it's
- name; e.g.:-
-
- DGARBAGE file1 file2
- (if the file is in the current directory)
-
- DGARBAGE \dir1\dir2\file1 file2
- (if the file is in the directory \dir1\dir2).
-
- Outfile need not exist and will be created if necessary. If
- it DOES exist you will be prompted as to whether or not you
- wish to overwrite it. For example if "file2" already exists
- and you enter the following command (in the root directory):-
-
- DGARBAGE file1 file2
-
- then the following prompt will be displayed:-
-
- File C:\FILE2 Already Exists
- Overwrite it (Y/N) ?
-
- If you then press Y, the file will be over written and it's
- current contents will be lost. However, if you press N then
- the program will terminate.
-
- If less than two parameters are specified after DGARBAGE, or
- if the same file name is given for infile and outfile, the
- program will display it's help screen and terminate. If
- infile does not exist, or if outfile can not be opened (e.g.
- if it is set to READ ONLY), then an appropriate error message
- will be displayed and the program will terminate.
-
-
-
-
- OPTIONS
- -------
-
- The options listed below may be entered in any order,
- separated by a space. They may be entered in upper or lower
- case. Note the comments for each option on the restrictions
- (if any) in using that option with other options.
-
-
-
- B Use BIOS screen writing method.
-
- The default is to use direct screen memory writing,
- however, this method doesn't work on all types of video
- adapter. If you have problems with the display, then
- use the B option.
-
-
-
- D Display the text on the screen as it is processed.
-
- The default is that the text is not displayed as it is
- processed, just an information box at the top of the
- screen which shows the progress of the processing. This
- is the fastest method, allowing approximately 9000
- characters to be processed per second. However, because
- the program uses very large buffers (30k for infile and
- 31k for outfile) a seemingly long time can pass between
- updates to the information box, which only take place
- as each 30k block of text is read from disk. Therefore
- the D option has been included for those people who
- like to be sure that something is happening.
-
- The effect of the D option depends on whether or not
- the B option is also specified. If the B option is not
- specified then the text is displayed in a window at the
- bottom of the screen using the fast screen write
- method. This slows the processing speed down to
- approximately 1200 characters per second. If the B
- option is specified in addition to the D option then
- the text is displayed using the entire screen (the
- information box is not displayed) and the processing
- speed slows down to approximately 300 characters per
- second.
-
-
- K/S The K and S options control what action the program
- takes when it encounters a character with the 8th bit
- set - i.e. a character with an ASCII value greater than
- 127. By default, these characters are removed from the
- text stream entirely and do not appear in outfile. This
- was done because most "line noise" characters appear to
- be in this range. However, these characters are also
- the IBM graphics characters, so there may be some
- occasions when it is not desirable to remove them.
-
- The K option tells the program to pass these characters
- through unchanged (Keep them) so they will appear in
- Outfile exactly as they appeared in infile.
-
- The S option tells the program to strip the 8th bit
- from these characters converting them to a character in
- the range 0-127. For example, the IBM horizontal double
- line character (═) which has an ASCII value of 205 will
- be converted to a character with an ASCII value of 78
- which is the letter M.
-
- These options only control what processing (if any) is
- performed on characters with an ASCII value greater
- than 127. Characters with an ASCII value less than 32
- (excluding CR/LF) are ALWAYS removed from the text
- stream.
-
- Note that the K and S options are mutually exclusive,
- so if both options are specified on the command line
- the program will display the help screen and terminate.
-
-
- Lnnn The L option control the number of characters allowed
- between successive Carriage Return/Line Feeds (CR/LF).
-
- The default Line length is 72 characters as this is the
- standard length used by most bulletin board message
- systems. If it is desired to have a different line
- length, the required value should be specified after
- the L, without any intervening spaces. For example L80
- will allow a line length of 80 characters before a new
- line is forced.
-
- Note that this option only controls the MAXIMUM number
- of characters allowed between CR/LFs, it does NOT
- expand short lines to the specified length. If a CR/LF
- is encountered in infile before the specified line
- length is reached, the counter is reset to 0 and
- counting begins again. Only if more characters than the
- specified line length have been read from infile
- without encountering a CR/LF is one inserted by the
- program.
-
- The acceptable range for this option is 1 - 255. Any
- values outside this range will cause the program to
- abort with an "Out of Range" error message.
-
-
- The above parameters are the only ones currently recognised
- by the program. Any other characters entered on the command
- line will be ignored.
-
-
-
-
-
- ERROR MESSAGES
- --------------
-
- If less than two parameters are specified on the command
- line, if both the K and S options are specified, or if the
- same file name is given for both infile and outfile, the
- program will display it's help screen, followed by a short
- descriptive message and terminate.
-
- In all other cases where an error is encountered a short
- descriptive message is displayed, preceded by the characters
- "*** ERROR:" and followed by a line giving the error code,
- program location and module in which the error occurred. The
- program will then terminate.
-
- For example if you specify a file that does not exist as
- infile the following might be displayed:-
-
- *** ERROR: File Not Found
-
- Error 2 at Program Location 0000:0316 in Module < Open Files>
-
-
- Other error messages that you may encounter are "File Access
- Denied" if the access permission for infile or outfile is not
- such as to allow the files to be accessed correctly by the
- program, and "Out Of Range" if a value outside the range of
- 1-255 is specified for the L option.
-
- There is also the possibility that the program will abort
- with a "Range Check Error" (error 201) in module "Process
- Data" This is caused by specifying a line length less than
- the length all the lines in infile. For example, if the
- length of all the lines in infile is 80 and you specify a
- line length of 79 (L79) then the program will add a CR/LF
- after the 79th character of EACH line. Eventually, this will
- overflow the output buffer and the program will abort. The
- output buffer has been made 1k greater than the input buffer
- to allow for some lines being a different length than
- specified, but it can not handle the situation just
- described. If you encounter this situation, adjust the line
- length value until the program works satisfactorily.
-
-
- OTHER ERRORS
- ------------
-
- The program has been fairly extensively tested, but, no
- doubt, there are errors in there that I haven't found. If you
- come across something that can't be solved in any of the ways
- described above, please either leave a message for me
- on the Salt Air Bulletin Board or write to me at the address
- below, giving the error message that was displayed (BOTH
- lines please) and, if possible, a sample of the file that was
- being processed.
-
- I will also welcome any suggestions for improvements or
- upgrades although I do not promise to implement any such
- suggestions.
-
-
-
-
- KNOWN BUG
- ---------
-
- The program adds one (or sometimes two) spurious characters
- to the end of outfile after processing. I haven't figured out
- why yet, but it doesn't cause any real problems (just delete
- the character if you don't want it).
-
-
-
- HISTORY
- -------
-
-
- Version 1.1 Added code to detect if the same file name
- had been specified for both infile and
- outfile and abort with an appropriate message
- in this case.
-
- Added code to detect if outfile already
- existed and, if so, prompt the user as to
- whether or not to overwrite the file.
-
- (Thanks to Jerry Clark for finding both of
- these).
-
-
- Filenames are now converted to uppercase
- characters within the program. This was
- necessary to permit the comparison of infile
- and outfile, and also makes the display look
- a bit neater.
-
- The program will prefix the drive specifier
- to the filenames if it is not already
- provided. Previously, only the path was
- prefixed.
-
- Messages were added to explain why the
- program has terminated in the event of
- incorrect or missing command line parameters.
-
- The operation of the program has been speeded
- up substantially, especially in the default
- mode where the throughput has been increased
- from around 2000 characters per second to
- 9000 characters per second. The throughput in
- the fast display mode has increased from 900
- to 1200 cps. The throughput for the slow
- display mode remains at around 300 cps.
-
- Tidied up the documentation a bit.
-
-
-
-
- Version 1.0 Initial Release.
-
-
-
-
- Address until mid-June 1988:-
-
- Peter Byrne,
- Exchange Operations Computer Centre,
- Room 103,
- Saudi Telecom Headquarters,
- Riyadh 11132,
- Kingdom of Saudi Arabia.
-
- Address after mid-June 1988:-
-
- Peter Byrne,
- Winthrop
- North Heath Lane
- Horsham
- Sussex
- England.
-
-
-
- 09 Ramadhan 1408.
- (25 April 1988)
-