home *** CD-ROM | disk | FTP | other *** search
-
- TEXT FILE PORTABILITY AIDS 6/6/92 public domain
- -------------------------- Alex Matulich, Unicorn Research Corp
- 4621 N. Landmark Drive
- Orlando, FL 32817-1235
- (407) 657-4974
- alex%bilver@peora.sdc.ccur.com
-
- This archive contains three programs plus C source code for each:
- AddCR -- converts unix or AmigaDOS text files to MS-DOS
- StripCR -- converts MS-DOS text files to unix or AmigaDOS
- StripHR -- selectively strips hard returns from a text file.
- Reformat -- reformats a text file using a different line length
-
- Typing the name of each command on the command line will display its
- function and syntax.
-
- These 4 utilities are helpful when transferring text files between
- dissimilar operating systems or when importing text files into word
- processing (WP) or desktop-publishing (DTP) software, or uploading
- text into BBS message editors.
-
- IMPORTANT: Normally, you specify an input file to process, and an output
- file to write. However, all these utilities are capable of processing a
- file "in place" if you do not specify an output file. This practice is
- not recommended because (1) it is MUCH slower, and (2) your original file
- may be destroyed if something goes wrong. Be warned!
-
-
- Explanation (i.e. why bother?)
- ------------------------------
- The end-of-line (EOL) code used in MS-DOS text files is different
- from that used in other operating systems such as unix or AmigaDOS.
- MS-DOS uses two characters at the end of each text line: a carriage
- return (CR) plus a linefeed (LF). Unix uses only one character as the
- EOL code, LF.
-
- The programs StripCR and AddCR will convert EOL codes from CR+LF to
- LF and back again. StripCR strips out the CR characters from a text
- file, leaving only LF characters at the end of each line. AddCR looks
- for LF characters in a file, and inserts a CR in front of those that are
- not already preceded by CR, thus ensuring that all LF's will be
- converted to CR+LF.
-
- Now that you have your text file all nice and compatible with your
- operating system, you probably want to import it into WP or DTP
- software, right? Before doing so, run the text file through the StripHR
- utility. Why, you ask? Well, all WP or DTP packages have the ability
- to import text. Invariably, however, the imported text will not fit
- properly within the margins you wish to use, which means that you will
- have to reformat everything in your WP program manually. Many software
- packages give you the option to strip out the hard returns (EOL codes)
- in your text file as it is being imported, so that the software can
- format it to your margins by itself. Trouble is, often ALL, or too
- many, of the hard returns are stripped out, and this messes up paragraph
- breaks and section titles, so you still have to go through your text
- and reformat everything manually.
-
- StripHR is more selective about removing hard returns. It removes
- all hard returns except the ones that precede a paragraph indent or a
- blank line. This means that the paragraph formatting in your text file
- will be preserved. A normal paragraph which is flush to the left will
- have all hard returns removed within it, but there will still be a hard
- return at the end of the paragraph. Also, indented blocks of text
- (where each line in the block starts with a space or a tab) will be left
- alone, as will lines that are intentionally blank. After running your
- text file through StripHR, just import it straight into your WP or DTP
- software, without stripping any more hard returns.
-
- Reformat will take a text file and reformat it to fit within a
- specified line length. This is useful if you have to upload a text file
- into an online editor that has its own idea about where the wrap margin
- is located. Two situations can occur: The lines in your text file
- might be a little too long, and the editor will split each line, usually
- leaving single words on lines by themselves. Or, if you have a
- "continuous" text file (say, the output of StripHR), you might be using
- an online editor that doesn't insert carriage returns into the text as
- you upload it, resulting in long one-line paragraphs that wrap around
- and around the screen. Running a text file through Reformat first
- should take care of most problems, though some minor post-editing may be
- necessary.
-
- StripHR and Reformat leave indented text alone. So if you have a
- textfile with a "left margin" throughout, these utilities won't have any
- effect. If I feel like it, I'll modify these utilities to handle this
- special case, but it's not likely to be soon, because I'm satisfied.
-
-
- Usage
- -----
-
- Each of these utilities will give you an explanation on its use when you
- enter its name by itself on the command line. More detailed information
- is below.
-
-
- ----- AddCR -----
-
- To add CR characters to a text file, type:
-
- AddCR file (operates on the file "in place" - avoid this!)
- or AddCR infile outfile (infile left alone, new version is outfile)
- Example: addcr unixfile.txt msdos.txt
-
- If there are existing CR+LF codes, AddCR will leave them alone, instead
- of converting them to CR+CR+LF codes, which are meaningless.
-
-
- ----- StripCR -----
-
- To strip CR characters from a text file, type:
-
- StripCR infile (operates in place)
- or StripCR infile outfile (infile left alone, stripped version is
- saved in outfile)
- Example: stripcr msdos.txt unixfile.txt
-
- When stripping is done in place, the resulting file size is equal to or
- shorter than the original (because characters are being removed). The
- new replaced file will have its end flagged with a ctrl-Z character,
- followed by some useless text from the old version of the file. Most
- MS-DOS software will ignore anything after the ctrl-Z. You can delete
- anything after the ctrl-Z with a text editor if you wish.
-
-
- ----- StripHR -----
-
- To strip out hard returns from a file, leaving paragraphs intact, type:
-
- StripHR infile (operates in place)
- or StripHR infile outfile (creates a new modified file)
- Example: striphr article.txt article.wp
-
- Here's the decision logic that determines which hard returns will be
- kept and which will be thrown out: A hard return is discarded unless
- (1) it is followed by "white space" (a space, tab, or blank line), or
- (2) it is preceded by another hard return.
- These rules ensure that paragraphs and blank lines will be kept intact.
-
-
- ----- Reformat -----
-
- To reformat a text file to new line lengths, the command syntax is:
-
- Reformat [-r] [-lN] [-tM] infile [outfile]
- Example: Reformat -l70 -t4 article.wp article.txt
-
- Parameters for Reformat:
- -r = Re-wrap lines that happen to be shorter than the specified
- length. The default is to leave shorter lines alone. Either
- setting may be a compromise, depending on what you want.
- -lN = Set maximum line length to N characters. Default is 72.
- -tM = Assume that tabs are M characters wide. Default assumes 8.
-
- Reformat can also be used as a functional equivalent to StripHR. You
- could delete StripHR altogether, and set up a command alias to use
- Reformat, something like:
-
- alias StripHR = Reformat -r -l20000
-
- This will work exactly like StripHR, assuming each paragraphs is less
- than 20,000 characters long! (StripHR itself has no limitation, and
- and it's a bit faster as well.)
-
- -----
-
- >>> IMPORTANT! >>> Be sure your input text file is compatible with
- your operating system, or the output of Reformat/StripHR could be trash!
- If you're running MS-DOS, make sure that the EOL codes are the CR+LF
- type. To be safe, specify both an input and output file when using
- StripHR or Reformat; this will preserve your original file in case
- something goes awry, or if you just don't like the result.
-
-
- Changes from the previous release
- ---------------------------------
-
- All utilities are now capable of operating on files in place, instead of
- just the stripping utilities. This practice is not recommended, however,
- because it is slower and you risk destroying your original file if
- something goes wrong. But if you have little disk space to spare, this
- method might be the best way.
-
- In-place operations used to be excruciatingly slow. Now the in-place
- mode is significantly faster. Instead of reading & writing one character
- at a time, a 16K buffer is used to hold the output so that reads will be
- smoother.
-
- The Reformat utility was added to the collection.
-
- Online help is now more informative.
-
- StripHR used to lose the last character in a file. No more.
-
- Endless-loop condition on buffer-emptying process in AddCR fixed.
-