home *** CD-ROM | disk | FTP | other *** search
- 23-Apr-86 19:32:57-PST,5277;000000000001
- Date: Wed, 23 Apr 86 20:04:12 EST
- From: Edward_Vielmetti%UMich-MTS.Mailnet@MIT-MULTICS.ARPA
- To: info-ibmpc@USC-ISIB.ARPA
- Subject: Breakup.doc
-
-
- Documentation for BREAKUP
- December, 1983
- Charles Roth
-
-
- BREAKUP is a utility for "breaking up" large files on MSDOS systems. It
- is intended primarily as a companion utility for The FinalWord editor, but it
- may have other uses as well. BREAKUP was written in C by Charles Roth, and
- is in the public domain.
-
- Many text editors for microcomputers can only deal with files of a
- certain maximum size. We know, by Murphy's Law, that some files will
- always be larger than any given size. Thus, there is a need to be able
- to break up large files in a convenient way. Also, when moving a collection
- of files from one machine to another, it is sometimes easier to concatenate
- all of the files together, ship them as one file, and then break them up
- again.
-
- BREAKUP uses Unix-ish command arguments to allow the user to break up
- a file in a variety of ways. Each "command" is actually a pair of
- arguments that specify the next place to break the file. The user can tell
- BREAKUP to break after so many bytes, or after so many lines, or when a
- particular string is encountered. The general syntax looks like:
-
- BREAKUP File.Ext -C1 A1 -C2 A2 -C3 A3 etc....
-
- where "file.ext" is the name of the file to be broken, and each "-Cn An" pair
- specifies a breaking point (as described below). The pieces of the broken-up
- file are put in the files File.000, File.001, and so on. Note that there is
- usually one more file than the number of breaking points specified.
-
- The command specifiers for the "-Cn An" pairs are:
-
- -B nnnn Break after nnnn Bytes, where nnnn is a decimal number
- -L nnnn Break after nnnn Lines, where nnnn is a decimal number
- -S string Break after the next occurrence of "string"
- -LB nnnn Break at the first end-of-line after nnnn bytes
- -LS string Break at the first end-of-line after next occurrence of "string"
- -R Repeat the last command specifier indefinitely
-
-
- EXAMPLES:
-
- BREAKUP File.Ext -b 1000 -b 1000
- breaks "file.ext" into three pieces. File.000 would contain the first 1000
- bytes, File.001 would contain the second 1000 bytes, and File.002 would
- contain everything else that was in File.Ext.
-
- BREAKUP File.Ext -l 1000 -r
- would chop File.Ext after every 1000 lines. (The last piece might be
- smaller than 1000 lines, of course.)
-
- BREAKUP File.Ext -l 200 -s Mom -s "Apple Pie"
- breaks File.Ext at 3 points: at the (end of the) 200th line; at the next
- occurrence of the string "Mom" in the text; and at the first occurrence of
- the string "Apple Pie" after "Mom". (Quotes are optional and are not part
- of the string searched for. They are required if the string contains one
- or more blanks.)
-
- NOTES:
-
- 1) Breaking at a point is inclusive. That is, breaking at 200 bytes
- means the first piece will contain the 200th byte. Ditto for lines and
- strings, i.e. breaking at "Mom" means the piece will end with "Mom".
-
- 2) The size of a file in bytes has two slightly different meanings.
- To programs written in C (BREAKUP, FinalWord) the end of a line is marked by
- a single character. Inside MSDOS, the end of a line is marked by the two
- characters Carriage-Return and Line-Feed. Thus, breaking off at piece at
- 100 bytes may result in a file that (according to DIR) is slightly larger.
-
- 3) The -s strings may include control characters. Of course, you
- can't just type the control characters as part of the -s string; MSDOS will
- try and interpret them right away. So instead, BREAKUP uses a special
- notation (borrowed from the C language) for control characters that always
- begins with a "\" (backslash). Similarly, since " and \ already mean
- something special, we must have a way to represent a single " or \.
- These special notations are listed below.
- \ddd is the character with the OCTAL value ddd. Must be 3 digits.
- \\ is a single backslash
- \" is a double-quote character
- \n is a newline (end-of-line character)
- The last sequence is particularly useful. Breaking at -s "\nA" would mean
- "break at the next place where there is an A at the beginning of the line".
- Warning: do NOT try to break about a null character, i.e. \000. Since the
- C string routines use \0 as a string terminator, BREAKUP will not understand
- its use as a breakpoint.
-
- 4) BREAKUP prints out the filenames of the pieces as they are produced.
- You can redirect this output to a file, if you wish, by placing
- >filename
- after the list of breakpoint specifiers. (You do not need MSDOS 2.x to do
- this.)
-