home *** CD-ROM | disk | FTP | other *** search
- ;;12-22-87
- FINREP.DOC Version 2.6
- 01/02/88
-
- Eric Gans
- French Department, UCLA
- Los Angeles, CA 90024
-
- * MS-DOS users: FINREP now exists for DOS. MS-DOS v2.4 is *
- * (more or less) equivalent to CP/M v2.6. *
-
- Version 2.6
- Corrected a bug in reading words across sectors.
-
- Version 2.5
- Corrected a bug in the wild-card file routine (thanks to faithful
- FINREP user John Stensvaag). Improved verify routine (as per DOS
- version); reduced program size.
-
- Version 2.4
- The search routines have been extensively revised and debugged.
- FINREP should now find just about any string, however perverse.
-
- Version 2.3
- Fixed bug that made verification incompatible with multiple
- (wildcard) files. Allow wildcard (?, not *) at end of search
- string.
-
- Version 2.2
- Added "V" flag to allow user verification ("Replace (y/n)?")
- before replacement in text files; a few minor improvements.
-
- Version 2.1
- Fixed bug that treated wildcard filetypes as single files. Added
- a couple of clarifications to DOC file.
-
- Version 2.0
- Allows wildcard searches (various options), wildcard filetypes.
- Easier entry of caps (!string! instead of !s!t!r!i!n!g). Allows
- control characters other than letters (e.g., ^[,^@).
-
- *****
-
- FINREP is a search/replace program that remedies most of the
- deficiencies of Wordstar's ^QA and other similar commands. Aside
- from being faster, it has important additional features:
-
- - allows wildcards in search string (v2.0)
- - allows wildcard filename (find/replace in groups of files)
- - command-line entry allows batch processing by SUBMIT, etc.
- - allows entry of control or hex characters (0-FF)
- - can be used with object files (e.g., COM files)
- - sets capitalization (first letter or whole word) and high
- bit of the last character according to the old string
-
- This last feature means that, for example, if you are writing a
- scenario where the characters' names appear sometimes in CAPS and
- sometimes just Capitalized, you don't need two search/replaces to
- replace one name with another: JOE will be replaced by HARRY, Joe
- by Harry, and even joe by harry.
-
- *****
-
- Format: finrep [d:]fn [newfn] /[switches]/ oldstring [newstring]
-
- (Enter "finrep" alone for a brief command summary.)
-
- If a second filename is given, the changes will be placed in that
- file; if not, the old filename will hold the changes and the
- original file will be changed from fn.ft to fn.BAK (unless the
- "B" switch is used).
-
- Wildcards (*,?) may be used anywhere in the filename; if there
- are wildcards in the filetype (after the ".") the B switch will
- be set automatically to suppress creation of BAK files. If
- wildcards are used, a second filename cannot be entered. If you
- enter:
-
- A>finrep urk*.doc // "blurk" "zap"
-
- the files urk01.doc, urk33.doc, urktty.doc would be modified as
- expected and files urk01.bak, urk33.bak etc. would be created.
-
- The characters "//" must be entered even if no switches are used.
-
- The switches are as follows:
-
- B = no BAK file. This switch disables making a BAK file; the
- original file will be lost. (Use only if you did not enter a
- second filename.)
-
- Q = allow wildcards in search string. (The program runs faster if
- this switch is not used.) The various options for this command
- are described below.
-
- V = verify replacement. If this switch is used, the context of
- the search string will be displayed on the screen and you will be
- queried re replacement. This switch cannot be used along with
- the O or H switches (see below).
-
- O = Object file. If this switch is used, the program will ignore
- end-of-file markers (1AH), as in PIP's "o" command. Use for
- search/replace in non-text files. WARNING: if you don't use "O"
- with a non-text file it will be cut off after the first 1AH.
- That's why FINREP makes BAK files!
-
- H = keep High bit. With this switch, all bytes are searched
- exactly as they are; letters with the high bit set will not be
- identified with their standard ASCII counterparts.
-
- W = no Whole-word search. This switch is used to search a string
- whether or not it is a whole word; with it, a search for "the"
- will find "other", "their" etc.
-
- NB - The program defines a "word" as anything preceded and
- followed by something other than a letter (space, punctuation
- mark, number, control character, beginning or end of file). Thus
- this switch is not needed if the search string is a series of
- words, a word preceded by a control character that is not
- contiguous to another word, etc.
-
- C = respect case. This switch allows you to distinguish capital
- from lower-case letters: in a search for "the", "The" will not be
- found/replaced. (NB: Upper case letters cannot be entered within
- quotes; see below.)
-
-
- In normal operation (with no switches), the search will include
- whole words only; it will ignore case and high bits, but will set
- the new string to correspond to the old in this respect,
- capitalizing the first letter or the whole string and setting the
- high bit of the last as required. This last feature is only
- useful if the replacement string is one word long; if it contains
- more than one word, you may set the high bits when you enter the
- string, or let your word-processor (e.g. Wordstar) do it. If you
- include capitals in your replacement string, they will be
- respected even if the find string is not capitalized.
-
- If you want to search for a capitalized word, you must use the
- "C" switch (or the "H" or "O" switch); FINREP will give you an
- error message if you don't.
-
- The last four switches are in the relation O > H > C
- > W ;
- the "higher" switch includes the smaller. Thus if the "H" switch
- is used, capitals and lower case will be distinguished, and the
- search will not be limited to whole words.
-
- *****************************************************************
-
- String entry:
-
- The find and replace strings must be separated by a space from
- the switch entry and from each other. Strings should be entered
- as follows:
-
- ASCII - in quotes: "blurk", "54%**90er @"
-
- The following characters must NOT be placed between quotes:
-
- HEX - separate by commas: d,1A,cd,10,ff,3
- CAPITALS - between !!: !A!,!hello! [NEW IN V2.0]
- CONTROL CHARACTERS - preceded by ^: ^M,^m^j,^c,^C,^[,^^
- WILDCARDS - ????, ?n (1 <= n <= 9) or ?* (indeterminate)
- The "|" character is used to display a break in the replace
- string (see below).
-
- All ascii letters entered within quotes will be treated as LOWER
- CASE. If you want to search upper case letters with the "C"
- switch, or to put upper case letters in the replace string, you
- must surround them with "!!", unless you enter them as hex
- characters: (A = 41, B = 42 ...). Sorry about this, but the CP/M
- command line cannot distinguish upper from lower case.
-
- Any combination of characters is valid; for clarity, groups
- should be separated by commas, although this is only necessary
- for individual hex characters: !h!"ello",^m^j,e5,?7,32,!blurk!,^q
- Quotes and !..! must be closed. To search/replace the quotation
- mark, enter it as a hex character ("=22h). You can search for
- "!" if you keep it between quotes.
-
- The length of the find/replace strings is limited to 30 bytes;
- this length applies to the strings themselves and not to the
- keyboard entry, which cannot exceed 127 bytes in all (blame CP/M
- for this). Thus ^j,cd,ff,3d is 4 bytes long. In the case of
- indeterminate wildcards, up to 255 bytes are allowed, but the
- limit of 30 still stands for the find/replace strings themselves.
-
- If you do not enter a replace string, the searched-for string
- will be replaced by nothing, i.e., deleted.
-
- WILDCARDS
-
- The wildcard search has a great deal of flexibility. For obvious
- reasons, wildcards cannot appear at the beginning of the search
- string. (In versions below 2.3 they can't be at the end either.
- For some not-so-obvious reason this seemed a bad thing at the
- time.) The options are as follows:
-
- 1. Simple wildcard search: all bytes of the search string will be
- replaced.
-
- finrep zz.txt /q/ "d"?2"e" "xxyz" will replace all words like
- "done", "dare", "dove" etc. by "xxyz" A maximum of four wildcard
- groups are allowed in this form: thus "a"?"cd"?4"ijk"??"nopq"?"s"
- is a permissible search string
-
- 2. Simple wildcard search with break. Only one wildcard group is
- allowed; the replace string is divided in two, with the first
- part replacing what precedes the wildcards and the second what
- follows; the intermediate bytes are left alone. The break CAN
- appear at the beginning or end of the replace string to indicate
- that the corresponding part of the find string is to be deleted.
- A blank replace string (entered simply as: | ) will delete both.
-
- finrep xx.txt /qw/ "d"?2"e" "xx"|"yzz" will replace the "d" in
- this pattern with "xx" and the "e" with "yzz"; "done" becomes
- "xxonyyz", "madame" -> "maxxamyxx", etc. (This last example only
- works if the "W" switch is used.)
-
- 3. Indeterminate wildcard search/replace. The indeterminate
- wildcard "?*" must be the only wildcard in the search string. In
- this option the whole string from the beginning to end is
- replaced. A maximum of 255 characters will be allowed in the
- search string; longer strings will not be found.
-
- finrep blurk.let /qw/ "xy"?*"zq" "garbage" will replace all
- strings beginning and ending with the indicated letters:
- "xyrwerwerzq", "xyuu is the nbrzq", "xy ^C^Yzq" will all be
- replaced by "garbage"
-
- NB - Since FINREP only looks for one thing at a time, it will not
- find nested pairs of strings, and will appear to miss some pairs
- where the second half of the search string is over 255 bytes away
- from the point at which the search began. (FINREP checks this
- only every 128 bytes.) Thus if you are looking for "the"?*"of",
- FINREP will sometimes miss the apparent "hit" in a text like
- this: ... the [ ... the ... ] of ...
- where the [] contain over 255 bytes. This is not a bug, but a
- limitation of the program.
-
- 4. Indeterminate wildcard with break. This is a very powerful
- option that allows you, for example, to replace PerfectWriter
- "fences" with WordStar control toggles (& vice versa). Here again
- only one wildcard group is allowed in the search string; the
- intermediate bytes are left unchanged.
-
- finrep zap.kkk /qw/ "123"?*"45" "6"|"789" will replace
- "zz123blurk blurk xxxc oo45rr" by "zz6blurk blurk xxxc oo789rr";
- finrep perf.wri /qwc/ "@"!ux!"{"?*"}" ^s|^s will replace the PW
- underline fence @UX{ ... } by WS's ^S ... ^S Note that the "C"
- flag is used here to search for caps; if l.c. as well as caps are
- acceptable, it could be omitted and the search string written
- "@ux{"?*"}". You can delete the fences altogether by replacing
- the ^s|^s by | in the last example.
-
- One user thought the word "break" was misleading and should be
- replaced by "save," since the "|" in the replace string means
- that you preserve the wildcard part of the search string. In
- other words: finrep zap.txt /qw/ "<<"?*">>" will kill everything
- between the "<<..>>" whereas: finrep zap.txt /qw/ "<<"?*">>" |
- will just kill the "<<>>" and "save" their contents.
-
- FINREP can be aborted at any time by typing <ESC> (=1B HEX). I
- preferred this to ^C since an extra ^C will be read by CP/M as a
- Warm Boot.
-
- Except when the "V" switch is used, the only screen output is the
- number of strings replaced and, if you use wildcards, the names &
- total number of files processed. If you want to see the
- replacement procedure in action, use a word-processor!
-
- Notes:
-
- 1. FINREP will modify files of any length; it uses the entire
- memory below the CCP as its buffer, and writes to disk whenever
- the buffer fills up. Since it doesn't overwrite the CCP, it
- doesn't have to end with a Warm Boot.
-
- 2. There is no intrinsic limit on the number of files allowed
- under the wildcard filename option; for sanity's sake, you will
- get an error message if there are more than 255.
-
- 3. If you want to create a version of FINREP with some of the
- switches preset, run the program without a filename: finrep
- /[sw1][sw2]../ After it returns to the CP/M prompt, save 13
- finrep1.com will keep the switches as you like them. This
- procedure is NOT REVERSIBLE, so keep your original FINREP
- unchanged.
-
- 4. In deciding whether to capitalize a whole word/string, FINREP
- looks at the first two letters. If the find string has only one
- letter, only the first letter of the replacement string will be
- capitalized. If the word to be replaced has unusual
- captalization (e.g. BBrrOOOmm), use the "C" switch and/or enter
- separate replacement strings for different variants.
-
- 5. In using indeterminate wildcards, you should use the "W"
- switch unless BOTH HALVES of the search string begin and end on
- word boundaries.
-
- 6. Re speed, FINREP is somewhat faster than Wordstar's ^QA
- command. But if all you want to do is replace a string, it is
- over three times faster, since its time includes loading and
- saving the file. Measured on a long (84 K) file, FINREP took 27
- seconds and WS 34 for a typical search/replace. But WS needs at
- least 10 seconds to load and a good minute to save the file and
- exit. With a little practice, the command line can be entered as
- fast as WS's, and it can be included in SUBMIT files or
- reproduced by programs like SYNONYM or my SYN.COM.
-
- *****
-
- FINREP was written at the request of John-Mark Stensvaag of
- Vanderbilt University. At first I couldn't see the use for it,
- but he convinced me (he is a professor of Law). The wildcard
- features added in v1.1 and v2.0 were also his idea; the
- verification feature in v2.2 was ssuggested by J. Olsen of
- Chicago. I would appreciate hearing from you about (a) bugs and
- (b) suggestions for further enhancements.