home *** CD-ROM | disk | FTP | other *** search
- ;;05-23-85
- Eric Gans
- French Department UCLA
- Los Angeles, CA 90024
- WINDEX.DOC
- v2.0
-
- Version 2.0 update (05/22/85):
-
- A number of features have been added at the request of users,
- whom I thank for their interest. WINDEX now:
-
- - carries all-caps words into the index unaltered (Paul Foote)
- - permits a negative page offset (Simon Heifetz)
- - allows tagging keys within the text file with ^PP, and
- - allows indexing of strings (John-Mark Stensvaag)
- - allows indexing of (hard-)hyphenated words
- - uses entire free memory (max of about 11000 page references and
- a 17 K NDX file for a 60 K system -- e.g., Kaypro-10)
-
- The I/O has been speeded up (again), a bug or two corrected, and
- tabs replaced by spaces in the NDX file, plus a few minor
- changes.
-
- Version 1.2 update (4/6/85):
- Finds words divided by "soft" hyphens at the end or middle of the
- line; more efficient I/O for long index files.
-
- Version 1.1 update (4/3/85):
- Fixed bug in v1.0 that forgot to open file to be indexed (!) when
- keywords entered from console. (Thanks to B. Cardozo for
- catching this one.)
-
- *****
-
- WINDEX creates indexes for Wordstar files written in document
- mode. It can be used to index a manuscript of any length,
- including books of up to 9999 pages, with a maximum of 254 keys.
-
- The keywords to be WINDEXed can be entered in three modes:
-
- 1. (windex [d:]fn.ft) Direct keyboard entry; you will be prompted
- at the console. The simplest approach for short indexes.
-
- 2. (windex [d:]fn.ft * NB: * replaces / used in previous
- versions) Keywords will be sought in a file fn.KWD on the same
- drive/user. In creating this file, you need only avoid hyphens
- and the exotic punctuation marks [\]^_` The character / may be
- used between words to index strings; it will be treated as the
- equivalent of a non-break space (^O) in the file. All other
- characters, including numbers, periods, commas, semicolons or
- blanks, are permitted as separators: the simplest way is to list
- the words with a CR after each. Hyphenated words may be indexed
- (this means that the program will not find indexed words that are
- contained in hyphenated groups: Big-Mac will not be found in a
- search for Mac); no other internal punctuation (e.g.,
- apostrophes) is permitted. First or all letters may be
- capitalized and will remain so in the output (the search function
- won't pay attention to capitalization). You need not enter the
- words in order; the program will alphabetize them. The same
- criteria hold true if you prefer to enter your word list from the
- keyboard.
-
- 3. (windex [d:]fn.ft #) Keywords may be tagged in the file to be
- indexed. This allows you to create your index as you go along
- (duplicate entries will be discarded). ^P (entered as ^PP) must
- precede each keyword. To index strings, separate words by a non-
- break space (^O, entered as ^PO), which keeps the whole string on
- the same line. To end the index string, you may enter a second
- ^P, although any non-alphabetical character except - and ^O will
- suffice. The maximum string length permitted is 29 characters.
-
-
- The output file (on the same drive) will be fn.NDX. An
- approximate right margin of 65 will be adhered to; CR's will be
- added after each line and second and succeeding lines of index
- entries will be tabbed. This file can be edited with Wordstar
- and converted if you like to document mode (this doesn't seem
- appropriate for an index, however).
-
- If you have more than 254 keywords, you should divide them
- alphabetically into two or more groups. (ALPHA.COM will do this
- for you.) You can then combine the indexes later in alphabetical
- order using PIP or Wordstar's ^KR command.
-
- The current version of WINDEX allocates about 2/3 of the free
- memory to the page-reference buffer and about 1/3 for the NDX
- file. This allows (on a 60 K Kaypro-10) for about 17 K for the
- file and 34 K for the buffer, or about 11000 references at 3
- bytes each. (This proportion is based on the fact that many
- references are multiple appearances on the same page that do not
- appear in the NDX file.) This should be enough for any normal
- use of the program (110 references/page in a 100-page
- manuscript!) In case you somehow do run out of memory, WINDEX
- will recognize when the CCP is overwritten and do a Warm Boot,
- but it doesn't check if you go even further. But long before you
- get to that point, you should divide your keyword list into
- smaller alphabetical groups and index them separately. As long
- as you keep the different indexes in alphabetical order (you'll
- also have to change their names if you keep them on the same
- disk), you can PIP them together with no internal editing save
- removal of a few headings.
-
- Page offset:
-
- You will be prompted for an offset between -255 and 9999 (default
- = 0). All page numbers will be increased by the offset. This
- feature allows you to index manuscripts that don't start at page
- 1 (say, chapters in a book). A negative offset may be used if
- page 1 is preceded by prefatory material; index entries that come
- before page 1 will be listed as "-#".
-
- Hyphens:
-
- Wordstar distinguishes between hard hyphens (those you enter
- yourself) and soft hyphens (entered for formatting purposes).
- WINDEX skips over soft hyphens, since they merely break words at
- the end of lines; hard hyphens are treated as letters in the
- keyword and in the file. This remains true even when they occur
- at the end of a line; the difference is that you entered them as
- part of a hyphenated word.
-
- String example:
-
- (Console or KWD file) The entry: blurk/zap/zlonk will search
- the string "blurk^Ozap^Ozlonk", which will appear in print as
- "blurk zap zlonk" (all words on the same line). To tag this
- string in your text file, it should appear on the screen as:
- "^Pblurk^Ozap^Ozlonk^P"; the second ^P can be replaced by a
- (normal) space.
-
- Features:
- - makes use of a binary tree for maximum search speed (5-6
- seconds for a 40K file)
- - occupies less than 3K on disk
-
- Limitations:
- - won't recognize words with internal punctuation
- other than hyphens (apostrophes, accents, &c.)
- - only works with files saved in document mode (it needs
- this for its page-count feature)
- - all files (fn.ft, fn.KWD, fn.NDX) must be on same drive
-
- Warning: - WINDEX will create a new NDX file each time it is run
- and delete any previous file of the same name. You must rename
- old index files you want to save before rerunning the program.
-
- Trick:
- - You can rename an old NDX file as a KWD file, since the
- program won't notice the numbers following the entries. Before
- you do this, don't forget to delete the heading within the NDX
- file.
-
- *****************************************************************
-
- ALPHA.COM v1.0
-
- In order to facilitate finding keywords in your files, I have
- included my word-sorting program ALPHA in this library. This
- program uses a binary tree to sort all the words in a file in
- alphabetical order and gives the number of times (up to 256) each
- is used (words used more than 255 times will wrap around; this
- may be changed in a future version but isn't very important for
- our purposes here.) Entering a / after the filename (i.e.,
- entering: alpha fn.ft /) will limit the word list to capitalized
- words. Many of these will be The or This, but you will also find
- all the proper names in your file. ALPHA doesn't create a
- separate file, but you can print it out (use CP/M ^P, or if you
- have a Kaypro, my PRINT or BIOSP print-screen patch in
- EGKTEN.LBR). ALPHA allows internal apostrophes, but not hyphens.
- (I didn't allow for apostrophes in WINDEX since when you write an
- index you usually want to include possessives under the
- possessor: if you are indexing "Smith," you want instances of
- "Smith's" to be included, not listed separately.)
-
- Note to the user:
-
- Version 2.0 contains several enhancements suggested by users.
- If you can think of anything else you'd like to see, let me
- know--we aim to please.
-
- Possibilities:
-
- - allowing numbers, etc. in index words
- - freer string entry options
- - ???