Basic Framework

The basic framework is very simple. All the author needs is to specify a pattern and a key. The editor then finds the pattern, issues a menu of options and inserts the index command along with the key as its argument upon the user's request. In our example, suppose both pattern and key are alpha, then the inserted string after an instance of alpha in the document is \index{alpha}. This insertion will be visible in a source-language based system and will be invisible for a direct-manipulation system (or visible as hidden text for its shadow pages).

Before the actual insertion is made, it is desirable to make a confirmation request that presents a menu of options, of which confirm and ignore are the most obvious ones. Thus for each instance of the pattern found, the user can decide if it is to be indexed.

Representing patterns as regular expressions gives significantly more power to this query-insert operation. The same key can represent a complicated string of a basic pattern, its capitalized form, its acronym, and other abbreviations. For instance, the following patterns may all be indexed by the key UCB,

        University of California, Berkeley
        Berkeley
        berkeley
        UCB

As a special case of this <key, pattern> setup, one can use words in the neighborhood of current cursor position as the implicit value for both the key and the pattern. Some editors allow the use of special characters to delimit word boundaries. These characters can be used in searching to reduce on the number of ``false drops''. For example, one can position the cursor after the desired pattern and with one editor command (typically in two or three key strokes), an index entry will be inserted with the preceding word (or words) as the implicit key. The advantage of this facility is that there is no need to type the key-pattern pair. The same idea also applies to a region of text, which is a piece of continuous text in the document. In Emacs, a region is everything between a marker and the current cursor position. More generally, the implicit operand can be the current selection, in which case the bounding positions of the selected text are not necessarily the insertion point.

In our system, there is also a special facility to index every author name that appears in the bibliography or references section of a document. This feature involves skipping citation entries without an author field and for each author name found, issuing a query-insert prompt similar to the normal case. Instead of entering a name directly as the index term, it is better to display it in the form of last name followed by first and middle names for confirmation, as in

Confirm: Knuth, Donald E.
This reordering yields last names as primary sort keys. Our name separation heuristic does not always work for multi-word last names. The confirmation prompt allows the user to correct it before insertion.