Detex - Version 2.5 Detex is a program to remove TeX constructs from a text file. It recognizes the \input command. This program assumes it is dealing with LaTeX input if it sees the string "\begin{document}" in the text. It recognizes the \include and \includeonly commands. This directory contains the following files: README - you're looking at it. Makefile - makefile for generating detex on a 4.2BSD Unix system. detex.1l - troff source for the detex manual page. Assuming you have the -man macros, use "make man-page" to generate it. detex.h - Various global definitions. These should be modified to suit the local installation. detex.l - Lex and C source for the detex program. states.sed - sed(1) script to munge the state names in detex.l so that reasonable names can be used in the source without causing lex(1) to overflow. Feel free to redistribute this program, but distribute the complete contents of this directory. Send comments and fixes to me via email. Daniel Trinkle Department of Computer Sciences Purdue University West Lafayette, IN 47907-1398 April 26, 1986 Modified -- June 4, 1986 Changed so that it automatically recognizes LaTeX source and ignores several environment modes such as array. Modified (Version 2.0) -- June 30, 1984 Now handles white space in sequences like "\begin { document }". Detex is not as easily confused by such things as display mode ends and begins that don't match up. Modified -- September 19, 1986 Added the "-e " option to ignore specified LaTeX environments. Modified -- June 30, 1987 Added the "-n" no-follow option, to allow detex to ignore \input and \include commands. Also changed the algorithm for locating the input files. It now interprets the "." more reasonably (i.e. it is not always the beginning of an extension). Modified -- December 15, 1988 Added handling of verbatim environment in LaTeX mode and added it to the list of modes ignored by default. Because of limitations with lex, it was necessary to shorten the names of some of the existing start states before adding a new one (ugh). Modified -- January 3, 1988 Added ignore of \$ inside $$ (math) pair. Modified (Version 2.2) -- June 25, 1990 Control sequences are no longer replaced by space, but just removed. This means accents no longer cause output words to be broken. The "-c" option was added to show the arguments of \cite, \ref, and \pageref macros. This is usefule when using something like style on the output. Modified (Version 2.3) -- September 7, 1990 Fixed the handling of Ctl mode a little better and added an exception for \index on suggestions from kcb@hss.caltech.edu (KC Border). Also changed the value for DEFAULTINPUTS to coincide with a local change. Modified -- February 10, 1991 Added -t option to force TeX mode even when seeing the "\begin{document}" string in the input. Modified -- February 23, 1991 Based on suggestions from pinard@iro.umontreal.ca (Francois Pinard), I added support for the SysV string routines (-DUSG), added defines for the flex lexical scanner (-DFLEX_SCANNER), changed NULL to '\0' when using it as a character (his sys defined NULL as (void *)0), changed the Makefile to use ${CC} instead of cc, and added comments about the new compile time options. Modified (Version 2.4) -- September 2, 1992 Corrected the way CITEBEGIN worked. Due to serious braindeath I had the condition wrong in the if test. It should be (fLatex && !fCite). This solves the problem a couple people reported with amstex style \ref entries. Added a preprocessing sed(1) command to replace all the long, easy to read state names with short two letter state names (SA-S?) so that lex won't overflow and I don't have to keep shortening the state names every time I add one. If a state is added, it must also be added to states.sed (order is important) along with its unique S? replacement. Added \pagestyle, \setcounter, and \verb handling from K.Sattar@cs.exeter.ac.uk (Khalid Sattar). Also allows invocation as "delatex" to force LaTeX mode. Applied patches from queinnec@cenatls.cena.dgac.fr (Philippe Queinnec) to handle nested {}s in state (\bibitem, \cite, \index). Added special ligature handling (\aa, \ae, \oe, \l, \o, \i, \j, \ss) at the suggestion of gwp@dido.caltech.edu (G. W. Pigman II). Cleaned up the comments on detex.h, added mathmatica to DEFAULTENV. Modified (Version 2.5) -- January 28, 1993 Leading spaces in macros are no longer stripped. This means "foo\footnote{ bar}" comes out as "foo bar" instead of "foobar". Fixed special ligature handling so it works in cases line {\ss} instead of just when followed by a space. Porting notes -- March 30, 1992 When using gcc, or compiling on a NeXT, you should compile with -fwritable-strings. On a NeXT, it has been reported that lex replaces the '\0' with NULL, and then the compiler complains about it. I think this is an old bug that is no longer applicable.