The C Users' Group Library 1994 August

home *** CD-ROM | disk | FTP | other *** search

/ The C Users' Group Library 1994 August / wc-cdrom-cusersgrouplibrary-1994-08.iso / vol_300 / 355_01 / slk1.exe / SPP.DOC < prev next >

Wrap

Text File | 1991-06-11 | 5KB | 82 lines

THE THEORY OF OPERATION OF SPP SPP is a strange combination of a one pass C compiler and a file copying program. With the exception of about 20 lines of code, the two programs are completely independent! The job of the compiler is to know the types of all functions and variables, to insert entry macros before the first executable statement of a function, to replace return instructions by exit macros and to insert exit macros at the end of every function if flow could fall off the end of the function. In order to do this, all include files must be processed and all macros must be correctly expanded. The job of the file copying program is to copy all the characters of the main file while allowing Sherlock macros to be inserted and return statements to be revised. Characters from included files are not copied and an option is provided to eliminate unnecessary white space, including comments. The file copier is bullet proof: it will, for instance, correctly copy a Pascal program even though the parser becomes (as it should become) hopelessly confused. The compiler is based on a recursive descent parser, found in PAR.C, DCL.C and EXP.C. The only complication here is that the parser accepts both K&R C and ANSI Standard C. Oh yes, the parser also accepts some Borland and Microsoft extensions. To add your own extensions, change the function allow_mkeys. The parser calls the get_token() routine to get a stream of C tokens. The get_token routine handles all preprocessor directives and macro expansion transparently to the parser. This has always been the part of SPP where the most real bugs lurked. Be warned. The files SPP.C, DEF.C, DIR.C, TOK.C and UTL.C contain the heart of the token processing routines. The file MST.C contains the macro symbol table used to record and expand macro definitions. The parser calls semantic routines in SEM.C. There are three kinds of semantic routines: 1. The functions whose names start with sd_ (for semantic declaration) keep track of the declared types of all functions and variables. The file ST.C contains the symbol table used by the semantic routines. 2. The functions whose names start with sf_ (for semantic flow) keep track of the possible flow of control through the executable statements in a function. These functions maintain a "flow stack" telling whether flow could ever fall through the end of the current statement. 3. The functions whose names start with so_ (for semantic output) create the additional macro calls that are generated by SPP. Because of the one pass nature of SPP, calling these routines at exactly the right time is tricky. Header files: The values of tokens used throughout SPP are defined in enum.h. The definitions of are global variables are found in glb.h. The prototypes of all globally visible functions are found in tmp.h. All other constants are found in spp.h. SPP is designed in separate modules. Variables and constants that are used only by a single module are declared static so that they are invisible outside the file in which they are defined. The partitioning of SPP into separate files was intended to reduce the number of global variables. In my opinion, this aspect of the design of SPP has been a complete success. The file SYS.C contains low level routines that might have to be rewritten should SPP be ported to another machine or operating system. Even in the days of the ANSI standard, the SYS.C module is quite useful. One of these low level routines is sysnext(), which returns the next character from the current input file. This is the most often called routine in the whole program, and has been tweaked to make it run fast. The file copier program operates inside sysnext(), at the lowest level of the whole program. As characters are returned from sysnext(), they are placed in the hold buffer. Characters are not placed in the hold buffer if they come from an included file or from a macro expansion. The hold buffer is output when syshflush() is called, or emptied (either partially or completely) without being output when syshkill() or syshnldel() are called. These calls to syshflush(), syshkill() and syshnldel() form the interface between the file copier and the parser. Although only 11 calls to these routines appear throughout the parser, getting these 11 calls right is tricky--syshflush() must be called often enough so that the hold buffer never overflows, but syshflush() must never be called when a Sherlock macro may have to be output, since either syshkill() or sysnldel() would be called instead.