home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Power-Programmierung
/
CD1.mdf
/
modula2
/
alexcoco
/
readme.wrd
< prev
next >
Wrap
Text File
|
1987-07-16
|
13KB
|
309 lines
Automatic Compiler Generation with Alex and Coco
H. Mössenböck
University of Linz, Institut für Informatik
Altenbergerstr. 69, A-4040 Linz
Compiler construction is a well understood programming discipline which makes
use of various standard techniques. It is therefore possible to write
programs that apply those techniques to the automatic generation of a
compiler from a formal compiler description.
What has compiler construction to do with your own programming tasks?
Don't be confused by the term "compiler construction techniques". In practice
there are a lot of situations where the same techniques can also be applied
to problems which are not in the typical domain of compiler construction.
Consider for example a command language for an interactive user interface or
the processing of a file with variable length records each having its own
structure.
In fact every time a structured input has to be processed it is possible to
specify its syntax and its translation by means of a so called attributed
grammar. This description is nothing else than a program in a very high level
language specifically suited for problems of this kind. It can be
automatically translated into the specified "compiler" by a tool called a
compiler generator. The advantage of high level compiler descriptions
compared to compilers written in a general purpose language like Modula-2 is,
that they are usually much shorter and easier to read, because no coding
effort has to be spent for syntax analysis, error handling and attribute
propagation. These actions are included automaticall by the generator,
helping the programmer to focus his attention on the crucial parts of the
translation.
Contents of the Demo Disk
-------------------------
The demo disk contains two compiler generating tools: Alex is a scanner
generator which builds a lexical analyzer from a description of the terminal
symbols of a language. Coco is a compiler generator which builds a topdown
parser and a semantic evaluator from an attributed grammar. To give you
an impression of how Alex and Coco work we have used them to write a compiler
for a small programming language which we have called Taste (because it gives
a taste of how to generate a compiler). The disk includes the source code
of the Taste compiler. Play with it, add some new features and learn how to
construct your own compiler.
Alex and Coco come in demo versions which allow generation of compilers for
languages with at most 32 terminal symbols. This is enough to realize the
benefits of the tools over conventional hand coding. Real languages are of
much bigger size. So if you like Alex and Coco, you should order the full
version of the programs (for sfr 1200.-) from
A.+L. Meier-Vogt
Im Späten 23
CH-8906 Bonstetten / ZH
Tel.: (+41) 1/700 30 37
Further material regarding the input languages and the implementation of
Alex and Coco can be found in:
Mössenböck H.: Alex - A Simple and Efficient Scanner Generator.
SIGPLAN-Notices, Vol 21, Nr.5, May 1986.
Rechenberg P., Mössenböck H.: Ein Compiler-Generator für Mikrocomputer.
Hanser-Verlag, 1985 (in German).
Files on the Disk
-----------------
Alex
alex.EXE scanner generator Alex
alextab tables which are needed by Alex
alexframe scanner frame needed by Alex
Coco
coco.EXE compiler generator Coco
cocotab tables which are needed by Coco
cocosynf syntax analyzer frame needed by Coco
cocosemf semantic evaluator frame needed by Coco
Library
CocoAlex.DIR This M2SDS-Library contains the following files
CocoAlex.LIB in a ready-to-link form
Standard Modules
FileIO.DEF+IMP IO Module (InOut for multiple Files)
Errors.DEF+IMP standard error message module
Keywords.DEF+IMP recognizes abbreviated program parameters
Taste (an example of a generated compiler)
taste.LEX scanner description of Taste (to be processed by Alex)
taste.ATG attributed grammar of Taste (to be processed by Coco)
taste.MOD main program of the Taste compiler
tastelex.DEF+IMP scanner (generated by Alex from taste.LEX)
tastesyn.DEF+IMP parser (generated by Coco from taste.ATG)
tastesem.DEF+IMP semantic evaluator (generated by Coco from taste.ATG)
tastetab parser tables (generated by Coco from taste.ATG; they are
read by the module tastesyn at run time)
tastesym.DEF+IMP symbol list module
tastecod.DEF+IMP code generator
Test.TAS example main program written in Taste
The Taste Language
------------------
Taste is a very simple programming language which is derived from Modula-2.
It has variables of type INTEGER and BOOLEAN and non-recursive procedures
without parameters. It allows assignments, procedure calls, IF- and WHILE-
statements. Integers may be read from the keyboard and written to the screen,
each of them in a single line. It has arithmetic expressions (+,-,*,/) and
relational expressions (=,<,>). For a full grammar see the file taste.ATG.
A Taste program may look like this:
MODULE Example;
VAR n: INTEGER;
PROCEDURE SumUp; (*build the sum of all integers from 1 to n*)
VAR sum: INTEGER;
BEGIN
sum:=0;
WHILE n>0 DO sum:=sum+n; n:=n-1 END;
WRITE sum
END SumUp;
BEGIN
READ n;
WHILE n>0 DO SumUp; READ n END
END Example.
Of course Taste is too restrictive to be used as a real programming language.
It was our purpose to give just a taste of how to write a compiler with Alex
and Coco.
Modules of the Taste Compiler
-----------------------------
The Taste compiler is a compile-and-go compiler, which means that it reads a
source program and translates it into a target language which is executed
(i.e. interpreted) immediately after the compilation. The following figure
shows the structure of the compiler (arrows denote procedure call dependencies).
taste
|
...................|........
. tastesyn .
. | .
. +------------+------------+
. | | . |
. tastelex <-- tastesem ---> Errors
...................|........
+---------+---------+
| |
tastesym tastecod
The modules in the dotted rectangle are generated by Alex and Coco, the
other modules are written by hand. Errors is a standard module. The
modules have the following tasks:
taste
Main program. It reads the name of the source file from the keyboard and
calls the parser and the interpreter.
tastesyn
Parser. It is generated by Coco from the attributed grammar taste.ATG.
The parser contains a language independent error recovery mechanism
which is derived by Coco from the attributed grammar.
tastesem
Semantic evaluator. It is generated by Coco from the attributed grammar
taste.ATG. It contains all the semantic actions of the grammar. The actions
are called and executed during parsing and do the actual compilation work.
tastelex
Scanner. It is generated by Alex from the scanner description taste.LEX.
tastesym
Symbol list module. This module contains procedures to handle scopes and
to store and retrieve object information.
tastecod
Code generator. This module contains procedures to emit instructions
as well as the interpreter and its data structures
Errors
General module for error messages. It is a standard module in compilers
generated by Coco.
The Target Code
---------------
We define an abstract stack machine for the interpretation of Taste programs.
The compiler translates a source program into instructions of that machine
which are interpreted later on. The machine uses the following data
structures (code and data are filled by the compiler):
data: array of integer data memory
code: array of byte code memory
stack: array of integer expression stack of the stack machine
pc: integer program counter
The instructions have variable length. They consist of an operation code
(one byte) and of an operand if necessary (two bytes). The instructions
are described by the following table:
LOAD adr Load value
Push(data[adr]);
LIT val Load literal constant (or address)
Push(val);
STO Store
Pop(val); Pop(adr); data[adr]:=val;
ADD Add
Pop(val2); Pop(val1); Push(val1+val2);
SUB Subtract
Pop(val2); Pop(val1); Push(val1-val2);
DIV Divide
Pop(val2); Pop(val1); Push(val1/val2);
MUL Multiply
Pop(val2); Pop(val1); Push(val1*val2);
EQU Compare if equal
Pop(val2); Pop(val1); if val1=val2 then Push(1) else Push(0) end;
LSS Compare if less
Pop(val2); Pop(val1); if val1<val2 then Push(1) else Push(0) end;
GTR Compare if greater
Pop(val2); Pop(val1); if val1>val2 then Push(1) else Push(0) end;
CALL adr Call procedure
Push(pc); pc:=adr;
RET Return from procedure
Pop(pc);
JMP adr Jump
pc:=adr;
FJMP adr False jump
Pop(val); if val=1 then pc:=adr end;
HALT End of the program
Halt;
NEG Negation
Pop(val); Push(-val);
READ adr Read integer
Write("?"); Read(val); data[adr]:=val; WriteLn;
WRITE Write integer
Pop(val); Write(val); WriteLn;
How to use Alex and Coco
------------------------
If you want to build a compiler for a source language of your own,
proceed as follows:
1. Write a scanner description (i.e. the syntax of the terminal symbols).
The scanner description of Taste is on the file taste.LEX. Feed the
scanner description to Alex to get the scanner as a Modula-2 module.
2. Write an attributed grammar of the compiler's source language. An
attributed grammar consists of a context free grammar and of attributes
and semantic actions in Modula-2. Attributes may be regarded as parameters
of the terminals and nonterminals. Semantic actions may be regarded
as translation actions. They are placed at those points in the grammar
where a computation with the attributes or with arbitrary Modula-2
variables are necessary. The attributed grammar of Taste is on the file
taste.ATG. Feed the attributed grammar to Coco to get the parser and
the semantic evaluator of the compiler as Modula-2 modules. Coco also
produces grammar tables which are read by the generated compiler at run
time.
3. Write a main program. It has to initialize the compiler and to call
the parser. The main program of the Taste compiler is on the file
taste.MOD.
4. Write semantic modules as needed. These modules contain procedures that
are called in the semantic actions of the attributed grammar. For
example, the Taste compiler has a symbol list module (tastesym.IMP) and
a code generator (tastecod.IMP). A standard module which prints error
messages can be found in the file Errors.IMP.
5. Put the pieces together to get a running compiler.
Examples for the Use of Alex and Coco
-------------------------------------
At our university we have used Alex and Coco for various tasks. The following
list should give you some ideas about what might be useful applications of
the generators:
- An analyzer for the static complexity of programs. The analyzer evaluates
the kind of operators and statements, the program nesting and the use
of local and global variables to obtain a measure of the program complexity
and an indication if the program is well structured.
- A cross reference generator which lists all occurences of the objects in
a program according to their scope together with information where the
objects have been assigned a value and where they have been referenced.
- An "intelligent" pretty printer which uses the structure and the length
of statements for proper indentation.
- A program which generates an index for books and reports. The index is
generated from relations between page numbers and keywords entered in an
appropriate way.
- The front end of a syntax oriented editor. A Modula-2 program is translated
into a tree representation which is the internal data structure of the
editor.
MBlock1wor