home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!europa.asd.contel.com!darwin.sura.net!wupost!uwm.edu!rutgers!faatcrl!iecc!compilers-sender
- From: sells@cs.umn.edu (Chris Sells)
- Newsgroups: comp.compilers
- Subject: Compiler Construction Using OO Techniques
- Keywords: C++, yacc, OOP
- Message-ID: <92-08-181@comp.compilers>
- Date: 31 Aug 92 02:35:41 GMT
- Sender: compilers-sender@iecc.cambridge.ma.us
- Reply-To: sells@cs.umn.edu (Chris Sells)
- Organization: Spanlink Communications
- Lines: 115
- Approved: compilers@iecc.cambridge.ma.us
-
- Compiler Construction Using OO Techniques: A Case Study
-
- I am interested in pursuing a Master's Thesis by expanding the currently
- available compiler generation tools (i.e. yacc) using object-oriented
- techniques.
-
- I've been doing research on the net w/ this topic in mind and have come up
- w/ some good references, including some requests for what I'd like to do,
- but have not come up w/ anyone who's done what I propose.
-
- The first thing I want to pursue is to generate parse classes
- automatically from a yacc grammar. For example, the following yacc code is
- present in just about every compiler known to man:
-
- prog : stmt prog {$$ = new prog($1,$2);}
- | stmt /* default {$$ = $1;} */
- ;
-
- stmt : expr '\n' {$$ = new stmt($1,$2);}
- | 'q' {$$ = new constant($1);}
- ;
-
- expr : expr '+' term {$$ = new expr($1,new constant($2),$3);}
- | expr '-' term {$$ = new expr($1,new constant($2),$3);}
- | term /* default {$$ = $1;} */
- ;
-
- term : term '*' factor {$$ = new term($1,new constant($2),$3);}
- | term '/' factor {$$ = new term($1,new constant($2),$3);}
- | factor /* default ($$ = $1;} */
- ;
-
- factor : primary '^' factor {$$ = new factor($1,new constant($2),$3);}
- | primary /* default {$$ = $1;} */
- ;
-
- primary : '(' expr ')' {$$ = new primary($1,new constant($2),$3};}
- | integer /* default {$$ = $1;} */
- ;
-
- From this, my 'generator' would read the yacc grammar and derive
- sub-classes (from a parent 'symbol') for prog, stmt, expr, term, factor,
- and primary and build the proper relationships between them to allow for
- the trivial construction of a parse tree (as shown above). All that it
- would take to generate code would be 'prog.emit()', once each object's
- 'emit' method was developed (that would still have to be done by hand).
- Likewise, to execute, 'prog.exec()' or to pretty-print,
- 'prog.pretty_print()'.
-
- The beauty of it is how well the tree-structure of OO maps to the
- tree-structure of the internal representation of a program, i.e. a
- parse-tree.
-
- My parse-tree generater would, for each non-terminal, create a sub-class
- of 'symbol' w/ the correct constructors (i.e. one for each rule). The
- instance then contains the actual arguments passed (treated as children in
- the tree). The 'emit' method would then be coded, by hand, to generate
- code for each possible rule for the class. For example,
-
- term : term '*' factor {$$ = new term($1,new constant($2),$3);}
- | term '/' factor {$$ = new term($1,new constant($2),$3);}
- | factor
-
- would generate (not worried about syntax, feel free to correct, it's
- 1:30am):
-
- class term:public Symbol {
- char *name; /* type string or constant value */
- int token; /* token value */
- SymbolList *arglist; /* list of arguments */
- public:
- int token() { return token; }
- const char* name() { return name; }
- Symbol& arg(int index) { return arglist->element(index); }
- term& emit();
- term& term(term* t,constant* op,factor* f)
- { arglist->add(t)->add(op)->add(f);name="term";token=TERM;return
- this;}
- ~term() {arglist->purge();}
- }
-
- /* knows that it's arguments are a term, an op and a factor */
- /* this code written by hand but constructor(s) and destructor auto-coded
- */
- term& term::emit()
- {
- cout << '(';
- arg(0).emit();
- cout << arg(1).type;
- arg(2).emit();
- cout << ')';
-
- return this;
- }
-
- This 'generator' would be able to handle any grammar that could be
- expressed as a yacc grammar. The actual yacc code would be almost trivial,
- I think. The compiler constructor would be able to write each 'emit'
- method and not have to deal w/ tree construction or traversal.
-
- The other areas I wish to involve involve symbols (variables, constants,
- arguments, etc) and scopes, but I have less concrete ideas about the
- benefits of OO w/ these areas. I plan on building a real-world compiler
- based on a C sub-set and to use OO when/where-ever I can.
-
- Any ideas, references, comments, criticisms, flames, encouragement, etc.
- are more than welcome. Lord knows I rather have my lame ideas corrected
- now rather than before the review board. Thanx in advance.
-
- | Chris Sells |
- | Spanlink Communications |
- | sells@cs.umn.edu |
- --
- Send compilers articles to compilers@iecc.cambridge.ma.us or
- {ima | spdcc | world}!iecc!compilers. Meta-mail to compilers-request.
-