home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
OS/2 Shareware BBS: 10 Tools
/
10-Tools.zip
/
pccts1.zip
/
UPDT110.TXT
< prev
Wrap
Text File
|
1993-09-08
|
63KB
|
2,109 lines
The Purdue Compiler Construction Tool Set
Version 1.10 Release Notes
ANTLR and DLG
August 31, 1993
Terence Parr
Army High Performance Computing Research Center,
University of MN
(parrt@acm.org)
Will Cohen and Hank Dietz
School of Electrical Engineering
Purdue University
(cohenw@ecn.purdue.edu)
(hankd@ecn.purdue.edu)
This is the TEXT version, the original was in PostScript.
Greek letters have been replaced with uppercase Arabic.
Subscripts have been encoded as _i postfixes.
Sorry for any errors. DLH - 930908
1. Introduction
This document describes the changes/enhancements in PCCTS since
version 1.06. As with the 1.06 release notes, these notes do not con-
stitute the complete reference manual. Unfortunately, the reference
manual is in the same condition as it was for the 1.00 release in the
Spring of 1992. We are working on a total rewrite of the manual,
which might end up in a book consisting of the theory behind practical
k-token lookahead for k>1 (Terence Parr's Ph.D. thesis), ANTLR imple-
mentation notes, and the PCCTS user's manual.
The 1.10 release of PCCTS has four main enhancements: fully
implemented semantic predicates (<<...>>?), infinite lookahead (plus
selective backtracking that uses it), increased ANTLR (see -ck and
ZZUSE_MACROS sections) and DLG speed, and the ability to link multiple
ANTLR parsers together. A number of bug fixes have been incorporated
as well. The tutorials have not been updated much for this release.
To better support our user's, we have established a mailing list
called pccts-users that you can subscribe to by sending email to
pccts-users-request@arc.umn.edu with a body of "subscribe pccts-users
your-name". Users that have registered with the PCCTS mail server at
pccts@ecn.purdue.edu have not been automatically subscribed. Once you
have subscribed, posting a message to the PCCTS community is as simple
as sending email to pccts-users@ahpcrc.umn.edu (with any Subject: and
body). You can also send a body of help to pccts-users-
request@ahpcrc.umn.edu to get help on using the mailing list server.
We have finally agreed on a numbering scheme for PCCTS releases:
x.yz where x reflects the major release number (new tool additions), y
reflects major new feature additions, and z reflects bug fixes and
minor feature additions (minor releases).
Page 1
PCCTS
2. Semantic Predicates
The fundamental idea behind semantic predicates has not changed
since the 1.06 release - semantic predicates indicate the semantic
validity of continuing with the parse or of predicting a particular
production. However, we now collect all predicates visible to a syn-
tactically ambiguous parsing decision rather than just the first one
encountered as in 1.06. In addition, the context of the predicate can
be computed and hoisted with the predicates; helpful warning are also
generated for incompletely disambiguated parser decisions. The only
backward incompatibilities are that parsing does not halt automati-
cally if a semantic validation predicate fails and the -pr is
obsolete - the specification of a predicate implies that it may be
used by ANTLR to validate and disambiguate as it sees fit. In this
section we discuss all of these issues.
2.1. Visible Semantic Predicates
Given a syntactically ambiguous parser decision (or, more accu-
rately, a non-deterministic decision), ANTLR attempts to resolve it
with semantic information - ANTLR searches for visible predicates. A
visible predicate is a semantic predicate that could be evaluated
without consuming an input token or executing a user action (except
initialization actions, which generally define variables). All visi-
ble predicates reside on the left edge of productions; predicates not
on the left edge can only function as validation predicates (see 1.06
release notes). Consider a simple example:
a : <<pred_1>>? ID A
| <<pred_2>>? ID B
;
Assuming that lookahead information is insufficient to predict produc-
tions of rule a, ANTLR would incorporate pred_1 into the prediction
expression for the first production and pred_2 into the second predic-
tion expression. Multiple predicates can be hoisted (which may or may
not be what you want):
decl : <<pred_1>>? var
| <<pred_2>>? ID
;
var : <<is_var(LATEXT(1))>>? ID
;
In this case, the prediction expression for production one of rule
decl would resemble (with context computation turned off - see below):
Page 2
PCCTS
if ( pred_1 && is_var(LATEXT(1)) && LA(1)==ID ) {
var();
}
...
Here, two visible predicates were found to disambiguate the prediction
of the first production of rule decl whereas only one was found for
the prediction of the second production.
The action of evaluating a predicate in a decision is called
hoisting. In the first example of this section, predicates were not
moved since they reside at the decision point in rule a, but techni-
cally we say that they were hoisted into the decision. In the second
example, pred_1 was hoisted from decl and is_var(LATEXT(1)) was
hoisted from rule var1 to predict the first production of decl.
2.2. Context of Semantic Predicates
In release 1.06, predicates were hoisted without computing and
hoisting the context of that predicate. Context is important because,
as we saw in the last section, predicates may be evaluated in totally
different rules. Imagine a rule that had many alternative produc-
tions, two of which were syntactically nondeterministic because of a
common lookahead of ID (assuming that only one symbol of lookahead is
available for simplicity).
a : A ...
| B ...
| classname
| C ...
| varname
| D ...
;
classname
: <<pred_1>>? ID
;
varname
: <<pred_2>>? ID
;
Simply incorporating pred_i into the production prediction expressions
for alternatives three and five is not safe for two reasons:
[1] Evaluation of pred_i may cause a program execution error if
evaluated on the wrong type of data. pred_i will be evaluated on
any input, which is {A, B, C, D, ID} in our case - pred_i may
"core" if fed non-ID token types.
[2] pred_i may give misleading results even if it does not "core".
pred_i may return false even though the production is not
Page 3
PCCTS
dependent on the predicate for that token type. For example:
a : (var | NUM) ...
| <<!is_var(LATEXT(1))>>? ID ...
;
var : <<is_var(LATEXT(1))>>? ID
;
The first production will never match a NUM because
is_var(LATEXT(1)) will always evaluate to false for that token
type since numbers are not variables ever (the predicate in var
is hoisted for use in the decision for rule a).
The way to solve both problems is to change pred_i to:
LA(1)==ID ? pred_i : 1
The 1 merely indicates that if the lookahead is not an ID then enable
the production for normal parsing - we have no semantic information
that establishes the validity or invalidity of the production.
Context computation similar to this is can now be done automati-
cally (-prc on). Unfortunately, as mentioned previously in this docu-
ment, computing full LL(k) lookahead is, in general, an exponential
problem; hence, for large grammars you may want to keep this off with
-prc off (default) and include context tests manually in your predi-
cates. The old -pr option to turn on parsing with predicates is now
ignored as the specification of a predicate indicates that it should
be used.
ANTLR does its best to warn the user when a possibly incompletely
disambiguated grammar has been specified. In other words, when a syn-
tactically ambiguous decision is resolved with semantic predicates,
all mutually ambiguous productions must have at least one semantic
predicate associated with it. For example:
a : <<pred>>? ID ...
| ID ...
;
This grammar will yield a warning when run through ANTLR with -w2 set
because semantic information was not provided to indicate the validity
of the second production:
t.g, line 1: warning: alt 2 of the rule itself has no predicate to resolve ambiguity
Page 4
PCCTS
However, rule a will behave correctly because if pred fails, the
second production will be attempted as the default case (remember that
a missing semantic predicate is equivalent to <<1>>?). Adding a third
production that began with ID would not behave correctly as the last
ID-prefixed production would never be matched.
As a more complicated example, consider the following incorrectly
disambiguated grammar:
a : b
| <<pred_1>>? ID
| <<pred_2>>? NUM
;
b : <<pred_3>>? ID
| NUM
;
Rule a cannot predict which production to match upon lookahead ID or
NUM. Alternatives 2 and 3 have been disambiguated, but the first pro-
duction hoists a predicate that only "covers" ID's. As a result, the
following message is generated by ANTLR:
t.g, line 1: warning: alt 1 of the rule itself has no predicate to resolve ambiguity upon { NUM }
This detection is a great help during grammar development.
Ambiguity warnings are now turned off for decisions that have
semantic predicates covering all ambiguous lookahead sequences.
2.3. Failure of predicates
Predicates that are not used in disambiguating parsing decisions
are called validation predicates. Previously, validation predicates
that failed during parsing printed out a message and terminated the
parser:
if (!pred) {fprintf(stderr,"failed predicate: 'pred'\n);exit(1);}
The latest release of ANTLR generates a call to a macro that the user
may define called zzfailed_pred(), which is passed a string represent-
ing the predicate that failed:
if (!pred) {zzfailed_pred("pred");}
while this solution is not ideal, it is much better than before.
Page 5
PCCTS
3. Infinite lookahead and Backtracking
There are a number of grammatical constructs that normal LL(k)
recursive-descent parsing cannot handle. The most obvious example
would be left-recursion, but left-recursion can be removed by well-
known algorithms. The nastiest grammar construct is one in which two
alternative productions cannot be distinguished without examining all
or most of the production. While left-factoring can handle many of
these cases, some cannot be handled due to things like action place-
ment, non-identical left-factors, or alternatives productions that
cannot be reorganized into the same rule. The solution to the
arbitrarily-large common left-factor problem is simply to use arbi-
trary lookahead; i.e., as much lookahead as necessary to uniquely
determine which production to apply.
ANTLR 1.10 provides two mechanisms for using "infinite" amounts
of lookahead. The first is to use semantic predicates in conjunction
with a user-defined function that scans arbitrarily ahead using a set
of macros provided in this release. The second is a more implicit
scheme by which the user can annotate those sections of the grammar,
which defy normal LL(k) analysis, with syntactic predicates. ANTLR
will then generate code that simply tries out the indicated alterna-
tive production to see if it would match a portion of the remaining
input. If not, the generated parser would try the next viable alter-
native production. This scheme is a form of selective backtracking
(and, hence, can recognize the class of context free languages) where
most of a parser is deterministic and only the "hard" parts are done
using trial-and-error. As a direct consequence, ANTLR can now gen-
erate parsers with the semantic flexibility of LL(k), that are
stronger than full LR(k) (in theory), and are nearly linear in com-
plexity; note that the semantic predicates (first introduced in the
1.06 release) can take ANTLR-generated parsers beyond the context-free
language limit into the context-sensitive.
We begin this section by introducing the notion of infinite loo-
kahead through an example problem that we solve with semantic and then
with syntactic predicates. Following this, we describe in detail the
syntax and use of syntactic predicates, which employ infinite looka-
head to perform selective backtracking.
3.1. Examples
This section presents a simple grammar whose productions have
common left-factors that we assume, for the sake of demonstration pur-
poses, to be non left-factorable. With nothing but the grammar, ANTLR
would be unable to construct a deterministic parser. We first provide
a solution by writing a function that explicitly accesses the infinite
lookahead buffer to determine which production should be attempted.
This solution is efficient, but would become somewhat tedious for the
programmer if it had to be done for each such problem in a large gram-
mar. Fortunately, an easier and more concise solution is provided by
syntactic predicates, which we also demonstrate using the same gram-
mar.
Page 6
PCCTS
Consider ML which has multiple assignment and list statements.
E.g.,
stat: list Assign list ";" <<printf("list = list\n");>>
| list ";" <<printf("list\n");>>
;
This grammar is not LL(k) for any k as list can be arbitrarily long.
The following grammar using semantic predicates to access the infinite
lookahead buffer to explicitly compute which production will be
matched.
Page 7
PCCTS
/* example use of the infinite lookahead buffer macros
* compile with:
* antlr list.g
* dlg parser.dlg scan.c
* cc -Iantlr_includes -DZZINF_LOOK -o list list.c scan.c err.c
*/
#header <<#include "charbuf.h">>
<<
main() { ANTLR(stat(), stdin); }
/* Scan for a "=", but only before a ";" -- return 1 if found, else 0
This performs the same function as using the syntactic predicate:
(list Assign list ";")?
but uses a semantic predicate coupled with the infinite-lookahead feature.
It is somewhat faster as it does not actually *parse* the "list =", it just
scans ahead.
MUST HAVE "ZZINF_LOOK" PREPROCESSOR FLAG DEFINED
(in #header or on compiler command line)
*/
which()
{
int i;
for (i=1; ZZINF_LA_VALID(i); i++)
{
if ( ZZINF_LA(i) == Assign ) return 1;
else if ( ZZINF_LA(i) == Semi ) return 0;
}
return 0;
}
>>
#token "[\ \t]+" <<zzskip();>>
#token "\n" <<zzskip(); zzline++;>>
#token Assign "="
#token Semi ";"
stat: <<which()>>? list Assign list ";" <<printf("list = list\n");>>
| list ";" <<printf("list\n");>>
;
list: "\(" elem ("," elem)* "\)"
;
elem: ID
| INT
;
#token ID "[a-zA-Z]+"
#token INT "[0-9]+"
Page 8
PCCTS
The infinite lookahead buffer may be accessed with the following mac-
ros:
ZZINF_LA(i)
Return the ith token of lookahead relative to the current posi-
tion. Hence, ZZINF_LA(1)..ZZINF_LA(k) are equivalent to
LA(1)..LA(k). The difference is that i can range from the
current token of lookahead until the last token of lookahead with
ZZINF_LA(i).
ZZINF_LATEXT(i)
Identical to ZZINF_LA(i) except that the text of the ith token is
returned.
ZZINF_LA_VALID(i)
Returns 1 if i if at least i non-EOF tokens are left in the input
stream else it returns 0.
Naturally, the use of infinite lookahead by defining ZZINF_LOOK is
inconsistent with interactive parsers as the entire input stream is
read in before parsing begins.
As mentioned above, this method could be tedious for large gram-
mars, hence, ANTLR provides a more elegant solution. The same problem
can be solved with a syntactic predicate by changing rule stat in the
following way:
stat: ( list Assign list ";" )? <<printf("list = list\n");>>
| list ";" <<printf("list\n");>>
;
Using this implicit method, the need for the semantic predicate and
the which() function disappears.
Let's now consider a small chunk of the vast C++ declaration syn-
tax. Can you tell exactly what type of object f is after having seen
the left parenthesis?
int f(
The answer is "no.". Object f could be an integer initialized to some
previously defined symbol a:
int f(a);
or a function prototype or definition:
Page 9
PCCTS
int f(float a) {...}
The following is a greatly simplified grammar for these two declara-
tion types:
decl : type ID "\(" expr_list "\)" ";"
| type ID "\(" arg_decl_list "\)" func_def
;
One notices that left-factoring type ID "\(" would be trivial because
our grammar is so small and the left-prefixes are identical. However,
if a user action were required before recognition of the reference to
rule type, left-factoring would not be possible:
decl : <</* dummy init action so next action is not taken as init */>>
<<printf("var init\n");>> type ID "\(" expr_list "\)" ";"
|
<<printf("func def\n");>> type ID "\(" arg_decl_list "\)" func_def
;
The solution to the problem involves looking arbitrarily ahead (type
could be arbitrarily big, in general) to determine what appears after
the left-parenthesis. This problem is easily solved implicitly by
using the new (...)? syntactic predicate:
decl :
( <<;>> <<printf("var init\n");>> type ID "\(" expr_list "\)" ";" )?
|
<<printf("func def\n");>> type ID "\(" arg_decl_list "\)" func_def
;
The (...)? says that it is impossible to decide, from the left edge of
rule decl with a finite amount of lookahead, which production to
predict. Any grammar construct inside a (...)? block is attempted
and, if it fails, the next alternative production that could match the
input is attempted. This represents selective backtracking and is
similar to allowing ANTLR parsers to guess without being "penalized"
for being wrong. Note that the first action of any block is con-
sidered an init action and, hence, cannot be disabled (by placing it
inside {...}) since it may define variables; the first action of the
block is a dummy action.
At this point, some readers may argue that scanning ahead arbi-
trarily far, using the infinite lookahead via a semantic or syntactic
predicate, renders the parser non-linear in nature. While this is
true, the slowdown is negligible as the parser is mostly linear.
Further, it is better to have a capability that is slightly ineffi-
cient than not to have the capability at all.
Page 10
PCCTS
3.2. Syntactic Predicates
Just as semantic predicates indicate when a production is valid,
syntactic predicates also indicate when a production is a candidate
for recognition. The difference lies in the type of information used
to predict alternative productions. Semantic predicates employ infor-
mation about the "meaning" of the input (e.g., symbol table informa-
tion) whereas syntactic predicates employ structural information like
normal LL(k) parsing decisions. Syntactic predicates specify a gram-
matical construct that must be seen on the input stream for a produc-
tion to be valid. Moreover, this construct may match input streams
that are arbitrarily long; normal LL(k) parsers are restricted to
using the next k symbols of lookahead. This section describes the
form and use of syntactic predicates as well as their implementation.
3.2.1. Syntactic Predicate Form
Syntactic predictions have the form
( A )? B
or, the shorthand form
( A )?
which is identical to
( A )? A
where A and B are arbitrary Extended BNF (EBNF) grammar fragments that
do not define new nonterminals. The notation is similar to the (A)*
and (A)+ closure blocks already present in PCCTS. The meaning of the
long form syntactic predicate is: "If A is matched on the input
stream, attempt to recognize B." Note the similarity to the semantic
predicate:
<<A>>? B
which means: "If A evaluates to true at parser run-time, attempt to
match B."
Decisions, which are nondeterministic (non-LL(k) for finite k),
are resolved via (..)? in the following manner:
Page 11
PCCTS
a : Y_1
| Y_2
...
| ( A_i )? Y_i
...
| Y_j
...
| Y_n
;
where productions i and j are mutually nondistinguishable from the
left-edge. If production i fails, production j will be attempted.
Typically, the number of syntactic predicates employed is n-1 where n
is the number of mutually nondeterministic productions in a decision;
the last production is attempted by default.
When a production to be predicted must be predicted with itself
(nothing less sophisticated is sufficient) or when efficiency is not a
major concern, the short form is used:
a : Y_1
| Y_2
...
| ( Y_i )?
...
| Y_j
...
| Y_n
;
3.2.2. Modified LL(k) Parsing Scheme
Decisions that are not augmented with syntactic predicates are
parsed deterministically with finite lookahead up to depth k as is
normal for PCCTS-generated parsers. When at least one syntactic
predicate is present in a decision, rule recognition proceeds as fol-
lows:
[1] Find the first viable production; i.e. the first production in
the alternative list predicted by the current finite lookahead,
according to the associated finite-lookahead prediction-
expression.
[2] If the first element in that production is not a syntactic predi-
cate, predict that production and go to [4] else attempt to match
its predicting grammar fragment.
[3] If the grammar fragment is matched, predict the associated pro-
duction and go to [4] else find the next viable production and go
Page 12
PCCTS
to [2].
[4] Proceed with the normal recognition of the production predicted
in [2] or [3].
For successful predicates, both the predicting grammar fragment and
the remainder of the production are actually matched, hence, the short
form, (A)?, actually matches A twice - once to predict and once to
apply A normally.
3.2.3. Syntactic Predicate Placement
Syntactic predicates may only appear as the first element of a
production because that is the only place decisions are required. For
example, the (..)? block in the first production of following grammar
has little utility.
a : Y_1 ( A )? B
| Y_2
...
| Y_n
;
There is no question that B is to be matched after \ and trying to
predict this situation is redundant. 1
Syntactic predicates may appear on the left edge of any produc-
tion within any subrule, not just in productions at the rule block
level.
3.2.4. Nested Syntactic Predicate Invocation
Because syntactic predicates may reference any defined nontermi-
nal and because of the recursive nature of grammars, it is possible
for the parser to return to a point in the grammar which had already
requested backtracking. This nested invocation poses no problem from
a theoretical point of view, but can cause unexpected parsing delays
in practice.
3.2.5. Grammar Fragments within Syntactic Predicates
The grammar fragments within (A)? may be any valid PCCTS produc-
tion right-hand-side; i.e. any expression except new nonterminal
definitions. A may contain semantic actions and semantic predicates,
although only the semantic predicates will be executed during predic-
tion.
3.2.6. Efficiency
In terms of efficiency, the order of alternative productions in a
decision is significant. Productions in a PCCTS grammar are always
Page 13
PCCTS
attempted in the order specified. For example, the parsing strategy
outline above indicates that the following rule is most efficient when
\ is less complex than \ .
1 2
a : ( Y_1 )?
| Y_2
;
Any parsing decisions made inside a (..)? block are made deter-
ministically unless they themselves are prefixed with syntactic predi-
cates. For example,
a : ( (A)+ X | (B)+ X )?
| (A)* Y
;
specifies that the parser should attempt to match the nonpredicated
subrule
( (A)+ X
| (B)+ X
)
using normal the normal finite-lookahead parsing strategy. If a sen-
tence recognizable by this grammar fragment is found on the input
stream, then restore the state of the parser to what it was before the
predicate invocation and parse the grammar fragment again; else, if
the attempt failed, apply the next production in the outer block:
(A)* Y
3.2.7. Resolving Ambiguous C++ Statements
Quoting from Ellis and Stroustrup ["The Annotated C++ Reference
Manual," Margaret A. Ellis and Bjarne Stroustrup, Addison Wesley Pub-
lishing Company; Reading, Massachusetts; 1990],
"There is an ambiguity in the grammar involving expression-
statements and declarations... The general cases cannot be
resolved without backtracking... In particular, the looka-
head needed to disambiguate this case is not limited."
The authors use the following examples to make their point:
Page 14
PCCTS
T(*a)->m=7; // expression-statement
T(*a)(int); // declaration
Clearly, the two types of statements are not distinguishable from the
left as an arbitrary amount of symbols may be seen before a decision
can be made; here, the -> symbol is the first clue that the first
example is a statement. Quoting Ellis and Stroustrup further,
"In a parser with backtracking the disambiguating rule can
be stated very simply:
[1] If it looks like a declaration, it is; otherwise
[2] if it looks like an expression, it is; otherwise
[3] it is a syntax error."
The solution in PCCTS using syntactic predicates is simply:
stat: (declaration)?
| expression
;
The semantics of rule stat are exactly that of the quoted solution.
The production declaration will, however, be recognized twice upon a
valid declaration and once upon an expression to decide that it is not
a declaration.
3.2.8. Revisiting the ML Example
To illustrate the utility of the full form of syntactic predi-
cates, reconsider the grammar for the ML-style statements provided in
the example section above:
stat: list "=" list ";"
| list ";"
;
Rule stat is not LL because list could be arbitrarily long and, hence,
predicting which production to apply beforehand is impossible with a
finite lookahead depth. There are two solutions in using syntactic
predicates, one more efficient than the other. The first method is,
as before, to specify:
stat: (list "=" list ";")?
| list ";"
;
However, this specification unnecessarily matches the list following
Page 15
PCCTS
the assignment operator twice. A more efficient, but functionally
equivalent, specification is as follows:
stat: (list "=")? list "=" list ";"
| list ";"
;
This description indicates that, as soon as the "=" has been seen, the
first production is uniquely predicted.
3.2.9. Syntactic Predicates Effect on Grammar Analysis
ANTLR still constructs normal LL(k) decisions throughout predi-
cated parsers. Only when necessary are arbitrary lookahead predictors
used. Constructing LL(k) parsers is an exponential problem that ANTLR
goes to great lengths to avoid or reduce in size on average. Unfor-
tunately, for large grammars and k values of more than 2 or 3 ANTLR
can take an impractical amount of time. Part of the benefit of (..)?
blocks is that, by definition, they defy LL(k) analysis. Hence, the
exponential, full LL(k) grammar analysis is turned off for any produc-
tion beginning with a syntactic predicate. In its place, a linear
approximation to LL(k) analysis, called LL'(k), is used. This reduces
the number of times that arbitrary lookahead (..)? blocks are
attempted unnecessarily, though no finite lookahead decision is actu-
ally required as the arbitrary lookahead mechanism will accurately
predict the production.
If the current finite lookahead can predict which production to
apply, syntactic predicates are not evaluated. For example, referring
to the C++ declaration versus expression grammar example above, if the
current input token were 42, rule stat would immediately attempt the
second production - expression. On the other hand, if the current
input token were ID, then the declaration rule would be attempted
before attempting expression. If neither productions successfully
match the input, a syntax occurs.
When constructing finite lookahead sets, the grammar fragment
within the (..)? block is ignored. In other words, FIRST ((A)? B) is
FIRST (B). k
k
3.2.10. The Effect of Nondeterminism upon Translation and Semantic
Predicates
Syntactic predicates are, by definition, not guaranteed to match
the current input. Therefore, actions with side-effects, for which no
"undo" exists, cannot be executed during nondeterministic syntactic
prediction ("guess" mode). This section describes how ANTLR handles
the execution of user-supplied actions and semantic predicates.
3.2.10.1. The Effect upon User Actions
PCCTS language specifications do not allow the execution of any
Page 16
PCCTS
semantic action during a syntactic prediction as no undo mechanism
exists; this conservative scheme avoids affecting the parser state in
an irreversible manner. The only exception to this rule is that ini-
tialization actions, which usually define variables visible to the
entire rule/function, are not enclosed in if {..} statements to "gate"
them out; hence, initialization actions with side effects must be
avoided by the PCCTS user.
3.2.10.2. The Effect upon Semantic Predicates
Semantic predicates are always evaluated because they are res-
tricted to side-effect-free expressions. During arbitrary lookahead
prediction, the semantic predicates that are evaluated must be func-
tions of values computed when actions were turned on. For example, if
your grammar has a predicate that examines the symbol table, all sym-
bols needed to direct the parse during prediction must be entered into
the table before prediction has begun. Consider the following grammar
fragment which recognizes simplified C declarations.
decl: "typedef" type declarator ";" /* define new type */
| (type declarator "\{")? type declarator func_body /* define function */
| type declarators ";" /* def/decl var(s) */
;
type: built_in_type
| <<is_type(LATEXT(1))>>? ID
;
declarator
: ...
/* recognizes a declarator such as "array[3]" */
/* add symbols, both types and vars, to the symbol table */
;
This rule is unnecessarily inefficient, but will illustrate the
evaluation of semantic predicates during nondeterministic prediction.
For the purposes of our discussion, we restrict new types to be intro-
duced using a typedef (structures and unions are not allowed). Con-
sider the recognition of the two sentences:
typedef int My_int;
My_int i;
The first production of rule decl will match the first sentence,
adding My_int to the symbol table as a type name. Production two of
decl attempts to match the second sentence with its syntactic predi-
cate. Rule type is entered, which evaluates is_type(LATEXT(1)) (where
is_type() is some user-defined function that looks up its symbol argu-
ment in the symbol table and returns true if that symbol is defined
and is a type). Because the text of the current token of lookahead,
Page 17
PCCTS
My_int, is a valid type, the predicate evaluates to true. Production
two of type is applicable semantically and is, therefore, applied.
After consuming My_int, the parser successfully applies declarator to
i. The next input token is ; which does not match . The nondeter-
ministic prediction fails and production three is predicted by default
and is applied.
The second production of rule decl could not be rewritten as
( type declarator func_body )? /* define function */
because, presumably, a func_body could define new types. The actions
that add these new types to the symbol table would not be executed,
however, as the parser would be in nondeterministic mode. Although
the semantic predicates would be evaluated correctly, the symbol table
would not hold the information necessary to parse the function body
during nondeterministic prediction. Also, this revision is very inef-
ficient as it would match the entire function, which could be large,
twice.
3.2.11. Comparing the Use of Semantic and Syntactic Predicates
Language constructs exists that are totally ambiguous syntacti-
cally, but easily distinguishable semantically. For example, array
references and function calls in Fortran are identical syntactically,
but very different semantically. The associated grammatical descrip-
tion is non-LL(k), non-LALR(k), and non-context-free; not even back-
tracking or infinite lookahead will help this problem.
expratom: ID "\(" expr_list "\)"
| ID "\(" expr_list "\)"
...
;
where expr_list is some rule matching a comma-separated expression
list. Putting (..)? around the first alternative production will not
change the fact that both productions match the same sentence. How-
ever, semantic predicates may be used to semantically disambiguate the
rule:
expratom: <<isVar(LATEXT(1))>>? ID "\(" expr_list "\)"
| <<isFunc(LATEXT(1))>>? ID "\(" expr_list "\)"
...
;
Page 18
PCCTS
3.2.12. Implementation
The discussion thus far has described the functionality of syn-
tactic predicates, but their implementation is an equally important
topic so that users can understand the new ANTLR parsing mechanism
(e.g., so that users can follow along in a debugger while tracking
down bugs in their grammar).
Because productions are assumed to be attempted in the order
specified, a nested if-then-else structure is generated. To illus-
trate the integration of syntactic predicates into the normal ANTLR
code generation scheme, consider the following abstract grammar.
a : Y_1
...
| ( A_i )? Y_i
...
| ( A_j )? Y_j
...
| Y_m
;
ANTLR generates the following, "slightly sanitized", code:
Page 19
PCCTS
a()
{
zzGUESS_BLOCK
if ( (T_1 , ..., T_k ) member LOOK_k (Y_1 ) ) {
match Y_1 ;
}
else {
zzGUESS
if ( !zzrv && (T_1 , ..., T_k ) member LOOK_k (Y_i ) ) {
match A_i ;
zzGUESS_DONE
match Y_i ;
}
else {
if ( zzguessing ) zzGUESS_DONE;
zzGUESS
if ( !zzrv && (T_1 , ..., T_k ) member LOOK_k (Y_j ) ) {
match A_j ;
zzGUESS_DONE
match Y_j ;
}
else {
if ( zzguessing ) zzGUESS_DONE;
if ( (T_1 , ..., T_k ) member LOOK_k (Y_m ) ) {
match Y_m ;
}
else goto fail;
}
}
}
return;
fail:
if ( zzguessing ) {zzGUESS_FAIL;}
gen syntax error message;
resynch parser;
}
where LOOK_k(Y_i) is the set of lookahead k-tuples that predict Y_i.
This notation is used as a convenience here whereas ANTLR generates
decisions that use as little lookahead as possible in practice.
The macros/variables themselves are defined as follows:
zzGUESS_BLOCK
Define a block of memory to hold the current parser state and the
return value of setjmp(), zzrv.
zzGUESS
Save the current parser state, turn on guessing mode and call
setjmp() to record the current C run-time stack state. The
result of setjmp() is placed into zzrv.
Page 20
PCCTS
zzGUESS_FAIL
Long jump - restore the C run-time stack to the state it held
before guessing began.
zzGUESS_DONE
Restore the parser state to the previously saved contents.
zzguessing
This variable is 1 if a prediction is currently underway and 0
when normal parsing is proceeding. User actions are turned off
when this variable is 1.
zzrv
This variable is the result of doing the setjmp() call, which
returns 0 always. When a longjmp() occurs, the C run-time stack
will be reset to the state held at the time of the setjmp() and
zzrv will be set to a nonzero value. In the view of the C pro-
gram, it will appear as if the setjmp() has returned without ever
having attempted the code in the if following it; execution con-
tinues past the if the second time.
All semantic actions except initialization actions are enclosed in
zzNON_GUESS_MODE {
user-defined-semantic-action;
}
so that they can be "turned off" during a prediction.
zzNON_GUESS_MODE is defined as follows:
if ( !zzguessing )
The effect of this type of code generation is that a stack of parser
states is maintained such that nested nondeterministic predictions can
be made.
As an optimization, when the prediction grammar fragment for a
production is regular, simpler recognition schemes could be used.
4. DLG Enhancements
There have been a number of changes to dlg from 1.06 to 1.10.
The main difference is that DLG execution speed is up to 7 times fas-
ter than the 1.06 version. A -Wambiguity option has been added to
indicate where ambiguities in DLG specifications exists. It numbers
the expressions and prints out for an accept state the possible
expressions that could be recognized. Also, a macro called ANTLRs()
has been added that behaves just like ANTLR() except that it accepts
input from a string rather than a stream:
Page 21
PCCTS
#define ANTLRs(rule, string) {...}
5. Linear-Approximation Lookahead
ANTLR-generated parsers predict which rule alternative to match
by examining up to k symbols of lookahead. Unfortunately, computing
(during ANTLR grammar analysis) and examining (during parser execu-
tion) the set of possible k-sequences is an exponentially large prob-
lem. A linear approximation to this full lookahead exists that
requires linear time to compute and to test; further, this approxima-
tion handles the majority of parsing lookahead decisions. To avoid
the, possibly exponential, computation of full lookahead, ANTLR
attempts to use the linear approximation first - computing full looka-
head as a last resort. The reason that ANTLR occasionally goes "off
the deep end" when analyzing some big grammars is that ANTLR found a
parsing decision that could not be solved with the approximate looka-
head and required exponential time to compute the full lookahead.
Because the approximation has linear time and space complexity,
its lookahead depth can be much deeper than that of the full looka-
head. Consequently, the approximate lookahead is sometimes stronger
than the full lookahead because it can look farther ahead without con-
suming an impractical amount of system resources. We have added an
ANTLR option, called -ck n, that allows the user to specify how deep
the linear approximation analysis should go before giving up and try-
ing the full lookahead computation. This new feature is best
described with an example:
a : (A B|C D) E
| A D F
;
The full LL(2) lookahead (as would be computed by "antlr -k 2 ...") is
summarized in the following table
LL(2)
________________________
|Lookahead | Alternative |
|__________|_____________|
| A B | 1 |
| C D | 1 |
| A D | 2 |
|__________|_____________|
whereas the linear approximate lookahead, denoted LL'(2) (as would be
computed by "antlr -ck 2 ..."), is
Page 22
PCCTS
LL'(2)
__________________________
| Lookahead | Alternative |
|____________|_____________|
|{A,C} {B,D} | 1 |
| {A} {D} | 2 |
|____________|_____________|
where lookahead {A,C} {B,D} implies that the first symbol of lookahead
can be either A or C and the second can be either B or D; this looka-
head therefore matches the set of sequences {A B, A D, C B, C D},
which is like the cross product of the sets (note the loss of sequence
information, which results in the approximation). The decision is
LL(2), but is not LL'(2) because the sequence A D predicts both alter-
natives (i.e., A can be seen first by both and D can be seen second).
However, if ANTLR were allowed to look 3 symbols ahead - LL'(3) - the
linear approximation would be sufficient and the complex LL(3) would
not be computed. The LL'(3) information ("antlr -ck 3") is summarized
in the following table:
LL'(3)
______________________________
| Lookahead | Alternative |
|________________|_____________|
|{A,C} {B,D} {E} | 1 |
| {A} {D} {F} | 2 |
|________________|_____________|
Notice that, now, the third symbol of lookahead alone can uniquely
identify which alternative to predict.
Let's augment our example to have one LL(2) decision and one
LL'(3) decision:
a : (A B | C D) E /* LL'(3) */
| A D F b
;
b : (A B | C D) Z /* LL(2) */
| A D Z
;
Although LL(3) ("antlr -k 3 ...") handles both the LL'(3) and LL(2)
decisions, we can make ANTLR and the resulting parser more efficient
by specifying "antlr -ck 3 -k 2 ...". The resulting parser decisions
are illustrated in the following (sanitized) code fragment:
Page 23
PCCTS
a()
{
/* there are no sequence comparisons for rule 'a' because LL'(3)
* is sufficient and full LL(3) analysis is not invoked
*/
if ( LA(1)member{A,C} && LA(2)member{B,D} && LA(3)==E ) {
if ( (LA(1)==A) ) {
zzmatch(A); zzCONSUME;
zzmatch(B); zzCONSUME;
}
else if ( (LA(1)==C) ) {
zzmatch(C); zzCONSUME;
zzmatch(D); zzCONSUME;
}
zzmatch(E); zzCONSUME;
}
else if ( (LA(1)==A) && (LA(2)==D) && (LA(3)==F) ) {
zzmatch(A); zzCONSUME;
zzmatch(D); zzCONSUME;
zzmatch(F); zzCONSUME;
b();
}
}
b()
{
/* LL(2) decision */
if ( (LA(1)==A && LA(2)==B) || (LA(1)==C && LA(2)==D) ) {
if ( (LA(1)==A) ) {
zzmatch(A); zzCONSUME;
zzmatch(B); zzCONSUME;
}
else if ( (LA(1)==C) ) {
zzmatch(C); zzCONSUME;
zzmatch(D); zzCONSUME;
}
}
else if ( LA(1)==A && LA(2)==D ) {
zzmatch(A); zzCONSUME;
zzmatch(D); zzCONSUME;
zzmatch(Z); zzCONSUME;
}
}
These examples are small and, hence, the savings are not
apparent, but the "compressed" approximation to full lookahead can be
used to reduce the ANTLR execution time and resulting parser
speed/size.
Page 24
PCCTS
6. Faster Compilation of ANTLR-Generated Parsers
Previous versions of ANTLR used macros rather than function calls
for many operations during parsing. Because the macros were invoked
numerous times, compilation of these files was slow and generated
large object files. The operations are now, by default, function
calls which makes compilation about 2 times as fast and results in
object files about half as large. The macros can be used if necessary
by defining ZZUSE_MACROS on the compile line (-DZZUSE_MACROS).
7. Linking Together Multiple ANTLR Parsers
Because of the lack of sophisticated "information hiding" in C,
many ANTLR program symbols are globally visible and, hence, linking
multiple ANTLR-generated parsers together would cause many name colli-
sions. To overcome this, we have introduced a new ANTLR directive:
#parser "my_parser_name"
which prefixes all global, externally visible symbols with prefix
my_parser_name_ (remember this when debugging ANTLR parsers). This,
clearly, renders the previous "generate prefix" option (-gp) obsolete.
Variables, functions and rule names are remapped through the inclusion
of a file called remap.h, which is automatically generated by ANTLR
when it encounters a #parser directive. In the future, we expect this
to be the name of a C++ object of some class Parser; variables and
functions will be referenced as my_parser_name.var_or_func.
Consider the following ANTLR example. Files t.g and t2.g are
identical except for the parser name.
File t.g
#header <<#include "charbuf.h">>
#parser "t"
<<
void parse_t()
{
ANTLR(a(), stdin);
}
>>
#token "[\ \t\n]" <<zzskip();>>
a : INT INT
;
#token INT "[0-9]+"
Page 25
PCCTS
File t2.g
#header <<#include "charbuf.h">>
#parser "t2"
<<
void parse_t2()
{
ANTLR(a(), stdin);
}
>>
#token "[\ \t\n]" <<zzskip();>>
a : INT INT
;
#token INT "[0-9]+"
File main.c
#include <stdio.h>
extern void parse_ter();
extern void parse_ter2();
main()
{
parse_ter();
parse_ter2();
}
File makefile
Page 26
PCCTS
DLG_FILE = parser.dlg
ERR_FILE = err.c
HDR_FILE = stdpccts.h
TOK_FILE = tokens.h
K = 1
ANTLR_H = ../h
BIN = ../bin
ANTLR = ../bin/antlr
DLG = $(BIN)/dlg
CFLAGS = -I. -I$(ANTLR_H) -g
AFLAGS = -fe err.c -fl parser.dlg -ft tokens.h -fr remap.h -fm mode.h \
-gt -gk
AFLAGS2= -fe t2_err.c -fl t2_parser.dlg -ft t2_tokens.h -fr t2_remap.h \
-fm t2_mode.h -gt -gk
DFLAGS = -C2 -i
GRM = t.g
SRC1 = scan.c t.c err.c
SRC2 = t2.c t2_scan.c t2_err.c main.c
OBJ1 = scan.o t.o err.o
OBJ2 = t2.o t2_scan.o t2_err.o main.o
CC=g++
t: $(OBJ1) $(OBJ2)
$(CC) -o t $(CFLAGS) $(OBJ1) $(OBJ2)
t.o : mode.h tokens.h t.g
scan.c mode.h : parser.dlg
$(DLG) $(DFLAGS) parser.dlg scan.c
t.c parser.dlg tokens.h : t.g
$(ANTLR) $(AFLAGS) t.g
t2.o : t2_mode.h t2_tokens.h t2.g
t2_scan.c t2_mode.h : t2_parser.dlg
$(DLG) $(DFLAGS) -m t2_mode.h t2_parser.dlg t2_scan.c
t2.c t2_parser.dlg t2_tokens.h : t2.g
$(ANTLR) $(AFLAGS2) t2.g
The input to the parser is 4 integers because each of the invoked
parsers matches 2.
The preprocessor symbol zzparser is set the parser name string
specified in the #parser "name" directive.
WARNING: the remapping of symbols to avoid collisions is not a
foolproof system. For example, if you have a rule named type and a
field in a structure named type, the field name will get renamed as
well - this is the price you pay for being able to link things
together without C++.
Page 27
PCCTS
8. Creating Customized Syntax Error Routines
Many users have asked how to create their own zzsyn() error han-
dling routine. Here's how:
[1] Make new zzsyn() with same parameters.
[2] Define the preprocessor symbol USER_ZZSYN on the compile line (-
DUSER_ZZSYN).
9. Lexical Changes to ANTLR Input
The manner in which ANTLR interprets user actions has changed.
Strings, character literals, and C/C++ comments are now totally
ignored. For example,
<<
// nothing in here is examined $1 ' "
/* or in here ' " $ #[jfd] '"''''" */
'"' // that's a character
"'" // that's an apostrophe
"$1 is", $1 /* $1 inside string is ignored */
>>
As a result of this change, you may experience a slight difference in
how ANTLR treats your actions. Comments inside actions are still
passed through to the parser.
C++ comments are now accepted outside of actions as well:
// this rule does nothing
a : ;
Watch out for this:
...
<< // a comment >>
...
a : A
;
The C++ style comment in side the action will scarf til end of line
and ignore the >> end action symbol. This could be avoided, but I'm
feeling lazy just now.
10. New ANTLR Options
Release 1.10 introduces the following ANTLR command-line options:
Page 28
PCCTS
- ANTLR now accepts input from stdin by using the - option; e.g.,
antlr -
A file called stdin.c is created as the output parser.
-ck nUse up to n symbols of lookahead when using compressed (linear
approximation) lookahead. This type of lookahead is very cheap
to compute and is attempted before full LL(k) lookahead, which is
of exponential complexity in the worst case. In general, the
compressed lookahead can be much deeper (e.g, -ck 10) than the
full lookahead (which usually must be less than 4).
-fm mode_file
Rename file with lexical mode definitions, mode.h, to file.
-fr file
Rename file which remaps globally visible symbols, remap.h, to
file. This file is only created if a #parser directive is found.
-prc on
Turn on the computation and hoisting of predicate context.
-prc off
Turn off the computation and hoisting of predicate context. This
option makes 1.10 behave like the 1.06 release with option -pr on
(default).
-w1 Set low warning level. Do not warn if semantic predicates and/or
(...)? blocks are assumed to cover ambiguous alternatives.
-w2 Ambiguous parsing decisions yield warnings even if semantic
predicates or (...)? blocks are used. Warn if -prc on and some
lookahead sequences are not disambiguated with a hoisted predi-
cate.
11. ANTLR Generates "Androgynous" Code
The distribution source of 1.06 PCCTS was generated using 1.06 on
a 32-bit machine. Unfortunately, the source code dumped bit sets to
arrays of unsigned's according to the word size of the machine that
generated the parser - regardless of the word size of the various tar-
get machines. To overcome this, ANTLR always dumps its bit sets as
arrays of unsigned char, which are 8 bits (or more) on any machine
that we'd ever want to work on. As a result, ANTLR itself should
bootstrap on any machine with a C compiler a enough memory. We have
gotten it to compile with 16-bit Microsoft and Borland C on the PC
with only a few whimpers. The makefiles in the ANTLR and DLG direc-
tories have sections for each of the various compilers.
Page 29
PCCTS
12. Printing out grammars
Using the -p option generates grammar listings that are somewhat
nicer.
13. C Grammar Changes
The C grammar example has been augmented with a -both option that
prints out both K&R and ANSI C prototypes for functions defined in the
input file. E.g.
% proto -both
void f(a,b)
int a;
char *b;
{;}
^D
void
#ifdef __STDC__
f( int a, char *b )
#else
f( a, b )
int a;
char *b;
#endif
Functions that already employ ANSI C style argument definitions are
handled as well.
14. C++ Now Compiles ANTLR Itself
We have modified the source code of ANTLR to compile under C++.
It is not written to take advantage of C++'s extensions to C, however,
except in rare instances. C++'s stricter type checking motivated the
modification.
15. New Preprocessor Symbol
ANTLR now generates a #define called ANTLR_VERSION that is set to
the version of ANTLR that generated the parser. For this release, you
will see:
#define ANTLR_VERSION 110
in the output files, which is an integer equivalent of the version
number.
16. Attribute Warning
A number of users have had trouble with the charptr.h attributes.
Page 30
PCCTS
Please note that they do not make copies and that the memory is freed
after the scope exits. For example, this is wrong because the memory
for the $1 attribute of A or B in the (...) scope will be freed upon
exit even though $0 will still point to it.
#header<<#include "charptr.h">>
<<
#include "charptr.c"
main() { ANTLR(a(),stdin); }
>>
#token "[\ \t\n]" << zzskip(); >>
a: "ick" ("A" << $0=$1; >>| "B" << $0=$1; >>) "ugh"
<< printf("$1, $2, $3 are %s, %s, %s\n",$1, $2, $3); >>
;
One should make a copy of the local attribute (or use charbuf.h) as
the mem is freed at the end of the scope ($0=strdup($1);).
17. Generation of Line Information
The normal form of line information is:
# line_number "file"
However, many compilers, such as Borland C, prefer it as
#line line_number "file"
This can be easily changed by looking file generic.h in the antlr
directory for the following:
/* User may redefine how line information looks */
#define LineInfoFormatStr "# %d \"%s\"\n"
Simply change it as your compiler wants it and recompile the antlr
source.
18. Incompatibilities
There should be very few incompatibilities with your 1.06-based
grammars. Should you find any please let us know.
1.06 semantic predicates were not hoisted into parsing decisions
Page 31
PCCTS
without the -pr flag (now obsolete). In 1.10, the use of a predicate
indicates that it may be hoisted.
Semantic predicates used to halt parser upon failure whereas 1.10
does not.
The interpretation of strings, character literals, and comments
are now handled differently; see above.
19. Future Directions
This section briefly describes some of the future enhancements
either being discussed, planned, or developed.
o A graphical user interface is planned for ANTLR grammars that
will allow the simultaneous display/manipulation of BNF and syn-
tax diagram representations of user grammars.
o A source-to-source translator-generator called SORCERER is in
prototype form. It's input looks like ANTLR and is integrated so
that one description will contain lexical, syntactic, and tree-
translation information.
o A number of groups are working on a C++ grammar. Things are
starting to heat up as it is pretty much certain that 1.10 ANTLR
is the minimum necessary system to parse C++.
o A code-generator generator, called PIGG, is in prototype form.
o An assembler generator is in prototype form.
o DLG backtracking will be added.
o A new "magic" token type, ".", will be introduced which means
"match any single token."
o A new operator will be introduced, "~", which will allow con-
structs like ~(A|B|C) - implying "match a single token not from
the set {A, B, C}."
o A new ANTLR directive will be introduced:
#tokclass name { token_list }
that creates a set of tokens like the #errclass directive, but
one which can be referenced in the grammar. For example:
Page 32
PCCTS
#tokclass AOP { "-" "+" }
#tokclass MOP { "/" " }
#tokclass OP { AOP MOP }
...
e : e1 ( AOP e1 )* ;
e1: e2 ( MOP e2 )* ;
...
o Simple left-factoring will be introduced to remove identical left
factors from alternative productions (assuming user actions do
not interfere).
o A version of ANTLR (called ANTLR-lite?) is being considered that
would accept most ANTLR description syntax, delay grammar
analysis to run time (where it could be done much more quickly -
with nonexponential complexity), scan for tokens with an NFA
regular-expression interpreter rather a DFA, and place all output
in one nice little file. The reduction in parser size would be
substantial, but at a parser run-time cost.
20. Portability
PCCTS 1.10 is known to compile "out of the box" on the following
machines and/or operating systems:
[1] DECSTATION 5000
[2] SGI, Running IRIX 4.0.5
[3] Sun SparcStation (cc, gcc, g++, Cfront)
[4] DOS and OS/2, Microsft C 6.0, 16 bit
[5] DOS and OS/2, Borland C/C++, 16 bit
[6] OS/2, IBM C-Set/2, 32 bit
[7] VAX C under VMS
[8] Linux SLS 0.99, gcc/g++
[9] NeXT box
[10] Amiga, AmigaDOS--SAS/C Development System V 5.10b
Page 33
PCCTS
21. Beta Testers
The following is a group of persons (listed alphabetically) that,
in some way, have helped shape and/or debug the latest release of
PCCTS.
[1] Steven Anderson, (sea@ahpcrc.umn.edu)
[2] Douglas B. Cuthbertson, (cuthbertsond@gw1.hanscom.af.mil)
[3] Peter Dahl, (dahl@mckinley.ee.umn.edu)
[4] Ed Harfmann, (mdbs!ed@dynamo.ecn.purdue.edu)
[5] Randy Helzerman, (helz@ecn.purdue.edu)
[6] Stephen Hite, (shite@sinkhole.unf.edu)
[7] Dana "Muck" Hoggatt, (mdbs!muck@dynamo.ecn.purdue.edu)
[8] Roy Levow, (roy@gemini.cse.fau.edu)
[9] John Mejia, (mejia@mckinley.ee.umn.edu)
[10] David Poole, (dpoole@nitrogen.oscs.montana.edu)
[11] Russell Quong, (quong@ecn.purdue.edu)
[12] Aaron Sawdey, (sawdey@mckinley.ee.umn.edu)
[13] Fred Scholldorf, (scholldorf@nuclear.physics.sunysb.edu)
[14] Sumana Srinivasan, (Sumana_Srinivasan@next.com)
[15] Ariel Tamches, (tamches@cs.wisc.edu)
Page 34