home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
The World of Computer Software
/
World_Of_Computer_Software-02-385-Vol-1of3.iso
/
p
/
pccts.zip
/
antlr
/
antlr1.txt
< prev
next >
Wrap
Text File
|
1992-12-08
|
9KB
|
249 lines
antlr - ANother Tool for Language Recognition
antlr [options] grammar_files
Antlr converts an extended form of context-free grammar into a
set of C functions which directly implement an efficient form of
deterministic recursive-descent LL(k) parser. Context-free grammars
may be augmented with predicates to allow semantics to influence pars-
ing; this allows a form of context-sensitive parsing. Antlr also pro-
duces a definition of a lexer which can be automatically converted
into C code for a DFA-based lexer by dlg. Hence, antlr serves a func-
tion much like that of yacc, however, it is notably more flexible and
is more integrated with a lexer generator (antlr directly generates
dlg code, whereas yacc and lex are given independent descriptions).
Unlike yacc which accepts LALR(1) grammars, antlr accepts LL(k) gram-
mars in an extended BNF notation - which eliminates the need for pre-
cedence rules.
Like yacc grammars, antlr grammars can use automatically-
maintained symbol attribute values referenced as dollar variables.
Further, because antlr generates top-down parsers, arbitrary values
may be inherited from parent rules (passed like function parameters).
Antlr also has a mechanism for creating and manipulating abstract-
syntax-trees.
There are various other niceties in antlr, including the ability
to spread one grammar over multiple files or even multiple grammars in
a single file, the ability to generate a version of the grammar with
actions stripped out (for documentation purposes), and lots more.
-cr Generate a cross-reference for all rules. For each rule, print a
list of all other rules that reference it.
-e1 Ambiguities/errors shown in low detail (default).
-e2 Ambiguities/errors shown in more detail.
-e3 Ambiguities/errors shown in excrutiating detail.
-fe file
Rename err.c to file.
-fh file
Rename stdpccts.h header (turns on -gh) to file.
-fl file
Rename lexical output, parser.dlg, to file.
-ft file
Rename tokens.h to file.
Page 1
PCCTS
-ga Generate ANSI-compatible code (default case). This has not been
rigorously tested to be ANSI XJ11 C compliant, but it is close.
The normal output of antlr is currently compilable under both K&R
and ANSI C-this option does nothing, but is reserved for the
future.
-gc Indicates that antlr should generate no C code, i.e., only per-
form analysis on the grammar.
-gd C code is inserted in each of the antlr generated parsing func-
tions to provide for user-defined handling of a detailed parse
trace. The inserted code consists of calls to the user-supplied
macros or functions called zzTRACEIN and zzTRACEOUT. The only
argument is a char * pointing to a C-style string which is the
grammar rule recognized by the current parsing function. If no
definition is given for the trace functions, upon rule entry and
exit, a message will be printed indicating that a particular rule
as been entered or exited.
-ge Generate an error class for each non-terminal.
-gh Generate stdpccts.h for non-ANTLR-generated files to include.
This file contains all defines needed to describe the type of
parser generated by antlr (e.g. how much lookahead is used and
whether or not trees are constructed) and contains the header
action specified by the user.
-gk Generate parsers that delay lookahead fetches until needed.
Without this option, antlr generates parsers which always have k
tokens of lookahead available. This option is incompatible with
-pr and renders references to LA(i) invalid as one never knows
when the ith token of lookahead will be fetched.
-gl Generate line info about grammar actions in C parser of the form
# line "file" which makes error messages from the C/C++ compiler
make more sense as they will "point" into the grammar file not
the resulting C file. Debugging is easier as well, because you
will step through the grammar not C file.
-gp prefix
Prefix all functions generated from rules with prefix.
-gs Do not generate sets for token expression lists; instead generate
a ||-separated sequence of LA(1)==token_number. The default is
to generate sets.
-gt Generate code for Abstract-Syntax Trees.
-gx Do not create the lexical analyzer files (dlg-related). This
option should be given when the user wishes to provide a custom-
ized lexical analyzer. It may also be used in make scripts to
cause only the parser to be rebuilt when a change not affecting
the lexical structure is made to the input grammars.
Page 2
PCCTS
-k n Set k of LL(k) to n; i.e. set tokens of look-ahead (default==1).
-p The complete grammar, collected from all input grammar files and
stripped of all comments and embedded actions, is listed to
stdout. This is intended to aid in viewing the entire grammar as
a whole and to eliminate the need to keep actions concisely
stated so that the grammar is easier to read. Hence, it is
preferable to embed even complex actions directly in the grammar,
rather than to call them as subroutines, since the subroutine
call overhead will be saved.
-pa This option is the same as -p except that the output is annotated
with the first sets determined from grammar analysis.
-pr Turn on use of predicates in parsing decisions (alpha version).
When a syntactic ambiguity is discovered, antlr searches for
predicates that can be used to disambiguate the decision. Predi-
cates have dual roles as semantic validation and disambiguation
predicates.
-rl Limit the maximum number of tree nodes used by grammar analysis
to n. Occasionally, antlr is unable to analyze a grammar submit-
ted by the user. This rare situation can only occur when the
grammar is large and the amount of lookahead is greater than one.
A nonlinear analysis algorithm is used by PCCTS to handle the
general case of LL(k) parsing. The average complexity of
analysis, however, is near linear due to some fancy footwork in
the implementation which reduces the number of calls to the full
LL(k) algorithm. An error message will be displayed, if this
limit is reached, which indicates the grammar construct being
analyzed when antlr hit a non-linearity. Use this option if
antlr seems to go out to lunch and your disk start thrashing; try
n=10000 to start. Once the offending construct has been identi-
fied, try to remove the ambiguity that antlr was trying to over-
come with large lookahead analysis.
Antlr works... we think. There is no implicit guarantee of any-
thing. We reserve no legal rights to the software known as the Purdue
Compiler Construction Tool Set (PCCTS) - PCCTS is in the public
domain. An individual or company may do whatever they wish with
source code distributed with PCCTS or the code generated by PCCTS,
including the incorporation of PCCTS, or its output, into commerical
software. We encourage users to develop software with PCCTS. How-
ever, we do ask that credit is given to us for developing PCCTS. By
"credit", we mean that if you incorporate our source code into one of
your programs (commercial product, research project, or otherwise)
that you acknowledge this fact somewhere in the documentation,
research report, etc... If you like PCCTS and have developed a nice
tool with the output, please mention that you developed it using
PCCTS. As long as these guidelines are followed, we expect to con-
tinue enhancing this system and expect to make other tools available
as they are completed.
Page 3
PCCTS
*.c output C parser
parser.dlg
output dlg lexical analyzer
err.ctoken string array, error sets and error support routines
tokens.h
output #defines for tokens used and function prototypes for func-
tions generated for rules
stdpccts.h
list of definitions needed by C files, not generated by PCCTS,
that reference PCCTS objects. This is not generated by default.
dlg(1), pccts(1)
Page 4