home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
OS/2 Shareware BBS: 10 Tools
/
10-Tools.zip
/
pccts.zip
/
pccts
/
antlr
/
antlr1.txt
< prev
next >
Wrap
Text File
|
1994-03-31
|
15KB
|
397 lines
ANTLR(1) PCCTS Manual Pages ANTLR(1)
NNAAMMEE
antlr - ANother Tool for Language Recognition
SSYYNNTTAAXX
aannttllrr [_o_p_t_i_o_n_s] _g_r_a_m_m_a_r___f_i_l_e_s
DDEESSCCRRIIPPTTIIOONN
_A_n_t_l_r converts an extended form of context-free grammar
into a set of C functions which directly implement an
efficient form of deterministic recursive-descent LL(k)
parser. Context-free grammars may be augmented with pred-
icates to allow semantics to influence parsing; this
allows a form of context-sensitive parsing. Selective
backtracking is also available to handle non-LL(k) and
even non-LALR(k) constructs. _A_n_t_l_r also produces a defi-
nition of a lexer which can be automatically converted
into C code for a DFA-based lexer by _d_l_g. Hence, _a_n_t_l_r
serves a function much like that of _y_a_c_c, however, it is
notably more flexible and is more integrated with a lexer
generator (_a_n_t_l_r directly generates _d_l_g code, whereas _y_a_c_c
and _l_e_x are given independent descriptions). Unlike _y_a_c_c
which accepts LALR(1) grammars, _a_n_t_l_r accepts LL(k) gram-
mars in an extended BNF notation -- which eliminates the
need for precedence rules.
Like _y_a_c_c grammars, _a_n_t_l_r grammars can use automatically-
maintained symbol attribute values referenced as dollar
variables. Further, because _a_n_t_l_r generates top-down
parsers, arbitrary values may be inherited from parent
rules (passed like function parameters). _A_n_t_l_r also has a
mechanism for creating and manipulating abstract-syntax-
trees.
There are various other niceties in _a_n_t_l_r, including the
ability to spread one grammar over multiple files or even
multiple grammars in a single file, the ability to gener-
ate a version of the grammar with actions stripped out
(for documentation purposes), and lots more.
OOPPTTIIOONNSS
--cckk _n Use up to _n symbols of lookahead when using com-
pressed (linear approximation) lookahead. This
type of lookahead is very cheap to compute and is
attempted before full LL(k) lookahead, which is of
exponential complexity in the worst case. In gen-
eral, the compressed lookahead can be much deeper
(e.g, --cckk 1100) than the full lookahead (which usu-
ally must be less than 4).
--CCCC Generate C++ output from both ANTLR and DLG.
--ccrr Generate a cross-reference for all rules. For each
rule, print a list of all other rules that refer-
ence it.
ANTLR April 1994 1
ANTLR(1) PCCTS Manual Pages ANTLR(1)
--cctt Do not make copies of tokens passed to the parser
in C++ mode (default=to copy). When using DLG in
conjunction with ANTLR, you will always want ANTLR
to make copies because DLG only has space for one
AANNTTLLRRTTookkeenn (which is passed to the scanner with
sseettTTookkeenn); this address is always returned and,
hence, without copies, all $-variables would point
to the same AANNTTLLRRTTookkeenn.
--ee11 Ambiguities/errors shown in low detail (default).
--ee22 Ambiguities/errors shown in more detail.
--ee33 Ambiguities/errors shown in excruciating detail.
--ffee file
Rename eerrrr..cc to file.
--ffhh file
Rename ssttddppccccttss..hh header (turns on --gghh) to file.
--ffll file
Rename lexical output, ppaarrsseerr..ddllgg, to file.
--ffmm file
Rename file with lexical mode definitions, mmooddee..hh,
to file.
--ffrr file
Rename file which remaps globally visible symbols,
rreemmaapp..hh, to file.
--fftt file
Rename ttookkeennss..hh to file.
--ggaa Generate ANSI-compatible code (default case). This
has not been rigorously tested to be ANSI XJ11 C
compliant, but it is close. The normal output of
_a_n_t_l_r is currently compilable under both K&R, ANSI
C, and C++--this option does nothing because _a_n_t_l_r
generates a bunch of #ifdef's to do the right thing
depending on the language.
--ggcc Indicates that _a_n_t_l_r should generate no C code,
i.e., only perform analysis on the grammar.
--ggdd C code is inserted in each of the _a_n_t_l_r generated
parsing functions to provide for user-defined han-
dling of a detailed parse trace. The inserted code
consists of calls to the user-supplied macros or
functions called zzzzTTRRAACCEEIINN and zzzzTTRRAACCEEOOUUTT. The
only argument is a _c_h_a_r _* pointing to a C-style
string which is the grammar rule recognized by the
current parsing function. If no definition is
ANTLR April 1994 2
ANTLR(1) PCCTS Manual Pages ANTLR(1)
given for the trace functions, upon rule entry and
exit, a message will be printed indicating that a
particular rule as been entered or exited.
--ggee Generate an error class for each non-terminal.
--gghh Generate ssttddppccccttss..hh for non-ANTLR-generated files
to include. This file contains all defines needed
to describe the type of parser generated by _a_n_t_l_r
(e.g. how much lookahead is used and whether or not
trees are constructed) and contains the hheeaaddeerr
action specified by the user.
--ggkk Generate parsers that delay lookahead fetches until
needed. Without this option, _a_n_t_l_r generates
parsers which always have _k tokens of lookahead
available. This option is incompatible with --pprr
and renders references to LLAA((_i)) invalid as one
never knows when the _i_t_h token of lookahead will be
fetched.
--ggll Generate line info about grammar actions in C
parser of the form ## _l_i_n_e ""_f_i_l_e"" which makes error
messages from the C/C++ compiler make more sense as
they will point into the grammar file not the
resulting C file. Debugging is easier as well,
because you will step through the grammar not C
file.
--ggpp _p_r_e_f_i_x
Prefix all functions generated from rules with _p_r_e_-
_f_i_x. This is now obsolete. Use the #parser "name"
_a_n_t_l_r directive.
--ggss Do not generate sets for token expression lists;
instead generate a ||||-separated sequence of
LLAA((11))====_t_o_k_e_n___n_u_m_b_e_r. The default is to generate
sets.
--ggtt Generate code for Abstract-Syntax Trees.
--ggxx Do not create the lexical analyzer files (dlg-
related). This option should be given when the
user wishes to provide a customized lexical ana-
lyzer. It may also be used in _m_a_k_e scripts to
cause only the parser to be rebuilt when a change
not affecting the lexical structure is made to the
input grammars.
--kk _n Set k of LL(k) to _n; i.e. set tokens of look-ahead
(default==1).
--oo dir Directory where output files should go
(default="."). This is very nice for keeping the
ANTLR April 1994 3
ANTLR(1) PCCTS Manual Pages ANTLR(1)
source directory clear of ANTLR and DLG spawn.
--pp The complete grammar, collected from all input
grammar files and stripped of all comments and
embedded actions, is listed to ssttddoouutt. This is
intended to aid in viewing the entire grammar as a
whole and to eliminate the need to keep actions
concisely stated so that the grammar is easier to
read. Hence, it is preferable to embed even com-
plex actions directly in the grammar, rather than
to call them as subroutines, since the subroutine
call overhead will be saved.
--ppaa This option is the same as --pp except that the out-
put is annotated with the first sets determined
from grammar analysis.
--pprr Obsolete -- used to turn on use of predicates in
parsing decisions in release 1.06. Now, in 1.10,
the specification of a predicate implies that it
should be used. When a syntactic ambiguity is dis-
covered, _a_n_t_l_r searches for predicates that can be
used to disambiguate the decision. Predicates have
dual roles as semantic validation and disambigua-
tion predicates.
--pprrcc oonn
Turn on the computation and hoisting of predicate
context.
--pprrcc ooffff
Turn off the computation and hoisting of predicate
context. This option makes 1.10 behave like the
1.06 release with option --pprr on. Context computa-
tion is off by default.
--rrll _n Limit the maximum number of tree nodes used by
grammar analysis to _n. Occasionally, _a_n_t_l_r is
unable to analyze a grammar submitted by the user.
This rare situation can only occur when the grammar
is large and the amount of lookahead is greater
than one. A nonlinear analysis algorithm is used
by PCCTS to handle the general case of LL(k) pars-
ing. The average complexity of analysis, however,
is near linear due to some fancy footwork in the
implementation which reduces the number of calls to
the full LL(k) algorithm. An error message will be
displayed, if this limit is reached, which indi-
cates the grammar construct being analyzed when
_a_n_t_l_r hit a non-linearity. Use this option if
_a_n_t_l_r seems to go out to lunch and your disk start
thrashing; try _n=10000 to start. Once the offend-
ing construct has been identified, try to remove
the ambiguity that _a_n_t_l_r was trying to overcome
ANTLR April 1994 4
ANTLR(1) PCCTS Manual Pages ANTLR(1)
with large lookahead analysis. The introduction of
(...)? backtracking blocks eliminates some of these
problems -- _a_n_t_l_r does not analyze alternatives
that begin with (...)? (it simply backtracks, if
necessary, at run time).
--ww11 Set low warning level. Do not warn if semantic
predicates and/or (...)? blocks are assumed to
cover ambiguous alternatives.
--ww22 Ambiguous parsing decisions yield warnings even if
semantic predicates or (...)? blocks are used.
Warn if predicate context computed and semantic
predicates incompletely disambiguate alternative
productions.
-- Read grammar from standard input and generate
ssttddiinn..cc as the parser file.
SSPPEECCIIAALL CCOONNSSIIDDEERRAATTIIOONNSS
_A_n_t_l_r works... we think. There is no implicit guarantee
of anything. We reserve no lleeggaall rights to the software
known as the Purdue Compiler Construction Tool Set (PCCTS)
-- PCCTS is in the public domain. An individual or com-
pany may do whatever they wish with source code dis-
tributed with PCCTS or the code generated by PCCTS,
including the incorporation of PCCTS, or its output, into
commercial software. We encourage users to develop soft-
ware with PCCTS. However, we do ask that credit is given
to us for developing PCCTS. By "credit", we mean that if
you incorporate our source code into one of your programs
(commercial product, research project, or otherwise) that
you acknowledge this fact somewhere in the documentation,
research report, etc... If you like PCCTS and have devel-
oped a nice tool with the output, please mention that you
developed it using PCCTS. As long as these guidelines are
followed, we expect to continue enhancing this system and
expect to make other tools available as they are com-
pleted.
FFIILLEESS
*.c output C parser
*.C output C++ parser when C++ mode is used
ppaarrsseerr..ddllgg
output _d_l_g lexical analyzer
eerrrr..cc token string array, error sets and error support
routines
rreemmaapp..hh
file that redefines all globally visible parser
symbols. The use of the #parser directive creates
ANTLR April 1994 5
ANTLR(1) PCCTS Manual Pages ANTLR(1)
this file
ssttddppccccttss..hh
list of definitions needed by C files, not gener-
ated by PCCTS, that reference PCCTS objects. This
is not generated by default.
ttookkeennss..hh
output _#_d_e_f_i_n_e_s for tokens used and function proto-
types for functions generated for rules
SSEEEE AALLSSOO
dlg(1), pccts(1)
ANTLR April 1994 6