Contents | < Browse | Browse >
From Formal Rules to Bison Input
================================
A formal grammar is a mathematical construct. To define the language
for Bison, you must write a file expressing the grammar in Bison syntax:
a "Bison grammar" file. Grammar File
A nonterminal symbol in the formal grammar is represented in Bison
input as an identifier, like an identifier in C. By convention, it
should be in lower case, such as `expr', `stmt' or `declaration'.
The Bison representation for a terminal symbol is also called a
"token type". Token types as well can be represented as C-like
identifiers. By convention, these identifiers should be upper case to
distinguish them from nonterminals: for example, `INTEGER',
`IDENTIFIER', `IF' or `RETURN'. A terminal symbol that stands for a
particular keyword in the language should be named after that keyword
converted to upper case. The terminal symbol `error' is reserved for
error recovery. Symbols .
A terminal symbol can also be represented as a character literal,
just like a C character constant. You should do this whenever a token
is just a single character (parenthesis, plus-sign, etc.): use that
same character in a literal as the terminal symbol for that token.
The grammar rules also have an expression in Bison syntax. For
example, here is the Bison rule for a C `return' statement. The
semicolon in quotes is a literal character token, representing part of
the C syntax for the statement; the naked semicolon, and the colon, are
Bison punctuation used in every rule.
stmt: RETURN expr ';'
;
Rules