home *** CD-ROM | disk | FTP | other *** search
Text File | 1994-04-01 | 9.9 KB | 385 lines | [TEXT/R*ch] |
- GRAMMAR TOOLS IN ABC
-
- When I have to work with grammars, I always use ABC to do it. Among
- the advantages are that you can do the work interactively, that you
- can very quickly build additional tools, and that you have the already
- powerful programming environment at your disposal.
-
- What follows is a brief description of some of the tools I use, with
- an example. There is no description of ABC: you can find a quick
- description of the language by ftp-ing the file "abc.intro" from
- uunet.uu.net, mcsun.eu.net, or hp4nl.nluug.nl, in directory
- {pub}/{programming}/languages/abc, or by sending the message
-
- request: programming/languages/abc
- topic: abc.intro
-
- to info-server@hp4nl.nluug.nl. The file tells you where to get more
- information about ABC (there's a book), and how to get the
- implementations (they're free).
-
- Some of what follows is also presented in the book, though at a more
- relaxed pace :-).
-
- (For didactic reasons, what is presented here differs in detail from
- the distributed code.)
-
- GRAMMARS
-
- The representation that I use is more or less a direct transcription
- of what a grammar is. I use a table whose keys are texts (i.e.
- strings) representing the nonterminals of the language, and whose
- items are sets of alternatives. Each alternative is a sequence of
- texts, representing terminals and nonterminals. So here is a how-to
- that displays a grammar in this form:
-
- HOW TO DISPLAY grammar:
- FOR name IN keys grammar:
- WRITE "`name`: " /
- FOR alt IN grammar[name]:
- WRITE " "
- FOR symbol IN alt:
- WRITE symbol, " "
- WRITE /
-
- and as example:
-
- >>> DISPLAY sentence
- ADJ:
- EMPTY
- clever
- shy
- BOY:
- John
- Kevin
- EMPTY:
-
- GIRL:
- Mary
- Susan
- OBJ:
- SUBJ
- SENT:
- SUBJ loves OBJ
- SUBJ:
- ADJ BOY
- ADJ GIRL
-
- You can generate a random phrase from a grammar with the following:
-
- HOW TO GENERATE sym FROM grammar:
- SELECT:
- sym in keys grammar: \ Nonterminal
- FOR new IN choice grammar[sym]:
- GENERATE new FROM grammar
- ELSE: \ Terminal symbol
- WRITE sym, " "
-
- >>> GENERATE "SENT" FROM sentence
- Susan loves clever John
-
- SETS
-
- Here are some necessary functions on sets. Set union:
-
- HOW TO RETURN set1 with set2: \ Union
- FOR x IN set2:
- IF x not.in set1:
- INSERT x IN set1
- RETURN set1
-
- Set difference:
-
- HOW TO RETURN set1 less set2: \ Difference
- FOR x IN set2:
- IF x in set1:
- REMOVE x FROM set1
- RETURN set1
-
- Here is a function that collects all symbols used in the rules of a grammar:
-
- HOW TO RETURN used grammar: \Collect all used symbols
- PUT {} IN all
- FOR rule IN grammar:
- FOR alt IN rule:
- FOR sym IN alt:
- IF sym not.in all:
- INSERT sym IN all
- RETURN all
-
- >>> WRITE used sentence
- {"ADJ"; "BOY"; "EMPTY"; "GIRL"; "John"; "Kevin"; "Mary";
- "OBJ"; "SUBJ"; "Susan"; "clever"; "loves"; "shy"}
-
- The terminals of the grammar are all the symbols less the nonterminals:
-
- >>> WRITE (used sentence) less keys sentence
- {"John"; "Kevin"; "Mary"; "Susan"; "clever"; "loves"; "shy"}
-
- and the unused nonterminals (such as the root symbol) are the
- nonterminals less the used symbols:
-
- >>> WRITE (keys sentence) less used sentence
- {"SENT"}
-
- For neater output, the function "listed" converts a set to a text:
-
- HOW TO RETURN listed set:
- PUT "" IN line
- FOR element IN set:
- PUT line ^ "`element` " IN line
- RETURN line
-
- >>> WRITE listed ((used sentence) less keys sentence)
- John Kevin Mary Susan clever loves shy
-
- A useful set is the set of nonterminals that can generate empty. This
- is generated by repeatedly doing a pass over the rules that we don't
- know yet can generate empty, until we find no more:
-
- HOW TO RETURN empties grammar:
- PUT keys grammar, {} IN to.do, empties
- WHILE SOME name IN to.do HAS empty.rule:
- INSERT name IN empties
- REMOVE name FROM to.do
- RETURN empties
- empty.rule:
- REPORT SOME alt IN grammar[name] HAS empty.alt
- empty.alt:
- REPORT EACH sym IN alt HAS sym in empties
-
- >>> WRITE listed empties sentence
- ADJ EMPTY
-
- RELATIONS
-
- Relations between symbols of the grammar are the essential element of
- the grammar tools. A relation is represented as a table whose keys are
- symbols, and whose items are sets of symbols.
-
- For instance, if symbol b follows symbol a in some rule, "b" will be
- in the set for follows["a"], so you can say, for instance:
-
- IF "b" in follows["a"]: ....
-
- Relations are sparse (i.e. a symbol is not in the keys of the relation
- if the set of elements is empty), so we use the following to access a
- relation:
-
- HOW TO RETURN relation for k: \relation[k] for sparse relations
- IF k in keys relation:
- RETURN relation[k]
- RETURN {}
-
- To add an element to a relation, we use this:
-
- HOW TO ADD element TO relation FOR thing:
- IF thing not.in keys relation: \First time
- PUT {} IN relation[thing]
- IF element not.in relation[thing]:
- INSERT element IN relation[thing]
-
- though you may prefer
-
- HOW TO ADD element TO relation FOR thing:
- PUT (relation for thing) with {element} IN relation[thing]
-
- For instance:
-
- >>> ADD "b" TO follows FOR "a"
-
- We'll display a relation with:
-
- HOW TO SHOW relation:
- FOR k IN keys relation:
- WRITE "`k`: ", listed relation[k] /
-
- Here are some general functions on relations. The inverse:
-
- HOW TO RETURN inverse relation:
- PUT {} IN inv
- FOR k IN keys relation:
- FOR x IN relation[k]:
- ADD k TO inv FOR x
- RETURN inv
-
- The product of two relations (a P c iff a R1 b and b R2 c):
-
- HOW TO RETURN r1 prod r2: \product of relations
- PUT {} IN prod
- FOR c IN keys r2:
- FOR b IN r2[c]:
- IF b in keys r1:
- FOR a IN r1[b]:
- ADD a TO prod FOR c
- RETURN prod
-
- The closure:
-
- HOW TO RETURN closure r:
- FOR i IN keys r:
- FOR j IN keys r:
- IF i in r[j]:
- PUT r[i] with r[j] IN r[j]
- RETURN r
-
- To make a relation reflexive, we use the following. Since relations
- are sparse, we also have to pass the set of symbols that it must be
- reflexive over:
-
- HOW TO RETURN symbols reflexive relation: \make the relation reflexive
- FOR sym IN symbols:
- ADD sym TO relation FOR sym
- RETURN relation
-
- SOME EXAMPLES OF RELATIONS
-
- To collect the *direct* followers for each symbol, we walk along each
- alternative, collecting adjacent symbols. There is one catch: in a
- rule like:
-
- SENT: the ADJ PERSON
-
- "the" and "ADJ" are adjacent, but if "ADJ" can generate empty, then so
- are "the" and "PERSON":
-
- HOW TO RETURN followers grammar:
- PUT {}, empties grammar IN foll, empty
- FOR rule IN grammar:
- FOR alt IN rule:
- TREAT ALT
- RETURN foll
- TREAT ALT:
- FOR i IN {1..#alt-1}:
- PUT alt item i IN this
- TREAT PART
- TREAT PART:
- FOR j IN {i+1..#alt}:
- PUT alt item j IN next
- ADD next TO foll FOR this
- IF next not.in empty: QUIT
-
- >>> SHOW followers sentence
- ADJ: BOY GIRL
- SUBJ: loves
- loves: OBJ
-
- To collect the direct starter symbols of each rule, you also have to
- deal with symbols that produce empty:
-
- HOW TO RETURN heads grammar:
- PUT {}, empties grammar IN heads, empty
- FOR name IN keys grammar:
- FOR alt IN grammar[name]:
- TREAT ALT
- RETURN heads
- TREAT ALT:
- FOR i IN {1..#alt}:
- PUT alt item i IN head
- ADD head TO heads FOR name
- IF head not.in empty: QUIT
-
- >>> SHOW heads sentence
- ADJ: EMPTY clever shy
- BOY: John Kevin
- GIRL: Mary Susan
- OBJ: SUBJ
- SENT: SUBJ
- SUBJ: ADJ BOY GIRL
-
- Similarly for the direct enders:
-
- HOW TO RETURN tails grammar:
- PUT {}, empties grammar IN tails, empty
- FOR name IN keys grammar:
- FOR alt IN grammar[name]:
- TREAT ALT
- RETURN tails
- TREAT ALT:
- FOR i' IN {-#alt..-1}:
- PUT -i' IN i
- PUT alt item i IN tail
- ADD tail TO tails FOR name
- IF tail not.in empty: QUIT
-
- The closure of the head relation represents all symbols that can start
- a rule, either directly or indirectly:
-
- >>> SHOW closure heads sentence
- ADJ: EMPTY clever shy
- BOY: John Kevin
- GIRL: Mary Susan
- OBJ: ADJ BOY EMPTY GIRL John Kevin Mary SUBJ Susan clever shy
- SENT: ADJ BOY EMPTY GIRL John Kevin Mary SUBJ Susan clever shy
- SUBJ: ADJ BOY EMPTY GIRL John Kevin Mary Susan clever shy
-
- Symbol b may follow symbol a in a phrase if b follows a in an
- alternative, or if B follows A in an alternative and b is in heads*(B)
- and a is in tails*(A). This is expressed as the product
- head* . follow . inverse(tail*).
-
- Now we have enough to define a command that prints for each symbol in
- an alternative what may follow that symbol at that point:
-
- HOW TO SHOW LOCAL FOLLOWERS grammar:
- PUT (used grammar) with keys grammar IN symbols
- PUT symbols reflexive (closure heads grammar) IN head.star
- PUT symbols reflexive (closure tails grammar) IN tail.star
- PUT followers grammar IN follow
- PUT (head.star prod follow) prod (inverse tail.star) IN deep.follow
- FOR parent IN keys grammar:
- FOR alt IN grammar[parent]:
- TREAT ALT
- TREAT ALT:
- ANNOUNCE ALT
- FOR i IN {1..#alt}:
- TREAT SYM
- TREAT SYM:
- PUT alt item i IN sym
- WRITE " `sym`: ", listed local.follow /
- local.follow:
- PUT {} IN foll
- FOR j IN {i+1..#alt}:
- PUT alt item j IN next
- PUT foll with (head.star for next) IN foll
- IF next not.in empty:
- RETURN foll
- RETURN foll with (deep.follow for parent)
- ANNOUNCE ALT:
- WRITE "`parent`: ", listed alt /
-
- This prints each alternative separately, followed by each symbol of
- the alternative indented one to a line followed by the symbols that
- can follow it at that point.
-
- For example:
-
- >>> SHOW LOCAL FOLLOWERS sentence
- ADJ: EMPTY
- EMPTY: BOY GIRL John Kevin Mary Susan
- ADJ: clever
- clever: BOY GIRL John Kevin Mary Susan
- ADJ: shy
- shy: BOY GIRL John Kevin Mary Susan
- BOY: John
- John: loves
- BOY: Kevin
- Kevin: loves
- EMPTY:
- GIRL: Mary
- Mary: loves
- GIRL: Susan
- Susan: loves
- OBJ: SUBJ
- SUBJ:
- SENT: SUBJ loves OBJ
- SUBJ: loves
- loves: ADJ BOY EMPTY GIRL John Kevin Mary OBJ SUBJ Susan clever shy
- OBJ:
- SUBJ: ADJ BOY
- ADJ: BOY John Kevin
- BOY: loves
- SUBJ: ADJ GIRL
- ADJ: GIRL Mary Susan
- GIRL: loves
-