Grammar Guide

Testing Speech Grammars

There are several utility programs that will assist you in evaluating grammars that you write:

fsgenum.exe: enumerates strings accepted by the grammar
fsgtest.exe: determines whether input strings are accepted by the grammar
fsgprint.exe: displays the graph that defines the grammar

This document provides a brief description of these tools.

FSGENUM.EXE

The program fsgenum.exe generates strings that will be accepted by a specific grammar. This tool is especially useful in determining whether a grammar overgenerates--- that is, whether the grammar accepts strings that it should not. For example, suppose we want a grammar that will parse (or accept) the following sentences:

   The boy saw the dog.
   The dog saw the boy.
   The cat saw the dog.
   He saw the dog.

and we write the following simple grammar (see writing grammar files):

;; saw.bnf
;;
<s>  = <np> <vp> .
<vp> = <v> <np>  .
<np> = <det> <n> |
       <pro>     .

; lexicon
<det> = the .
<n>  = dog |
       boy |
       cat .
<pro> = he .
<v>   = saw .

After compiling this grammar using vtbnfc.exe we want to examine what strings the grammar accepts. Using fsgenum.exe (the details will be explained shortly) we see that it generates the following:

the boy saw the dog
the cat saw the dog
the dog saw the boy
he saw the cat
the dog saw the dog

the dog saw he

As we can now see, the grammar overgenerates--- it accepts a string that is not in the language--- the dog saw he. We can now rewrite the grammar to prevent this string from being accepted. To give a more practical example, rules that allow for the acceptance of twelve hundred dollars might also contribute to the acceptance of the ill-formed four thousand twelve hundred dollars. Overgeneration is a common error in writing complex grammars and this tool is a useful debugging aid.

Syntax of fsgenum.exe

The syntax of this tool is:

fsgenum [options] fsg-filename

The available options are:

-r num: This option generates a set of num strings described by the fsg file. For example, fsgenum -r 10 saw.fsg will display 10 strings that saw.fsg will accept.
-R num: This option generates a random set of num strings described by the fsg file. That is, a different set of strings will be generated each time the fsgenum -R num command is issued. (Whereas repeated uses of the fsgenum -r num command will generate the same set of strings as output.)
-f: This option generates the full set of strings produced by the fsg file.
-a: This option shows the annotations associated with each generated string.
-p: This option displays the probability associated with each string.
-l num: This option defines the maximum number of times a Kleene star or plus is iterated.

Example

We compile the following BNF file to produce np.fsg

;; np.bnf
;;
<np>  = <det> <adj>* <n>.

; lexicon

<adj> = old .
<det> = the .
<n>   = dog .

fsgenum np.fsg and fsgenum -f np.fsg both produce the following output:

the dog
the old dog

fsgenum -f -l 5 np.fsg produces the output:

the dog
the old dog
the old old dog
the old old old dog
the old old old old dog

FSGTEST.EXE

This tool determines whether a given string (received from standard input) is accepted by the specified grammar. For example, using the np.fsg file described immediately above, we start this tool by issuing the command:

fsgtest np.fsg

After issuing this command a user can type in a string followed by the enter key. (However, it should be noted that the program does not produce a prompt.) If the entered string is accepted by np.fsg the string will be echoed. If it is rejected the string is echoed with a preceding asterisk. This is illustrated in the following example (where the user input is in bold).


fsgtest np.fsg
the dog
  the dog
the old
* the old
the old dog
  the old dog
[ctrl-c]
The external process was cancelled by a Ctrl+Break or another process.

A set of test strings can be input to this program by using file redirection. For example, suppose we have a currency grammar (compiled as currency.fsg) that we would like to test. We have a file, good-set, containing strings we would like the grammar to accept:

twelve hundred dollars
four thousand two hundred and eleven dollars
a dollar fifty
a dollar and a half
one and a half dollars

We also have a file, bad-set, of ill-formed strings that we want the grammar to reject:

four thousand twelve hundred dollars
four dollars fifty
one dollar and a half

We can test good-set by using the command:

fsgtest currency.fsg < good-set

which results in the following output:

  twelve hundred dollars
  four thousand two hundred and eleven dollars
  a dollar fifty
  a dollar and a half
  one and a half dollars

We can test bad-set by using

fsgtest currency.fsg < bad-set

which results in the following output:

* four thousand twelve hundred dollars
* four dollars fifty
* one dollar and a half

Syntax of fsgtest.exe

The syntax of this tool is:

fsgtest [options] fsg-filename

The available options are:

-a: This option shows the annotations associated with the accepted input string.
-p: This option displays the probability associated with each string.

Example

(Once again, user input is shown in bold)


fsgtest -p np.fsg
the dog
0.50029 the dog
the old
******* the old
the old dog
0.25029 the old dog
[ctrl-c]
The external process was cancelled by a Ctrl+Break or another process.

To illustrate the use of annotations we compile the following bnf file to produce np2.fsg:

;; np.bnf
;;
<np>  = <det> <adj>* <n>.

; lexicon

<adj> = old:"adjective" .
<det> = the:"determiner" .
<n>   = dog:"noun" .

Now we use fsgtest with the -a option:


fsgtest -a np2.fsg
the dog
  the:"determiner" dog:"noun"
the old
  the:"determiner" old:"adjective"
the old dog
  the:"determiner" old:"adjective" dog:"noun"
[ctrl-c]
The external process was cancelled by a Ctrl+Break or another process.

FSGPRINT.EXE

This tool displays the fsg file as a graph. For each state in the graph it displays the transitions from that state. For each transition it displays both the terminal word associated with the transition and the local probability of that transition. For example, using the np.fsg listed above:

fsgtest np.fsg

 --- fsg np.fsg ---

state 0
         the  -> state 1    p=1.00
state 1
         dog  -> state *    p=0.50
         old  -> state 1    p=0.50
state *

This shows that the grammar has three states (state 0, state 1, and state *). There is only one transition from state 0. It accepts the word "the" and goes to state 1. If you are in state 0 the probability of taking this transition is 1.00 since it is the only transition. State 1 has two transitions. The first transition accepts the word "dog" and goes to the final state, state *. The second transition accepts the word "old" and loops back to state 1. Since there are two transition out of state 1, the local probability of both of these transitions is 0.50.

Syntax of fsgprint.exe

The syntax of this tool is:

fsgprint [options] fsg-filename

The available options are:

-v: This option displays more detailed information about the fsg file such as the values of certain engine parameters and a more explicit description of the graph .

[ Top of Page | Previous Page | Next Page | Back to Grammar Guide ]