|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectedu.cmu.sphinx.linguist.lextree.LexTreeLinguist
A linguist that can represent large vocabularies efficiently. This class
implements the Linguist interface. The main role of any linguist is to
represent the search space for the decoder. The initial state in the search
space can be retrieved by a SearchManager via a call to getInitialSearchState
.
This method returns a SearchState. Successor states can be retrieved via
calls to SearchState.getSuccessors().
. There are a number of
search state subinterfaces that are used to indicate different types of
states in the search space:
getSearchStateOrder
can be used to retrieve the order
of state returned by the linguist.
Depending on the vocabulary size and topology, the search space represented
by the linguist may include a very large number of states. Some linguists
will generate the search states dynamically, that is, the object
representing a particular state in the search space is not created until it
is needed by the SearchManager. SearchManagers often need to be able to
determine if a particular state has been entered before by comparing states.
Because SearchStates may be generated dynamically, the SearchState.equals()
call (as opposed to the reference equals '==' method) should be used to
determine if states are equal. The states returned by the linguist will
generally provide very efficient implementations of equals
and hashCode
. This will allow a SearchManager to maintain
collections of states in HashMaps efficiently.
LexTeeLinguist Characteristics Some characteristics of this linguist:
Design Notes The following are some notes describing the design of this linguist. They may be helpful to those who want to understand how this linguist works but are not necessary if you are only interested in using this linguist.
Search Space Representation It has been shown that representing the search space as a tree can greatly reduce the number of active states in a search since the units at the beginnings of words can be shared across multiple words. For example, with a large vocabulary (60K words), at the end of a word, with a flat representation, we have to provide transitions to the initial state of each possible word. That is 60K transitions. In a tree based system we need to only provide transitions to each initial phone (within its context). That is about 1600 transitions. This is a substantial reduction. Conceptually, this tree consists of a node for each possible initial unit. Each node can have an arbitrary number of children which can be either unit nodes or word nodes.
This linguist uses the HMMTree class to build and represent the tree. The HMMTree is given the dictionary and language model and builds the lex tree. Instead of representing the nodes in the tree as phonemes and words as is typically done, the HMMTree represents the tree as HMMs and words. The HMM is essentially a unit within its context. This is typically a triphone (although for some units (such as SIL) it is a simple phone. Representing the nodes as HMM instead of nodes yields a much larger tree, but also has some advantages:
Nested Class Summary | |
class |
LexTreeLinguist.LexTreeEndUnitState
Represents a unit in the search space |
class |
LexTreeLinguist.LexTreeEndWordState
Represents the final end of utterance word |
class |
LexTreeLinguist.LexTreeHMMState
Represents a HMM state in the search space |
class |
LexTreeLinguist.LexTreeNonEmittingHMMState
Represents a non emitting hmm state |
class |
LexTreeLinguist.LexTreeUnitState
Represents a unit in the search space |
class |
LexTreeLinguist.LexTreeWordState
Represents a word state in the search space |
Field Summary | |
static java.lang.String |
PROP_ACOUSTIC_MODEL
A sphinx property used to define the acoustic model to use when building the search graph |
static java.lang.String |
PROP_CACHE_SIZE
A sphinx property that defines the size of the arc cache (zero to disable the cache). |
static int |
PROP_CACHE_SIZE_DEFAULT
Property that defines the dictionary to use for this grammar |
static java.lang.String |
PROP_DICTIONARY
Property that defines the dictionary to use for this grammar |
static java.lang.String |
PROP_FULL_WORD_HISTORIES
Sphinx property used to determine whether or not the gstates are dumped. * A sphinx property that determines whether or not full word histories are used to determine when two states are equal. |
static boolean |
PROP_FULL_WORD_HISTORIES_DEFAULT
The default value for PROP_FULL_WORD_HISTORIES |
static java.lang.String |
PROP_GRAMMAR
A sphinx property used to define the grammar to use when building the search graph |
static java.lang.String |
PROP_LANGUAGE_MODEL
A sphinx property for the language model to be used by this grammar |
static java.lang.String |
PROP_LOG_MATH
Sphinx property that defines the name of the logmath to be used by this search manager. |
static java.lang.String |
PROP_UNIT_MANAGER
A sphinx property used to define the unit manager to use when building the search graph |
Constructor Summary | |
LexTreeLinguist()
|
Method Summary | |
void |
allocate()
Allocates the linguist. |
void |
deallocate()
Deallocates the linguist. |
LanguageModel |
getLanguageModel()
Retrieves the language model for this linguist |
java.lang.String |
getName()
Retrieves the name for this configurable component |
SearchGraph |
getSearchGraph()
Retrieves search graph. |
void |
newProperties(PropertySheet ps)
This method is called when this configurable component has new data. |
void |
register(java.lang.String name,
Registry registry)
Register my properties. |
void |
startRecognition()
Called before a recognition |
void |
stopRecognition()
Called after a recognition |
Methods inherited from class java.lang.Object |
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
public static final java.lang.String PROP_GRAMMAR
public static final java.lang.String PROP_ACOUSTIC_MODEL
public static final java.lang.String PROP_UNIT_MANAGER
public static final java.lang.String PROP_LOG_MATH
public static final java.lang.String PROP_FULL_WORD_HISTORIES
public static final boolean PROP_FULL_WORD_HISTORIES_DEFAULT
public static final java.lang.String PROP_LANGUAGE_MODEL
public static final java.lang.String PROP_DICTIONARY
public static final java.lang.String PROP_CACHE_SIZE
public static final int PROP_CACHE_SIZE_DEFAULT
Constructor Detail |
public LexTreeLinguist()
Method Detail |
public void register(java.lang.String name, Registry registry) throws PropertyException
Configurable
register
in interface Configurable
name
- the name of the componentregistry
- the registry for this component
PropertyException
public void newProperties(PropertySheet ps) throws PropertyException
Configurable
newProperties
in interface Configurable
ps
- a property sheet holding the new data
PropertyException
- if there is a problem with the properties.public java.lang.String getName()
Configurable
getName
in interface Configurable
public void allocate() throws java.io.IOException
Linguist
Implementor's Note - A well written linguist will allow allocate to be called multiple times without harm. This will allow a linguist to be shared by multiple search managers.
allocate
in interface Linguist
java.io.IOException
- if an IO error occurspublic void deallocate()
Linguist
Implementor's Note - if the linguist is being shared by multiple searches, the deallocate should only actually deallocate things when the last call to deallocate is made. Two approaches for dealing with this: (1) Keep an allocation counter that is incremented during allocate and decremented during deallocate. Only when the counter reaches zero should the actually deallocation be performed. (2) Do nothing in dellocate - just the the GC take care of things
deallocate
in interface Linguist
public SearchGraph getSearchGraph()
Linguist
Implementor's note: This method is typically called at the beginning of each recognition and therefore should be
getSearchGraph
in interface Linguist
public void startRecognition()
startRecognition
in interface Linguist
public void stopRecognition()
stopRecognition
in interface Linguist
public LanguageModel getLanguageModel()
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |