|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectedu.cmu.sphinx.linguist.language.ngram.large.LargeTrigramModel
Queries a binary language model file generated by the CMU-Cambridge Statistical Language Modelling Toolkit. Note that all probabilites in the grammar are stored in LogMath log base format. Language Probabilties in the language model file are stored in log 10 base. They are converted to the LogMath logbase.
Field Summary | |
static int |
BYTES_PER_BIGRAM
The number of bytes per bigram in the LM file generated by the CMU-Cambridge Statistical Language Modelling Toolkit. |
static int |
BYTES_PER_TRIGRAM
The number of bytes per trigram in the LM file generated by the CMU-Cambridge Statistical Language Modelling Toolkit. |
static java.lang.String |
PROP_APPLY_LANGUAGE_WEIGHT_AND_WIP
Sphinx propert that controls whether or not the language model will apply the language weight and word insertion probability |
static boolean |
PROP_APPLY_LANGUAGE_WEIGHT_AND_WIP_DEFAULT
The default value for PROP_APPLY_LANGUAGE_WEIGHT_AND_WIP |
static java.lang.String |
PROP_BIGRAM_CACHE_SIZE
A sphinx property that defines the maximum number of bigrams to be cached. |
static int |
PROP_BIGRAM_CACHE_SIZE_DEFAULT
The default value for the PROP_BIGRAM_CACHE_SIZE property |
static java.lang.String |
PROP_CLEAR_CACHES_AFTER_UTTERANCE
A sphinx property that controls whether the bigram and trigram caches are cleared after every utterance |
static boolean |
PROP_CLEAR_CACHES_AFTER_UTTERANCE_DEFAULT
The default value for the PROP_CLEAR_CACHES_AFTER_UTTERANCE property |
static java.lang.String |
PROP_FULL_SMEAR
If true, use full bigram information to determine smear |
static boolean |
PROP_FULL_SMEAR_DEFAULT
Default value for PROP_FULL_SMEAR |
static java.lang.String |
PROP_LANGUAGE_WEIGHT
Sphinx property that defines the language weight for the search |
static float |
PROP_LANGUAGE_WEIGHT_DEFAULT
The default value for the PROP_LANGUAGE_WEIGHT property |
static java.lang.String |
PROP_LOG_MATH
Sphinx property that defines the logMath component. |
static java.lang.String |
PROP_QUERY_LOG_FILE
Sphinx property for the name of the file that logs all the queried N-grams. |
static java.lang.String |
PROP_QUERY_LOG_FILE_DEFAULT
The default value for PROP_QUERY_LOG_FILE. |
static java.lang.String |
PROP_TRIGRAM_CACHE_SIZE
A sphinx property that defines that maxium number of trigrams to be cached |
static int |
PROP_TRIGRAM_CACHE_SIZE_DEFAULT
The default value for the PROP_TRIGRAM_CACHE_SIZE property |
static java.lang.String |
PROP_WORD_INSERTION_PROBABILITY
Word insertion probability property |
static double |
PROP_WORD_INSERTION_PROBABILITY_DEFAULT
The default value for PROP_WORD_INSERTION_PROBABILITY |
Fields inherited from interface edu.cmu.sphinx.linguist.language.ngram.LanguageModel |
PROP_DICTIONARY, PROP_FORMAT, PROP_FORMAT_DEFAULT, PROP_LOCATION, PROP_LOCATION_DEFAULT, PROP_MAX_DEPTH, PROP_MAX_DEPTH_DEFAULT, PROP_UNIGRAM_WEIGHT, PROP_UNIGRAM_WEIGHT_DEFAULT |
Constructor Summary | |
LargeTrigramModel()
|
Method Summary | |
void |
allocate()
Create the language model |
void |
deallocate()
Deallocate resources allocated to this language model |
float |
getBackoff(WordSequence wordSequence)
Returns the backoff probability for the give sequence of words |
int |
getBigramMisses()
Returns the number of times when a bigram is queried, but there is no bigram in the LM (in which case it uses the backoff probabilities). |
int |
getMaxDepth()
Returns the maximum depth of the language model |
java.lang.String |
getName()
Retrieves the name for this configurable component |
float |
getProbability(WordSequence wordSequence)
Gets the ngram probability of the word sequence represented by the word list |
float |
getSmear(WordSequence wordSequence)
Gets the smear term for the given wordSequence |
float |
getSmearOld(WordSequence wordSequence)
Gets the smear term for the given wordSequence |
int |
getTrigramHits()
Returns the number of trigram hits. |
int |
getTrigramMisses()
Returns the number of times when a trigram is queried, but there is no trigram in the LM (in which case it uses the backoff probabilities). |
java.util.Set |
getVocabulary()
Returns the set of words in the lanaguage model. |
int |
getWordID(Word word)
Returns the ID of the given word. |
void |
newProperties(PropertySheet ps)
This method is called when this configurable component has new data. |
void |
register(java.lang.String name,
Registry registry)
Register my properties. |
void |
start()
Called before a recognition |
void |
stop()
Called after a recognition |
Methods inherited from class java.lang.Object |
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
public static final java.lang.String PROP_QUERY_LOG_FILE
public static final java.lang.String PROP_QUERY_LOG_FILE_DEFAULT
public static final java.lang.String PROP_TRIGRAM_CACHE_SIZE
public static final int PROP_TRIGRAM_CACHE_SIZE_DEFAULT
public static final java.lang.String PROP_BIGRAM_CACHE_SIZE
public static final int PROP_BIGRAM_CACHE_SIZE_DEFAULT
public static final java.lang.String PROP_CLEAR_CACHES_AFTER_UTTERANCE
public static final boolean PROP_CLEAR_CACHES_AFTER_UTTERANCE_DEFAULT
public static final java.lang.String PROP_LANGUAGE_WEIGHT
public static final float PROP_LANGUAGE_WEIGHT_DEFAULT
public static final java.lang.String PROP_LOG_MATH
public static final java.lang.String PROP_APPLY_LANGUAGE_WEIGHT_AND_WIP
public static final boolean PROP_APPLY_LANGUAGE_WEIGHT_AND_WIP_DEFAULT
public static final java.lang.String PROP_WORD_INSERTION_PROBABILITY
public static final double PROP_WORD_INSERTION_PROBABILITY_DEFAULT
public static final java.lang.String PROP_FULL_SMEAR
public static final boolean PROP_FULL_SMEAR_DEFAULT
public static final int BYTES_PER_BIGRAM
public static final int BYTES_PER_TRIGRAM
Constructor Detail |
public LargeTrigramModel()
Method Detail |
public void register(java.lang.String name, Registry registry) throws PropertyException
Configurable
register
in interface Configurable
name
- the name of the componentregistry
- the registry for this component
PropertyException
public void newProperties(PropertySheet ps) throws PropertyException
Configurable
newProperties
in interface Configurable
ps
- a property sheet holding the new data
PropertyException
- if there is a problem with the properties.public java.lang.String getName()
Configurable
getName
in interface Configurable
public void allocate() throws java.io.IOException
LanguageModel
allocate
in interface LanguageModel
java.io.IOException
public void deallocate()
LanguageModel
deallocate
in interface LanguageModel
public void start()
start
in interface LanguageModel
public void stop()
stop
in interface LanguageModel
public float getProbability(WordSequence wordSequence)
getProbability
in interface LanguageModel
wordSequence
- the word sequence
public final int getWordID(Word word)
word
- the word to find the ID
public float getSmearOld(WordSequence wordSequence)
wordSequence
- the word sequence
public float getSmear(WordSequence wordSequence)
LanguageModel
getSmear
in interface LanguageModel
wordSequence
- the word sequence
public float getBackoff(WordSequence wordSequence)
wordSequence
- the sequence of words
public int getMaxDepth()
getMaxDepth
in interface LanguageModel
public java.util.Set getVocabulary()
getVocabulary
in interface LanguageModel
public int getBigramMisses()
public int getTrigramMisses()
public int getTrigramHits()
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |