Configuration Management for Sphinx-4 |
<config>
<component name="mySampleComponent" type="edu.cmu.sphinx.sample.MyComponent"/>
</config>
<config>This configuration file defines three components, two of the type MyComponent , and one with the type YourComponent. The two components with the same types will result in two different instances of that component being created.
<component name="mySampleComponent" type="edu.cmu.sphinx.sample.MyComponent"/>
<component name="anotherComponent" type="edu.cmu.sphinx.sample.MyComponent"/>
<component name="aDifferentComponent" type="edu.cmu.sphinx.sample.YourComponent"/>
</config>
<config>Here we see some of the components used in the front end of Sphinx4
<component name="dct" type="edu.cmu.sphinx.frontend.transform.DiscreteCosineTransform"/> <component name="batchCMN" type="edu.cmu.sphinx.frontend.feature.BatchCMN"/> <component name="liveCMN" type="edu.cmu.sphinx.frontend.feature.LiveCMN"/> <component name="featureExtraction" type="edu.cmu.sphinx.frontend.feature.DeltasFeatureExtractor"/>
</config>
<config>In Sphinx-4, we call the configuration data for a component its properties. Here, we are defining six properties for the concatDataSource component. Properties are simple name/value pairs and are set with the <property> statement as shown above.
<component name="concatDataSource" type="edu.cmu.sphinx.frontend.util.ConcatFileDataSource">
<property name="sampleRate" value="16000"/>
<property name="transcriptFile" value="reference.txt"/>
<property name="silenceFile" value="/lab/speech/sphinx4/data/tidigits/test/raw16k/silence1sec.raw"/>
<property name="bytesPerRead" value="320"/>
<property name="batchFile" value="tidigits.batch"/>
<property name="addRandomSilence" value="true"/>
</component>
</config> >
value="'Twas brillig and slithey toves" |
A string |
value="3.14" |
a float |
value="1E-140" |
a double |
value="16000" |
an integer |
value="false" |
a boolean |
value="beamPruner" |
a component |
<component name="fileManager" type="edu.cmu.sphinx.sample.FileManager">Property lists of components are defined similarly:
<propertylist name="fileNames">
<item>file1.txt</item>
<item>file2.txt</item>
<item>file3.txt</item>
</propertylist>
</component>
<component name="mfcLiveFrontEnd" type="edu.cmu.sphinx.frontend.FrontEnd">
<propertylist name="pipeline">
<item>concatDataSource </item>
<item>speechClassifier </item>
<item>speechMarker </item>
<item>nonSpeechDataFilter </item>
<item>premphasizer </item>
<item>windower </item>
<item>fft </item>
<item>melFilterBank </item>
<item>dct </item>
<item>liveCMN </item>
<item>featureExtraction </item>
</propertylist>
</component>
Element |
Attributes |
Sub-elements |
Description |
<config> |
none |
<component> <property> <propertylist> |
The top level element. It has no
attributes. It can have any number of the component, property and propertylist
sub-elements. |
<component> |
name
- the component name type - the component type |
<property> <propertylist> |
Defines an instance of a
component. This element must always have the name and type attributes. |
<property> |
name
- the property name value - the type of the property |
None |
Used to define a single property
of a component or a global system property. This element must always
have the name and value attributes. |
<propertylist> |
name
- the name of the property list |
<item> |
Used to define a list of strings
or components. This element must always have the name element. I can have any number
of item sub-elements. |
<item> |
none |
none |
contents of this element
define a string or a component name. |
<config>
<property name="absoluteBeam" value="1000"/>
<property name="relativeBeam" value="1E-10"/>
</config>
<config>In this example we have three components, all of which need to be configured with a sampleRate. We could explicitly use the value "16000" for each of the sampleRate properties, but if we decided to change the sample rate at a later time, we would have to change it in three places. Using a global property allows us to have a single point where the sample rate is defined. To change the sample rate, we only have to change the single number.
<property name="sampleRate" value="16000"/>
<component name="concatDataSource" type="edu.cmu.sphinx.frontend.util.ConcatFileDataSource">
<property name="sampleRate" value="${sampleRate}/>
</component>
<component name="microphone" type="edu.cmu.sphinx.frontend.util.Microphone">
<property name="sampleRate" value="${sampleRate}/>
</component>
<component name="streamDataSource" type="edu.cmu.sphinx.frontend.util.StreamDataSource">
<property name="sampleRate" value="${sampleRate}/>
</component>
</config> >
<config>
<!-- ******************************************************** -->
<!-- frequently tuned properties -->
<!-- ******************************************************** -->
<property name="absoluteBeamWidth" value="-1"/>
<property name="relativeBeamWidth" value="1E-200"/>
<property name="wordInsertionProbability" value="1E-36"/>
<property name="languageWeight" value="8"/>
<property name="silenceInsertionProbability" value="1"/>
<property name="skip" value="0"/>
<!-- ******************************************************** -->
<!-- Components -->
<!-- ******************************************************** -->
<component name="batch" type-"..." >
...
</component>
<!-- more omitted .... -->
<config>Note that you can not substitute global variables for component names or types. Thus, this is illegal:
<property name="cmn" value="liveCMN"/>
<component name="mfcFrontEnd" type="edu.cmu.sphinx.frontend.FrontEnd">
<propertylist name="pipeline">
<item>streamDataSource</item>
<item>premphasizer</item>
<item>windower</item>
<item>fft</item>
<item>melFilterBank</item>
<item>dct</item>
<item>${cmn}/item>
<item>featureExtraction</item>
</propertylist>
</component>
</config>
<config>
<property name="cmn" value="liveCMN"/>
<!-- illegal! -->
<component name="${liveCMN}" type="edu.cmu.sphinx.frontend.CepstralMeanNormalizer">
</component>
</config>
java -Dmicrophone[sampleRate]=44100 edu.cmu.sphinx.tools.LiveModeRecognizer tidigits.config.xml tidigits.batchThe syntax for global properties is: globalProperty=value
java -Dmicrophone[sampleRate]=441000 -DabsoluteBeamWidth=2000 -DwordInsertionProbability=.01 \Of course, ant has its own syntax for setting such things. Here's an example of setting properties from an ant build file:
edu.cmu.sphinx.tools.LiveModeRecognizer tidigits.config.xml tidigits.batch
<target name="tidigits_wordlist_live" description="Live mode TIDIGITS test.">Note that currently, it is not possible to set the value of propertylist properties from the command line.
<java classpath="${classes_dir}" classname="${live_main}"
<sysproperty key="live[skip]" value="1"/>
<sysproperty key="speedTracker[showResponseTime]" value="true"/>
<sysproperty key="frontend" value="mfcLiveFrontEnd"/>
<arg value="${config}"/>
</java>
</target>
java -cp ../../../bld/classes -DshowCreations=true edu.cmu.sphinx.tools.batch.BatchModeRecognizer \
tidigits.config.xml tidigits.batch
Creating: batch
Creating: connectedDigitsRecognizer
Creating: digitsDecoder
Creating: searchManager
Creating: logMath
Creating: flatLinguist
Creating: wordListGrammar
Creating: dictionary
Creating: acousticModel
Creating: sphinx3Loader
Creating: trivialPruner
Creating: threadedScorer
Creating: mfcFrontEnd
Creating: streamDataSource
Creating: premphasizer
Creating: windower
Creating: fft
Creating: melFilterBank
Creating: dct
Creating: batchCMN
Creating: featureExtraction
Creating: activeList
Creating: accuracyTracker
Creating: speedTracker
Creating: memoryTracker
Creating: recognizerMonitor
Creating: linguistStats
============ config =============
fft:
numberFftPoints = [DEFAULT]
trivialPruner:
searchManager:
scorer = threadedScorer
activeListFactory = activeList
pruner = trivialPruner
logMath = logMath
growSkipInterval = [DEFAULT]
showTokenCount = [DEFAULT]
wantEntryPruning = [DEFAULT]
linguist = flatLinguist
relativeWordBeamWidth = [DEFAULT]
wordRecognizer:
decoder = decoder
melFilterBank:
numberFilters = [DEFAULT]
maximumFrequency = [DEFAULT]
minimumFrequency = [DEFAULT]
threadedScorer:
numThreads = 0
scoreablesKeepFeature = true
frontend = mfcFrontEnd
isCpuRelative = true
minScoreablesPerThread = 10
premphasizer:
factor = [DEFAULT]
memoryTracker:
showDetails = [DEFAULT]
showSummary = [DEFAULT]
recognizer = wordRecognizer
<?xml version="1.0" encoding="UTF-8"?>
<!--
Sphinx-4 Configuration file
-->
<!-- ******************************************************** -->
<!-- tidigits configuration file -->
<!-- ******************************************************** -->
<config>
<!-- ******************************************************** -->
<!-- frequently tuned properties -->
<!-- ******************************************************** -->
<property name="absoluteBeamWidth" value="-1"/>
<property name="relativeBeamWidth" value="1E-200"/>
<property name="wordInsertionProbability" value="1E-36"/>
<property name="languageWeight" value="8"/>
<property name="silenceInsertionProbability" value="1"/>
<property name="skip" value="0"/>
<property name="linguist" value="flatLinguist"/>
<property name="frontend" value="mfcFrontEnd"/>
<!-- ******************************************************** -->
<!-- batch tool configuration -->
<!-- ******************************************************** -->
<component name="batch"
type="edu.cmu.sphinx.tools.batch.BatchModeRecognizer">
<property name="recognizer" value="connectedDigitsRecognizer"/>
<property name="inputSource" value="streamDataSource"/>
<propertylist name="monitors">
<item>accuracyTracker </item>
</propertylist>
</component>
<!-- ******************************************************** -->
<!-- The connectedDigitsRecognizer configuration -->
<!-- ******************************************************** -->
<component name="connectedDigitsRecognizer"
type="edu.cmu.sphinx.recognizer.Recognizer">
<property name="decoder" value="digitsDecoder"/>
</component>
<!-- ******************************************************** -->
<!-- The Decoder configuration -->
<!-- ******************************************************** -->
<component name="digitsDecoder" type="edu.cmu.sphinx.decoder.Decoder">
<property name="searchManager" value="searchManager"/>
</component>
<component name="searchManager"
type="edu.cmu.sphinx.decoder.search.SimpleBreadthFirstSearchManager">
<property name="logMath" value="logMath"/>
<property name="linguist" value="${linguist}"/>
<property name="pruner" value="trivialPruner"/>
<property name="scorer" value="threadedScorer"/>
<property name="activeListFactory" value="activeList"/>
</component>
<component name="activeList"
type="edu.cmu.sphinx.decoder.search.SortingActiveListFactory">
<property name="logMath" value="logMath"/>
<property name="absoluteBeamWidth" value="${absoluteBeamWidth}"/>
<property name="relativeBeamWidth" value="${relativeBeamWidth}"/>
</component>
<component name="trivialPruner"
type="edu.cmu.sphinx.decoder.pruner.SimplePruner"/>
<component name="threadedScorer"
type="edu.cmu.sphinx.decoder.scorer.ThreadedAcousticScorer">
<property name="frontend" value="${frontend}"/>
<property name="isCpuRelative" value="true"/>
<property name="numThreads" value="0"/>
<property name="minScoreablesPerThread" value="10"/>
<property name="scoreablesKeepFeature" value="true"/>
</component>
<!-- ******************************************************** -->
<!-- The linguist configuration -->
<!-- ******************************************************** -->
<component name="flatLinguist"
type="edu.cmu.sphinx.linguist.flat.FlatLinguist">
<property name="logMath" value="logMath"/>
<property name="grammar" value="wordListGrammar"/>
<property name="acousticModel" value="acousticModel"/>
<property name="wordInsertionProbability"
value="${wordInsertionProbability}"/>
<property name="silenceInsertionProbability"
value="${silenceInsertionProbability}"/>
<property name="languageWeight" value="${languageWeight}"/>
</component>
<!-- ******************************************************** -->
<!-- The Grammar configuration -->
<!-- ******************************************************** -->
<component name="wordListGrammar"
type="edu.cmu.sphinx.linguist.language.grammar.SimpleWordListGrammar">
<property name="path" value="./tidigits.wordlist"/>
<property name="isLooping" value="true"/>
<property name="dictionary" value="dictionary"/>
<property name="optimizeGrammar" value="true"/>
<property name="logMath" value="logMath"/>
</component>
<!-- ******************************************************** -->
<!-- The Dictionary configuration -->
<!-- ******************************************************** -->
<component name="dictionary"
type="edu.cmu.sphinx.linguist.dictionary.FullDictionary">
<property name="location"
value="file:/lab/speech/sphinx4/data/tidigits_8gau_13dCep_16k_40mel_130Hz_6800Hz.bin.zip"/>
<property name="dictionaryPath" value= "dictionary"/>
<property name="fillerPath" value="fillerdict"/>
<property name="addSilEndingPronunciation" value="false"/>
</component>
<!-- ******************************************************** -->
<!-- The acoustic model configuration -->
<!-- ******************************************************** -->
<component name="acousticModel"
type="edu.cmu.sphinx.linguist.acoustic.tiedstate.TiedStateAcousticModel">
<property name="loader" value="sphinx3Loader"/>
</component>
<component name="sphinx3Loader"
type="edu.cmu.sphinx.linguist.acoustic.tiedstate.Sphinx3Loader">
<property name="logMath" value="logMath"/>
<property name="isBinary" value="true"/>
<property name="location"
value="file:/lab/speech/sphinx4/data/tidigits_8gau_13dCep_16k_40mel_130Hz_6800Hz.bin.zip"/>
<property name="definition_file"
value="wd_dependent_phone.500.mdef"/>
<property name="data_location"
value="wd_dependent_phone.cd_continuous_8gau"/>
<property name="properties_file" value="am.props"/>
<property name="FeatureVectorLength" value="39"/>
</component>
<!-- ******************************************************** -->
<!-- The frontend configuration -->
<!-- ******************************************************** -->
<component name="mfcFrontEnd" type="edu.cmu.sphinx.frontend.FrontEnd">
<propertylist name="pipeline">
<item>streamDataSource</item>
<item>premphasizer</item>
<item>windower</item>
<item>fft</item>
<item>melFilterBank</item>
<item>dct</item>
<item>batchCMN</item>
<item>featureExtraction</item>
</propertylist>
</component>
<component name="premphasizer"
type="edu.cmu.sphinx.frontend.filter.Preemphasizer"/>
<component name="windower"
type="edu.cmu.sphinx.frontend.window.RaisedCosineWindower">
</component>
<component name="fft"
type="edu.cmu.sphinx.frontend.transform.DiscreteFourierTransform"/>
<component name="melFilterBank"
type="edu.cmu.sphinx.frontend.frequencywarp.MelFrequencyFilterBank">
</component>
<component name="dct"
type="edu.cmu.sphinx.frontend.transform.DiscreteCosineTransform"/>
<component name="batchCMN"
type="edu.cmu.sphinx.frontend.feature.BatchCMN"/>
<component name="featureExtraction"
type="edu.cmu.sphinx.frontend.feature.DeltasFeatureExtractor"/>
<component name="streamDataSource"
type="edu.cmu.sphinx.frontend.util.StreamDataSource">
<property name="sampleRate" value="16000"/>
</component>
<component name="cepstrumSource"
type="edu.cmu.sphinx.frontend.StreamCepstrumSource">
<property name="sampleRate" value="16000"/>
</component>
<!-- ******************************************************* -->
<!-- monitors -->
<!-- ******************************************************* -->
<component name="accuracyTracker"
type="edu.cmu.sphinx.instrumentation.AccuracyTracker">
<property name="recognizer" value="connectedDigitsRecognizer"/>
<property name="showAlignedResults" value="false"/>
<property name="showRawResults" value="false"/>
</component>
<!-- ******************************************************* -->
<!-- Miscellaneous components -->
<!-- ******************************************************* -->
<component name="logMath" type="edu.cmu.sphinx.util.LogMath">
<property name="logBase" value="1.0001"/>
<property name="useAddTable" value="true"/>
</component>
</config>