edu.cmu.sphinx.frontend.endpoint
Class SpeechMarker

java.lang.Object
  extended byedu.cmu.sphinx.frontend.BaseDataProcessor
      extended byedu.cmu.sphinx.frontend.endpoint.SpeechMarker
All Implemented Interfaces:
Configurable, DataProcessor

public class SpeechMarker
extends BaseDataProcessor

Converts a stream of SpeechClassifiedData objects, marked as speech and non-speech, and mark out the regions that are considered speech. This is done by inserting SPEECH_START and SPEECH_END signals into the stream.

The algorithm for inserting the two signals is as follows.

The algorithm is always in one of two states: 'in-speech' and 'out-of-speech'. If 'out-of-speech', it will read in audio until we hit audio that is speech. If we have read more than 'startSpeech' amount of continuous speech, we consider that speech has started, and insert a SPEECH_START at 'speechLeader' time before speech first started. The state of the algorithm changes to 'in-speech'.

Now consider the case when the algorithm is in 'in-speech' state. If it read an audio that is speech, it is outputted. If the audio is non-speech, we read ahead until we have 'endSilence' amount of continuous non-speech. At the point we consider that speech has ended. A SPEECH_END signal is inserted at 'speechTrailer' time after the first non-speech audio. The algorithm returns to 'ou-of-speech' state. If any speech audio is encountered in-between, the accounting starts all over again.


Field Summary
static java.lang.String PROP_END_SILENCE
          The SphinxProperty for the amount of time in silence (in milliseconds) to be considered as utterance end.
static int PROP_END_SILENCE_DEFAULT
          The default value of PROP_END_SILENCE.
static java.lang.String PROP_SPEECH_LEADER
          The SphinxProperty for the amount of time (in milliseconds) before speech start to be included as speech data.
static int PROP_SPEECH_LEADER_DEFAULT
          The default value of PROP_SPEECH_LEADER.
static java.lang.String PROP_SPEECH_TRAILER
          The SphinxProperty for the amount of time (in milliseconds) after speech ends to be included as speech data.
static int PROP_SPEECH_TRAILER_DEFAULT
          The default value of PROP_SPEECH_TRAILER.
static java.lang.String PROP_START_SPEECH
          The SphinxP roperty for the minimum amount of time in speech (in milliseconds) to be considered as utterance start.
static int PROP_START_SPEECH_DEFAULT
          The default value of PROP_START_SPEECH.
 
Constructor Summary
SpeechMarker()
           
 
Method Summary
 int getAudioTime(edu.cmu.sphinx.frontend.endpoint.SpeechClassifiedData audio)
          Returns the amount of audio data in milliseconds in the given SpeechClassifiedData object.
 Data getData()
          Returns the next Data object.
 void initialize()
          Initializes this SpeechMarker
 void newProperties(PropertySheet ps)
          This method is called when this configurable component has new data.
 void register(java.lang.String name, Registry registry)
          Register my properties.
 
Methods inherited from class edu.cmu.sphinx.frontend.BaseDataProcessor
getName, getPredecessor, getTimer, setPredecessor, toString
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

PROP_START_SPEECH

public static final java.lang.String PROP_START_SPEECH
The SphinxP roperty for the minimum amount of time in speech (in milliseconds) to be considered as utterance start.

See Also:
Constant Field Values

PROP_START_SPEECH_DEFAULT

public static final int PROP_START_SPEECH_DEFAULT
The default value of PROP_START_SPEECH.

See Also:
Constant Field Values

PROP_END_SILENCE

public static final java.lang.String PROP_END_SILENCE
The SphinxProperty for the amount of time in silence (in milliseconds) to be considered as utterance end.

See Also:
Constant Field Values

PROP_END_SILENCE_DEFAULT

public static final int PROP_END_SILENCE_DEFAULT
The default value of PROP_END_SILENCE.

See Also:
Constant Field Values

PROP_SPEECH_LEADER

public static final java.lang.String PROP_SPEECH_LEADER
The SphinxProperty for the amount of time (in milliseconds) before speech start to be included as speech data.

See Also:
Constant Field Values

PROP_SPEECH_LEADER_DEFAULT

public static final int PROP_SPEECH_LEADER_DEFAULT
The default value of PROP_SPEECH_LEADER.

See Also:
Constant Field Values

PROP_SPEECH_TRAILER

public static final java.lang.String PROP_SPEECH_TRAILER
The SphinxProperty for the amount of time (in milliseconds) after speech ends to be included as speech data.

See Also:
Constant Field Values

PROP_SPEECH_TRAILER_DEFAULT

public static final int PROP_SPEECH_TRAILER_DEFAULT
The default value of PROP_SPEECH_TRAILER.

See Also:
Constant Field Values
Constructor Detail

SpeechMarker

public SpeechMarker()
Method Detail

register

public void register(java.lang.String name,
                     Registry registry)
              throws PropertyException
Description copied from interface: Configurable
Register my properties. This method is called once early in the time of the component, shortly after the component is constructed. This component should register any configuration properties that it needs to register. If this configurable extends another configurable, super.register should also be called

Specified by:
register in interface Configurable
Overrides:
register in class BaseDataProcessor
Throws:
PropertyException

newProperties

public void newProperties(PropertySheet ps)
                   throws PropertyException
Description copied from interface: Configurable
This method is called when this configurable component has new data. The component should first validate the data. If it is bad the component should return false. If the data is good, the component should record the the data internally and return true.

Specified by:
newProperties in interface Configurable
Overrides:
newProperties in class BaseDataProcessor
Throws:
PropertyException

initialize

public void initialize()
Initializes this SpeechMarker

Specified by:
initialize in interface DataProcessor
Overrides:
initialize in class BaseDataProcessor

getData

public Data getData()
             throws DataProcessingException
Returns the next Data object.

Specified by:
getData in interface DataProcessor
Specified by:
getData in class BaseDataProcessor
Returns:
the next Data object, or null if none available
Throws:
DataProcessingException - if a data processing error occurs

getAudioTime

public int getAudioTime(edu.cmu.sphinx.frontend.endpoint.SpeechClassifiedData audio)
Returns the amount of audio data in milliseconds in the given SpeechClassifiedData object.

Parameters:
audio - the SpeechClassifiedData object
Returns:
the amount of audio data in milliseconds