|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectedu.cmu.sphinx.frontend.BaseDataProcessor
edu.cmu.sphinx.frontend.transform.DiscreteFourierTransform
Computes the Discrete Fourier Transform (FT) of an input sequence, using Fast Fourier Transform (FFT). Fourier Transform is the process of analyzing a signal into its frequency components. In speech, rather than analyzing the signal over its entrie duration, we analyze one window of audio data. This window is the product of applying a sliding Hamming window to the signal. Moreover, since the amplitude is a lot more important than the phase for speech recognition, this class returns the power spectrum of that window of data instead of the complex spectrum. Each value in the returned spectrum represents the strength of that particular frequency for that window of data.
By default, the number of FFT points is the closest power pf 2
that is equal to or larger than the number of samples in the incoming
window of data. The FFT points can also be set by the user with the
property defined by PROP_NUMBER_FFT_POINTS
.
The length of the returned power
spectrum is the number of FFT points, divided by 2, plus 1. Since
the input signal is real, the FFT is symmetric, and the information
contained in the whole vector is already present in its first half.
Note that each call to getData
only returns
the spectrum of one window of data. To display the spectrogram of
the entire original audio, one has to collect all the spectra from
all the windows generated from the original data. A spectrogram is
a two dimensional representation of three dimensional
information. The horizontal axis represents time. The vertical axis
represents the frequency. If we slice the spectrogram at a given
time, we get the spectrum computed as the short term Fourier
transform of the signal windowed around that time stamp. The
intensity of the spectrum for each time frame is given by the color
in the graph, or by the darkness in a gray scale plot. The
spectrogram can be thought of as a view from the top of a surface
generated by concatenating the spectral vectors obtained from the
windowed signal.
For example, Figure 1 below shows the audio signal of the utterance "one three nine oh", and Figure 2 shows its spectrogram, produced by putting together all the spectra returned by this FFT. Frequency is on the vertical axis, and time is on the horizontal axis. The darkness of the shade represents the strength of that frequency at that point in time:
Figure 1: The audio signal of the
utterance "one three nine oh".
Figure 2: The spectrogram
of the utterance "one three nine oh" in Figure 1.
Field Summary | |
static java.lang.String |
PROP_NUMBER_FFT_POINTS
The name of the SphinxProperty for the number of points in the Fourier Transform. |
Constructor Summary | |
DiscreteFourierTransform()
|
Method Summary | |
Data |
getData()
Reads the next DoubleData object, which is a data frame from which we'll compute the power spectrum. |
void |
initialize()
Initializes this DataProcessor. |
void |
newProperties(PropertySheet ps)
This method is called when this configurable component has new data. |
void |
register(java.lang.String name,
Registry registry)
Register my properties. |
Methods inherited from class edu.cmu.sphinx.frontend.BaseDataProcessor |
getName, getPredecessor, getTimer, setPredecessor, toString |
Methods inherited from class java.lang.Object |
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Field Detail |
public static final java.lang.String PROP_NUMBER_FFT_POINTS
Constructor Detail |
public DiscreteFourierTransform()
Method Detail |
public void register(java.lang.String name, Registry registry) throws PropertyException
Configurable
register
in interface Configurable
register
in class BaseDataProcessor
PropertyException
public void newProperties(PropertySheet ps) throws PropertyException
Configurable
newProperties
in interface Configurable
newProperties
in class BaseDataProcessor
PropertyException
public void initialize()
BaseDataProcessor
initialize
in interface DataProcessor
initialize
in class BaseDataProcessor
public Data getData() throws DataProcessingException
getData
in interface DataProcessor
getData
in class BaseDataProcessor
DataProcessingException
- if there is a processing error
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |