Sphinx-4 Transcriber Demo

A simple Sphinx-4 application that transcribes a continuous audio file that has multiple utterances. The audio file should contain connected digits data. The default file, called "10001-90210-01803.wav", contains three utterances, separated by silences. People who want to transcribe non-digits data should modify the config.xml file to use the correct grammar, language model, and linguist to do so. Please refer to the Programmer's Guide on how to modify the configuration file for your purposes.

Building

Check if the bin directory already has the Transcriber.jar file. If not, type the following in the top level directory:

ant -find demo.xml

Running

First make sure that you have JSAPI setup correctly. Then, to run the demo, type:

sphinx4> java -jar bin/Transcriber.jar

You will see the following result, with each utterance on its own line:

one zero zero zero one
nine oh two one oh
zero one eight zero three
      

NOTE:

  1. Make sure that you are using JavaTM 2 SDK, Standard Edition, v1.4 or higher.
  2. If you have the source distribution, make sure that the JAR file lib/sphinx4.jar is built. If not, go to the top level directory and type: ant
  3. You can supply your own test files, but they must be digits data. Just make sure that the audio format is the same as in the config.xml file, which is 16-bit signed PCM-linear, 16kHz, little-endian. The audio file format can be any format readable by Java Sound, e.g., .wav, .au. To test your own file, supply it as an argument. Suppose your test file is called test.wav, then:

    java -jar bin/Transcriber.jar test.wav


Copyright 1999-2004 Carnegie Mellon University.
Portions Copyright 2002-2004 Sun Microsystems, Inc.
Portions Copyright 2002-2004 Mitsubishi Electric Research Laboratories.
All Rights Reserved. Usage is subject to license terms.