edu.cmu.sphinx.frontend.util
Class DataUtil

java.lang.Object
  extended byedu.cmu.sphinx.frontend.util.DataUtil

public class DataUtil
extends java.lang.Object

Defines utility methods for manipulating data values.


Method Summary
static void bytesToFile(byte[] data, java.lang.String filename)
          Saves the given bytes to the given binary file.
static short bytesToShort(byte[] byteArray, int offset)
          Convert the two bytes starting at the given offset to a short.
static double[] bytesToValues(byte[] byteArray, int offset, int length, int bytesPerValue, boolean signedData)
          Converts a big-endian byte array into an array of doubles.
static short[] byteToShortArray(byte[] byteArray, int offset, int length)
          Converts a byte array into a short array.
static java.lang.String doubleArrayToString(double[] data)
          Returns the given double array as a string.
static java.lang.String floatArrayToString(float[] data)
          Returns the given float array as a string.
static java.lang.String formatDouble(double number, int integerDigits, int fractionDigits)
          Returns a formatted string of the given number, with the given numbers of digit space for the integer and fraction parts.
static javax.sound.sampled.AudioFormat getNativeAudioFormat(javax.sound.sampled.AudioFormat format)
          Returns a native audio format that has the same encoding, endianness and sample size as the given format, and a sample rate that is larger than the given sample rate.
static javax.sound.sampled.AudioFormat getNativeAudioFormat(javax.sound.sampled.AudioFormat format, javax.sound.sampled.Mixer mixer)
          Returns a native audio format that has the same encoding, endianness and sample size as the given format, and a sample rate that is greater than or equal to the given sample rate.
static int getSamplesPerShift(int sampleRate, float windowShiftInMs)
          Returns the number of samples in a window shift given the sample rate (in Hertz) and the window shift (in milliseconds).
static int getSamplesPerWindow(int sampleRate, float windowSizeInMs)
          Returns the number of samples per window given the sample rate (in Hertz) and window size (in milliseconds).
static double[] littleEndianBytesToValues(byte[] data, int offset, int length, int bytesPerValue, boolean signedData)
          Converts a little-endian byte array into an array of doubles.
static java.lang.String shortArrayToString(short[] data)
          Returns the string representation of the given short array.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

byteToShortArray

public static short[] byteToShortArray(byte[] byteArray,
                                       int offset,
                                       int length)
                                throws java.lang.ArrayIndexOutOfBoundsException
Converts a byte array into a short array. Since a byte is 8-bits, and a short is 16-bits, the returned short array will be half in length than the byte array. If the length of the byte array is odd, the length of the short array will be (byteArray.length - 1)/2, i.e., the last byte is discarded.

Parameters:
byteArray - a byte array
offset - which byte to start from
length - how many bytes to convert
Returns:
a short array, or null if byteArray is of zero length
Throws:
java.lang.ArrayIndexOutOfBoundsException

bytesToValues

public static final double[] bytesToValues(byte[] byteArray,
                                           int offset,
                                           int length,
                                           int bytesPerValue,
                                           boolean signedData)
                                    throws java.lang.ArrayIndexOutOfBoundsException
Converts a big-endian byte array into an array of doubles. Each consecutive bytes in the byte array are converted into a double, and becomes the next element in the double array. The size of the returned array is (length/bytesPerValue). Currently, only 1 byte (8-bit) or 2 bytes (16-bit) samples are supported.

Parameters:
byteArray - a byte array
offset - which byte to start from
length - how many bytes to convert
bytesPerValue - the number of bytes per value
signedData - whether the data is signed
Returns:
a double array, or null if byteArray is of zero length
Throws:
java.lang.ArrayIndexOutOfBoundsException

littleEndianBytesToValues

public static final double[] littleEndianBytesToValues(byte[] data,
                                                       int offset,
                                                       int length,
                                                       int bytesPerValue,
                                                       boolean signedData)
                                                throws java.lang.ArrayIndexOutOfBoundsException
Converts a little-endian byte array into an array of doubles. Each consecutive bytes of a float are converted into a double, and becomes the next element in the double array. The number of bytes in the double is specified as an argument. The size of the returned array is (data.length/bytesPerValue).

Parameters:
data - a byte array
offset - which byte to start from
length - how many bytes to convert
bytesPerValue - the number of bytes per value
signedData - whether the data is signed
Returns:
a double array, or null if byteArray is of zero length
Throws:
java.lang.ArrayIndexOutOfBoundsException

bytesToShort

public static short bytesToShort(byte[] byteArray,
                                 int offset)
                          throws java.lang.ArrayIndexOutOfBoundsException
Convert the two bytes starting at the given offset to a short.

Parameters:
byteArray - the byte array
offset - where to start
Returns:
a short
Throws:
java.lang.ArrayIndexOutOfBoundsException

shortArrayToString

public static java.lang.String shortArrayToString(short[] data)
Returns the string representation of the given short array. The string will be in the form:
data.length data[0] data[1] ... data[data.length-1]

Parameters:
data - the short array to convert
Returns:
a string representation of the short array

doubleArrayToString

public static java.lang.String doubleArrayToString(double[] data)
Returns the given double array as a string. The string will be in the form:
data.length data[0] data[1] ... data[data.length-1]
where data[i]. The doubles can be written as decimal, hexadecimal, or scientific notation. In decimal notation, it is formatted by the method Util.formatDouble(data[i], 10, 5). Use the System property "frontend.util.dumpformat" to control the dump format (permitted values are "decimal", "hexadecimal", and "scientific".

Parameters:
data - the double array to dump
Returns:
a string representation of the double array

floatArrayToString

public static java.lang.String floatArrayToString(float[] data)
Returns the given float array as a string. The string is of the form:
data.length data[0] data[1] ... data[data.length-1]
The floats can be written as decimal, hexadecimal, or scientific notation. In decimal notation, it is formatted by the method Util.formatDouble(data[i], 10, 5). Use the System property "frontend.util.dumpformat" to control the dump format (permitted values are "decimal", "hexadecimal", and "scientific".

Parameters:
data - the float array to dump
Returns:
a string of the given float array

formatDouble

public static java.lang.String formatDouble(double number,
                                            int integerDigits,
                                            int fractionDigits)
Returns a formatted string of the given number, with the given numbers of digit space for the integer and fraction parts. If the integer part has less than integerDigits digits, spaces will be prepended to it. If the fraction part has less than fractionDigits, spaces will be appended to it. Therefore, formatDouble(12345.6789, 6, 6) will give the string
" 12345.6789  "
(one space before 1, two spaces after 9).

Parameters:
number - the number to format
integerDigits - the length of the integer part
fractionDigits - the length of the fraction part
Returns:
a formatted number

getSamplesPerWindow

public static int getSamplesPerWindow(int sampleRate,
                                      float windowSizeInMs)
Returns the number of samples per window given the sample rate (in Hertz) and window size (in milliseconds).

Parameters:
sampleRate - the sample rate in Hertz (i.e., frequency per seconds)
windowSizeInMs - the window size in milliseconds
Returns:
the number of samples per window

getSamplesPerShift

public static int getSamplesPerShift(int sampleRate,
                                     float windowShiftInMs)
Returns the number of samples in a window shift given the sample rate (in Hertz) and the window shift (in milliseconds).

Parameters:
sampleRate - the sample rate in Hertz (i.e., frequency per seconds)
windowShiftInMs - the window shift in milliseconds
Returns:
the number of samples in a window shift

bytesToFile

public static void bytesToFile(byte[] data,
                               java.lang.String filename)
                        throws java.io.IOException
Saves the given bytes to the given binary file.

Parameters:
data - the bytes to save
filename - the binary file name
Throws:
java.io.IOException - if an I/O error occurs

getNativeAudioFormat

public static javax.sound.sampled.AudioFormat getNativeAudioFormat(javax.sound.sampled.AudioFormat format)
Returns a native audio format that has the same encoding, endianness and sample size as the given format, and a sample rate that is larger than the given sample rate.

Returns:
a suitable native audio format

getNativeAudioFormat

public static javax.sound.sampled.AudioFormat getNativeAudioFormat(javax.sound.sampled.AudioFormat format,
                                                                   javax.sound.sampled.Mixer mixer)
Returns a native audio format that has the same encoding, endianness and sample size as the given format, and a sample rate that is greater than or equal to the given sample rate.

Parameters:
format - the desired format
mixer - if non-null, use this Mixer; otherwise use AudioSystem
Returns:
a suitable native audio format