home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
The Fred Fish Collection 1.6
/
ffcollection-1-6-1993-02.iso
/
ff_disks
/
601-630
/
ff_618
/
voicedemo
/
voicelibrary.doc
< prev
next >
Wrap
Text File
|
1992-03-10
|
11KB
|
330 lines
*****************************************************************
Voice Recognition for the Amiga and PerfectSound 3
Voice.library (Ver 4.0) by Richard Horne - March 1992
*****************************************************************
FUNCTION OFFSET DEFINITIONS
_LVOLearn EQU -30
_LVORecognize EQU -36
_LVOAddVoiceTask EQU -42
_LVORemVoiceTask EQU -48
_LVOGainUp EQU -54
_LVOGainDown EQU -60
_LVORecDataAddress EQU -66
_LVORecMapAddress EQU -72
_LVOWordScore EQU -78
_LVOPickSampler EQU -84
************************* FUNCTION DEFINITIONS ******************
NAME:
Learn -- Learn a spoken phrase.
OFFSET:
-30
SYNOPSIS:
MapAddress = Learn (MapBuffer, Text, Screen, SequenceNum, X, Y)
d0 a0 a1 a2 d0 d1 d2
FUNCTION:
The "Learn" function stores a frequency map of a spoken word or
phrase. Each frequency map is made up of 72 long words of data
plus a 16 byte header for the associated ASCII text (304 bytes
total). "Learn" requires the user to reserve a MapBuffer in
memory equal to the size of vocabulary desired (number of words)
times 304 bytes. MapBuffer address is passed to "Learn" in a0.
Address of a null terminated text string representing the word or
phrase to be learned is passed to "Learn" in a1.
The "Learn" function will open it's own window on the screen
specified in a2 (use NULL for WBENCHSCREEN), at a position X, Y
specified in d1 and d2. The user will then be prompted to speak
the specified word or phrase to obtain three good digital
samples. Internally, these three samples are analyzed for
frequency content and transformed into a frequency map (304
bytes) which is stored in the MapBuffer according to the Sequence
Number specified in d0. "Learn" returns the memory address
within MapBuffer at which this particular frequency map is
stored. If "Learn" is intentionally cancelled using the close
gadget of the Learn Window, then a zero will be returned.
"Learn" is called separately for each word or phrase in the
vocabulary. After every word has been learned, MapBuffer will be
filled with a sequence of frequency maps (each 304 bytes). Then
the "Recognize" or "AddVoiceTask" functions can be called which
will listen to the PerfectSound digitizer, compute a frequency
map of incoming words compare them to the words in MapBuffer, and
indicate by Sequence Number which word or phrase is the best
match. The maximum number of words or phrases in the vocabulary
is 64.
Note that you must select an audio sampler (PerfectSound3 or
SoundMaster) using the "PickSampler" function before using the
"Learn" function.
"Learn" utilizes Amiga audio channel 0. So do not call this
function while any other application is using channel 0.
*****************************************************************
NAME:
Recognize -- Recognize a spoken word or phrase.
OFFSET:
-36
SYNOPSIS:
SequenceNum = Recognize (MapBuffer, SizeVocabulary, Resolution)
d0 a0 d0 d1
FUNCTION:
"Recognize" assumes that the user has learned a sequence of words
or phrases using the "Learn" function. MapBuffer contains a
sequence of frequency maps produced by "Learn" corresponding to
each word or phrase in the vocabulary. Mapbuffer address is
passed to "Recognize" in a0. Number of words or phrases in the
vocabulary are passed to "Recognize" in d0.
"Recognize" listens for an incoming word, computes it's frequency
map, and compares this map to the sequence of maps contained in
MapBuffer. The Sequence Number of the word or phrase in
MapBuffer which is most similar to that of the incoming word is
returned in d0. Note that the number "0" represents the first
word, "1" the second, and so on.
"Recognize" will operate at either high resolution (d1 = 0) or
low resolution (d1 = 1). High resolution computes a frequency
analysis of the incoming word or phrase at twice the number of
points in time as low resolution. High resolution is somewhat
better at word recognition, but takes almost twice the processing
time.
"Recognize" will return the following error codes if it cannot
find a match.
d0 = -1 if there is no match between the incoming frequency map
and any of the maps in MapBuffer.
d0 = -2 if the incoming word causes unacceptable digital
clipping. Volume should be reduced by moving your
microphone or by using the "GainDown" function.
d0 = -3 if incoming word is too low in volume. Volume should be
increased by moving your microphone or by using the "GainUp"
function.
d0 = -4 if the incoming sample is confused by extraneous noise.
"Recognize" utilizes Amiga audio channel 0. So do not call this
function while any other application is using channel 0.
*****************************************************************
NAME:
AddVoiceTask -- Initiate a separate task to recognize a spoken
word or phrase.
OFFSET:
-42
SYNOPSIS:
AddVoiceTask (MapBuffer, MsgPort, SizeVocabulary, Resolution)
a0 a1 d0 d1
FUNCTION:
"AddVoiceTask" is similar in function to "Recognize" except that
here, a separate task is started under the Amiga multitasking
operating system which listens for incoming words or phrases and
returns messages to the user's Message Port indicating the
Sequence Number of the frequency map in Mapbuffer which best
matches the frequency map of the incoming word. MapBuffer
address and Message Port address are passed to "AddVoiceTask"
in a0 and a1. Number of words or phrases in the vocabulary are
passed to "AddVoiceTask" in d0.
"AddVoiceTask" will operate at either high resolution (d1 = 0) or
low resolution (d1 = 1). High resolution computes a frequency
analysis of the incoming word or phrase at twice the number of
points in time as low resolution. High resolution is somewhat
better at word recognition, but takes almost twice the processing
time.
The messages sent to MessagePort are designed to mimic shortened
IDCMP messages with a im_Class = $0. Thus you can receive and
process these messages at either an Intuition window IDCMP
message port or at a custom message port of your own.
Messages sent by this task are as follows.
im_Code = Sequence number of frequency map in MapBuffer that
best matches the frequency map of the incoming
word or phrase.
im_Code = -1 if there is no match between the incoming
frequency map and any of the maps in MapBuffer.
im_Code = -2 if the incoming word causes unacceptable
digital clipping. Volume should be reduced by
moving your microphone or by using the "GainDown"
function.
im_Code = -3 if incoming word is too low in volume. Volume
should be increased by moving your microphone or
by using the "GainUp" function.
im_Code = -4 if the incoming sample is confused by
extraneous noise.
Upon calling "AddVoiceTask", the PerfectSound digitizer becomes
immediately active, listening for an incoming word. After
receipt of a word or phrase, a message as described above is sent
to Message Port. The VoiceTask then goes into a WAIT mode and
remains inactive until it receives a reply to the message it has
sent to Message Port. Upon receipt of a reply, VoiceTask again
becomes goes active and listens for an incoming word.
While VoiceTask is in WAIT, you are free to use any audio channel
in other applications. However, when VoiceTask is active, it
utilizes audio channel 0 and will conflict with any other
application using audio channel 0.
*****************************************************************
NAME:
RemVoiceTask -- Remove task initiated by AddVoiceTask
OFFSET:
-48
SYNOPSIS:
RemVoiceTask ()
FUNCTION:
Deallocates memory and removes VoiceTask from the Amiga system.
Note that the Message Port specified for the "AddVoiceTask" function
must still exist at the time you call "RemVoiceTask".
*****************************************************************
NAME:
GainUp -- Increase gain of PerfectSound 3 audio digitizer.
OFFSET:
-54
SYNOPSIS:
GainUp()
FUNCTION:
Increases gain of the PerfectSound audio digitizer by one step.
Note that when gain reaches maximum, "GainUp" will wrap around
and return gain to it's lowest value. Do not call this function
if you are using the SoundMaster audio digitizer.
*****************************************************************
NAME:
GainDown -- Decease gain of PerfectSound 3 audio digitizer.
OFFSET:
-60
SYNOPSIS:
GainDown()
FUNCTION:
Decreases gain of the PerfectSound audio digitizer by one step.
Note that when gain reaches minimum, "GainDown" will wrap around
and return gain to it's highest value. Do not call this function
if you are using the SoundMaster audio digitizer
*****************************************************************
NAME:
RecDataAddress -- Return memory address of digital sample of
incoming word or phrase.
OFFSET:
-66
SYNOPSIS:
RecDataAddress()
FUNCTION:
When an incoming word or phrase is digitized, 3/4 second of
digital data is stored in an internal buffer. This is 8 bit
digitized data is sampled at a rate of 6400 Hz. Thus the buffer
for storing this data is 4800 bytes in size. This function
returns the address of this buffer for possible additional
experimental uses.
*****************************************************************
NAME:
RecMapAddress -- Return memory address of frequency map of
incoming word or phrase.
OFFSET:
-72
SYNOPSIS:
RecMapAddress()
FUNCTION:
A frequency map of each incoming word or phrase is computed for
comparison with maps learned and stored in MapBuffer. Each map
consists of a frequency analysis of 3/4 second of audio data at
72 points in time. For each of these 72 time points, the data is
examined for frequency content at 32 points between 0 Hz and 3200
Hz. A frequency map is made up of 72, 32 bit words corresponding
to the 72 time points analyzed. For each of these 32 bit words,
bit 0 is set if the signal contains frequency components from
0-100 Hz. Bit 1 is set if the signal contains frequency
components from 100-200 Hz. Bit 2 is set if the signal contains
frequency components from 200-300 Hz etc. This function returns
the address of this frequency map for possible additional
experimental uses. Note that this internal frequency map does
not have the 16 byte ASCII header as do the frequency maps
stored in MapBuffer.
*****************************************************************
NAME:
WordScore -- Return recognition score of a recognized word.
OFFSET:
-78
SYNOPSIS:
WordScore()
FUNCTION:
The recognize function computes a numerical score representing the
"goodness" of a match between the frequency map of an incoming word
and each frequency map stored in MapBuffer. The recognized word
is determined by highest score. This function returns the numerical
score of the recognized word.
*****************************************************************
NAME:
PickSampler -- Specify which model audio sampler to use (either
PerfectSound3 or SoundMaster).
OFFSET:
-84
SYNOPSIS:
PickSampler (SamplerID)
d0
FUNCTION:
Select the audio sampler to be used with this function. SamplerID = 0
for PerfectSound3. SamplerID = 1 for SoundMaster