home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Chestnut's Multimedia Mania
/
MM_MANIA.ISO
/
speech
/
tran04
/
tran04.doc
< prev
Wrap
Text File
|
1990-11-04
|
6KB
|
152 lines
TRAN, version 0.4
September 1990
TRAN is a text-to-speech program for the I.B.M.-P.C.. It can read ASCII
(askey) text files, translate normal English spelling to phones (basic
building blocks of speech sounds. TRAN can also sound out each phone
through the internal speaker of the P.C., to an 8-bit DAC on one of the
P.C.'s I/O ports, or to a binary sound file.
usage: tran [ +|- flags] [-options] [file-name]
The file-name is an ASCII text file (with no word processor formatting
codes). If no file-name is given TRAN reads input from the keyboard; in
this case the program can be terminated with CTRL-C.
The TRAN program contains two independent sets of 47 phones. For the
internal speaker, TRAN uses a set of phones encoded as a sequence of
bits which determine the position (in or out) of the P.C. speaker at a
rate of about 16 kilohertz. When outputting to an I/O port, TRAN uses a
set of phones with 8-bit values for generating the speech wave-forms
with a digital-to-analog convertor (DAC) at a rate of 10 kilohertz.
COMMAND LINE OPTIONS
Command line flags and options control various features of the program.
A '+' turns the flag on and a '-' turns the flag off. Options can use
either '+' or '-'. The square brackets below indicate default values.
flags: + = on, - = off
i use hardware interrupts to time the output [off]
P type output of phone translation [off]
r type rule number and phone translation [off]
s say output (make sounds) [on]
t echo input to console [off]
T do phone translation [on]
v type other internal information [off]
options:
c say the time of day once
C say the time of day every 10 seconds
D N make an 8-bit waveform file for phone N
d1 N use N for space delay [5]
d2 N use N for voice pitch [1]
f N output 8-bit wave-form to binary file N
R print all pronunciation rules
p N output 8-bit wave-forms to DAC at port N
where N = LPT1, LPT2, or LPT3 for a printer port
or N = VMK1, VMK2, VMK3, or VMK4 for a COVOX VMK board
or N = INT to use the internal speaker
or N = the hexidecimal port number
z print duration and loudness of each 8-bit phone
Z make 8-bit waveform file for each phone
? type this summary of TRAN usage
USING TRAN
The following are examples of ways to use the TRAN program. The
simplest way to use TRAN is just to type "tran" without any command line
arguments. In this case TRAN will interactively read lines you type and
attempt to speak these lines as English text.
You can have TRAN type out and read this file with the command:
tran +t tran.doc
If you also want to see the phone translation add the +P flag:
tran +P +t tran.doc
You can save the phone translation in a file by typing
tran -s +P tran.doc > tran.phn
and listen to the phone translation at some other time file by typing
tran -T tran.phn
You can save the sound output in a binary file by typing
tran -f tran.trn tran.doc
The sound file format is a 128 byte header followed by 8-bit (unsigned)
waveform values.
The TRAN program will say (and type) the time of day once if you type
tran +t +c
or will continue to announce the time every 10 seconds if you type
tran +t +C
ADJUSTING THE OUTPUT RATE
When TRAN outputs to the internal speaker of the I.B.M.-P.C., there are
two timing parameters, d1 and d2, that control the rate that TRAN
speaks. These are set automatically, but can be adjusted if necessary.
Making d1 larger increases the pauses between words and making d2 larger
lowers the pitch of the voice during phones. Both d1 and d2 must have a
value of 1 or greater. On a 4.77-MHz, I.B.M.-P.C./X.T., the best values
for the timing parameters are d1=2 and d2=1. Setting these values
explicitly, by-passes the automatic setting, which saves a second or two
starting the program. These values can be set on the command line
tran -d1 2 -d2 1 ...
or by using the environment variable TRAN to pass these values
set TRAN= -d1 2 -d2 1
Any of the other command line flags and options may also be set using
this environment variable.
When TRAN outputs to an I/O port, the rate may be determined either by a
programmed delay loop (as it is for the internal speaker) or by
borrowing hardware interrupts from the timer normally used by DOS to
keep trak of the time of day. The default is to use the programmed
delay loop and the loop count can be adjust using -d1 as described
above. The +i flag enables the hardware interrupt timing method.
REFERENCES
The speech-to-text rules used in the TRAN program come mostly from an
article in an IEEE journal:
Elovitz, H.S., Johnson, R., McHugh, A., and Shore, J.E. (1976).
"Letter-to-Sound Rules for Automatic Translation of English Text to
Phonetics," IEEE Transactions on Acoustics, Speech, and Signal
Processing, Vol. ASSP-24(6), pp 446-458.
See the file RULES.TXT for a complete list of the speech-to-text rules
used by the TRAN program.
The 1-bit phones were extracted from a public domain program called
SPEECH by Andy McGuire.
The 8-bit phones were recorded from my voice with an 8-bit
analog-to-digital convertor (COVOX Voice Master) and can be output to
any 8-bit (unsigned) digital-to-analog convertor (DAC) attached to any
P.C. I/O port (for example, the Speech Thing, also from COVOX, which
attaches externally to the printer port). For further information
regarding these products, please contact:
COVOX inc
675-D Conger Street
Eugene, OR 97402