

|
Volume Number: | 1 | |||
Issue Number: | 12 | |||
Column Tag: | Assembly Language Lab |
The Talking Mac
By Dan Weston, The NerdWorks, Salem, OR
In the May, 1985, issue of the Macintosh Software Supplement Apple released a package of tools and code units collectively called MacinTalk 1.1. With these tools programmers can make their Macintosh programs talk without any additional hardware. In this article I'll explain the general workings of MacinTalk and develop a small application program in assembly language that will show you how to use the main features of MacinTalk in your own programs.
Overview of MacinTalk
The MacinTalk system's most basic component is a driver that contains several procedures available to your programs. The driver is contained in a file called 'MacinTalk', and this file must be on the same volume as any application that wishes to use the MacinTalk driver. The most basic function of the driver is to convert ASCII strings of phonetic codes into speech. You can also use another part of the driver to convert standard English text into phonetic codes which can then be spoken by the driver. Furthermore, there are parts of the driver that you can use to control the rate of speaking and the pitch.
Beyond the actual driver procedures that you will be using in your programs, there are a few tools that are useful to you while you are preparing a program that will use speech. The program 'Speech Lab' allows you to enter English text in one window and then hear the MacinTalk speech and see the phonetic translation in another window. This program is very useful for learning the tricks of the phonetic code system used by Macintalk. For example, the English sentence "This is a test." is translated into the phonetic string,"DHIHS IHZ AH TEHST.#". This program can be used to pre-translate strings that your program will speak when the strings are known ahead of time. It is more efficient, both in time and memory, to feed phonetic strings directly to the MacinTalk driver rather than relying on translation at run time. Also, if you pre-translate you will be able to fine tune the phonetics, because the translation is not always perfect.
The translation of English to phonetics is governed by hundreds of phonetic and grammatical rules contained in the Macintalk driver, but these rules will not get every word right. Another program in the Macintalk 1.1 package is 'Exception Edit'. This program allows you to create a special file of tricky words and their correct phonetic translation. Exception Edit lets you experiment with the phonetic strings until you get them right, and then save those translations for later use. A file created by Exception Edit can be automatically loaded and utilized by mentioning it when the MacinTalk driver is opened, as shown in a later section of this article.
Fig. 1 Program Output
The Macintalk Driver
There are seven procedures in the MacinTalk driver that your program can call. They are listed briefly below.
FUNCTION SpeechOn(ExceptionsFile: Str255; theSpeech: SpeechHandle): SpeechErr;
This function opens up the driver and initializes the values for speecd and pitch. If you pass a null string for ExceptionsFile, then the translation of English to phonetics will follow the standard rules. If you pass a valid file name for ExceptionsFile, then that file, which must have been created by Exception Edit, will be used to help guide translation. If you pass the string 'noReader' for ExceptionsFile, then the driver will be opened but it will only be able to receive phonetic input and it will not be able to translate English to phonetics.
PROCEDURE SpeechOff(theSpeech: SpeechHandle)
This procedure closes the driver and deallocates any storage that it has been using.
FUNCTION MacinTalk(theSpeech: SpeechHandle; Phonemes:Handle): SpeechErr
This is the work horse of the driver. This is where phoneme code strings are converted to speech. The handle to the phonemes should refer to a string of ASCII phonemes without a length byte.
FUNCTION Reader(theSpeech: SpeechHandle; EnglishInput: Ptr; InputLength: LongInt; PhoneticOutput: Handle): SpeechErr
This is where English strings are translated into phonetic strings that can then be fed to MacinTalk. The Ptr to EnglishInput should not point to a length byte of a Str255. Point to the first character instead. The Handle for PhoneticOutput can start out as a zero length Handle, and Reader will dynamically grow the Handle to fit the output.
PROCEDURE SpeechRate(theSpeech: SpeechHandle; theRate:INTEGER)
This sets the rate at which words are spoken, in words/min. The rate must be between 85 and 425 words/min.
PROCEDURE SpeechPitch(theSpeech: SpeechHandle; thePitch: INTEGER; theMode: FOMode)
This sets the baseline pitch, in Hz, and sets the pitch mode, either natural or robotic.
PROCEDURE SpeechSex(theSpeech: SpeechHandle; theSex:Sex)
This is not implemented in MacinTalk 1.1
The glue which calls the various procedures in the driver is contained in the file SpeechASM.Rel, also available in the Software Supplement. Make sure that you include SpeechASM.Rel in the link file for you application so that the driver routines will be available to your code. Also, you must XREF the individual routines that you wish to use. See the listings of CheapTalk.ASM and CheapTalk.LINK for examples.
CheapTalk: a simple speech application example
The software supplement contains the source code for a very short example program that shows how to use the speech driver. As usual, it is in Pascal, so we assembly language programmers have to muddle along and figure things out ourselves. In order to learn the system myself, and to provide a clear example of the main features of MacinTalk, I have written CheapTalk, a dialog based application that speaks pre-translated text stored in a resource file and also translates and speaks user input at run time. CheapTalk opens a dialog and speaks the static message one time. Then it waits for the user to type English text into an edit text box in the dialog. Hitting return or pressing a 'Say it' button will translate the English text into phonemes and then say it.
This application will show you how to open and close the driver, and how to use MacinTalk and Reader from assembly language. It does not use the procedures to control the speed of pitch, but I imagine that you can figure that out for yourselves.
In my discussion of the code, listed in listing 1 as CheapTalk.ASM, I will concentrate on the parts pertinent to MacinTalk, and leave many of the details of the shell to speak for themselves.
Making the connection to SpeechASM.Rel
Toward the beginning of CheapTalk.ASM, notice the XREF statements necessary for the linker to establish the connection between our routine calls and the SpeechASM.Rel code that we link with our code.
XREF SpeechOn ; open driver XREF MacinTalk ; speak phonetic string XREF Reader; translate English to phonetics XREF SpeechOff ; close driver
The linker control file is listed in listing 2 as CheapTalk.LINK. SpeechASM.Rel is a code file which contains the glue routines necessary to call the individual procedures contained in the driver. SpeechASM.Rel does not contain the actual speech routines, just short procedures to call the appropriate section of the MacinTalk driver. All the routines of the speech driver expect their parameter on the stack.
Setting up the global variables for speech
Next, notice the global variable, 'theSpeech', defined as a long word to hold the handle to the speech globals that will be allocated when the driver is opened. We only have to define a variable to hold the handle, the opening routine will allocate the neccessary storage for the speech globals. Other globals that we need to define include a word length flag that we use to show if the driver was successfully opened, a 256 byte block to hold an English string, and a handle which will be used for phonetic output from Reader.
theSpeech DS.L 1 ; handle to speech driver globals speechOKDS.W1 ; our flag to show if driver opened theString DS.B 256 ; keep our English string here phHandleDS.L1 ; handle to phonetic string
If you look at CheapTalk.ASM you will see that there are several other global variables defined to use as VAR parameters associated with maintaining the dialog box.
Opening the driver
When we call SpeechOn to open the driver, we specify the null string (a string with length 0, which we define in the static variable area at the end of the code) for the ExceptionsFile so that the Reader will translate English to phonetics using the default rules. If we had a specific exceptions file that we had created with Exception Edit, then we could pass in that file name so that that exception file would be used. We also pass the address of our global variable, theSpeech, so that it can be updated to hold the handle to the speech globals which will be allocated by the open routine.
; assume that driver will open alright, set our flag to TRUE MOVE.W #1,speechOK(A5) ; set flag to TRUE ; now open driver to use default rules for translation ;FUNCTION SpeechOn(ExceptionsFile:Str255; ; theSpeech:SpeechHandle): SpeechErr CLR.W -(SP) ; space for result PEA NULL; defined at end of code PEA theSpeech(A5); handle for speech global JSR SpeechOn ; jump to open routine MOVE.W (SP)+,D0 ; get result code BEQ @1; branch if open OK ; if driver open not successful then clear speechOK flag ; to prevent further use of invalid driver MOVE.W #0,speechOK(A5) ; set flag to FALSE ; you could also dispay an error dialog here @1 ; branch to this point if open is successful
You can see how the result code is checked after SpeechOn to see if the driver was opened successfully. In the event of a non-zero result, impying a problem with the opening, we set the speechOK flag to 0 and continue on with the program. All other parts of the program which use the speech driver first check the speechOK flag to make sure that there is a valid driver to work with.
Speaking pre-translated speech
The static message in our dialog box is "This is a talking dialog demonstration." There is a phonetic translation of that string kept in the resource file as a resource of type PHNM. The translation was done using Speech Lab, and the resulting phonetic string put into the RMaker source file, listed in listing 3 as CheapTalk.R. I created the PHNM resource type for RMaker so that the phonetic string would not have a length byte. As a general strategy you can translate the static message of any dialog into a PHNM resource with the same resource ID number as the dialog. That way, it is easy to display the dialog and speak the message together.
When the PHNM resource is loaded into memory by GetResource, you get a handle to the phoneme string that you can pass to MacinTalk to recite. Remember, no length byte on phonetic strings! Generally, you should to pre-translate any strings that you know at assembly time so as not to waste time and memory translating at run time and also to insure higher quality speech by testing and refining the phonetic strings. Look at the following code to see how the PHNM resource is retrieved and then fed to MacinTalk.
; first check our flag to make sure that driver is open TST.W speechOK(A5) BEQ @2; driver not valid ; branch around speech stuff ; driver valid, go ahead and speak ;FUNCTION ; GetResource(theType:ResType;ID: INTEGER): Handle CLR.L -(SP) ; space for result MOVE.L #'PHNM',-(SP); resource type PHNM MOVE.W #theDialog,-(SP) ; use same ID# as dialog _GetResource MOVE.L (SP)+,A0 ; handle to phoneme string ;FUNCTIONMacinTalk(theSpeech:SpeechHandle; ; Phonemes:Handle):SpeechErr CLR.W -(SP) ; space for result MOVE.L theSpeech(A5),-(SP) ; speech global handle MOVE.L A0,-(SP) ; handle to phonemes JSR MacinTalk ; say it MOVE.W (SP)+,D0 ; get result code @2 ; branch to here to avoid speaking with invalid driver
Translating English to Phonetics and then Speaking
After saying the static dialog message upon opening, the program waits for the user to enter English text in the edit text window of the dialog. The program watches the results of ModalDialog until the 'Say it' button is pushed, at which point it uses GetDItem and GetIText to get the current English text of the edit text item. That text, which is a Str255, is fed into Reader to translate it into a phonetic string. Please notice that when we pass the English text into Reader we skip over the length byte at the head of the Str255. We do, however, use the length byte, after coercing it to a long word, as the length input to Reader. The Handle which we use to hold the phonetic output of Reader is initially associated with a zero length block, but Reader grows the block automatically to fit the output. Look at this code fragment which feeds the English string to Reader. (Assume that the string has already been placed in the variable 'theString' by calls to GetDItem and GetIText.)
; set up an empty handle first for Reader to fill with phonemes ;FUNCTION NewHandle(logicalSize: Size): Handle ; logicalSize => D0, Handle => A0 MOVEQ #0,D0 ; set up empty handle _NewHandle MOVE.L A0,phHandle(A5) ; save Handle for later ;FUNCTION Reader(theSpeech:SpeechHandle; ;EnglishInput:Ptr; ;InputLength:LongInt: PhoneticOutput:Handle);: SpeechErr CLR.W -(SP) ; space for result MOVE.L theSpeech(A5),-(SP) ; speech globals PEA theString+1(A5); Ptr to string, skip length CLR.L D0; clear out D0 MOVE.B theString(A5),D0 ; put length byte in D0 MOVE.L D0,-(SP) ; use longInt for length MOVE.L phHandle(A5),-(SP); we just allocated this ;handle JSR Reader; do translation MOVE.W (SP)+,D0 ; get result
Once we have used Reader to translate the English text into a phonetic string, we pass the handle to the phonemes to MacinTalk, much as we did earlier, to hear it spoken. Here is the code which speaks the translation and then deallocates the handle which held the phonetic string. It is important to deallocate this handle after the phonemes are spoken to avoid cluttering up memory with old sayings.
;FUNCTION MacinTalk(theSpeech: SpeechHandle ;Phonemes: Handle):SpeechErr CLR.W -(SP) ; space for result MOVE.L theSpeech(A5),-(SP) ; speech globals MOVE.L phHandle(A5),-(SP); handle to phonemes JSR MacinTalk ; say it MOVE.W (SP)+,D0 ; get result ; deallocate handle ;PROCEDURE DisposHandle(h: Handle) ; h => A0 MOVE.L phHandle(A5),A0 ; where phonemes are _DisposHandle
This process can be generalized to other situations where you want to translate arbitrary English text into speech. Just get a pointer to the first character of the text, get the length of the text, allocate an empty handle, and feed it all to Reader. The phonetic output of Reader can then by handed to MacinTalk to recite.
Closing the driver
We merely make a call to SpeechOff with theSpeech as input to close up the driver and deallocate the memory used by it. Generally, Macintalk will use at least 20 k of memory, plus dynamic buffers equal to about 800 bytes/second of uniterrupted speech (usually less than 10 seconds). In addition, Reader utilizes 10k plus a buffer to hold the translated text.
;PROCEDURESpeechOff(theSpeech: SpeechHandle) MOVE.L theSpeech(A5),-(SP) ; handle to speech ;globals JSR SpeechOff ; close it up
Putting it all together
Listings 1, 2, and 3 show the assembler source file, the linker control file, and the RMaker source file. You should assemble CheapTalk.ASM, then link it with CheapTalk.LINK. One thing to notice about the output file from the linker is that it is not a functional application until it is combined with the necessary resources by RMaker. Since Link output files are normally application type file, CheapTalk.LINK assigns a file type of 'CODE' so that the resulting output file will not have the characteristic diamond shaped icon. The final step of the program development is to run CheapTalk.R through RMaker to create the DLOG, DITL, and PHNM resources and combine them in one application file with the output file from the linker. The output of RMaker, Cheap Talk, will be a independent application program which can be moved to any disk and run as long as the driver file, MacinTalk, is also on that disk.
Summary
This discussion has been rather superficial. You are encouraged to study the source code and steal whatever parts of it you find useful for your own applications. All parts of the MacinTalk system are available in the Software Supplement or in the DL8 area of the Mac Developers interest group (PCS-7) on Compuserve, including the MacinTalk 1.1 documentation that Apple provides. This documentation is a good place to learn more about the phonetic symbols that MacinTalk uses and some of the finer points of the availale routines. You should also be aware that there is a licensing fee if you distribute programs that use MacinTalk 1.1, so contact Apple before you start shipping disks with MacinTalk on them.
Fig. 2 Program files
; CheapTalk.ASM ; A short program to demonstrate how to ; use Macintalk 1.1 from assembly language ; This program displays a dialog and speaks ; the written message in the dialog ; It also will speak English strings written ; into an edit text box in the dialog ; copyright August 1985 ; Dan Weston ; This program uses subroutines from the file SpeechASM.rel ; You must include that file in your link file list ; and XREF the particular routines here ; You must also have the file 'MacinTalk' on the same volume ; as this application program XREF SpeechOn ; open driver XREF MacinTalk ; say something XREF Reader; translate English to phonemes XREF SpeechOff ; close the driver theDialog EQU 1 ; resource ID # of dialog sayitbutton EQU 1 ; item # for 'say it ' quitbuttonEQU 2 ; item # for 'quit' usertextEQU 3 ; item # for edit text box INCLUDE Mactraps.D ; --------------- Global Variables ------------------- theSpeech DS.L 1 ; handle to speech driver globals speechOKDS.W1 ; our flag to show if driver open theString DS.B 256 ; VAR for GetIText phHandleDS.L1 ; handle to phonetic string ItemHit DS.W1 ; VAR for modal dialog theType DS.W1 ; VAR for GetDItem theItem DS.L1 ; VAR for GetDItem theRect DS.W4 ; VAR for GetDItem ; --------------- Initialization ---------------------- BSRInitManagers ; at end of source file ; -------------- Open the Speech Driver ---------------- ; Open speech driver to use default rules ; assume that driver will open alright, set our flag to TRUE MOVE.W #1,speechOK(A5) ; set flag to TRUE CLR.W -(SP) ; result PEANULL ; defined at end of source code PEAtheSpeech(A5) ; VAR theSpeech JSRSpeechOn ; jump to to open routine MOVE.W (SP)+,D0 ; check result BEQ@1 ; branch if ok ; If driver open not successful then clear speechOK flag ; to prevent further use of invalid driver MOVE.W #0,speechOK(A5) ; You could also put an error dialog here @1 ; branch to this point if open is successful ;--------------- Get the Dialog from the Resource file -- CLR.L -(SP) ;Clear Space For DialogPtr MOVE #theDialog,-(SP) ; Resource # CLR.L -(SP) ;Storage Area on heap MOVE.L #-1,-(SP);Above All Others _GetNewDialog ;Get New Dialog MOVE.L (SP)+,D6 ;Move Handle To D6 ;PROCEDURESetPort (gp: GrafPort) MOVE.L D6,-(SP) ;Move Dialog Pointer To Stack _SetPort;Make It The Current Port ; usually you would not use DrawDialog, but we need to draw ; the dialog contents once before saying them, then go to ; Modal dialog which will draw the contents again ;PROCEDURE DrawDialog(dp:DialogPtr) MOVE.L D6,-(SP) _DrawDialog ;------------------- Speak pre-translated speech ------- ; now Say the static text item which has been pre-translated ; into a phoneme string with the same ID as the dialog ; first, check our flag to make sure that driver is open TST.W speechOK(A5) BEQ@2 ; driver not valid ; branch around speech stuff ; driver valid, go ahead and speak CLR.L -(SP) ; space for result MOVE.L #'PHNM',-(SP); resource type PHNM MOVE.W #theDialog,-(SP) ; use same ID as dialog _GetResource MOVE.L (SP)+,A0 ; handle to phoneme string CLR.W -(SP) ; space for result code MOVE.L theSpeech(A5),-(SP) ; speech global handle MOVE.L A0,-(SP) ; phonemes, from above JSRMacinTalk; say it MOVE.W (SP)+,D0 ; get result code @2 ; branch to here to avoid speaking with invalid driver ;------------------- Dialog loop ------------------ ; now process the dialog dialogloop ;PROCEDUREModalDialog (filterProc: ProcPtr; ; VAR itemHit: INTEGER) CLR.L -(SP) ;default filter proc PEAItemHit(A5) ;Item Hit Data _ModalDialog ; see which button was pushed CMP.W #quitbutton,ItemHit(A5) ; quit button? BEQcloseit CMP.W #sayitbutton,ItemHit(A5) ; say it? BEQsayit ; none of the above BRAdialogloop ; go around again ;----------------- Translate English to Phonetics and speak ------ sayit ; first, check our flag to make sure that driver is open TST.W speechOK(A5) BEQ@3 ; driver not valid ; branch around speech stuff ; driver valid, go ahead and speak ; get the current text in the edit text box MOVE.L D6,-(SP) ; we saved DialogPtr here MOVE.W #usertext,-(SP) ; the edit text item PEAtheType(A5) ; VAR type PEAtheItem(A5) ; VAR item PEAtheRect(A5) ; VAR box _GetDItem ;PROCEDUREGetIText(item:Handle;VAR text: Str255) MOVE.L theItem(A5),-(SP) ; result of GetDItem PEAtheString(A5) ; VAR text _GetIText ; now feed the text into reader to translate it into phonemes ; set up an empty handle first for Reader to fill with phonemes ;FUNCTION NewHandle(logicalSize: Size): Handle ; logicalSize => D0, Handle => A0 MOVEQ #0,D0 ; set up empty handle _NewHandle MOVE.L A0,phHandle(A5) ; save Handle for later CLR.W -(SP) ; space for result MOVE.L theSpeech(A5),-(SP) ; speech globals PEAtheString+1(A5) ;Ptr to string, skip length byte CLR.L D0; clear out D0 MOVE.B theString(A5),D0 ; put length byte in D0 MOVE.L D0,-(SP) ; use LongInt for length MOVE.L phHandle(A5),-(SP); we just allocated this JSRReader ; do translation MOVE.W (SP)+,D0 ; get result ; now feed the phonemes to Macintalk ;FUNCTION ;MacInTalk(theSpeech:SpeechHandle;Phonemes:Handle) ; :SpeechErr CLR.W -(SP) ; space for result code MOVE.L theSpeech(A5),-(SP) ; speech globals handle MOVE.L phHandle(A5),-(SP); handle to phonemes JSRMacinTalk; say it MOVE.W (SP)+,D0 ; get result code ; deallocate handle and loop back for more ; PROCEDURE DisposHandle (h: Handle) ; h => A0 MOVE.L phHandle(A5),A0 ; this is where the phonemes are _DisposHandle @3 ; branch to here to avoid speaking with invalid driver BRAdialogloop ;------------------ Close up shop ----------------------- closeit ;PROCEDURECloseDialog (theDialog: DialogPtr); MOVE.L D6,-(SP) ;Get Dialog Pointer To Close _CloseDialog;Close Window ; first, check our flag to make sure that driver is open TST.W speechOK(A5) BEQ@4 ; driver not valid ; branch around speech stuff ; driver valid, go ahead and close it ; PROCEDURE SpeechOff(theSpeech: SpeechHandle) MOVE.L theSpeech(A5),-(SP) ; handle to speech globals JSRSpeechOff; close it up @4 ; branch to here to avoid closing invalid driver _ExitToShell;Return To Finder ;--------------- Initialize Managers Subroutine ---------- InitManagers ;PROCEDUREInitGraf (globalPtr: QDPtr); PEA-4(A5) ;Space Created For Quickdraw's Use _InitGraf ;Init Quickdraw _InitFonts;Init Font Manager _InitWindows;Init Window Manager ;PROCEDUREInitDialogs (restartProc: ProcPtr); CLR.L -(SP) ; NIL restart proc _InitDialogs;Init Dialog Manager ;procedure TEinit _TEInit _InitCursor ; set arrow cursor RTS; end of InitManagers ;--------------------- Static Data ----------------------------- NULL DC.W0 ; null string /OUTPUT CheapTalkCode ; Since this code file will not run successfully until it has been ; joined with the resources by RMaker, set its file type so ; that it cannot be mistakenly run from the desktop. ; Link output files are usually of type APPL /TYPE 'CODE' 'LINK' ; link our code, CheapTalk, with the glue for the speech driver ; routines CheapTalk SpeechASM $ * CheapTalk.R * create the application Cheap Talk * First define all the resources, and then include the code * output file name, File type, file creator MDS2:Cheap Talk APPLCHTK * dialog resource is a vanilla dialog * make it pre-loaded (4) to speed things up Type DLOG ,1 (4) 60 100 260 400 Visible NoGoAway 1 0 1 * DITL resource for dialog has one static text item, * one edit text item, * and two buttons: 'Say it' and 'Quit' * The 'Say it' button is item #1 so that hitting return is * the same as clicking 'Say it' * make it pre-loaded (4) to speed things up Type DITL demo,1 (4) 4 Button 170 200 190 250 Say it Button 170 50 190 100 Quit EditText 40 30 150 270 Enter English text here StaticText Disabled 10 30 30 290 This is a talking dialog demonstration * PHNM resource is defined by us to be a string without length * byte it is a phonetic translation of the static tect in the DITL * of the same resource # * make it pre-loaded (4) to speed things up Type PHNM = GNRL demo,1 (4) .S DHIH9S, IHZ AH TAO4KIHNX DAY6AELAA1G DIH1MUNSTREY5SHUN # * now include the code produced by the linker INCLUDE MDS2:CheapTalkCode

- SPREAD THE WORD:
- Slashdot
- Digg
- Del.icio.us
- Newsvine