home *** CD-ROM | disk | FTP | other *** search
Text File | 1993-10-23 | 111.5 KB | 2,575 lines |
- Atari ST Machine Specific Programming In Assembly
-
- Chapter 5: Performance Testing
-
-
- The Never Finished Theory
-
- In a recent magazine article, the author stated that no
- program is ever finished. I have seen that viewpoint
- expressed many times. Sympathetically, I agree in principle
- with the emotional flavor of this statement, I suppose, but
- when I am working on a program, I always reach a point at
- which I can conclude that its performance is satisfactory.
- At that stage I say that the program is finished.
- The reason that some programmers, and perhaps some
- users also, come to accept the never finished concept as
- gospel is that they have seen too many programs, either
- purchased or written, that never seem to perform completely
- satisfactorily, and, therefore, seem to continuously require
- fine tuning or corrections. But this program attribute is
- not an inherent consequence of program development. The
- problem with such programs is that their performance was
- judged to be satisfactory prematurely. Too often, the
- performance of a program is judged to be satisfactory by its
- author if the program seems to accomplish its primary
- function after a few cursory tests. Program testing, like
- program documentation, seems to be a distasteful chore to
- many programmers. That's probably why so many programs are
- thrust into the software market prematurely.
- The attitude that I have developed is one which views
- algorithmic design, documentation and testing as steps in a
- single process, each of which demands the same level of
- concentration, concern and quality control. If you can
- adopt a similar attitude, I guarantee that you will be a
- happier, more successful programmer than one who finds any
- phase of program development boring or distasteful.
- Documentation is your front line defense against
- programming catastrophes. To be able to fix a program, at
- repair time, you must be able to understand it as well as
- you did when you wrote it. The same level of understanding
- is required when you decide to intentionally enhance a
- program. If program documentation includes the results of
- performance testing, then a program's prior performance can
- be used to gauge performance after alterations.
- When a program seems to be malfunctioning, the first
- action you take should be to compare its current performance
- to past performance under known conditions. Many times,
- such comparisons will reveal that the execution environment,
- not the program, is at fault. Of course, it is then that
- you may decide that a new version of the program is required
- to cope with an altered execution environment.
-
- The Three For One Theory
-
- When I was working on a large mainframe, the
- manufacturer to remain nameless, for a company that shall
- also remain nameless, we programmers developed a formula for
- bug introduction into the mainframe's operating system. For
- every bug fixed, three more were introduced. This is, of
- course, one of Murphy's laws. Sometimes the new bugs were
- called enhancements to obscure the fact that they were
- screwups.
- But that's what all bugs are. They are errors that you
- make when you write your programs. This is the first truth
- that you should hold to be self-evident, if you want to
- develop programs that eventually perform satisfactory. Once
- you realize that errors in your programs will be there
- because of your own carelessness, or in spite of your best
- efforts, you can take steps to prevent them from being
- catastrophic.
-
- Realistic Expectations
-
- When I judge the performance of an item of software or
- hardware that I have purchased, I compare its performance to
- the levels at which I have been led to believe it should be
- according to the product's designer, manufacturer and
- seller. To that extent, if the product fails to meet my
- expectations, then I have been cheated. If I am fooled
- twice by the same designer, manufacturer or seller, then I
- have cheated myself.
- When I judge the performance of an item of software or
- hardware that I have designed and constructed, I restrict
- my expectations to levels that are commensurable with my
- knowledge, experience and available tools. If I have done
- my best, then I can do no more, unless I decide to redesign
- and reconstruct after obtaining more knowledge, more
- experience or better tools.
- The performance of an item of software is inherently
- restricted by the design of the computer system--that should
- be obvious. Performance is also influenced by the extent of
- the programmer's knowledge about the system and developed
- programming ability. A programmer's ability is developed
- via education and experience. To the extent that this
- ability restricts the performance of final product, it is a
- constituent of the overall programming environment.
- If a program's performance depends on your programming
- ability, how can you determine when its performance is
- satisfactory? Well, when a program executes a task
- according to your specifications, then its performance must
- be judged satisfactory. How stringent should your
- specifications be? Your performance demands must be
- commensurable with your programming ability. When you have
- exerted your best effort, you must be satisfied with the
- final product; or you must obtain another system; or you
- must accumulate more knowledge and experience, so that you
- can demand more stringent specifications.
- Accumulating Knowledge about your computer system's
- capabilities can be a horrendous task: cataloging your
- system's real, versus advertised or reported, capabilities
- requires extensive performance testing of the system,
- because you can't trust someone else's assessments. For
- example, on page 18 of my star NX-10 user's manual there is
- a description of a self-test and the following statements:
-
- Were you surprised? It's fast, isn't it? About 120
- characters a second, to be exact.
-
- Would any serious person use the words about and exact
- in the way they are used above? When I execute the test on
- my printer, 503 characters are printed. The elapsed time
- according to my stopwatch was 6 seconds. This means that
- the printer prints 83 characters per second in that test
- mode. Even allowing for a one second error in timing, the
- printing speed would be increased to only 100 characters per
- second. In order for the printer to meet specifications, it
- would have to print the 503 characters in 4.19 seconds.
- Again, on page 213 of the manual the printing speed in
- Draft pica mode is specified to be 120 characters per
- second; no about qualifier there. When printing an ASCII
- file in that mode, with the entire file contained in the
- printer's buffer, so that the printer's speed depends only
- upon its own capability, I measure a maximum printing speed
- of 68 characters per second.
- Are the manufacturer's specifications incorrect? Is my
- method of timing the printer's speed incorrect? I have
- learned not to be absolutely sure about anything in this
- world, but I think my time would be wasted if I were to
- spend it trying to develop a program which depended on the
- printer's ability to print at 120 characters per second.
- That would be an unrealistic expectation.
-
- Performance Measuring Tools
-
- Because of its dependency on ability and personal
- assessment, to some extent, performance testing must be
- subjective. In chapter 1, I said that I have been satisfied
- with the star NX-10, and I have been, in spite of the
- printing speed controversy. The printer's other
- capabilities and its low cost more than compensate for that
- discrepancy, if it actually exists. Therefore, in my
- opinion, the performance of the printer is satisfactory.
- This is my personal assessment.
- Of course, one might be inclined to scold me, pointing
- out that my method of measuring elapsed time during the
- printing speed tests was crude. To which I would reply,
- "It's the only method that was available to me." And, I
- might add, I have found it to be much more reliable than
- words printed on paper in a user's manual. Any user's
- manual.
- One individual's judgement of the overall performance
- of a particular item of software is as subjective as is my
- conclusions about the star NX-10. But specific software
- attributes can be judged objectively, if tools which can
- measure pertinent aspects of performance are available. I
- am going to provide you with some of those tools in this
- chapter.
- I will introduce utility programs with which the
- efficiency of individual instructions, algorithms and
- programs may be compared. I will provide programs that
- perform many comparisons, but, since the subject of
- performance testing must be restricted to a reasonable
- length in the book, I will concentrate more on showing you,
- by example, when and how I decide to conduct performance
- tests, rather than flit through comparisons until you become
- bored with the whole idea.
-
- The First Utility
-
- While the primary objective of the chapter is to
- provide performance measuring utilities, a secondary
- objective is to illustrate specific stages of program
- development. I begin with the specifications for a utility
- to be called SPEEDTST. Then I introduce the first of a
- series of programs, each of which is a model that represents
- a snapshot of a continuous process. The other programs
- follow after the introduction of a program on which the
- models can operate.
- The programs introduced in this chapter invoke the
- custom traps described in program 13. To install the custom
- traps, execute TRAPS.PRG. If you want the traps to be
- automatically installed during system boot, copy TRAPS.PRG
- to the AUTO folder on your boot partition or floppy disk.
- The first utility will calculate a program's load and
- execution times. As I concluded chapter 3, I mentioned that
- the first stage of increasing a program's execution speed
- involved getting it into ram as quickly as possible.
- Methods of doing that will be discussed in this chapter. In
- order to discuss a variety of methods, I need a way to
- measure the time required to load and execute a program.
-
- Specifications For SPEEDTST
-
- SPEEDTST must accomplish the following:
-
- 1. Spawn a process = load and execute a program.
- 2. Programs to be spawned will have a TOS or PRG
- suffix.
- 3. The spawned program will reside in the same
- directory as does SPEEDTST.
- 4. Create a disk file which is to be identified by
- the name of the spawned program with a DAT
- suffix. The disk file is to reside in the same
- directory as does SPEEDTST.
- 5. Calculate the spawned program's load and
- execution times.
- 6. Store the load and execution times in the disk
- file described in item 4.
- 7. If the spawned process directs output to the
- video screen via GEMDOS function $9, redirect
- that output to the file described in item 4.
-
- The First Model
-
- Program 15 is the first in a series of four programs
- which progress in algorithmic perfection until the program
- SPEEDTST is developed. SPEED_1 is the first working model
- of a parent program which loads and executes a child
- program. The parent calculates the spawned program's load
- and execution times, using information returned to the
- parent when the child terminates. The parent creates a disk
- file and stores the calculated values therein. If the child
- directs output to the screen using GEMDOS function $9, that
- output will be redirected to the file. The name of the file
- created by the parent is composed of the name of the child,
- without suffix, plus the extension DAT.
- While it is doing all of that, the parent also confirms
- that trap #6 has been installed by TRAPS.PRG and functions
- correctly. The parent accomplishes the verification simply
- by being able to spawn, which it can't do if custom trap #6
- fails to return excess memory to the operating system. Trap
- #6 also performs another function, but its effectiveness is
- confirmed only if the child terminates using custom trap #8.
- Refer to the extensive note in the data section of program
- 16.
- Program 16 must be assembled in PC-relative mode and the
- executable file must be saved with a TTP extension. When it
- is executed, the filename of the program to be spawned must
- reside in the same directory as does program 16. Type the
- name of the program to be spawned on program 16's input
- parameter line. As you shall see, a program that is to be
- spawned by program 16 must be specifically prepared for the
- spawning operation.
- Program 16, as does programs 18 and 19, invokes custom
- traps which must be installed by programs TRAPS.PRG (program
- 13, chapter 4) and TRAP_9.PRG (program 15, chapter 5),
- therefore, these programs must be executed from the desktop
- or from the AUTO folder of a boot partition or floppy before
- programs SPEED_1.TTP, SPEED_2.TTP or SPEED_3.TTP are
- executed. TRAP_9.S follows.
-
-
- Program 15. This program installs a custom trap for programs
- 16, 18 and 19.
-
- ; Program Name: TRAP_9.S
- ; Version 1.002
-
- ; Assembly Instructions:
-
- ; Assemble in PC-relative mode and save with a PRG extension.
-
- ; Program Function:
-
- ; This is a LSR program that establishes a user defined trap. It may be
- ; executed from the desktop, but you may prefer to copy it to the AUTO
- ; folder of your boot partition or floppy disk so that it will execute
- ; automatically during boot.
-
- ; MAJOR NOTE: SEE FURTHER DOCUMENTATION FOR THIS PROGRAM IN TRAPS.S.
-
- ; Trap #9 is special in that it is only used by three programs: SPEED_1.TTP,
- ; SPEED_2.TTP and SPEED_3.TTP. The custom trap is used simply to reduce the
- ; size of those programs.
-
- ; This program invokes a custom trap that is established by TRAPS.PRG,
- ; therefore, that program must be executed before trap #9 is invoked by a
- ; program.
-
- program_start: ; Calculate program size and retain result.
- lea program_end, a3 ; Fetch program end address.
- suba.l 4(a7), a3 ; Subtract basepage address.
-
- enter_supervisor_mode:
- move.l #0, -(sp) ; The zero turns on supervisor mode.
- move.w #$20, -(sp) ; Function = super = GEMDOS $20.
- trap #1 ; Go to supervisor mode.
- addq.l #6, sp ; Supervisor stack pointer (SSP) returned in D0.
- movea.l d0, a5 ; Save SSP in scratch register.
-
- install_trap_9_routine: ; Note: pointer = vector = pointer.
- lea trap_9_routine, a0 ; Fetch address of trap #9 routine.
- move.l a0, $A4 ; Store custom trap address in pointer.
-
- enter_user_mode:
- pea (a5) ; Restore supervisor stack pointer.
- move.w #$20, -(sp) ; Function = super = GEMDOS $20.
- trap #1 ; Go to user mode.
- addq.l #6, sp ; Reset stack pointer to top of stack.
-
- relinquish_processor_control: ; Maintain memory residency.
- move.w #0, -(sp) ; See page 121 of Internals book.
- move.l a3, -(sp) ; Program size.
- move.w #$31, -(sp) ; Function = ptermres = GEMDOS $31.
- trap #1
-
- trap_9_routine:
-
- ; Expects a programs load time in register D3 as a binary number. This
- ; algorithm converts the value in D3 to milliseconds (msec) then prints the
- ; load time in decimal msec.
-
- ; Also expects a programs execution time in register D5. The same service
- ; is performed for the value in that register.
-
- convert_load_time_to_msec:
- move.l d3, d0 ; Save a copy to add.
- asl.l #2, d3 ; Shift to multiply by 4.
- add.l d0, d3 ; To complete multiplication by 5.
-
- print_load_time:
- cmpi.l #999, d3 ; If load time is less than 1000, then
- bgt no_space ; print a leading blank space for output
- lea space, a0 ; alignment.
- bsr print_string
- cmpi.l #99, d3 ; If load time is less than 100, then
- bgt no_space ; print another leading blank space.
- lea space, a0
- bsr print_string
- no_space:
- move.l d3, d1 ; Copy load time to D1 for decimal conversion.
- trap #4 ; Returns address of decimal string in A0.
- bsr.s print_string
- lea units_label, a0
- bsr.s print_string
-
- convert_execution_time_to_msec:
- lea execute_time_msg, a0
- bsr.s print_string
- move.l d5, d0 ; Save a copy to add.
- asl.l #2, d5 ; Shift to multiply by 4.
- add.l d0, d5 ; To complete multiplication by 5.
-
- print_execution_time:
- cmpi.l #999, d5 ; If execute time is less than 1000, then
- bgt _no_space ; print a leading blank space for output
- lea space, a0 ; alignment.
- bsr print_string
- cmpi.l #99, d5 ; If execute time is less than 100, then
- bgt.s _no_space ; print another leading blank space.
- lea space, a0
- bsr.s print_string
- _no_space:
- move.l d5, d1 ; Copy execute time for decimal conversion.
- trap #4 ; Returns address of decimal string in A0.
- bsr.s print_string
- lea units_label, a0
- bsr.s print_string
- rte
-
- ;
- ; Subroutine
- ;
-
- print_string: ; Expects address of string to be in A0.
- pea (a0) ; Push address of string onto stack.
- move.w #9, -(sp) ; Function = c_conws = GEMDOS $9.
- trap #1 ; GEMDOS call
- addq.l #6, sp ; Reset stack pointer to top of stack.
- rts
-
- data
- space: dc.b " ",0
- execute_time_msg: dc.b " Execute time: ",0
- units_label: dc.b " milliseconds", $D,$A,0
- bss
- align ; Align storage on a word boundary.
- program_end: ds.l 0
- end
-
-
- Program 16. A utility that computes a program's load and
- execution times.
-
- ; Program Name: SPEED_1.S
- ; Version: 1.006
-
- ; Assembly Instructions:
-
- ; Assemble in "PC-relative" mode and save with a TTP extension.
-
- ; Execution Instructions:
-
- ; SPEED_1.TTP will not execute unless the custom traps in program
- ; TRAPS.PRG and TRAP_9.PRG have previously been installed. The custom
- ; traps are installed when those programs are executed from the desktop
- ; or from an AUTO folder on a boot disk.
-
- ; NOTE: The time required for a program to be loaded into memory depends
- ; on the assembly mode used to assemble the program. This will be
- ; shown, using SPEEDTST.TTP, in chapter 5.
-
- ; In addition, a program's load time depends on the drive from which
- ; the program is loaded, the method used to format the disk on which
- ; the program is located, the position of the program on the disk
- ; and, in this case, the position of the child process relative to the
- ; position of the parent process.
-
- ; To eliminate the drive-variables when comparing the load and
- ; execution times of one program to that of another, the parent and
- ; the child should be isolated to an otherwise empty partition or
- ; floppy disk for each spawning instance.
-
- ; For example, if there are two programs involved in the comparison,
- ; first copy the parent, which is SPEED_1 in this case, so that it is
- ; the only item in the hard disk partition or on the floppy. Then,
- ; copy the first program to the same partition or floppy. Execute
- ; the parent, SPEED_1 in this case, and obtain the results.
-
- ; Remove the first program and copy the second. Execute the parent
- ; and obtain the results for the second program.
-
- ; Execute from the desktop. Type the name of an executable file which
- ; has a TOS or PRG extension on SPEED_1.TTP's input parameter line. The
- ; name of the program you type on the parameter line must be in the same
- ; directory as is SPEED_1.TTP and the program must be one that terminates
- ; with GEMDOS function $4C.
-
- ; Upon termination, the spawned program must return the value that is in
- ; memory location $4BA immediately after it has been loaded, hereafter called
- ; the after-load time or after-load value. Custom trap #3 (get_time) can be
- ; used to obtain that value. SPEED_1.TTP uses the value returned in D0 to
- ; calculate the spawned program's load and execution times.
-
- ; The spawned program must terminate with GEMDOS function $4C so that
- ; the after-load value can be returned in D0 by that function. The value
- ; returned in D0 by GEMDOS function $4C is limited to a 16 bit value.
-
- ; If the spawned program has any halt or wait instructions, such as wait
- ; for a keypress, etc., those should be commented out, then the program
- ; should be assembled especially for the speed test. Otherwise the
- ; execution time will include the time waiting for input.
-
- ; If custom trap #8 is used to terminate the program, the trap will
- ; execute a wait_for_keypress algorithm when the program is executed from
- ; the desktop, but it will omit the wait algorithm when the program is
- ; spawned by SPEED_1.TTP. In addition, trap #8 will return the after-load
- ; value to SPEED_1.TTP and terminate the spawned program with GEMDOS function
- ; $4C.
-
- ; Both trap #8 and SPEED_1.TTP require that the spawned program be
- ; initialized with custom trap #6. See the note in the data section, below.
-
- ; Primary Function:
-
- ; Spawn a process. Calculate the spawned program's load and execution
- ; times. Store these values in a disk file that is identified by the name
- ; of the spawned process with a DAT suffix.
-
- ; If the spawned process directs output to the screen, store that output
- ; in the same disk file. Note: only screen directed output processed by
- ; GEMDOS function $9 will be directed to the file. If BIOS function $3 is
- ; used for screen output, that output will not be redirected to the file.
-
- ; Secondary Function:
-
- ; Verify that trap #6 is resident and functions correctly. SPEED_1
- ; confirms that because it will not be able to spawn a process unless
- ; the trap #6 call has returned excess memory to the system.
-
- ; Description:
-
- ; SPEED_1 is the first in a series of programs which progress in
- ; algorithmic perfection until the program SPEEDTST is developed. Using
- ; this series of programs, I intend to help you experience selected stages
- ; of a program development process.
-
- ; The primary attribute of this development process is its dependence,
- ; during the early stages of development, on familiar documented algorithms
- ; that can easily be found in references for many programming languages.
- ; After a working model has been developed with these familiar algorithms,
- ; attempts are made to introduce unfamiliar algorithms which may be faster
- ; or consume less memory.
-
- release_excess_memory:
- lea program_end, a0 ; Put "end of program" address in A0.
- movea.l 4(a7), a1 ; Put "basepage" address in A1.
- movea.l a1, a4 ; Copy to A4 for command line access.
- trap #6 ; Calculate program size and release memory.
-
- ; NOTE: A local stack is not declared in PRG_5AP.TOS. Because of the long
- ; string that is printed by that program, this program will bomb when
- ; it spawns PRG_5AP.TOS, if a local stack is not declared here.
-
- lea stack, a7 ; Point A7 to this program's stack.
-
- ; The next task to be accomplished is an initialization algorithm. The
- ; name of the program that is to be typed on SPEED_1.TTP's input parameter
- ; line must be used in several ways. First, its suffix must be changed to
- ; DAT so that it can be passed as a NULL terminated string when GEMDOS $3C
- ; is invoked to create the disk file.
-
- ; Then it must be passed as a NULL terminated string with the program's
- ; original suffix when GEMDOS $4B is invoked to spawn the program.
-
- ; Finally, the program's name is used as part of SPEED_1.TTP's output
- ; header.
-
- ; The command line processing algorithm creates the required NULL terminated
- ; strings, storing them in locations declared in the data section of SPEED_1.
-
- process_command_line_parameters:
- lea $80(a4), a4 ; Fetch address of parameters.
- move.b (a4)+, d0 ; Fetch parameter line character count.
- lea program_name, a3 ; Load program_name address in A3.
- subq.b #1, d0 ; Set up counter.
- ext.w d0 ; Extend to match the size of the dbra
- ; instruction.
-
- ; NOTE: The dbcc instruction operates on a word length value, therefore,
- ; the value in the register that is to be decremented by a dbcc
- ; instruction must be placed there with a word size instruction, such
- ; as move.w #10, D0; or with a longword size instruction, as long as
- ; the value in the longword is limited to word size validity, or with
- ; a byte size instruction, as long as the value in the register is
- ; sign extended to word size, as is done in the instruction above.
-
- fetch_character:
- move.b (a4)+, (a3)+ ; Store character.
- dbra d0, fetch_character ; Loop until d0 becomes negative.
- move.b #0, (a3) ; Finish with a NULL.
-
- create_file_name: ; Create a file to accept standard output.
- lea filename, a4
- lea program_name, a3
- copy_name:
- move.b (a3)+, (a4)+
- cmpi.b #$2E, (a3) ; Is next byte of program_name the period?
- bne.s copy_name ; Continue looping until period is seen.
- move.b #$2E, (a4)+ ; Add a period.
- move.b #$44, (a4)+ ; Add letter 'D'.
- move.b #$41, (a4)+ ; Add letter 'A'.
- move.b #$54, (a4)+ ; Add letter 'T'.
- move.b #0, (a4) ; Add a NULL.
-
- create_file:
- move.w #0, -(sp) ; File attribute = read/write.
- pea filename ; Will be name of spawned process + .DAT.
- move.w #$3C, -(sp) ; Function = f_create = GEMDOS $3C.
- trap #1 ; File handle is returned in D0.
- addq.l #8, sp
- lea file_handle, a0 ; Store returned file handle.
- move.w d0, (a0)
-
- redirect_output: ; Exchange file handle with screen's handle.
- move.w file_handle, -(sp) ; This is the disk file's handle.
- move.w #1, -(sp) ; This is the video screen's handle.
- move.w #$46, -(sp) ; Function = f_force = GEMDOS $46.
- trap #1
- addq.l #6, sp
-
- get_start_time:
- lea start_time, a3 ; Fetch address of variable "start_time".
- trap #3 ; Returns value of system clock in D0.
- move.w d0, (a3) ; Save start time.
-
- load_and_execute_program:
- pea environ_string
- pea command_line
- pea program_name
- move.w #0, -(sp)
- move.w #$4B, -(sp) ; Function = GEMDOS $4B = p_exec.
- trap #1
- move.w d0, d3 ; Copy after-load value to D3 for calculation.
-
- get_end_time:
- trap #3 ; Returns value of system clock in D0.
- move.w d0, d5 ; Copy to D5 for calculation.
- sub.w d3, d5 ; Subtract after-load time from end time.
- ext.l d5 ; Extend to 32 bits.
-
- ; NOTE: D5 now contains the spawned program's execution time, but the time
- ; has not yet been converted to milliseconds. See the note below
- ; concerning the sign extension of D3 and D5.
-
- reposition_stack_pointer:
- lea $10(sp), sp
-
- ; Note the difference between the use of GEMDOS function $19 below and
- ; the way it is used on page 116 of the Internals book. In the
- ; Internals book there are two errors: (1) sp should not be referenced
- ; indirectly, as (sp); (2) the ASCII code for the letter A should be
- ; added to the contents of the register--in the internals book the
- ; contents of the register are added to the ASCII code for the letter
- ; A.
-
- get_drive:
- move.w #$19, -(sp) ; Function = dgetdrv = GEMDOS $19.
- trap #1 ; Returns 0 for drive A, 1 for B, etc.
- addq.l #2, sp
- add.b #$41, d0 ; Add ASCII value for A to compute ASCII
- lea drive, a0 ; letter code for the drive value returned.
- move.b d0, (a0) ; Save drive's ASCII letter code.
-
- print_heading:
- lea heading, a0
- bsr print_string
- lea program_name, a0
- bsr print_string
- print_drive_for_spawned_program:
- lea drive_msg, a0
- bsr print_string
-
- compute_load_time:
- lea load_time_msg, a0
- bsr.s print_string
- lea start_time, a3
- sub.w (a3), d3 ; Subtract start time from after-load time.
- ext.l d3 ; Extent to 32 bits.
-
- ; SIGN EXTENSION NOTE
-
- ; The value in D3, above, and in D5 previously, is extended to 32 bits
- ; because, although the number of 200hz intervals we are able to utilize is
- ; limited to a word size by the value that is returned in D0 via GEMDOS
- ; function $4C, the time converted to milliseconds can extend beyond that
- ; word size limitation.
-
- trap #9 ; See description in TRAP_9.S.
-
- close_file:
- move.w file_handle, -(sp)
- move.w #$3E, -(sp) ; Function = fclose = GEMDOS $3E.
- trap #1
- addq.l #4, sp
-
- terminate:
- move.w #0, -(sp)
- trap #1
-
- print_string: ; Expects address of string to be in A0.
- pea (a0) ; Push address of string onto stack.
- move.w #9, -(sp) ; Function = c_conws = GEMDOS $9.
- trap #1 ; GEMDOS call
- addq.l #6, sp ; Reset stack pointer to top of stack.
- rts
-
- data
- heading: dc.b $D,$A,"SPEED_1.TTP Execution Results",$D,$A
- dc.b "for ",0
- drive_msg: dc.b ", loaded from drive: "
- drive: dc.b "A",$D,$A,0
- load_time_msg: dc.b $D,$A," Load time: ",0
-
- ; NOTE: Custom trap #6 checks the environmental string pointer of each
- ; program that invokes it to see if the pointer contains the address
- ; of the label "environ_string" below. That test is performed by
- ; comparing the contents of the address contained in the pointer to
- ; the ASCII string "TERM" declared below.
-
- ; When a match occurs, it means that the program invoking trap #6 has
- ; been spawned by SPEED_1 (or by a similar program), therefore, trap
- ; #6 sets the value of the boolean variable "spawned", declared by
- ; TRAPS.PRG, to all ones = true.
-
- ; When custom trap #8 is invoked by a program, the state of the
- ; variable "spawned" is tested. If the state is true, the program
- ; invoking custom trap #8 is terminated with GEMDOS function $4C and
- ; the after-load time, which was saved by custom trap #6, is returned
- ; to the parent program.
-
- ; If the state of "spawned" is false, GEMDOS function $8 is executed
- ; so that execution will pause for a keypress. When the keypress is
- ; received, GEMDOS function $0 is executed.
-
- ; In this manner, custom trap #8, working in conjunction with custom
- ; trap #6, eliminates the "wait for keypress" algorithm automatically
- ; when a program is spawned by SPEED_1 (or a similar program). This
- ; prevents the computed execution time from being corrupted by a time
- ; period that involves a wait for keyboard input.
-
- environ_string: dc.b "TERM",0
- command_line: dc.b 0
- align
- bss
- start_time: ds.w 1 ; Value in $4BA just before spawning.
- file_handle: ds.w 1 ; Handle for the filename below.
- filename: ds.l 4 ; File name for execution results.
- program_name: ds.l 4 ; Filename buffer. Must be NULL terminated.
- ds.l 96 ; Program stack.
- stack: ds.l 0 ; Address of program stack.
- program_end: ds.l 0
- end
-
-
- Program 17 was prepared as a simple example to be
- executed by program 16 and the other programs in the series.
- Program 17 illustrates the use of custom traps #3, #6 and
- #8. Assemble programs 16 and 17, then, with their
- executable files in the same directory, execute program 16.
- Type the name of program 17's executable file on program
- 16's command line. Figure 5.1 shows the contents of the
- file produced by program 16. The values stored in the file
- depend on the variables mentioned in program 16's
- documentation.
-
- Program 17. Execute this program by typing PRG_5AP.TOS on
- SPEED_1.TTP's command line.
-
- ; Program Name: PRG_5AP.S
- ; Version 1.003
-
- ; Assembly Instructions:
-
- ; Assemble in PC-relative mode and save with a TOS extension.
-
- ; Execution Note:
-
- ; This program invokes custom traps which must be installed by
- ; TRAPS.PRG prior to its execution.
-
- ; Program Function:
-
- ; This program illustrates the use of custom traps #3, #6 and #8.
- ; If the program is executed from the desktop, trap #8 will execute the
- ; wait_for_keypress algorithm, then, when a key is pressed it will execute
- ; GEMDOS function 0.
-
- ; If, instead, this program is executed by typing its name on
- ; SPEEDTST.TTP's input parameter line, trap #8 will not execute the
- ; wait_for_keypress algorithm, but it will immediately execute GEMDOS
- ; function $4C.
-
- ; Trap #3 returns, in D0, the value of the system clock as it is
- ; immediately after this program has been loaded. The value in D0 is not
- ; corrupted before trap #6 is invoked, therefore, it is still valid when
- ; the trap #6 routine begins to execute. Trap #6 saves the "after-load"
- ; value of the system clock in its own local variable, where it is available
- ; for processing during the execution of trap #8.
-
- ; Trap #6 also calculates the memory occupied by this program and releases
- ; the memory not occupied by this program to the operating system.
-
- fetch_load_time:
- trap #3 ; Returns value of system clock in D0.
- release_excess_memory: ; Also stores after-load time in TRAPS bss.
- lea program_end, a0 ; Put "end of program" address in A0.
- movea.l 4(a7), a1 ; Put "basepage" address in A1.
- trap #6 ; Calculate program size and release memory.
-
- waste_time:
- move.l #$1, d0
- outer_loop:
- move.l #$FDE8, d1
- inner_loop:
- move.l #$FDE8, d2
- dbra d1, inner_loop
- dbra d0, outer_loop
-
- lea heading, a0
- bsr.s print_string
- lea string, a0
- bsr.s print_string
-
- trap #8 ; Terminate.
- print_string:
- pea (a0)
- move.w #9, -(sp)
- trap #1
- addq.l #6, sp
- rts
-
- data
- heading: dc.b 'PRG_5AP.TOS Execution Results',$D,$A,$D,$A,0
- string: dc.b ' When executed from the desktop, this program will print '
- dc.b 'this string on the',$D,$A
- dc.b ' video screen and pause for a keypress. But, when this '
- dc.b 'program is spawned by',$D,$A
- dc.b ' SPEED_1, SPEED_2, SPEED_3 or SPEEDTST, the string will '
- dc.b 'be stored in a file ',$D,$A
- dc.b ' named PRG_5AP.DAT and the program will not pause for a '
- dc.b ' keypress.',$D,$A,0
- bss
- align
- program_end: ds.l 0
- end ; Assembler pseudo-op.
-
- PRG_5AP.TOS Execution Results
-
- When executed from the desktop, this program will print this string on the
- video screen and pause for a keypress. But, when this program is spawned by
- SPEED_1, SPEED_2, SPEED_3 or SPEEDTST, the string will be stored in a file
- named PRG_5AP.DAT and the program will not pause for a keypress.
-
-
- Figure 5.1. Contents of PRG_5AP.DAT, the data file produced
- by program 16 to contain program 17's load and execution
- times.
-
- PRG_5AP.TOS Execution Results
-
- When executed from the desktop, this program will print this string on the
- video screen and pause for a keypress. But, when this program is spawned by
- SPEED_1, SPEED_2, SPEED_3 or SPEEDTST, the string will be stored in a file
- named PRG_5AP.DAT and the program will not pause for a keypress.
-
- SPEED_1.TTP Execution Results
- for PRG_5AP.TOS, loaded from drive: G
-
- Load time: 45 milliseconds
- Execute time: 680 milliseconds
-
-
- The Second Model
-
- After program 16 was operational, I began to think
- about ways I might improve the command line processing
- algorithm. Also, I decided to try to improve the accuracy
- of the calculated load time by initializing the stack for
- GEMDOS $4B, withholding the invocation of trap #1, then
- invoking trap #3 to get the start time, just before invoking
- trap #1 to load and execute program 17.
- The improvements are incorporated in program 18, the
- next program in the series. In SPEED_2, the movem.l
- instruction is used to move the command line to four
- registers, then from there to a declared location in the
- data section. Since this program is simply a model, and
- since the algorithms which create the disk file were
- developed in SPEED_1, I decided that there was no reason to
- repeat those algorithms in SPEED_2.
- However, I discovered that, for no apparent reason, the
- load time reported by SPEED_2 increased significantly, even
- though the experiments with SPEED_1 and SPEED_2 were
- executed under identical conditions. By eliminating each of
- SPEED_1's algorithms that are involved with the disk file,
- in turn, I learned that, for some reason, the load time is
- shorter when a file is created. Therefore, in order to
- maintain a valid experiment, I created a dummy file in
- SPEED_2, but wrote nothing to it.
- But by the time I got to SPEEDTST, I realized that the
- file name creation algorithm was actually a part of the
- command line processing algorithm, therefore, in order to
- validate comparisons between the three models, I had to redo
- SPEED_2 and SPEED_3, including a file name creation
- algorithm in each. While doing that, I was able to use the
- movem.l instruction to develop a faster creation algorithm
- than that used in SPEED_1.
-
- Program 18. The next stage of SPEEDTST.TTP's development.
-
- ; Program Name: SPEED_2.S
- ; Version 1.003
-
- ; NOTE: This program is similar to SPEED_1. The differences between the
- ; the two is that this one uses a different algorithm to process
- ; the command line, and it fetches the start time at a more appropriate
- ; place in the program.
-
- ; Assembly Instructions:
-
- ; Assemble in "PC-relative" mode and save with a TTP extension.
-
- ; Function:
-
- ; Spawn a process and calculate the spawned program's load and execution
- ; times. Pause for a keypress before terminating.
-
- release_excess_memory:
- lea program_end, a0 ; Put "end of program" address in A0.
- movea.l 4(a7), a1 ; Put "basepage" address in A1.
- movea.l a1, a4 ; Copy to A4 for command line access.
- trap #6 ; Calculate program size and release memory.
-
- ; NOTE: A local stack is not declared in PRG_5AP.TOS. Because of the long
- ; string that is printed by that program, this program will bomb when
- ; it spawns PRG_5AP.TOS, if a local stack is not declared here.
-
- lea stack, a7 ; Point A7 to this program's stack.
-
- ; NOTE ABOUT THE COMMAND LINE PROCESSING ALGORITHM
-
- ; Refer to figure 2.13 of chapter 2 for an image of a command line
- ; that is stored in a program's basepage. The first byte of the command
- ; line is a count of the ASCII characters contained therein. The second
- ; byte is the first character in the command line. The last character in
- ; the command line is followed by the ASCII code for a carriage return;
- ; the carriage return is not included in the character count.
-
- ; For program SPEED_2 we know that the command line character count
- ; cannot exceed 12 characters = 12 bytes = 3 longwords. Therefore, it
- ; would be convenient if those 3 longwords could be transfered directly to
- ; three data registers. Unfortunately, the MC68000 will not permit the
- ; movem instruction to transfer data which begins at an odd address.
-
- ; Because of this restriction, it would be convenient if the operating
- ; system stored the first command line character at an even address.
- ; Unfortunately, it does not. Therefore, we are forced to fetch 4 longwords
- ; from the vicinity of the command line. That's why we must use four data
- ; registers instead of three.
-
- ; To complicate things, the command line ASCII string will be corrupted
- ; by the first byte in the first register, because it is the character count,
- ; not a valid character. So, when the data contained in the data registers
- ; are transferred to a declared variable location, this byte must be stripped
- ; from the command line ASCII string.
-
- ; I accomplish this with no wasted time by declaring two variable
- ; locations, input_line and program_name. Since input_line is one byte in
- ; length, and the first location for program_name immediately follows that
- ; byte, when the contents of the data registers is moved to the location of
- ; input_line, the variable program_name will point to the first character
- ; of the command line ASCII string, as it should.
-
- ; The carriage return at the end of the ASCII string is also transferred
- ; to the 15 byte array addressed by program_name. It must be overwritten by
- ; a NULL so that the ASCII string is NULL terminated. That is accomplished
- ; fetching the command line character count as a byte length value, extending
- ; it to word length and using the result in an operand that uses "address
- ; register indirect with index" addressing.
-
- process_command_line:
- lea input_line, a3 ; Fetch location to contain command line.
- lea output_line, a5 ; A second location: for filename.
- movem.l $80(a4), d0-d3 ; Move 16 bytes of command line to 4 registers.
- movem.l d0-d3, (a3) ; Move them to address "input_line".
- movem.l d0-d3, (a5) ; Move them to address "output_line".
- move.b $80(a4), d0 ; Fetch command line ASCII character count.
- ext.w d0 ; Extend to word for next instruction.
- move.b #0, 1(a3,d0.w) ; Store a null at end of command line input.
- move.b #0, 1(a5,d0.w) ; Same for filename buffer.
-
- insert_filename_suffix:
- move.b #$44, -2(a5,d0.w) ; Insert letter 'D'.
- move.b #$41, -1(a5,d0.w) ; Insert letter 'A'.
- move.b #$54, 0(a5,d0.w) ; Insert letter 'T'.
-
- create_file:
- move.w #0, -(sp) ; File attribute = read/write.
- pea filename ; Will be name of spawned process + .DAT.
- move.w #$3C, -(sp) ; Function = f_create = GEMDOS $3C.
- trap #1 ; File handle is returned in D0.
- addq.l #8, sp
- lea file_handle, a0
- move.w d0, (a0)
-
- redirect_output: ; Exchange file handle with screen's handle.
- move.w file_handle, -(sp) ; This is the disk file's handle.
- move.w #1, -(sp) ; This is the video screen's handle.
- move.w #$46, -(sp) ; Function = f_force = GEMDOS $46.
- trap #1
- addq.l #6, sp
-
- ; NOTE: In order to increase the accuracy of the start time, the stack is
- ; prepared for the spawning process, then, just before trap #1 is
- ; invoked, custom trap #3 is invoked and the start time is saved.
-
- prepare_stack_for_load_and_execute_program:
- pea environ_string
- pea command_line
- pea program_name
- move.w #0, -(sp)
- move.w #$4B, -(sp) ; Function = GEMDOS $4B = p_exec.
-
- get_start_time:
- lea start_time, a3 ; Fetch address of variable "start_time".
- trap #3 ; Returns value of system clock in D0.
- move.w d0, (a3) ; Save start time.
- load_and_execute_program:
- trap #1
- move.w d0, d3 ; Copy after-load value to D3 for calculation.
-
- get_end_time:
- trap #3 ; Returns value of system clock in D0.
- move.w d0, d5 ; Copy to D5 for calculation.
- sub.w d3, d5 ; Subtract after-load time from end time.
- ext.l d5 ; Extend to 32 bits.
-
- reposition_stack_pointer:
- lea $10(sp), sp
-
- get_drive:
- move.w #$19, -(sp) ; Function = dgetdrv = GEMDOS $19.
- trap #1
- addq.l #2, sp
- add.b #'A', d0
- lea drive, a0
- move.b d0, (a0)
-
- print_heading:
- lea heading, a0
- bsr print_string
- lea program_name, a0
- bsr print_string
- print_drive_for_spawned_program:
- lea drive_msg, a0
- bsr print_string
-
- compute_load_time:
- lea load_time_msg, a0
- bsr print_string
- lea start_time, a3
- sub.w (a3), d3 ; Subtract start time from after-load time.
- ext.l d3 ; Extent to 32 bits.
- trap #9 ; See description in TRAPS.S.
-
- close_file:
- move.w file_handle, -(sp)
- move.w #$3E, -(sp) ; Function = fclose = GEMDOS $3E.
- trap #1
- addq.l #4, sp
-
- terminate:
- move.w #0, -(sp)
- trap #1
-
- print_string: ; Expects address of string to be in A0.
- move.l a0, -(sp) ; Push address of string onto stack.
- move.w #9, -(sp) ; Function = c_conws = GEMDOS $9.
- trap #1 ; GEMDOS call
- addq.l #6, sp ; Reset stack pointer to top of stack.
- rts
-
- data
- heading: dc.b $D,$A,"SPEED_2.TTP Execution Results",$D,$A
- dc.b "for ",0
- drive_msg: dc.b ", loaded from drive: "
- drive: dc.b "A",$D,$A,0
- load_time_msg: dc.b $D,$A," Load time: ",0
- environ_string: dc.b "TERM",0
- command_line: dc.b 0
- align
- bss
- start_time: ds.w 1
- file_handle: ds.w 1
- input_line: ds.b 1
- program_name: ds.b 15 ; Program name buffer.
- output_line: ds.b 1
- filename: ds.b 15 ; Filename buffer.
- ds.l 96 ; Program stack.
- stack: ds.l 0 ; Address of program stack.
- program_end: ds.l 0
- end
-
-
- SPEED_2.TTP Execution Results
-
- PRG_5AP.TOS Execution Results
-
- When executed from the desktop, this program will print this string on the
- video screen and pause for a keypress. But, when this program is spawned by
- SPEED_1, SPEED_2, SPEED_3 or SPEEDTST, the string will be stored in a file
- named PRG_5AP.DAT and the program will not pause for a keypress.
-
- SPEED_2.TTP Execution Results
- for PRG_5AP.TOS, loaded from drive: G
-
- Load time: 45 milliseconds
- Execute time: 680 milliseconds
-
-
- Program 19 serves as the final model to be considered
- in the development of SPEEDTST.TTP. Within it, a new
- command line processing algorithm is developed and the user
- declared stack is discarded. As in the two previous models,
- the command line algorithm must prepare two strings: one to
- be used as the name of a disk file, the other to be passed
- as a parameter when GEMDOS $4B is invoked. The latter
- string is also used as part of the utility's header. The
- algorithm in this model takes advantage of the presence of
- the string in the command line to eliminate movement to
- prepare the latter string. Instead, the string is altered
- in place, at its location in the basepage. The movement
- which is required is a prerequisite for preparation of the
- filename string.
-
-
- Program 19. The final program model in SPEEDTST.TTP's
- development.
-
- ; Program Name: SPEED_3.S
- ; Version 1.004
-
- ; NOTE: This program is similar to SPEED_2. The differences are that a
- ; different algorithm is used to process the command line, and no user
- ; stack is declared in SPEED_3.
-
- ; Assembly Instructions:
-
- ; Assemble in "PC-relative" mode and save with a TTP extension.
-
- ; Function:
-
- ; Spawn a process and calculate the spawned program's load and execution
- ; times. Pause for a keypress before terminating.
-
- release_excess_memory:
- lea program_end, a0 ; Put "end of program" address in A0.
- movea.l 4(a7), a1 ; Put "basepage" address in A1.
- lea $80(a1), a3 ; Put "command line" address in A3.
- trap #6 ; Calculate program size and release memory.
-
- ; NOTE: A local stack is not declared in PRG_5AP.TOS. Because of the long
- ; string that is printed by that program, this program will bomb when
- ; it spawns PRG_5AP.TOS, if a local stack is not declared here.
-
- lea stack, a7 ; Point A7 to this program's stack.
-
- ; COMMAND LINE PROCESSING NOTE
-
- ; At this point register A3 contains the address of the command line.
- ; In the algorithm below, the address of the first ASCII character in the
- ; command line input is stored at the pointer "program_name". Then a NULL
- ; character is written over the carriage return code at the end of the
- ; command line input. Thus the command line input itself becomes the
- ; string, the address of which must be pushed on the stack during the p_exec
- ; invocation.
-
- ; Even though register A3 contains the address of the program name string,
- ; and the contents of A3 can be pushed during the p_exec invocation, the
- ; address of the string must be stored in a declared location because
- ; register A3 might be used by the spawned program. And the address of the
- ; string is still needed to print the spawned program's name in SPEED_3's
- ; output heading.
-
- process_command_line:
- lea command_line, a4 ; Fetch location to contain command line.
- movem.l (a3), d0-d3 ; Move 16 bytes of command line to 4 registers.
- movem.l d0-d3, (a4) ; Move them to address "command_line".
- move.b (a3)+, d0 ; Fetch parameter line input character count.
- ext.w d0 ; Extend to word for next instruction.
- move.b #0, 1(a4,d0.w) ; Store a null at end of string.
- lea program_name, a0 ; Fetch address of pointer to program name.
- move.l a3, (a0) ; Store address of program name in pointer.
- move.b #0, 0(a3,d0.w) ; Replace $0D at end of program name with NULL.
-
- insert_filename_suffix:
- move.b #$44, -2(a4,d0.w) ; Insert letter 'D'.
- move.b #$41, -1(a4,d0.w) ; Insert letter 'A'.
- move.b #$54, 0(a4,d0.w) ; Insert letter 'T'.
-
- create_file:
- move.w #0, -(sp) ; File attribute = read/write.
- pea filename ; Will be name of spawned process + .DAT.
- move.w #$3C, -(sp) ; Function = f_create = GEMDOS $3C.
- trap #1 ; File handle is returned in D0.
- addq.l #8, sp
- lea file_handle, a0
- move.w d0, (a0)
-
- redirect_output: ; Exchange file handle with screen's handle.
- move.w file_handle, -(sp) ; This is the disk file's handle.
- move.w #1, -(sp) ; This is the video screen's handle.
- move.w #$46, -(sp) ; Function = f_force = GEMDOS $46.
- trap #1
- addq.l #6, sp
-
- prepare_stack_for_load_and_execute_program:
- pea environ_string
- pea command_string
- pea (a3) ; Push address of program name string.
- move.w #0, -(sp)
- move.w #$4B, -(sp) ; Function = GEMDOS $4B = p_exec.
-
- get_start_time:
- lea start_time, a3 ; Fetch address of variable "start_time".
- trap #3 ; Returns value of system clock in D0.
- move.w d0, (a3) ; Save start time.
- load_and_execute_program:
- trap #1
- move.w d0, d3 ; Copy after-load value to D3 for calculation.
-
- get_end_time:
- trap #3 ; Returns value of system clock in D0.
- move.w d0, d5 ; Copy to D5 for calculation.
- sub.w d3, d5 ; Subtract after-load time from end time.
- ext.l d5 ; Extend to 32 bits.
-
- reposition_stack_pointer:
- lea $10(sp), sp
-
- get_drive:
- move.w #$19, -(sp) ; Function = dgetdrv = GEMDOS $19.
- trap #1
- addq.l #2, sp
- add.b #'A', d0
- lea drive, a0
- move.b d0, (a0)
-
- print_heading:
- lea heading, a0
- bsr print_string
- lea program_name, a0 ; Fetch address of program name string.
- movea.l (a0), a0
- bsr print_string ; Print spawned program's name.
- print_drive_for_spawned_program:
- lea drive_msg, a0 ; Print drive from which spawned program was
- bsr print_string ; loaded.
-
- compute_load_time:
- lea load_time_msg, a0
- bsr print_string
- lea start_time, a3
- sub.w (a3), d3 ; Subtract start time from after-load time.
- ext.l d3 ; Extent to 32 bits.
- trap #9 ; See description in TRAPS.S.
-
- close_file:
- move.w file_handle, -(sp)
- move.w #$3E, -(sp) ; Function = fclose = GEMDOS $3E.
- trap #1
- addq.l #4, sp
-
- terminate:
- move.w #0, -(sp)
- trap #1
-
- print_string: ; Expects address of string to be in A0.
- move.l a0, -(sp) ; Push address of string onto stack.
- move.w #9, -(sp) ; Function = c_conws = GEMDOS $9.
- trap #1 ; GEMDOS call
- addq.l #6, sp ; Reset stack pointer to top of stack.
- rts
-
- data
- heading: dc.b $D,$A,"SPEED_3.TTP Execution Results",$D,$A
- dc.b "for ",0
- drive_msg: dc.b ", loaded from drive: "
- drive: dc.b "A",$D,$A,0
- load_time_msg: dc.b $D,$A," Load time: ",0
- environ_string: dc.b "TERM",0
- command_string: dc.b 0
- align
- bss
- start_time: ds.w 1
- program_name: ds.l 1 ; Pointer to string in basepage command line.
- file_handle: ds.w 1
- command_line: ds.b 1 ; Unused character count will go here.
- filename: ds.b 15 ; File name for redirected output.
- ds.l 96 ; Program stack.
- stack: ds.l 0 ; Address of program stack.
- program_end: ds.l 0
- end
-
-
- SPEED_3.TTP Execution Results
-
- PRG_5AP.TOS Execution Results
-
- When executed from the desktop, this program will print this string on the
- video screen and pause for a keypress. But, when this program is spawned by
- SPEED_1, SPEED_2, SPEED_3 or SPEEDTST, the string will be stored in a file
- named PRG_5AP.DAT and the program will not pause for a keypress.
-
- SPEED_3.TTP Execution Results
- for PRG_5AP.TOS, loaded from drive: G
-
- Load time: 40 milliseconds
- Execute time: 685 milliseconds
-
-
- Each of the three programs, SPEED_1, SPEED_2 and
- SPEED_3 are models which fixate attention to a particular
- phase of a continuous development cycle. It would be very
- difficult, if not impossible, to pause as each instruction
- of each algorithm is chosen in order to describe the
- creative processes which instigate the choice. Furthermore,
- the algorithmic development process is rhythmically
- recursive. At intervals, the duration of which is dictated
- by personal education and experience, the programmer is
- drawn back to the beginning of the process to verify what
- has been done and, perhaps, to refine portions of the
- product.
- The final stage of the development process involves an
- assimilation of the best features of the three programs into
- a utility that is a fast as possible, while consuming
- minimum requisite memory. In order to choose the best
- command line processing algorithm from the three that were
- introduced in the models, a comparison of their relative
- speeds and requisite memory is needed. Program 20 was
- written to perform that chore.
-
- Program 20. This program was used to compare the speeds of
- the command line processing algorithms used in programs 16,
- 18 and 19.
-
- ; Program Name: CMD_TEST.S
- ; Version 1.004
-
- ; Assembly Instructions:
-
- ; Assemble in "PC-relative" mode and save with a TOS extension.
-
- ; Execution Instructions:
-
- ; Execute program CMD_TEST.TOS from the desktop. After reading the
- ; program's output on the screen, terminate execution by pressing the
- ; Return key.
-
- ; Function:
-
- ; This program is used to compare the relative speed of the command
- ; line processing algorithms used in SPEED_1, SPEED_2 and SPEED_3.
-
- ; Description:
-
- ; Three command line processing algorithms are executed 10,000 times.
- ; The elapsed time and requisite memory for each algorithm is printed to
- ; the screen. So that this program need not be executed as a TTP program,
- ; the command line is salted with a declared string.
-
- release_excess_memory:
- lea program_end, a0 ; Put "end of program" address in A0.
- movea.l 4(a7), a1 ; Put "basepage" address in A1.
- movea.l a1, a5 ; Copy to A5 for command line access.
- trap #6 ; Calculate program size and release memory.
- lea stack, a7 ; Point A7 to this program's stack.
-
- mainline:
- lea heading, a0
- bsr print_string
-
- salt_command_line:
- lea salt, a0 ; Fetch pointer to ersatz command line input.
- movem.l (a0), d0-d3 ; Move it to registers.
- movem.l d0-d3, $80(a5) ; Copy to actual command line address.
-
- speed_1_algorithm:
- lea speed_1_msg, a0
- bsr print_string
- move.l #9999, d4 ; Initialize counter for 10000 executions.
- trap #3 ; Get start time.
- move.l d0, d5 ; Copy for calculations.
- speed_1_loop:
- lea $80(a5), a4 ; Fetch address of parameters.
- move.b (a4)+, d0 ; Fetch parameter line character count.
- lea program_name_1, a3 ; Load buffer address.
- subq.b #1, d0 ; Set up counter.
- ext.w d0 ; Extend to match the size of the dbra
- ; instruction.
- fetch_character:
- move.b (a4)+, (a3)+ ; Store character.
- dbra d0, fetch_character ; Loop until D0 becomes negative.
- move.b #0, (a3) ; Finish with a NULL.
- create_file_name: ; Create a file to accept standard output.
- lea filename_1, a4 ; Load buffer address.
- lea program_name_1, a3 ; Load buffer address.
- copy_name:
- move.b (a3)+, (a4)+
- cmpi.b #$2E, (a3) ; Is next byte of program_name the period?
- bne.s copy_name ; Continue looping until period is seen.
- move.b #$2E, (a4)+ ; Add a period.
- move.b #$44, (a4)+ ; Add letter 'D'.
- move.b #$41, (a4)+ ; Add letter 'A'.
- move.b #$54, (a4)+ ; Add letter 'T'.
- move.b #0, (a4) ; Add a NULL.
- speed_1_memory:
- dbra d4, speed_1_loop ; Loop until D4 becomes negative.
- trap #3 ; Get end time.
- bsr convert_and_print_time
-
- speed_2_algorithm:
- lea speed_2_msg, a0
- bsr print_string
- move.l #9999, d4 ; Initialize counter for 10000 executions.
- trap #3 ; Get start time.
- move.l d0, d5 ; Copy for calculations.
- speed_2_loop:
- lea input_line, a3 ; Fetch location to contain command line.
- lea output_line_2, a4 ; A second location: for filename.
- movem.l $80(a5), d0-d3 ; Move 16 bytes of command line to 4 registers.
- movem.l d0-d3, (a3) ; Move them to address "input_line".
- movem.l d0-d3, (a4) ; Move them to address "output_line".
- move.b $80(a5), d0 ; Fetch command line ASCII character count.
- ext.w d0 ; Extend to word for next instruction.
- move.b #0, 1(a3,d0.w) ; Store a null at end of command line input.
- move.b #0, 1(a4,d0.w) ; Same for filename buffer.
- insert_filename_suffix:
- move.b #$44, -2(a4,d0.w) ; Insert letter 'D'.
- move.b #$41, -1(a4,d0.w) ; Insert letter 'A'.
- move.b #$54, 0(a4,d0.w) ; Insert letter 'T'.
- speed_2_memory:
- dbra d4, speed_2_loop ; Loop until D4 becomes negative.
- trap #3 ; Get end time.
- bsr convert_and_print_time
-
- speed_3_algorithm:
- lea speed_3_msg, a0
- bsr print_string
- move.l #9999, d4 ; Initialize counter for 10000 executions.
- lea $80(a5), a5 ; Fetch command line address.
- trap #3 ; Get start time.
- move.l d0, d5 ; Copy for calculations.
- speed_3_loop:
-
- ; NOTE: The first instruction, below, is not used in the actual SPEED_3
- ; algorithm, but it must be included here to reset A3 to the
- ; correct address each time through the loop. This instruction
- ; adds 4 clock periods per loop, 40000 clock periods for the
- ; 10000 loops, which is 5 milliseconds. The accuracy of this error
- ; calculation was confirmed by executing CMD_TEST.TOS with and
- ; without the instruction in the loop. The 5 msec error is equal to
- ; one system clock tick, therefore, when the loop end-time is obtained
- ; with the trap #3 invocation, 1 clock tick is subtracted before the
- ; loop time is calculated.
-
- ; The memory occupied by this instruction is not included in the
- ; value reported for the algorithm's requisite memory.
-
- movea.l a5, a3
- start_memory:
- lea output_line_3, a4 ; Fetch location to contain command line.
- movem.l (a3), d0-d3 ; Move 16 bytes of command line to 4 registers.
- movem.l d0-d3, (a4) ; Move them to address "command_line".
- move.b (a3)+, d0 ; Fetch command line ASCII character count.
- ext.w d0 ; Extend to word for next instruction.
- move.b #0, 1(a4,d0.w) ; Store a null at end of string.
- lea program_name_ptr, a0 ; Fetch address of pointer to program name.
- move.l a3, (a0) ; Store address of filename string in pointer.
- move.b #0, 0(a3,d0.w) ; Replace $0D at end of program name with NULL.
-
- _insert_filename_suffix:
- move.b #$44, -2(a4,d0.w) ; Insert letter 'D'.
- move.b #$41, -1(a4,d0.w) ; Insert letter 'A'.
- move.b #$54, 0(a4,d0.w) ; Insert letter 'T'.
- speed_3_memory:
- dbra d4, speed_3_loop
- trap #3
- subq.w #1, d0 ; Subtract 1 clock tick to correct time.
- bsr.s convert_and_print_time
-
- speed_1_requisite_memory:
- lea speed_1_memory_msg, a0
- bsr.s print_string
- lea speed_1_loop, a1 ; Calculate number of bytes occupied by the
- lea speed_1_memory, a0 ; instructions in the loop, then print.
- bsr.s calculate_and_print_requisite_memory
-
- speed_2_requisite_memory:
- lea speed_2_memory_msg, a0
- bsr.s print_string
- lea speed_2_loop, a1 ; Calculate number of bytes occupied by the
- lea speed_2_memory, a0 ; instructions in the loop, then store.
- bsr.s calculate_and_print_requisite_memory
-
- speed_3_requisite_memory:
- lea speed_3_memory_msg, a0
- bsr print_string
- lea start_memory, a1 ; Calculate number of bytes occupied by the
- lea speed_3_memory, a0 ; instructions in the loop, then print.
- bsr.s calculate_and_print_requisite_memory
-
- wait_for_keypress:
- move.w #8, -(sp) ; Function = c_necin = GEMDOS $8.
- trap #1 ; GEMDOS call.
- addq.l #2, sp ; Reposition stack pointer at top of stack.
-
- terminate:
- move.w #0, -(sp)
- trap #1
-
- print_string: ; Expects address of string to be in A0.
- pea (a0) ; Push address of string onto stack.
- move.w #9, -(sp) ; Function = c_conws = GEMDOS $9.
- trap #1 ; GEMDOS call
- addq.l #6, sp ; Reset stack pointer to top of stack.
- rts
-
- convert_and_print_time:
- sub.l d5, d0 ; Subtract start time from end time.
- mulu #5, d0 ; Convert to milliseconds.
- move.l d0, d1 ; Convert to ASCII decimal.
- trap #4
- bsr print_string
- lea time_label, a0
- bsr print_string
- rts
-
- calculate_and_print_requisite_memory:
- suba.l a1, a0
- move.l a0, d1 ; Transfer requisite memory for trap call.
- print_speed_1_requisite_memory:
- trap #4 ; Returns address of decimal string in A0.
- bsr print_string
- lea memory_label, a0
- bsr print_string
- rts
-
- data
- salt: dc.b $B,"PRG_5AP.TOS",$D,0,0,0,0
- heading: dc.b $D,$A,"CMD_TEST Execution Results",$D,$A,$D,$A,0
- speed_1_msg: dc.b " SPEED_1 algorithm time: ",0
- speed_2_msg: dc.b " SPEED_2 algorithm time: ",0
- speed_3_msg: dc.b " SPEED_3 algorithm time: ",0
- time_label: dc.b " milliseconds",$D,$A,0
- speed_1_memory_msg: dc.b $D,$A," SPEED_1 algorithm requisite memory: ",0
- speed_2_memory_msg: dc.b " SPEED_2 algorithm requisite memory: ",0
- speed_3_memory_msg: dc.b " SPEED_3 algorithm requisite memory: ",0
- memory_label: dc.b " bytes",$D,$A,0
- align
- bss
- program_name_1: ds.l 4 ; Program name buffer for SPEED_1 algorithm.
- filename_1: ds.l 4 ; Filename buffer for SPEED_1 algorithm.
- input_line: ds.b 1 ; Command line buffer for SPEED_2 algorithm.
- program_name_2: ds.b 15 ; Program name buffer for SPEED_2 algorithm.
- output_line_2: ds.b 1 ; Second command line buffer for SPEED_2.
- filename_2: ds.b 15 ; Filename buffer for SPEED_2 algorithm.
- program_name_ptr: ds.l 4 ; Pointer to filename in command line for SPEED_3.
- output_line_3: ds.b 1 ; Command line buffer for SPEED_3 algorithm.
- filename_3: ds.b 15 ; Filename buffer for SPEED_3 algorithm.
- ds.l 96 ; Program stack.
- stack: ds.l 0 ; Address of program stack.
- program_end: ds.l 0
- end
-
-
- CMD_TEST Execution Results
-
- SPEED_1 algorithm time: 830 milliseconds
- SPEED_2 algorithm time: 350 milliseconds
- SPEED_3 algorithm time: 300 milliseconds
-
- SPEED_1 algorithm requisite memory: 60 bytes
- SPEED_2 algorithm requisite memory: 58 bytes
- SPEED_3 algorithm requisite memory: 52 bytes
-
-
- Authenticating The Results
-
- Because the final configuration of the utility will
- depend primarily on the results displayed by CMD_TEST.TOS,
- the validity of those results must be beyond question. I
- have used three validation techniques. First, I single-
- stepped through each instruction. Then I verified the data
- written by the program to its basepage command line and bss
- section. Finally, I compared the execution times reported
- to values calculated using the Motorola Programmer's
- Reference Manual.
- Figure 5.1 is a partial disassembly of program 20 as it
- was in memory after execution. There you can see that the
- basepage command line contains the salt data and has been
- altered as specified by the SPEED_3 command line processing
- algorithm. To wit: the carriage return has been replaced by
- a NULL. Also evident are the strings stored in the bss
- segment by the three algorithms. Table 5.1 lists the
- relevant declared variables, their lengths and their
- addresses in the disassembly listing.
-
- Table 5.1 Match the variable names listed and their
- addresses to the data shown in the disassembly listing.
-
- Variable Length Address
-
- program_name_1: ds.l 4 $0919C8
- filename_1: ds.l 4 $0919D8
- input_line: ds.b 1 $0919E8
- program_name_2: ds.b 15 $0919E9
- output_line_2: ds.b 1 $0919F8
- filename_2: ds.b 15 $0919F9
- program_name_ptr: ds.l 4 $091A08
- output_line_3: ds.b 1 $091A0C
- filename_3: ds.b 15 $091A0D
-
-
- Figure 5.1. Partial disassembly of CMD_TEST.TOS after
- execution, showing the basepage command line, and the
- command line relevant portion of the bss section.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Table 5.2 lists the instructions used in each of the
- command line processing algorithms and their required
- execution clock periods as specified in the Motorola manual.
- I want to give you the values I calculated for two reasons;
- first, to show you that it can be done from the tables in
- the Motorola guide; second, to serve as verification that
- program 20 performs its task accurately. If you desire, you
- can use this table to practice your interpretation of the
- data in the Motorola tables. A short tutorial follows the
- table.
-
- Table 5.2 The instructions used in command line processing
- algorithms of SPEED_1, SPEED_2 and SPEED_3.
-
- Instruction Clock Periods
-
- speed_1_loop:
- lea $80(a5), a4 8
- move.b (a4)+, d0 8
- lea program_name_1, a3 8
- subq.b #1, d0 4
- ext.w d0 4
- Total = sum of 5 = 32
-
- fetch_character:
- (There are 11 characters.)
- move.b (a4)+, (a3)+ 12
- dbra d0, fetch_character 10/14
- Total = 11(12) + 10(10) + 14 = 246
-
- move.b #0, (a3) 12
- create_file_name:
- lea filename_1, a4 8
- lea program_name_1, a3 8
- Total = sum of 3 = 28
-
- copy_name:
- (There are 7 characters.)
- move.b (a3)+, (a4)+ 12
- cmpi.b #$2E, (a3) 12
- bne.s copy_name 10/8
- Total = 7(24) + 6(10) + 8 = 236
-
- move.b #$2E, (a4)+ 12
- move.b #$44, (a4)+ 12
- move.b #$41, (a4)+ 12
- move.b #$54, (a4)+ 12
- move.b #0, (a4) 12
- Total = sum of 5 = 60
-
- Algorithm total = 32 + 246 + 28 + 236 + 60 = 602
- speed_1_memory:
- dbra d4, speed_1_loop
-
- speed_2_loop:
- lea input_line, a3 8
- lea output_line_2, a4 8
- movem.l $80(a5), d0-d3 = 16+8(4) = 48
- movem.l d0-d3, (a3) = 8+8(4) = 40
- movem.l d0-d3, (a4) 8+8(4) = 40
- move.b $80(a5), d0 12
- ext.w d0 4
- move.b #0, 1(a3,d0.w) 18
- move.b #0, 1(a4,d0.w) 18
- insert_filename_suffix:
- move.b #$44, -2(a4,d0.w) 18
- move.b #$41, -1(a4,d0.w) 18
- move.b #$54, 0(a4,d0.w) 18
-
- Algorithm total = sum of 12 = 250
- speed_2_memory:
- dbra d4, speed_2_loop
-
- speed_3_loop:
- movea.l a5, a3 (not counted)
- start_memory:
- lea output_line_3, a4 8
- movem.l (a3), d0-d3 = 12+8(4) = 44
- movem.l d0-d3, (a4) 8+8(4) = 40
- move.b (a3)+, d0 8
- ext.w d0 4
- move.b #0, 1(a4,d0.w) 18
- lea program_name_ptr, a0 8
- move.l a3, (a0) 12
- move.b #0, 0(a3,d0.w) 18
-
- _insert_filename_suffix:
- move.b #$44, -2(a4,d0.w) 18
- move.b #$41, -1(a4,d0.w) 18
- move.b #$54, 0(a4,d0.w) 18
-
- Algorithm total = sum of 12 = 214
- speed_3_memory:
- dbra d4, speed_3_loop
-
-
- Instruction Execution Times Tutorial
-
- In my copy of the M68000 Programmer's Reference Manual,
- which may not be the same as yours, MC68000 instruction
- execution times are presented in Appendix D. Times for the
- MC68008 are presented in Appendix E, and those for the
- MC68010/MC68012 are in Appendix F. I am pointing out the
- locations for the other processors so that you can avoid
- them. When you are looking for MC68000 times, make sure
- that you are doing so in Appendix D.
- The Introduction to the appendix contains information
- concerning wait states that is not applicable to the Atari
- ST. The only thing in the introduction which concerns us
- are the notes stating that the instruction execution times
- are given in terms of external (system) clock periods and
- that the number of periods includes instruction fetch and
- all applicable operand fetches and stores. The ST's clock
- period is 1 divided by 8,000,000 = .000000125 second = 1.25
- x 10-7 sec, because the system operates with an 8 megahertz
- (mhz) clock.
- The first table, D-1, lists the Effective Address
- Calculation Times for the addressing modes. This table is
- one to which you must refer back when so directed by other
- tables. Reference is made to this table via a + sign
- following the number of clock periods given for a particular
- instruction. The reference indicates that you should go
- back to table D-1, fetch the appropriate time for the
- appropriate addressing mode and data length (byte,word or
- long) and add that time to the number of clock periods
- preceding the + sign.
- The other tables list base times for the instructions;
- I say base because of the need to add an effective address
- time for many instructions. The tables are arranged so that
- data is presented for groups of similar instructions. You
- use these tables by finding the one which lists the
- instruction of interest; then, if a source operand is
- involved, you locate the row specified by the source
- operand, if there are rows of source operands; then, if a
- destination operand is involved, you locate the column
- specified by the destination operand, if there are columns
- of destination operands; then, locate the data at the row-
- column intersection; then, if a + sign follows the data, go
- back to table D-1, fetch the effective address time and add
- it to the data.
- When you have done all of that, you will have the
- instructions execution time in clock periods. Not all
- instructions contain both a source operand and a destination
- operand. Not all of the tables explicitly reference both
- operand types. Not all tables list destination operands in
- columns; some of them list source operands in columns.
- Therefore, your mind must be on what you are doing when you
- are reading the tables. For example, times for ADD/ADDA,
- AND, CMP/CMPA, DIVS, DIVU, EOR, MULS, MULU, OR and SUB are
- listed in table D-4. Following each time given is a + sign,
- therefore, an effective address time from table D-1 is
- needed for each item in the table.
- You might ask, "Does the data in table D-4 pertain to
- source operands or destination operands? Does the reference
- to table D-1 pertain to source operands or destination
- operands?". The answer to both questions is, "Yes.".
- Because these instructions can be written so that the
- effective addresses which head the columned data in table D-
- 4 can be either source or destination operands. To see what
- I mean, look at the Assembler Syntax for the ADD
- instruction. There you see the following notation:
-
- ADD <ea>,Dn
-
- ADD Dn,<ea>.
-
- The reason that Motorola's use of the term effective
- address is confusing is that, in their manual, all
- addressing modes are discussed as if the location specified
- in operands are somehow implied by operand format. It seems
- as though the authors of the manual had originally intended
- that the term effective address be used to indicate a
- location specified by an operand to be ultimately found in
- memory external to the processor, in contrast to processor
- registers, which are internal addresses.
- But, in fact, even when discussing Register Direct
- Modes, the manual states, "These effective addressing modes
- specify that the operand is in one of the 16 multifunction
- registers.". So, I say, let it all be effective addresses,
- as the authors apparently decided to do. But then the
- descriptive effective is redundant, and it renders the
- instruction, add effective address calculation time, which
- is indicated by the + sign, ineffective.
- What that instruction should instruct one to do is
- this: for the appropriate operand, add the additional time
- indicated in table D-1 for the appropriate addressing mode.
- Of course, one must determine the appropriate operand and
- the appropriate addressing mode. But this must be done
- regardless of terminology. However, the manual does not
- make that clear, nor does it indicate the manner in which it
- can be accomplished. I shall.
- Using instructions selected from those listed in table
- 5.2, I will conclude this tutorial by showing you how I
- obtained the execution time for those instruction, then I
- will show you how to obtain the time for at least one
- instruction listed in the Motorola tables which none of the
- instructions in table 5.2 access. I think that the
- exploration will be sufficiently comprehensive.
-
- lea input_line, a3
-
- The lea instruction is found in table D-10, JMP, JSR,
- LEA, PEA, and MOVEM Instruction Execution Times. For all of
- these instructions, the destination operand is implicit: for
- JMP and JSR the destination is the program counter (PC); for
- LEA the destination is an address register; for PEA the
- destination is a stack; for MOVEM (M->R, memory to
- registers) the destination is a register group; for MOVEM
- (R->M, registers to memory) the destination is a group of
- memory addresses. This means that the columns containing
- times in the table refer to source operands.
- The source operand for lea input_line, a3 is a label,
- therefore, the addressing mode used might seem to be
- absolute, but the program in which the instruction is used
- was assembled in AssemPro's PC-relative mode. Therefore,
- the addressing mode is program counter with displacement.
- The execution time for the instruction is found where the
- LEA row intersects with the d16(PC) column. The time is 8
- clock periods.
- Eight clock periods translates to 8(.000000125 sec) =
- .000001 sec = 1 microsecond = .001 msec. As you can see,
- this is a very short period of time. It is not possible to
- measure times that are this short with a clock that has a
- resolution of 5 msec. That's why it is necessary to execute
- instructions and entire algorithms within loops that extend
- the time period being measured. A time period being
- measured with the system clock should be sufficiently long
- to render the 5 msec resolution of the clock insignificant.
- Because the loops which execute the algorithms many
- times contain branching overhead, it is easier to compare
- relative execution times, instead of absolute execution
- times, when performing the comparisons with computer
- generated data. When absolute times are desired, it is
- easier to compute them using the tables in the Motorola
- manual.
-
- lea $80(a5), a4
-
- Here the addressing mode of the source operand is
- address register indirect with displacement. The execution
- time for the instruction is found at the point of
- intersection specified by the LEA row and the d16(An)
- column. The time is 8 clock periods.
-
- movem.l $80(a5), d0-d3
-
- The row labeled MOVEM M->R is specified for this
- instruction. Furthermore, this row is divided into two
- subrows: Word and Long. The instruction specifies a
- longword operation, so the Long subrow must be used. The
- source operand uses address register indirect with
- displacement addressing = d16(An).
- For this instruction, the data found at the
- intersection of the specified row and column is not the
- instruction execution time. Instead, there is a formula
- from which the execution time must be calculated. The
- parameter n specified by the formula is a variable for the
- number of registers specified in the instruction. In this
- case the transfer from memory is to use 4 registers. The
- instruction execution time is 16 + 8(4) = 48 clock periods.
-
- movem.l d0-d3, (a3)
-
- Refer to the row labeled MOVEM R->M in the D-10 table.
- The formula shown at the intersection of the Long subrow and
- (An) column is similar to that for the MOVEM M->R
- instruction. The instruction execution time is 8 + 8(4) =
- 40 clock periods.
-
- move.b (a4)+, d0
-
- The execution times for move instructions are contained
- in two tables. The first table, D-2 (Move Byte and Word
- Instruction Execution Times), must be used for this
- instruction because a byte operation is specified. The
- addressing mode used by the source operand is address
- register indirect with postincrement. That used in the
- destination operand is data register direct. The
- instruction time of 8 clock periods is found at the
- intersection of the (An)+ row and the Dn column. Note that
- these tables are used for MOVE and MOVEA instructions.
-
- subq.b #1, d0
-
- The table to use is D-5 (Immediate Instruction
- Execution Times). All of the instructions in this table
- require a source operand which uses the immediate data
- addressing mode. The three columns in the table specify
- permissible destination operands. In this case, the
- instruction specifies data register direct. At the
- intersection of the SUBQ row and op #, Dn column, for a byte
- size operation, the time given is 4 clock periods.
-
- ext.w d0
-
- This instruction found in table D-12 (Miscellaneous
- Instruction Execution Times). Although there are two
- subrows shown for the EXT row, the times for both are
- identical. This instruction requires no source operand, and
- the time is simply 4 clock periods.
-
- dbra d0, fetch_character
-
- The DBcc instruction is used to control loop exits.
- Therefore, we are most often concerned with multiple
- executions of the instruction and with a sum of execution
- times. Also, the execution time of a single DBcc execution
- depends on the state of the condition code register (CCR)
- and the state of the loop counter when loop exit takes
- place. Loop exit is forced when the DBcc condition code
- becomes true or when the value in the counter becomes
- negative.
- Refer to table D-9 (Conditional Instruction Execution
- Times). Note that the DBcc instruction is the only
- instruction in the table for which the displacement between
- the instruction and the destination does not affect the
- execution time. Depending on the manual you are using, the
- DBcc row may be divided into 2 or 3 subrows. Figure 5.4
- shows the row divided into 3 subrows.
-
- Figure 5.4. Subrows for the DBcc Instruction.
-
- Displacement Branch Taken Branch Not Taken
-
- cc true - 12
-
- cc false, Count
- Not Expired 10 -
-
- cc false, Counter
- Expired - 14
-
-
- The information contained in the second and third rows
- can be combined so that only one row need be used to express
- it. In that case, the second row would be:
-
- cc false 10 14
-
- This makes sense because when cc is false the branch can be
- taken only if the count has not expired, while it cannot be
- taken if the count has expired.
- Except for the DBT instruction, which never branches
- and never decrements, for any condition specified in a DBcc
- instruction (For DBRA = DBF, the condition is always
- false.), a branch will be taken if the condition is true or
- if the value in the counter is not negative, and the
- execution time for the instruction will be 10 clock periods.
- If the condition becomes true, a branch will not be taken,
- and the execution time for the instruction will be 12 clock
- periods, regardless of the value in the counter. If the
- value in the counter becomes negative before the condition
- becomes true, then the execution time for the instruction
- will be 14 clock periods.
- For a counter value n, the DBcc instruction will be
- executed N times if exit from the loop takes place because
- the condition becomes true and the sum of DBcc instruction
- execution times will be (N)(10) + 12, where N is the number
- of branches which actually took place, not the value stored
- in the counter. The sum of execution times will be
- (n)(10) + 14 if exit from the loop takes place because the
- counter becomes negative.
- For the instruction being used as an example, n is
- equal to one less than the number of characters in the
- string being copied. There are 11 characters, so n equals
- 10 because the value in the counter must be one less than
- the number of times the loop is to be executed. The
- condition for the DBRA instruction is never true, so exit
- from the loop can only take place when the value in the
- counter becomes negative. The sum of execution times for
- the instruction is (10)(10) + 14 = 114 clock periods.
-
- cmpi.b #$2E, (a3)
-
- This instruction is found in table D-5 (Immediate
- Instruction Execution Times). This table was discussed in
- the section under subq.b #1, d0. The source operand must
- use, and does use, the immediate data addressing mode.
- Unlike that of the previously referenced instruction, the
- destination operand of this one uses the address register
- indirect addressing mode. And at the intersection of the
- CMPI.B row and op #, M column, we find that the instruction
- execution time of 8 clock periods is following by a + sign.
- The + sign indicates a reference to table D-1 (Effective
- Address Calculation Times). But what value is it that we
- seek there? Just under the heading for table D-5 is the
- statement that implies this information. The statement
- tells us that the time shown at the intersection is that
- which is required to fetch the immediate operand.
- We can deduce that the time we seek is that for the
- addressing mode of the destination operand. In table D-1,
- at the (An) row/byte size operation intersection we find the
- value 4, which means that we must add 4 clock periods to the
- 8 shown in table D-5. Thus the instruction execution time
- is 12 clock periods.
-
- bne.s copy_name
-
- The Bcc instruction is listed in table D-9, the same
- table which lists the DBcc instruction. The Bcc instruction
- also has a Branch Taken and a Branch Not Taken column; and
- like the DBcc instruction, the Bcc instruction's execution
- time depends on the state of the CCR; but unlike the DBcc
- instruction, it also depends on the size of the displacement
- between the instruction and the branch destination.
- For the instruction being discussed, the displacement
- is short = byte size. For a byte size displacement the
- execution time is 10 clock periods for a branch taken, 8
- clock periods for a branch not taken. There are two
- instructions within the SPEED_1 copy_name loop, each of
- which require 12 clock periods per execution. The body of
- the loop is executed 7 times, and the bne.s instruction is
- executed 7 times. But the branch is taken only 6 times.
- The sum of the Bcc instruction execution times will be 6(10)
- + 8 = 68 clock periods.
-
- add.l d0, d5
-
- Refer to table D-4, Standard Instruction Execution
- Times. There are two subrows, labeled according to the size
- of the operation. At the intersection of the Long subrow
- and the op<ea>, Dn column, there is this notation:
- 6(1/0)+**. Referring to the notes under the table, we find
- that the + means that we must fetch the address calculation
- time for the source operand, and the ** means that the 6
- must be increased to 8 if the addressing mode of the source
- operand is register direct or immediate.
- Well, the addressing mode of the source operand is
- register direct, so the 6 becomes 8. Glancing back at table
- D-1, we see that the address calculation time for the
- register direct addressing mode is 0. Therefore, the
- execution time for the instruction is simply 8 clock
- periods.
-
- asl #2, d5
-
- The execution times for SHIFT and ROTATE instructions
- are listed in table D-7. Using the formula shown at the
- intersection of the ASL instruction's Long subrow and the
- Register column, the calculated execution time for the
- example is 8 + 2(2) = 12 clock periods. Here I have
- replaced n with the immediate value of the source operand.
-
- seq (a0)
-
- Refer to table D-6, Single Operand Instruction
- Execution Times. The Scc instruction row is divided into
- two subrows labeled Byte, False and Byte, True. So we see
- that the execution time depends on the state of the
- condition code, which is eq in the example, if the
- addressing mode of the operand is register direct.
- For all other modes, the execution time is 8 clock
- periods plus the address calculation time obtained from
- table D-1. The example operand's addressing mode is address
- register indirect, and in table D-1 the address calculation
- time for that mode is 4 clock periods for a byte size
- operation. The instruction execution time is 8 + 4 = 12
- clock periods.
-
- bset #5, (sp)
-
- Table D-8 lists the execution times for the Bit
- Manipulation instructions. For all of the instructions
- listed in the table, the bit to be manipulated is specified
- by the source operand; the location of the bit to be
- manipulated is specified by the destination operand.
- There are two major columns in this table: Dynamic and
- Static. The Dynamic major column is used if the number of
- the bit to be manipulated is specified with the contents of
- a register; the Static major column is used if the number of
- the bit to be manipulated is specified with immediate data,
- such as shown in the example.
- Each of the major columns is composed of two minor
- columns. A Register minor column is used if the bit to be
- manipulated resides in a register, a Memory minor column is
- used if the bit to be manipulated resides in memory external
- to the processor. The bit to be manipulated in the example
- resides in a stack, which is memory external to the
- processor.
- An operation size indicator for any of the instructions
- shown in this table would be redundant because the size of
- the operation must be long if the bit to be manipulated
- resides in a register and it must be byte otherwise. So at
- the intersection of BSET's Byte subrow and Static-Memory
- column we find the notation: 12(2/1)+. Fetching the address
- calculation time for (sp) = (An) from table D-1, which is 4
- for a byte size operation, and adding it to the 12, we
- calculate the example instruction time as 16 clock periods.
- This concludes my Instruction Execution Times Tutorial.
- I have not dealt with table D-13, which lists the single
- instruction MOVEP, because this instruction is a little
- tricky. I will use this instruction in a later chapter, and
- I hope to remember to discuss its execution time then.
- Neither have I dealt with table D-14, which lists Exception
- Processing Execution Times because they are so easily
- derived. For example, the execution time for any trap #n
- instruction is simply 34 clock periods.
-
- Execution Speed Ratios
-
- The execution speed ratios of figure 5.2 are obtained
- from the results of one execution of CMD_TEST.TOS. On
- subsequent executions the results for SPEED_1 were sometimes
- 835, and the results for SPEED_3 were sometimes 305. At
- times, both differences appeared simultaneously. These
- differences for multiple executions are to be expected
- because the system variable _hz_200 (memory location $4BA)
- is incremented only every 200hz, which means that the period
- between increments is 1/200 = .005 second = 5 milliseconds.
- This means that the variable measures time with a resolution
- of 5 milliseconds (msec).
- Unexpectedly, the time for SPEED_2 rarely varied. At
- first, that made me wonder if I had made an error in its
- algorithm as it is in CMD_TEST.TOS, but I have checked
- extensively and found nothing wrong. However, I mention my
- concern, just so you'll know, although it does not really
- affect the decision concerning which algorithm to choose for
- SPEEDTST.TTP.
-
-
- Figure 5.2. CMD_TEST.TOS execution speed ratios. As you can
- see, SPEED_2's command line processing algorithm is about
- 2.37 times faster than SPEED_1's, while SPEED_3's is about
- 2.77 times faster.
-
- SPEED_1 830
- ------- = --- = 2.37
- SPEED_2 350
-
- SPEED_1 830
- ------- = --- = 2.77
- SPEED_3 300
-
- SPEED_2 350
- ------- = --- = 1.17
- SPEED_3 300
-
-
- The execution speed ratios shown in figure 5.3 are
- obtained from the data in table 5.2. I have also checked
- and rechecked this data many times, but I warn you not to
- trust me, although I trust the data. Actually, the ratios
- below agree very closely with the those of figure 5.2,
- especially when one considers the 5 msec resolution of the
- clock that is being used to measure execution time. In any
- case, we are much more interested in relative execution
- speeds than we are in absolute speeds.
-
- Figure 5.3. Execution speed ratios calculated from
- instruction execution timing information in the Motorola
- manual.
-
-
- SPEED_1 602
- ------- = --- = 2.41
- SPEED_2 250
-
- SPEED_1 602
- ------- = --- = 2.81
- SPEED_3 214
-
- SPEED_2 250
- ------- = --- = 1.17
- SPEED_3 214
-
-
- Putting the Pieces Together
-
- The final algorithm is prepared by extracting the best
- algorithms from the three models, and installing the
- instructions implemented by custom trap #9. All of the
- programs of the series, SPEED_1.TTP, SPEED_2.TTP and
- SPEED_3.TTP, as well as programs PRG_5AP.TOS, CMD_TEST.TOS,
- TRAPS.S and TRAP_9.S along with all of the execution results
- are included in the documentation package for program 21.
- In addition, program 21 contains some documentation that was
- not previously disclosed.
-
- Program 21. The final algorithm.
-
- ; Program Name: SPEEDTST.S
- ; Version: 1.006
-
- ; Assembly Instructions:
-
- ; Assemble in "PC-relative" mode and save with a TTP extension.
-
- ; Function:
-
- ; Spawn the TOS or PRG process typed on the command line. Create a disk
- ; file which is to be identified by the name of the spawned program with a
- ; DAT suffix. The disk file is to reside in the same directory as does the
- ; spawned process.
-
- ; Calculate the spawned program's load and execution times and store them
- ; in the file. If the spawned process directs output to the video screen via
- ; GEMDOS function $9, redirect that output to the file.
-
- ; Execution Instructions:
-
- ; SPEEDTST.TTP will not execute unless the custom traps in program
- ; TRAPS.PRG have previously been installed.
-
- ; Execute from the desktop. Type the name of an executable file which
- ; has a TOS or PRG extension on SPEEDTST.TTP's input parameter line. The
- ; name of the program you type on the parameter line must be in the same
- ; directory as is SPEEDTST.TTP. The program must terminate with GEMDOS
- ; function $4C, and, via that function, it must pass to SPEEDTST.TTP the
- ; word length portion of the value that was in memory location $4BA
- ; immediately after it was loaded.
-
- ; The longword value in $4BA can be obtained by invoking custom trap #3
- ; (get_time). SPEEDTST.TTP uses the word length portion of that value,
- ; which is returned in D0 by GEMDOS $4C, to calculate the spawned program's
- ; load and execution times.
-
- ; If the spawned program contains any instructions that cause it to pause,
- ; such as those that wait for a keypress or some other event, those should be
- ; commented out, and the program should be assembled especially for the speed
- ; test. Otherwise the execution time computed by SPEEDTEST.TTP will include
- ; the time that the spawned program was waiting for the event to occur.
-
- ; If custom trap #8 is used to terminate the spawned program, the trap
- ; will execute a wait_for_keypress algorithm when the program is executed from
- ; the desktop, but it will omit the wait algorithm when the program is spawned
- ; by SPEEDTST.TTP. In addition, trap #8 will return the after-load value to
- ; SPEEDTST.TTP and terminate the spawned program with GEMDOS function $4C.
-
- ; Both trap #8 and SPEEDTST.TTP require that the spawned program be
- ; initialized with custom trap #6 or a similar algorithm. See TRAPS.S for
- ; details about custom traps #6 and #8.
-
- release_excess_memory:
- lea -$82(pc), a3 ; Put "command line" address in A3.
- lea -$80(a3), a1 ; Put "basepage" address in A1.
- lea program_end, a0 ; Put "end of program" address in A0.
- trap #6 ; Calculate program size and release memory.
-
- ; NOTE: A local stack is not declared in PRG_5AP.TOS. Because of the long
- ; string that is printed by that program, this program will bomb when
- ; it spawns PRG_5AP.TOS, if a local stack is not declared here.
-
- lea stack, a7 ; Point A7 to this program's stack.
-
- process_command_line:
- lea command_line, a4 ; Fetch location to contain command line.
- movem.l (a3), d0-d3 ; Move 16 bytes of command line to 4 registers.
- movem.l d0-d3, (a4) ; Move them to address "command_line".
- move.b (a3)+, d0 ; Fetch command line ASCII character count.
- ext.w d0 ; Extend to word for next instruction.
- move.b #0, 1(a4,d0.w) ; Store a null at end of string.
-
- lea program_name, a0 ; Fetch address of pointer to command line.
- move.l a3, (a0) ; Store address of command line string at
- ; pointer.
- move.b #0, 0(a3,d0.w) ; Replace $0D at end of command line input
- ; in basepage with a NULL.
-
- insert_filename_suffix:
- move.b #$44, -2(a4,d0.w) ; Insert letter 'D'.
- move.b #$41, -1(a4,d0.w) ; Insert letter 'A'.
- move.b #$54, 0(a4,d0.w) ; Insert letter 'T'.
-
- create_file:
- move.w #0, -(sp) ; File attribute = read/write.
- pea filename ; Will be name of spawned process + .DAT.
- move.w #$3C, -(sp) ; Function = f_create = GEMDOS $3C.
- trap #1 ; File handle is returned in D0.
- addq.l #8, sp
- lea file_handle, a0 ; Store returned file handle.
- move.w d0, (a0)
-
- redirect_output: ; Exchange file handle with screen's handle.
- move.w file_handle, -(sp) ; This is the disk file's handle.
- move.w #1, -(sp) ; This is the video screen's handle.
- move.w #$46, -(sp) ; Function = f_force = GEMDOS $46.
- trap #1
- addq.l #6, sp
-
- prepare_stack_for_load_and_execute_program:
- pea environ_string
- pea command_string
- pea (a3) ; Push address of program name string.
- move.w #0, -(sp)
- move.w #$4B, -(sp) ; Function = GEMDOS $4B = p_exec.
- get_start_time:
- lea start_time, a3 ; Fetch address of variable "start_time".
- trap #3 ; Returns value of system clock in D0.
- move.w d0, (a3) ; Save start time.
- load_and_execute_program:
- trap #1
- move.w d0, d3 ; Copy after-load value to D3 for calculation.
-
- get_end_time:
- trap #3 ; Returns value of system clock in D0.
- move.w d0, d5 ; Copy to D5 for calculation.
- sub.w d3, d5 ; Subtract after-load time from end time.
- ext.l d5 ; Extend to 32 bits.
-
- reposition_stack_pointer:
- lea $10(sp), sp
-
- get_drive:
- move.w #$19, -(sp) ; Function = dgetdrv = GEMDOS $19.
- trap #1 ; Returns 0 for drive A, 1 for B, etc.
- addq.l #2, sp
- add.b #$41, d0 ; Add ASCII value for A to compute ASCII
- lea drive, a0 ; letter code for the drive value returned.
- move.b d0, (a0) ; Save drives ASCII leter code.
-
- print_heading:
- lea heading, a0
- bsr print_string
- lea program_name, a0 ; Fetch address of program name string.
- movea.l (a0), a0
- bsr print_string
- print_drive_for_spawned_program:
- lea drive_msg, a0
- bsr print_string
-
- compute_load_time:
- lea load_time_msg, a0
- bsr print_string
- lea start_time, a3
- sub.w (a3), d3 ; Subtract start time from after-load time.
- ext.l d3 ; Extent to 32 bits.
-
- multiply_by_five: ; Convert to milliseconds.
- move.l d3, d0 ; Save a copy to add.
- asl.l #2, d3 ; Shift to multiply by 4.
- add.l d0, d3 ; To complete multiplication by 5.
-
- print_load_time:
- cmpi.l #999, d3 ; If load time is less than 1000, then
- bgt no_space ; print a leading blank space for output
- lea space, a0 ; alignment.
- bsr print_string
- cmpi.l #99, d3 ; If load time is less than 100, then
- bgt no_space ; print another leading blank space.
- lea space, a0
- bsr print_string
- no_space:
- move.l d3, d1 ; Copy load time to D1 for decimal conversion.
- trap #4 ; Returns address of decimal string in A0.
- bsr.s print_string
- lea units_label, a0
- bsr.s print_string
-
- compute_execution_time: ; D5 already contains the execution time.
- lea execute_time_msg, a0; Here, it must only be multiplied by 5 to
- bsr.s print_string ; be converted to milliseconds.
- move.l d5, d0 ; Save a copy to add.
- asl.l #2, d5 ; Shift to multiply by 4.
- add.l d0, d5 ; To complete multiplication by 5.
-
- print_execution_time:
- cmpi.l #999, d5 ; If execute time is less than 1000, then
- bgt _no_space ; print a leading blank space for output
- lea space, a0 ; alignment.
- bsr print_string
- cmpi.l #99, d5 ; If execute time is less than 100, then
- bgt _no_space ; print another leading blank space.
- lea space, a0
- bsr print_string
- _no_space:
- move.l d5, d1 ; Copy execute time for decimal conversion.
- trap #4 ; Returns address of decimal string in A0.
- bsr.s print_string
- lea units_label, a0
- bsr.s print_string
-
- close_file:
- move.w file_handle, -(sp)
- move.w #$3E, -(sp) ; Function = fclose = GEMDOS $3E.
- trap #1
- addq.l #4, sp
-
- terminate:
- move.w #0, -(sp)
- trap #1
-
- print_string: ; Expects address of string to be in A0.
- pea (a0) ; Push address of string onto stack.
- move.w #9, -(sp) ; Function = c_conws = GEMDOS $9.
- trap #1 ; GEMDOS call
- addq.l #6, sp ; Reset stack pointer to top of stack.
- rts
-
- data
- space: dc.b " ",0
- heading: dc.b $D,$A,"SPEEDTST.TTP Execution Results",$D,$A
- dc.b "for ",0
- drive_msg: dc.b ", loaded from drive: "
- drive: dc.b "A",$D,$A,0
- load_time_msg: dc.b $D,$A," Load time: ",0
- execute_time_msg: dc.b " Execution time: ",0
- units_label: dc.b " milliseconds",$D,$A,0
- environ_string: dc.b "TERM",0
- command_string: dc.b 0
- align
- bss
- start_time: ds.w 1 ; Value in $4BA just before spawning.
- program_name: ds.l 1 ; Pointer to name in basepage command line.
- file_handle: ds.w 1 ; Handle for the filename below.
- command_line: ds.b 1 ; Unused character count will go here.
- filename: ds.b 15 ; File name for redirected output.
- ds.l 96 ; Program stack.
- stack: ds.l 0 ; Address of program stack.
- program_end: ds.l 0
- end
-
-
- SPEEDTST.TTP Execution Results
-
- PRG_5AP.TOS Execution Results
-
- When executed from the desktop, this program will print this string on the
- video screen and pause for a keypress. But, when this program is spawned by
- SPEED_1, SPEED_2, SPEED_3 or SPEEDTST, the string will be stored in a file
- named PRG_5AP.DAT and the program will not pause for a keypress.
-
- SPEEDTST.TTP Execution Results
- for PRG_5AP.TOS, loaded from drive: G
-
- Load time: 40 milliseconds
- Execution time: 685 milliseconds
-
-
- The Second Utility
-
- I conclude this chapter with a utility that spawns a
- program and create a file for redirected output, but which
- does not measure load and execution times. This program is
- used when I want to save the output from a program in a disk
- file for documentation, leisurely viewing or for comparison
- with the output of one or more other programs. Program 22
- is simply a subset of program 21.
-
- Program 22. A program that simply spawns a process and saves
- its redirected output in a disk file.
-
- ; Program Name: SPAWN.S
- ; Version 1.003
-
- ; Assembly Instructions:
-
- ; Assemble in "PC-relative" mode and save with a TTP extension.
-
- ; Program Function:
-
- ; Spawn the TOS or PRG process typed on the command line. Create a disk
- ; file which is to be identified by the name of the spawned program with a
- ; DAT suffix. The disk file is to reside in the same directory as does the
- ; spawned process.
-
- ; If the program to be executed has any halt or wait instructions, such
- ; as wait for a keypress, etc., you must remember that execution of the
- ; spawned process will not terminate until those conditions are satisfied.
-
- release_excess_memory:
- lea -$82(pc), a3 ; Put "command line" address in A3.
- lea -$80(a3), a1 ; Put "basepage" address in A1.
- lea program_end, a0 ; Put "end of program" address in A0.
- trap #6 ; Calculate program size and release memory.
- lea stack, a7 ; Point A7 to this program's stack.
-
- process_command_line_parameters:
- lea command_line, a4 ; Fetch location to contain command line.
- movem.l (a3), d0-d3 ; Move 16 bytes of command line to 4 registers.
- movem.l d0-d3, (a4) ; Move them to address "command_line".
- move.b (a3)+, d0 ; Fetch command line ASCII character count.
- ext.w d0 ; Extend to word for next instruction.
- move.b #0, 1(a4,d0.w) ; Store a null at end of string.
-
- lea program_name, a0 ; Fetch address of pointer to command line.
- move.l a3, (a0) ; Store address of command line string at
- ; pointer.
- move.b #0, 0(a3,d0.w) ; Replace $0D at end of command line input
- ; in basepage with a NULL.
- insert_filename_suffix:
- move.b #$44, -2(a4,d0.w) ; Insert letter 'D'.
- move.b #$41, -1(a4,d0.w) ; Insert letter 'A'.
- move.b #$54, 0(a4,d0.w) ; Insert letter 'T'.
-
- create_file:
- move.w #0, -(sp) ; File attribute = read/write.
- pea filename ; Will be name of spawned process + .DAT.
- move.w #$3C, -(sp) ; Function = f_create = GEMDOS $3C.
- trap #1 ; File handle is returned in D0.
- addq.l #8, sp
- lea file_handle, a0 ; Store returned file handle to be used when
- move.w d0, (a0) ; the file is closed later.
-
- redirect_output: ; Exchange file handle with screen's handle.
- move.w d0, -(sp) ; This is the disk file's handle.
- move.w #1, -(sp) ; This is the video screen's handle.
- move.w #$46, -(sp) ; Function = f_force = GEMDOS $46.
- trap #1
- addq.l #6, sp
-
- load_and_execute_program:
- pea environ_string
- pea command_string
- pea (a3) ; A3 contains address of program name string.
- move.w #0, -(sp) ; Load and Go option.
- move.w #$4B, -(sp) ; Function = GEMDOS $4B = p_exec.
- trap #1
- lea $10(a7), sp ; Reposition stack pointer.
-
- close_file:
- move.w file_handle, -(sp)
- move.w #$3E, -(sp) ; Function = GEMDOS $3E = f_close.
- trap #1
- addq.l #4, sp
-
- terminate:
- move.w #0, -(sp)
- trap #1
-
- data
- environ_string: dc.b "TERM",0
- command_string: dc.b 0
- align
- bss
- file_handle: ds.w 1 ; Handle for the disk file named below.
- command_line: ds.b 1 ; Unused character count will go here.
- filename: ds.b 15 ; File name for redirected output.
- program_name: ds.l 1 ; Pointer to name in basepage command line.
- ds.l 96 ; Program stack.
- stack: ds.l 0 ; Address of program stack.
- program_end: ds.l 0
- end
-
-
- Execution results for PRG_5AP.TOS as a spawned process.
-
- PRG_5AP.TOS Execution Results
-
- When executed from the desktop, this program will print this string on the
- video screen and pause for a keypress. But, when this program is spawned by
- SPEED_1, SPEED_2, SPEED_3 or SPEEDTST, the string will be stored in a file
- named PRG_5AP.DAT and the program will not pause for a keypress.
-
-
- Conclusion
-
- Performance testing and utilities with which such
- testing may be accomplished has been the subject of this
- chapter. But the material in this chapter represents only a
- beginning. Software testing as a subject is complicated
- enough, but implementing such testing is a horrendous task.
- At this point, I have provided you with a few simple
- tools and a tutorial which should assist you in calculating
- instruction execution times. I have said that single-
- stepping through a program with AssemPro's debugger is one
- method I use to verify a program's performance. The
- debugger permits one to view registers and memory locations
- while tracing through a program in this manner.
- For short, uncomplicated programs, if you are able to
- keep your wits sharp while doing so, this is a viable method
- of verification. But many programs cannot be tested within
- the debugger. Furthermore, it is virtually impossible to
- keep track of register and memory activity for larger
- programs. Therefore, programs which do this automatically
- will be introduced in a later chapter.
- For now, it is time to take advantage of the two
- utilities introduced here to investigate the questions
- raised by material in earlier chapters. I do this in
- chapter 6. There I will compare programs assembled in each
- of three assembly modes, and I will compare the performance
- of certain instructions, so that you can see early on why I
- choose to use them in future programs.
-
-