home *** CD-ROM | disk | FTP | other *** search
Text File | 1992-05-29 | 104.1 KB | 1,776 lines |
-
- A.I.B.B.
- Amiga Intuition Based Benchmarks
- A system performance evaluation utility for the Amiga
-
- Program Release Version 4.3
- Copyright 1991,1992 LaMonte Koop
-
-
- This software is provided as is. No warranty as to the performance or
- validity of data obtained within is stated or implied. Bug reports and
- suggestions for improvement are welcomed, and every effort will be made
- to evaluate such reports.
- AIBB is freely distributable provided no fee other than a moderate
- fee for disk copying charges is made for its acquirement. It may be
- distributed across any electronic network, provided no fee is charged
- specifically for it's download. A broad-based download fee is acceptable
- provided it is charged universally for all such file downloads. All
- associated files included with the distribution archive of AIBB are to
- remain intact and unaltered. BBS listing notices and the like may be
- included in the archive provided no alterations are made to the actual
- distribution files themselves.
- This program, and all accompanying files are not public domain. They
- are copyright material and may not be used for commercial purposes without
- permission from the author. In most circumstances such permission will
- be granted, but the author must be contacted before any distribution with
- a commercial product.
- AIBB is not shareware, as no donation or usage fee is required.
- However, any donations are always appreciated, and can only encourage
- further development of the program. This is an ongoing project, and will
- continue to be so as long as interest in it is shown.
-
-
- INTRODUCTION
-
- AIBB is a utility primarily designed to assist in the evaluation of
- system performance on a basic level. It consists of a series of system
- "benchmark" tests, the results of which are put against other systems
- and the results displayed for comparison purposes. It should be noted that
- care must be taken when making a definitive evaluation of the performance
- of any system, as much more is involved in making a thorough determination
- than the data which can be provided by AIBB alone.
- System performance evaluation, commonly referred to as "benchmarking",
- is the rather dubious science of trying to determine which system or
- system architechture is "fastest". Unfortunately, all to often it is not
- completely clear what is meant by which system is "fast".
- Computer systems in general usually consist of a number of devices
- interconnected to form a whole. These individual devices can be on one
- circuit board, such as the case with certain coprocessor devices, etc...
- or even as seperate entities completely, physically connected in some
- external fashion, such as with expansion boards. All of these devices will
- have certain advantages and disadvantages in performance levels. Combined
- together, it is generally the use of the system in general which determines
- how much of an effect is seen in these factors when observing overall system
- performance. Before delving into these factors further, it is necessary to
- first clarify a few of the key components which are main players in the
- performance game.
-
- I. The system CPU.
-
- The CPU ( Central Processing Unit ) of a computer is often the focus
- of most performance discussion. This unit is generally responsible for the
- non-specific portion of any computing task. It's duties involve general
- program instruction execution, and in many cases it is the device
- responsible for 'mastering' the system and coordinating the system effort
- as a whole. Note that this is a generalization. Systems do exists which
- are distributed; their CPU is not as readily defined, or consists of multiple
- CPU units each coordinated as a whole. However, in the context of this
- discussion a single device will be assumed.
- Since the CPU of any system does often recieve a great deal of the
- overall responsibility for program execution and task organization, it is
- thus a very key part in the overall performance of the system as a whole.
- However, often times it is considered solely as the factor which determines
- the "speed" a computer can perform a particular operation. This assumption
- is not always valid, and must be thought out carefully. Many other factors
- may affect the efficiency of the CPU itself in performing it's operations,
- which is why the system as a whole must be evaluated towards a particular
- job which it is to be given. But before this relationship becomes clear,
- the other components which are factors must first be recognized.
-
- II. Coprocessor Devices
-
- A coprocessor is any system processing unit which works in conjunction
- with the primary processor (CPU) in the actions of the system. Such devices
- are often subsystem-specific, and are responsible for a particular set of
- computing tasks. For example, a system may include a FPU, or Floating
- Point Unit to take on the task of floating point computations. These
- processors are generally fine-tuned to that specific task, and thus are
- more efficient at it than the main processor would be if it were to do the
- same job.
- Thus, the primary use of coprocessors is to alleviate some of the total
- system computing load from the CPU. These devices may be directly coupled
- to the CPU, thus being closely tied to the performance of the master
- processor, or may be of a loosly coupled variety. This latter type of
- coprocessing unit is tied to the CPU only when it requires data and
- information from the main processor, and in some situations may be capable
- of accessing and modifying system memory without going through the
- CPU at all. Although this concept is not unique to coprocessors alone,
- it is relevant, and thus will be explained here. Such memory accessing
- capabilities denote a Direct Memory Access device (DMA). These devices
- do not necessarily rely on the CPU to transfer data to them, and thus are
- often 'decoupled' from the CPU in such a way as to have a different
- performance ratio from the CPU itself. Even non-DMA devices are often
- afforded a level of concurrent, or simultaneous operation with the main
- CPU, so as to provide a more efficient method of task completion. However,
- DMA devices are more closely tied with another set of subsystems to be
- considered when dealing with system performance.
-
- III. Bus interfaces.
-
- This is often a confusing topic. The term 'bus' is used a great deal,
- but all to often it is not clear what is meant by it. As stated before,
- a computer system consists of a number of devices integrated together to
- form the whole. A bus is, simply put, a communications pathway between
- devices. Over these pathways control, address, and data signals are
- transferred to devices which are required to perform a portion of any
- particular task. Most systems contain more than one bus in which this
- communication takes place. Usually, a primary bus or combination of
- specific primary buses is responsible for the majority of data transfer and
- communications between all devices in general, which lesser buses used as
- specific pathways between certain devices. Buses are often 'sized', or
- given in terms of bit-bandwidth. Basically, this is a determination of the
- maximum size of simultaeous data transfer across the pathway between devices.
- For example, an 8-bit bus can transfer an 8-bit quantity of data across
- it at once, while a 32-bit bus can transfer 32 bits at a single time
- ( Where a bit is defined as an electrical signal value representing a binary
- number, either 0 or 1 [ Logical FALSE or TRUE, which orientation depending
- upon the design of the system ] for each bit ). Although there are other
- sizing factors which come into play, this is a general idea, and suitable
- for the discussion at hand.
- As any system relies on the coordinated efforts of all its components,
- the efficiency and effectiveness of communication between each device is
- of importance when considering the overall performace of the computer. A
- bus which is not up to par with the capabilities of the devices it
- interconnects will hinder the system while one which is capable of handling
- the individual components will allow for a more efficient setup. More of
- this relationship will be given later after the other component members are
- introduced.
-
- IV. Input and Output ( I/O ) Devices.
-
- This is a lose subset of devices collectively describing such units as
- storage media devices ( disk/tape drives, etc... ), external communications
- devices ( serial and parallel communications to external units ), and
- specific control input units, such as keyboards and other data input means.
- While the latter of these devices is generally not considered to be of much
- influence in system performance, the former members, such as storage devices,
- can have a great impact on performance levels.
- Storage devices are in general the slowest of data transfer devices
- on any system. For this reason they are often considered to be a
- 'bottleneck' in system performance evaluation. However, many advances have
- been made in the design of such units, including the use of DMA access from
- storage device control units to the system main memory, which helps by
- alleviating the CPU's responsibility in data transfer from these devices.
- Generally, I/O devices are more important to systems requiring a great
- deal of access to large quantities of data, or ones involved in data
- transfer as their primary mechanism of use.
-
- V. System Memory.
-
- This subsystem has been mentioned in passing previously, but until
- this section not given full attention. System memory resources also play
- a big part in overally system performance evaluation. Memory can affect
- a system's performance in many ways. Depending on the speed of other
- devices, utilizing memory subsystems which are slower (requiring the
- addition of 'wait states - periods of time in which the data requesting
- device waits for the data to be available - to properly interface to the
- system) can cause any data accesses to occur at a slower rate than the rest
- of the system could otherwise handle them. Many memory subsystems do
- indeed utilize wait states, as other devices are too fast for such memory and
- the memory access speeds required for zero-wait-state access would make for
- prohibitively expensive systems. Although a completely zero-wait state
- system is often not feasible, methods are available to system designers to
- try and reduce the overall memory latency periods. One widely used method
- is the use of cache memory.
-
- VI. Cache Memory.
-
- Cache memory is a memory storage medium which is usually designed for
- the fastest possible access to frequently used resources, usually
- microprocessor instructions and/or data. This area is generally small compared
- to the size of an entire memory complement, and thus can be implemented at a
- cost lower than that of employing very fast components for all of the system
- memory. The general operation of most memory caches is to store the most
- recently accessed instructions or data within the cache, then make a check for
- them there upon the next memory access call. In this sense, if the instruction
- or data is in the cache, it can be accessed almost immediately, rather than
- having the processor fetch the required data from the system's main memory
- resources. A cache 'hit' is the term used to indicate the processor did indeed
- find the data within the cache, and did not have to fetch from main memory,
- whereas a 'miss' denotes when the processor was forced to get the needed data or
- instructions from the main system memory. When a miss occurs, the cache will
- usually be updated with this new data, in the case it is called for again, thus
- keeping the data in the cache fresh.
- The main theory behind such caches is that many programs spend a great deal
- of time within the confines of a loop. Therefore, depending on the size of the
- cache, part or all of such a loop can be held within the cache, decreasing
- execution time. Caches can be found both external to the microprocessor or,
- increasingly, within the microprocessor itself. These caches may be seperated
- such that they only hold instructions or data individually, or may be set up
- such that both types of memory accesses are kept within one cache. There are
- tradeoffs to both types of design, but in general the cache in any form is a
- useful mechanism for increasing system performance. One must be cautioned,
- however, as the cache can also lead to a misrepresentation of system performance
- comparisons. Benchmarking tools are often small segments of programs, and as
- such may be easily cached on systems equipped with such. Thus, a benchmark
- result may not accurately depict the true system performance with a real-world
- application which would not be entirely housed within a system cache.
-
- A word on clocks and clockspeed ratings.
-
- No mention has been made of clockspeed ratings of various devices
- so far because they are often misleading terms and can be taken in the
- wrong context in many cases. Therefore this subject is placed in a seperate
- section of discussion.
- "Clockspeed" ratings of devices are in actuality frequency measurements.
- Almost all digital devices operating in a computer system require some sort
- of timing input to coordinate their internal and external responses.
- Generally, this is provided by a clock signal fed to that device, and in
- some cases the device itself may be responsible for the generation of
- additional clock outputs to other devices.
- Clock frequency ratings for system components are usually today given
- in terms of MegaHertz ( MHz ). This is a cyclic frequency rating indicating
- a the number of cycles per second an oscilating periodic signal undergoes.
- As an example, a rating of one MegaHertz indicates a rate of one million
- cycles per second.
- As indicated earlier, almost all digital system components require some
- form of clock input. To see where this is important, take the case of the
- CPU. Generally, instruction execution timing is stated in terms of the
- number of clocks a given instruction takes to complete. A faster clock means
- that although an instruction takes the same number of clocks to finish, more
- clock input edges occur in a given time frame, and thus afford a faster
- response. In this sense, faster clock rates generally indicate faster
- devices. The system bus, and other devices are also managed in terms of clock
- inputs signals. These may or may not be the same input as given to the CPU,
- or the CPU itself may control said clock rates. Thus, differences in clock
- ratings between subsystems can be a source of bottlenecking, if one faster
- clocked subsystem is forced to wait to synchronize with a slower subsystem in
- order to transfer data and control signals.
- Let it not be thought that clock input frequency is the sole governing
- force in determining component speed, however. In many cases, other effects
- can cause similarly clocked devices which do the same task to finish in
- differing amounts of time. One way this can happen is if one device has
- been enhanced in such a way as that it's internal operations are more
- efficient, thus requiring fewer clocks to complete a given operation.
- Therefore, this factor must be weighed as well as clockspeed in even single
- device evaluations.
- It should be noted that the term "bus cycle" is often confused with
- the concept of of clockspeed, because of the term cycle. A bus cycle
- is related to the clock cycle rate, but not usually identical. Bus cycles
- are the time required for the CPU or other device to access data and
- complete an external bus operation on it. For example, the MC68000 CPU runs
- a 4 clock memory access cycle in general ( asynchronous memory transfers ),
- requiring 4 CPU clocks to access a given memory operand. This is assuming a
- no-wait state operation. Wait states are additional clock periods added to
- this cycle time in order for the data to be validly returned from the
- accessed device, and are placed in the bus cycle period when a device is
- incapable of responding to the data transfer request within the normal 4 clock
- period. This is only given as a particular example; other CPUs and
- architectures have differing bus cycle timing layouts (i.e, the MC68020,
- MC68030, etc... run normal 3 clock asynchronous bus cycles ).
-
- Putting it all together
-
- These by all means do not represent the entire array of factors
- involved in effectively evalution performance issues in computer systems,
- but they are a good example assortment. These factors cannot be considered
- alone, but rather must be put together in order to get a whole picture.
- Moreover, the intent of the system in use is important in weighting these
- factors towards which are more influencing for any particular task.
- As an example, consider a system primarily intented for data processing
- tasks. One might expect that it should have a relatively fast CPU in order
- to work through the data at a reasonable pace. However, if the system's
- memory resources are such that they require the addition of many wait states
- into their accesses, then some of the effect of having a fast CPU is offset. Even further, what type of data is being processed?
- Is it of a floating-point variety? If so, then a very fast CPU might not
- necessarily be as effective as a moderately fast Floating Point coprocessor
- added to the system. Another important factor might be the amount of
- data which needs to be continously accessed from storage devices. In
- the case where a great deal is being pulled from such devices, and they
- are slow in providing the data to the system, then no blazingly fast
- component elsewhere is going to be able to make that system setup mark
- high in it's environment as the data is only able to get to the 'fast'
- devices as fast as the 'slow' storage devices can provide it.
- Thus, care must be taken in evaluating any system's performance
- in order to properly take into account all factors involved. This includes
- determination of the usage of the system, and how individual components
- may affect this speed.
-
- Specific performance aspects to the Amiga
-
- Until now, the discussion of system performance has been left fairly
- general. But when dealing with performance issues, the system architecture
- in question must be looked at specifically to make any kind of fair
- determination. As AIBB is a utility for comparisons between various
- Amiga computer systems, this will be the focus of this section.
- The Amiga is a particularly interesting system as a whole to evaluate,
- as it is a very complex architecture for its relative price range. It
- includes aspects of multiprocessing within it's design, as well as a
- multitude of different system layouts to consider. However, only subsystems
- relevant to the type of testing performed by AIBB will be considered here,
- these being the 'core' elements of the system, discounting I/O devices and
- external communications units. Of primary interest in this discussion is
- the system CPU, coprocessing devices, and memory subsystems of the Amiga.
-
- Layout of the Amiga
-
- Primary system processors.
-
- The Motorola M68000 series of microprocessors is utilized as the
- main CPU in all Amigas in production today. Various models of Amigas
- exist which utilize most of the main variants of this microprocessor family,
- with third-party add-on accelerator units providing an upgrade path for
- many systems originally borne with earlier 68000 series CPUs. An overview
- of the various M68000 microprocessors and their main uses in Amigas is
- as follows:
-
- MC68000 : This was the CPU the Amiga was born with, utilized in
- the Amiga 1000 first, and subsequently in the A500 and
- A2000 stock system models. This CPU is characterized
- by a 24-bit address bus, giving it a 16 megabyte addressing
- capability, and a 16-bit data bus. In all stock Amiga
- models utilizing this CPU, the device is clocked at the
- clock rate of the system bus, approximately 7.15 MHz.
- Certain add-on accelerators do exist for this CPU, replacing
- the stock motherboard component with an add-on board which
- runs the CPU at 14.28 MHz, or in some accelerators, 16.0 MHz.
-
- MC68010 : This CPU has not seen wide use in Amiga systems, although
- it's use does exist. The MC68010 is pin-compatible with
- the MC68000, allowing for simple drop-in replacement in
- any system utilizing the latter. Most systems do not see
- a tremendous performance boost while utilizing this CPU,
- however, as it's improvements over the MC68000 are not
- a tremendous leap. The MC68010 does include various
- microcode enhancements over the MC68000, allowing for
- faster instruction execution in some circumstances, as
- well as the addition of a specialized transparent 'loop mode'
- which enhances CPU performance in tight program loops by
- allowing tight code loops to be latched into the CPU
- instruction prefetch queue where external bus cycles are not
- necessary for the loop code proper. As indicated earlier
- though, this CPU has not seen a great deal of use in Amiga
- systems, and is not of primary focus here.
-
- MC68020 : A major upgrade to the line, the MC68020 includes a great
- many advances over the previous members of this
- microprocessor line. The MC68020 is the first fully
- 32-bit capable microprocessor of the M68000 series,
- incorporating full 32-bit address and data buses.
- This microprocessor also incorporates a 256 byte
- instruction cache, in order to cache program code sections
- used often within a fast-access medium. The MC68020 is
- a large step above the MC68000 or MC68010, with an
- architecture more capable of handling larger demands upon
- its resources. The MC68020 is utilized in earlier
- acclerated Amiga systems, including as the main processing
- engine of the first A2500 series of machines which utilized
- the CBM A2620 accelerator unit. Many acclerators utilizing
- this CPU were produced by third-party manufacturers,
- including many low-cost units found in some A500 units, as
- well as the A2000 line. In most units, this CPU is clocked
- at approximately 14.28 MHz, with a few of the low-cost
- accelerators running the CPU at the 7.14 MHz system clock
- of the Amiga.
-
- MC68030 : Improvements were made to the MC68020, including the
- addition of a 256-byte data cache to complement the
- existing instruction cache, and the inclusion of an
- on-board memory management unit ( MMU ) in order to
- produce the MC68030. Additional improvements exist
- internally to this CPU over the MC68020 to give this
- CPU a stand against its generation of competing
- microprocessors. The MC68030 is found as the accelerated
- CPU of the later A2500 series of Amigas, as well as being
- the main processor of the Amiga 3000 line. This
- microprocessor has also been widely implemented in
- accelerator units for all models of Amigas and is used at
- a wide variety of clock frequencies, ranging from 16.0 MHz
- to 50.0 MHz.
-
- MC68040 : This microprocessor is a next-generation leap over the
- previous MC68030 model incorporating a great many advances
- over all previous models in this series of microprocessors.
- Both instruction and data caches found in the MC68030 are
- present, but their size is increased to 4K bytes each. In
- addition, the data cache of this processor now supports a
- 'CopyBack' mode of operation, providing for faster data
- access times. On-chip MMUs exist for both data and
- instruction pipelines within the CPU, and internal
- pipelining has been streamlined for increased performance.
- A subset Floating Point Unit (FPU) is also included on-chip
- for floating-point calculations. This CPU is this found in
- only 25-28 MHz rated varieties at this writing, though this
- will likely change in the near future.
-
- There are several variants of these primary microprocessor models in
- production. The newest such variants are the Motorola "EC" series of
- M680x0 parts. The "EC" ( Embedded Controller ) series are characterised
- by changes from the standard part ranging from simple packaging to the
- removal of certain internal features. This latter option is what has
- been taken with the MC68EC030 and MC68EC040 parts. The MC68EC030 is
- characterized by the lack of an on-chip MMU. Other than this change, it
- functions identically to the standard MC68030. This microprocessor has
- found some use in the Amiga by "economy" accelerator units, as the MMU
- is not used often by applications programs, and these processors are less
- expensive than the standard parts in general. The MC68EC040 has not yet
- made it's appearance, but perportedly will not include the on-board MMUs
- or FPU of the MC68040.
- At this point it is of interest to bring up a point of common interest
- with accelerated Amiga systems; that of asynchronous vs. synchronous
- accelerator designs.
- Synchronous designs were the first accelerators to appear for the Amiga.
- These designs are generally found in the MC68020 based accelerator units,
- and also in many of the low-cost MC68000-based accelerators. A synchronous
- design is one in which the devices present on the accelerator are clocked at
- a rate which is absolutely synchronized to the main system clock signals.
- For the A500 and A2000, this means the clock rate of such accelerators
- must be an even multiple of the 7.15MHz system clock rate. Because of the
- difficulties involved in maintaining synchronicity at high clock rates,
- generally these accelerator units are restricted to about 14.3 MHz, or double
- the system clock rate.
- Asynchronous designs, on the other hand, have no such restrictions.
- These units are somewhat more difficult to design, but in general the
- accelerator components may be operated at nearly any clock input, provided
- they are themselves capable of performing at the given frequency. This
- operation mode is what all MC68030-based accelerator designs for the A500
- and A2000 utilize, thus giving the wide range of clock rates found in these
- accelerators.
- It must be noted however that an ambiguity exists in the terms
- synchronous and asyncronous. The 680x0 microprocessor series is characterized
- by normally running asyncronous bus cycles. This simply means the processor
- initiates a read/write action, and it is up to the external device to terminate
- ( acknowledge ) the cycle, thus completing it. This behavior is NOT related
- to accelerator design as might be confused by the use of the same terms. In
- accelerator design terms, asyncronous and synchronous are designating how the
- accelerator state machine relates to the main system clock, and NOT how
- individual bus cycles are run by the CPU in general.
-
- Many accelerated Amigas also utilize an FPU for floating-point math
- intensive operations. The main FPUs in use by the various Amigas available,
- and there add-on accelerators, are manufactured by Motorola as well, either
- as seperate coprocessor devices, or as in the case of the MC68040 are embedded
- within the main CPU itself. An overview of the various FPUs in use is given
- below:
-
- MC68881 : This is a seperate floating point coprocessor device
- which provides fast hardware-supported floating-point
- operations to any system software which supports it's
- use. This unit does provide a certain level of
- concurrancy, giving it the abililty to perform certain
- instructions at the same time the main CPU is performing
- other operations. Support for this coprocessor is
- provided either by a built-in hardware microcode interface,
- found on the MC68020 and MC68030, or by software trap
- interfacing for the MC68000 and MC68010. The latter
- method is used in but a few early Amiga accelerator boards,
- while the preferred interface, that to the MC68020 or
- MC68030, is supported by virtually all accelerators
- utilizing those CPUs. The MC68881 may be run asynchronous
- to the CPU clock input, meaning it need not run at the
- same clockspeed as the CPU itself. Thus, a faster FPU
- may be used to give somewhat of a boost to floating-point
- operations. The MC68881 is found mostly running at
- clock frequencies ranging from 12-20 MHz.
-
- MC68882 : The successor to the MC68881, this unit incorporates
- the same interface and operations as the former device,
- but with certain internal enhancements. The microcode
- for many operations was optimized for faster response, and
- support for further instruction concurrency was added. In
- general this FPU will perform at about 1.5 times the speed
- of the MC68881 at the same clock input frequency. The
- MC68882 is primarily operated at clock rates of 12-50 MHz,
- depending on the accelerator or system utilizing it.
-
- MC68040 : The MC68040 CPU incorporates an FPU within the CPU itself.
- This FPU unit is a basic subset FPU to the MC68882,
- eliminating mainly the transcendental (sin, cos, etc...),
- and complex functions found in hardware microcode on the
- former. Nevertheless, the optimized nature of the existing
- FPU instructions provided allow for emulation of the
- eliminated functions in such a was as to allow for faster
- execution than the MC68882 for almost all operations,
- including those that are software emulated.
-
- The custom chips.
-
- In addition to these main processing units, the Amiga also incorporates
- a number of custom designed devices, known collectively as the Amiga's
- custom chips. Their primary purposes are varied, but they are generally
- in charge of such things as DMA access and arbitration to various memory
- areas, graphics generation and effects, and sound generation and effects.
- The custom chips utilized within the Amiga are:
-
- Agnus : Probably the most talked about custom chip, Agnus is found
- in a number of flavors, ranging from the original device,
- to the 'super' version found in the A3000. Aside from minor
- internal changes, the main differences between these
- different versions is the amount of memory they can directly
- access. Agnus is responsible for for control of 25 system
- DMA channels, generation of all system clocks in the A500
- and A2000, and provides control and addressing for CHIP RAM,
- or the memory accessable by these custom chips. The size
- of this memory region is determined by the Agnus in use,
- and is either 512 KBytes, 1 Megabyte, or 2 Megabytes in
- range. As the custom chips are utilized for graphics and
- sound coprocessing tasks, all such data must be located in
- this CHIP RAM area. Agnus also contains within it what is
- referred to as a Bit Blitter. This internal device is
- a fast memory copy unit designed to move areas of memory
- as efficiently as possible, and has the capability to also
- perform specific manipulations of said data in the process.
- Finally, Agnus also contains Copper. Copper is the system's
- Display Synchronized Coprocessor. This device assists with
- screen refreshes and display building.
-
- Denise : The Denise custom chip is responsible for color generation,
- and display resolution modes. This chip also contains the
- eight hardware display sprite controllers used in the system.
-
- Paula : Paula is more or less a diverse device. It controls sound
- generation, contains the system floppy disk control
- circuitry, and I/O control circuitry for the disks as well
- as external control ports. Paula also contains an
- interrupt control system within for various system
- operations.
-
- Gary : This custom chip is not heard about much, primarily because
- it's role is not as forbearing as the other chips. In
- fact, Gary is not present on the A1000 series of Amigas.
- Gary is basically a 'glue' chip, responsible for smoothly
- holding things in line. It does handle a certain amount
- of the floppy drive signals, but is primarily utilized for
- bus control and address decoding along the system bus.
-
- The custom chips of the Amiga, and the coprocessors associated with
- them are designed in such a way as to alleviate the main CPU from many
- intensive tasks, such as graphics operations and sound generation. They
- support a concurrent level of operations, allowing the main CPU to continue
- with non-specific computing tasks while the custom chips handle their
- specific operations. The custom chips are capable of DMAing directly
- into the CHIP RAM area, freeing the CPU completely from task responsibility
- in those respects.
-
- Bus layout.
-
- The seperation of operations, and the denotion of the memory area
- of CHIP RAM is further accentuated by the fact that the Amiga utilizes
- two buses along these areas. The CHIP RAM bus is a seperate entity from
- the main bus utilized by the CPU and other devices, but is accessable by
- the CPU as well. The seperation can even be greater given the fact that
- the CHIP RAM bus can be decoupled from the CPU bus completely under
- certain circumstances.
- The CHIP RAM bus is the bus primarily utilized by the custom chips,
- with the CPU also have access to it on an interleaved cycle basis.
- ( every other bus cycle can be a CPU access cycle ). The custom chips
- have priority in this domain, and this is where the idea of bus contention
- arises. If a great deal of bus activity is in progress by the custom chips,
- they may 'lock out' the CPU, forcing it to wait if it needs data or
- information from this bus and it's memory area. This is where the touted
- 'FAST RAM' comes in.
- FAST RAM is memory not on the CHIP RAM bus, but rather on the main
- system CPU bus or expansion bus. This memory is not accessable by the
- custom chips, thus no contention for it's access occurs between them and
- the CPU. Due to the seperate nature of the buses, it is possible for the
- CPU to be processing instructions and data utilizing FAST RAM while the
- custom chips are concurrently operating in the CHIP RAM area. This
- concurrent operational status allows the Amiga to perform a great variety
- of graphics operations in such a way as to be usable on a bus which is
- not operated at a great speed.
- The CHIP RAM bus on all Amigas is operated at a clock frequency of
- approximately 7.15 MHz. On the A500 and A2000, this is the main system
- clock frequency. For those machines, the CHIP RAM bus is accessed via
- a 16-bit wide bus port, while on the later A3000 systems the bus port
- for external accesses is a full 32-bit interface, affording larger data
- transfer sizes at the same clock rate.
- Because of bus contention, a system containing only CHIP RAM may very
- well have slower operations than one which contains FAST RAM as well. The
- FAST RAM equipped machine will be capable of having the CPU operate
- concurrently on information on that bus, while the custom chips operate on
- their tasks. The CHIP RAM only system is going to have circumstances where
- the CPU will be forced to wait to access data, as the custom chips may be
- utilizing the CHIP RAM bus heavily.
- FAST RAM in the A500 and A2000 series of machines can be located on
- many devices, from standard expansion card extenders which exist on the
- system expansion bus and operate at the system clock frequency, to other
- methods of RAM addition which have been devised that do not directly use
- the common Amiga expansion routes. FAST RAM located along the standard
- expansion backplane on these systems operates at the system bus clock
- rate ( 7.15 MHz ), and is accessed accordingly. On A3000 machines, FAST
- RAM is generally located on the system motherboard, and is accessed
- according to the system clock rate of those machines, which on stock models
- may be 16 or 25 MHz. It should be noted that some systems utilizing only
- 512K of CHIP RAM have in their memory lists a region of RAM which is called
- FAST, but in fact is on the same bus as CHIP RAM. This is generally the
- memory found on the A2000 motherboard for 512K CHIP RAM machines, or on the
- A501 expansion card for A500s. This memory may suffer from the same bus
- contention that CHIP RAM is exposed to, and thus it is generally advisable
- to be sure that program code is not put here unless it has to be ( e.g, if
- true FAST RAM exists, it should be prioritized ). The program FastMemFirst,
- which is supplied by CBM is meant to do just that.
- FAST RAM located within the domain of an accelerator is not limited to
- the system bus clock rate. It may be operated at such, but in general can
- be accessed at a clock rate much different, usually at the accelerator's
- CPU clock rate. Systems utilizing accelerators benefit from this setup, as
- an accelerator does not change the system clock rate, and therefore in order
- for an accelertor's CPU to use system resources, it has to synchronize with
- the system clock, and may even have to contend with a narrower bus interface.
- Such is often the case on the A500 or A2000 when utilizing accelerators
- which employ the MC68020 or MC68030 microprocessors, which are best suited
- for 32-bit bus ports. Since those processors take a performance hit when
- accessing narrower bus ports, as well as a hit from the possibly slower
- clock rate of the system bus, accelerators often are equipped with their
- own RAM resources which is designed to operate at the CPU clock frequency
- and utilizes a more efficient bus port size ( 32-bit ). The case with the
- A3000 is slightly different.
- The A3000 utilizes a 32-bit bus across it's memory resources already,
- therefore this is not a problem with accelerators for those machines.
- However, the bus on the A3000 is clocked at 16 or 25 MHz ( depending on
- the model ), and if a faster CPU is used in an accelerator it may be
- profitable for the unit to contain it's own RAM resources in order to
- lower access delays to a minimum. The A3000 does include provisions for
- an accelerator to supply it's own clock signal to the motherboard, but
- as of this writing, this has not been employed by any devices.
-
- Summary and overview.
-
- It can be seen from all this that there is a great deal to be visualized
- when trying to make a comparison of system performance levels. A great
- many factors come into play when trying to determine just what system is
- best and quickest for the task at hand. Various factors can determine how
- efficient an accelerator is on a particular system, or how efficient a
- system is in general. System interface efficiency, accelerator or general
- system design, and intended use all play a part in determining which setup
- is the 'winner' in the speed race. Indeed, there may not be a winner,
- except in a particular task category, and this must always be remembered.
- No benchmark or performance test can possibly hope to test all of these
- categories, and the others which also play roles. Thus, it is necessary
- to utilize data obtained from any set of benchmarks as only a portion of
- the picture to be analyzed, and not as a rock-solid performance indication.
- System design has improved to the point where many benchmarks can be fooled
- into giving higher performance measures than would be found in any typical
- application. As benchmarks are typically small pieces of code, they must
- be evaluated as such. They can indeed give clues as to the performance
- level of a system, but certainly not a definitive answer.
-
-
- OVERVIEW OF AIBB
-
- Given the introduction just had, it seems that it is in order to bring
- in the topic of this entire discussion. AIBB is a program primarily
- designed to test various aspects of system performance at the CPU and
- accompanying device level. It does not test such things as I/O efficiency
- and storage media data retrieval and placement efficiency ( storage I/O ).
- The tests contained within AIBB by no means give a complete picture of any
- system's performance level, but does provide some basic information and
- comparison data for a variety of systems.
- AIBB is divided into a number of sections. Several are simply
- informative in nature, and designed to give a better picture of the system
- conditions during the actual testing phases. Other portions of the program
- allow for a certain measure of system control, to give the ability to
- somewhat modify parameters under which tests are performed. It is important
- to try to pay attention to the parameters and information given by AIBB,
- as they may in turn give important clues as to the nature of the test
- results reported.
- AIBB is set up to allow a user to perform a series of tests on the host
- system, and compare those results against a series of other systems.
- Comparison data is given in both graphical and numerical form. AIBB also
- allows the entire series of tests to be performed, and the results and
- system state stored as a "load module" which may later be loaded and used
- as one of the comparison systems against which a possibly different host
- will be compared to. Tests may be manipulated by code type and system
- situation in order to allow a better picture of system performance to be
- visualized.
-
- System Requirements.
-
- AIBB may be run on any Amiga system utilizing AmigaOS 1.3 or greater,
- but it should be noted that the tests performed are designed primarily for
- accelerated systems or fast systems in general. Therefore, tests may be
- exceedingly long on Amigas utilizing slower CPU units, and the general
- speed of the program may seem a bit slow on unaccelerated platforms.
- Users of MC68040 based systems must be utilizing AmigaOS 2.0 or
- greater in order to run AIBB. Modified versions of AmigaOS 1.3 do exist
- which are patched to somewhat deal with the problems of that OS version
- and the 68040, but as per CBM's official stance, this is not a supported
- method of utilizing the MC68040 as a system processor. For this reason,
- AIBB will abort if it detects a 68040 and the system OS version is less
- than 2.0.
- AmigaOS 1.3 users with accelerators must be sure to be using the latest
- SetPatch routines for those OS versions. ( SetPatch v1.34 ) SetPatch
- corrects a problem with FPU code with those OS versions, and is necessary
- for proper operation of AIBB. AmigaOS 2.0x also is shipped with a SetPatch
- routine which should be executed in the Startup-Sequence to assure any
- future OS bug fixes and corrections will be applied.
- This program does not absolutely have any absolute requirements other
- than those previously mentioned in order to be operated, but it does have
- some suggested configurations. In order to utilize the program's file
- functions, AIBB must be able to find one of the following shared libraries
- in the libs: directory on your system disk:
-
- 1. kd_freq.library ( library version 3.0 or greater )
- 2. req.library ( library version 2.0 or greater )
- 3. asl.library ( AmigaOS 2.0 systems only )
- 4. reqtools.library
-
- AIBB will search for these libraries in this order, and utilize the first
- one found. Primarily, the library need is for file requester utilizing
- functions within AIBB. AIBB will still operate without finding one of
- these libraries, but it will block access to the file-requesting functions
- it normally provides.
- This will be the last version of AIBB to include support for AmigaOS
- versions below 2.0. At this time, more effort is being placed into
- compatibility with later AmigaOS generations, and this will be the mode
- of support emphasized.
-
- Getting Started.
-
- AIBB may be started from either the CLI/Shell, or WorkBench. If the
- latter method is used, it is imperative that the icon used ( if not the
- supplied one ) has it's STACK value set to 20000. AIBB invocations from
- the CLI/Shell have no special requirements or stack settings as AIBB will
- perform the necessary set-up in this environment. It is recommended that
- careful attention be paid to the existing system memory resources before
- starting AIBB. AIBB is quite large, and if you wish it and it's test code
- to be loaded into a certain memory medium ( generally a fast medium if
- possible ), then enough contiguous memory must exist in that memory region.
- AIBB will give information as to where exactly it's code is located, but
- if you are interested in loading AIBB in a certain region, this must be
- taken into account BEFORE starting the program.
- AIBB will not start if it detects the presence of the debugging tool
- "Enforcer" when first invoked. This is because the way AIBB examines the
- system causes Enforcer to complain loudly, although the "hits" it detects
- are not problematic, and are non-damaging. Enforcer was designed
- primarily as a debugging tool. It should NOT be used as a "protective
- measure" for everday use. As this example illustrates, Enforcer cannot
- be sure of what it is reporting as a problem, and if it is indeed actually
- a problem. However, for those who desire, Enforcer MAY be run once AIBB is
- up and running.
- Upon starting AIBB, a few moments may be needed by the program while
- it evaluates the system it is being operated on, the exact time depending
- on the relative speed of the host system in question. A screen displaying
- a message of that sort will be given while this is in progress. Following
- this evaluation, you will be presented with AIBB's main program screen.
-
- Section 1: The Main Screen.
-
- AIBB's primary screen consists of several informational areas designed
- to provide information about test operations and basic system information.
- These areas are divided up as follows:
-
- Performance Graph
- The performance graph is a bar graph display of the comparisons
- made after each test is performed. Ratings are given in reference
- to the base machine for comparisons, with the highest performing
- system having it's bar displayed in RED, while all others are
- in YELLOW. Note that although numerically two machines may have
- the same results out to 2 decimal places, AIBB may still show one
- in red. This is due to rounding, and the fact that the one
- highlighted machine does in fact have a higher rating if a few
- more decimal places were shown numerically. However, such small
- quantities should not be taken literally, as far too many variables
- exist to use such small values in accurate comparisons.
-
- Test Result/Information
- This area provides several pieces of data. First, it gives the
- name of the test last whose information is being displayed
- currently. The numerical result of the test performed is given
- here, as well as the memory node reference number where the test
- code, and possibly any test data is located. To reference these
- node numbers, please see the section on the "System Information
- Display".
-
- Base Machine Indication
- Below the Test Result/Information area is a small reference which
- lists the current comparison system being utilized as the base for
- all comparisons performed.
-
- Comparison Information
- This section provides several key pieces of information about test
- performance. It gives the numerical ratings of all systems
- utilizing the base machine as a reference. These value are the
- same as those used to generate the performance graph. In addition,
- the type of code used for the host system when performing the test,
- as well as the code type of the comparison systems' result
- references is displayed for ratings calculations is displayed.
-
- Basic Information
- Located just below the performance graph, this area provides key
- pieces of information about the current state of the host system.
- The system CPU type, FPU type, and MMU type in use are displayed,
- as well as the current operational status of the MMU, and any
- CPU caches which may exist. Also displayed is the approximate
- CPU and FPU clock speed ratings, as calculated when AIBB first
- evaluated the host system on startup.
-
- Test Activation Gadgets
- These are located in the lower right-hand corner of the screen and
- serve several purposes. Normally, they are utilized to start a
- test, but this is dependent upon the mode of operation AIBB is
- currently in. See the section on "Review Mode" for further
- information of this nature. Activation of a gadget in standard
- mode starts a test with the given code parameters and general
- settings, as detailed in the appropriate sections later. Tests
- are divided into two groups: "Standard" and Floating-Point.
- Standard test types are more general to the system, and represent
- code more often found in operational situations. Floating-Point
- tests utilize a great deal of floating-point math to test the
- system's performance across that domain. Standard tests are
- denoted in AQUA lettering, while floating-point specific tests
- are given in YELLOW lettering. See the test descriptions for
- more information on the tests available from AIBB.
-
- Main Screen Menus.
-
- AIBB's primary screen has attached to it a number of menu items, which
- give even more options and control over program operation. Those operations
- are described below, in the order of the menus as they appear on the screen.
-
- Menu 1: General
-
- About AIBB
- This option presents a requester giving credits and information
- about this version of AIBB.
-
- Enter/Exit Help Mode
- Toggling this menu item enters or exits AIBB from Help Mode.
- While in Help Mode, choosing menu options and screen gadgets
- will result in a requester giving information about the item
- just selected. After a help requester for a certain item has
- been selected, it will not be displayed again during that
- invocation of the help option. Toggling help off and
- reactivating it will result in the requesters being displayed
- again for all items.
-
- Each requester for help shown will allow either the continuation
- of the selected function, or will allow it to be cancelled in
- the event that it was selected only for determining what its
- action would be.
-
- Load Module Prefs
- This version of AIBB allows the use of alternate systems than
- those contained internally in order to make comparisons against
- the host system. This menu item will bring up a requester-like
- arrangement which will allow the paths to load modules to be
- used in place of the internal defaults to be specified. To
- replace an internal module at startup for comparisons, simply
- enter the full path name to the alternate load module in the
- respective entry in this requester. Leaving an entry blank
- informs AIBB to use it's internal default for that system.
- Note that this configuration will take effect when AIBB is
- next started, and the the next menu item, "Save Configuration"
- as detailed below, must be selected to save the choices made
- here.
-
- Save Configuration
- This saves the current state of AIBB's menu item selections, as
- well as the current order of the comparison machines as they
- are placed. For more information on these regards, see the
- section on loading new comparison modules from the default
- systems within AIBB. AIBB currently saves this data to a file
- called "aibb.prefs", which may be located in an assigned
- directory called AIBB:, or your system S: directory. This
- file will be searched for, in that order, when AIBB is first
- invoked, and the values contained within will set AIBB's
- startup options. If AIBB cannot locate a preferences
- configuration file, it will notify you and use internal
- default values.
-
- QUIT
- This item forces termination of AIBB.
-
- Menu 2: Systems
-
- Systems Information
- Selecting this menu item produces a submenu which lists the
- current comparison machines, as well as a selection for the
- host system. Selection of one of these submenus moves AIBB
- to a systems information display, which will show pertinent
- information about that system's state during test operations.
- For a complete description of the data shown here, see the
- section "The System Information Display".
-
- AIBB Task Priority
- Another submenu-endowed item, this selection allows for the
- selection of AIBB's task priority. This is primarily for
- running tests while still allowing multitasking to occur,
- while examining the effects of different task priority levels.
- For information on disabling multitasking during test
- operations, see the "Disable Multitasking" entry under the
- Test Options menu descriptions.
-
- The next entries in this menu will be available only if the
- host system's CPU type has support for their use. Otherwise, they
- will be disabled from operation. These options are provided to
- allow some flexibility in global systems operations, and to provide
- a method of determining the effect of use or non-use of these
- CPU-related items on system performance.
-
- Switch Instruction Cache
- Toggles the activation or disabling of the CPU instruction
- cache. The state of this cache will be reflected in the
- Basic Information area of the display.
-
- Switch Data Cache
- Toggles the CPU data cache operation state. The status of
- this CPU cache is also reflected in the Basic Information area
- of the display.
-
- The cache-influencing items below toggle certain modes associated with
- the caches in question. A lot of confusion exists about such modes,
- and the MC680x0 cache BURST mode ( supported on the MC68030 and
- MC68040 ) is often not understood. BURST mode operations are a special
- form of cache filling ( updating the contents of the cache ) where an
- entire 'line' of cache data may be filled sequentially and faster than
- the single-entry mode of cache filling. A cache 'line' in this case is
- a series of 4 longwords ( 32 bits each ) arranged simplistically as:
-
- entry: 1 2 3 4
-
- line 1 ---- ---- ---- ----
- line 2 ---- ---- ---- ----
-
- where each entry is one longword. The MC68020 and MC68030 utilize
- cache sizes of 16 lines, giving 256 bytes of cache storage. The
- MC68040 increases this to give a total of 4K of cache space for each of
- data and instruction cache.
- BURST mode is essentially a compromise in performance. Average-
- case CPU performance is enhanced at the cost of worst-case performance.
- The latter is true because during BURST mode operations, the CPU bus
- controller is committed to a memory fetch sequence for a longer period
- of time than with single-entry mode. The mode enhances average and
- best case performance by allowing the CPU to sequentially fetch 3
- additional longwords from memory faster than normally done by the usual
- asynchronous single-fetch bus cycle. Once it has fetched the first
- longword, the next 3 are clocked into the cache line utilizing only 2
- clocks per fetch, thus filling one cache 'line' in 9 clocks ( assuming
- a zero-wait state fetch ) rather than 15 clocks. The theory behind
- this is that the data/operands sequentially surrounding the initial
- fetch will most likely be needed soon in any case, and placing them in
- the cache leads to their eventual faster access.
- BURST mode operations are not universally applicable to all
- systems however. Generally, the memory controller on the system ( or
- particular memory board ) must be capable of supporting BURST mode
- operations, or the BURST request by the CPU will not be fulfilled. In
- systems not capable of these modes, activating them will not be
- detrimental, but will go unnoticed in performance terms. The CPU will
- request BURST fills when it deems appropriate, but the memory
- controller will not acknowledge the request and thus simply force the
- CPU to do single-entry fetches as in standard operation.
- AIBB allows toggling of the following cache modes on 68030 CPUs
- ( The MC68040 implicitly requests BURST mode transfers, and does not
- have a cache BURST disable option except in hardware response from the
- memory device, thus the cache BURST toggle options are not available
- for 68040 systems...this is evident by the cache status display showing
- this mode to be ON at all times on these platforms ):
-
- Switch I-Cache Burst Mode
- This activates or deactivates the CPU instruction cache BURST
- mode of operation. Note that not all memory subsystems will
- support the utilization of the cache mode, but the state of
- this cache setting will not harm the system either way.
-
- Switch D-Cache Burst Mode
- Works the same as Switch I-Cache Burst Mode, but operates on
- the CPU data cache BURST mode setting. Note again that
- not all memory configurations support this mode, and setting
- this cache operation mode may not have any effect.
-
- Switch 040 Copyback Mode
- The MC68040 data cache is capable of being operated in a
- special mode known as Copyback, and this item will toggle its
- setting. Copyback mode does not immediately write data through
- to main memory during write operations, but only to the cache
- until such time as the cache is or particular entries/lines are
- 'flushed'. This increases performance on data writes as main
- memory bus cycles do not have to be run for cache hit situations.
- However, some applications may have difficulty with this mode, as
- it is possible for memory to be altered without the cache being
- updated properly. Some DMA devices are affected by this, and even
- the normal unmodified 1.3 version of the Amiga's operating system
- and below are not compatible with Copyback mode. Please refer to
- your accelerator or system documentation before utilizing Copyback
- mode.
-
- Menu 3: Test Options
-
- Disable Multitasking
- When this item is selected, it indicates AIBB should perform
- all tests in such a way as to disable all system multitasking
- during the run of any test. This allows a figure to be
- generated which indicates the system performance FOR THAT TEST
- more accurately, as there is no task context switching during
- the test runs. Note that all comparison system figures are
- generated with this option enabled, so this should be selected
- in order to compare the systems on an even par. When this
- item is utilized, the previously mentioned ability to set AIBB's
- task priority will have no impact on test performance, as no
- task switching will occur, and thus the task priority level
- becomes meaningless.
-
- It should be noted that when using this option, it is a good
- idea NOT to be running much in the background. The Amiga's
- operating system is a near-real-time setup, requiring in many
- cases fast response to system conditions. Use of this option
- can affect certain other operations adversely, most notably
- that of serial communications and the like.
-
- Screen Overlay
- Using this option results in AIBB putting a one bitplane
- ( two color ) low-resolution screen over it's main screen
- during every test. AIBB's normal screen is a high-resolution
- 4 bitplane ( 16 color ) screen, and on CHIP RAM only systems,
- and for some tests even on FAST RAM equipped systems this may
- result in a great deal of bus contention on the CHIP RAM
- bus. Subsequently, performance levels may be adversely
- affected by such bus contention. The use of this option
- attempts to alleviate some of this problem by utilizing a
- screen overlay which minimizes bus contention on the CHIP RAM
- bus, by limiting the required DMA activity by the custom chips
- to display it while it is the topmost screen. Again, all
- comparison data for the other systems is obtained with this
- option enabled, so in order to keep comparisons on par this
- option should be enabled, which it is by default values.
- Note that for graphics-related tests this option will not be
- activated as it would be detrimental to what those tests are
- indeed trying to analyze. It is advised that if this option
- is enabled while multitasking is permitted that screens not
- be shuffled while a test is in progress. The uppermost screen
- is the cause of the CHIP RAM bus display DMA effects, and to
- shuffle to another screen during a test could nullify the
- advantage of using this option.
-
- Set Comparison Base
- This item contains the names of the comparison systems in a
- submenu area. Selecting one of these submenu items sets
- the current comparison base system to that machine. The
- comparison base is the system utilized as the 'base' value for
- test results when computing performance ratings. All
- percentages shown are given as percentages of the base system,
- with a 1.0 value for a system indicating a performance
- equal to the base system.
-
- The next items in this menu allow code type options to be selected
- for the host system ( and subsequently the tests which are
- performed ), and the comparison systems. Selection of a code option
- for the host system causes AIBB to perform any tests utilizing that
- option or options. Selections under the comparison systems result
- in AIBB using the figures for that code type ( previously obtained
- when the comparison data was generated ) when making comparisons.
- Note that not all options will be available. This is dependent on
- the system capabilities. Each selection contains the following
- options within a submenu, the first options in each submenu selecting
- options for the type of "standard", or non-floating point specific
- code used. The selections here affect the general code type
- utilized for all test types.
-
- Standard 68000 Code
- Having this item selected sets the code type to code which is
- compatible with all MC680x0 series microprocessors. Note
- that this means no advantage is taken of the capabilities or
- code optimizations available on later-generation
- microprocessors of this series, but it is a good base selection
- as it can be utilized on all existing Amiga systems.
-
- 68020+ Code
- This item selects code compatible with later generation
- MC680x0 series processors. It will not be compatible under
- most circumstances with earlier ( MC68000 or MC68010 ) based
- systems, but will take advantage of some of the more advanced
- capabilities of these later processors of the series.
-
- The next options available are for floating-point code type utilized.
- It will affect only tests which utilize floating-point math in
- nature.
-
- Standard Math Code
- Using this option sets the code type to use software emulation
- of floating point routines. This is compatible with all
- Amiga systems in use, as it is not hardware specific.
-
- In-Line Coprocessor Code
- This option sets the test code type to that which uses in line
- coprocessor instructions for floating point operations. As not
- all systems will have a coprocessor available, this option is
- not universally usable on all systems, as is not the code type
- used itself.
-
- A note must be made here about the code types available. The
- math-specific code selections will ONLY affect tests which utilize
- floating point routines, while the "standard" code selections will
- be effective across all tests, INCLUDING floating-point tests.
- Again, not all code options will be available on all systems, due
- to the various configurations which may exist, and the lack of
- certain hardware therein.
-
- Menu 4: Special
-
- Enter/Exit Review Mode
- Entering Review Mode gives a method for reviewing previously
- performed tests and their comparisons. When this mode is
- active, selecting a test gadget, or setting a comparison option
- ( code type, etc ), will result in the display of the results
- last obtained for that test. If no test results for the host
- system are available, the information for the comparison
- systems currently in use will be shown, and the host system
- will data will be marked with a 'N/A' indicating the
- information is not available. The ability to display the
- comparison system data without running the actual test on the
- host system is provided to allow a quick view of the performance
- of said comparison machines before running the test(s) on the
- host.
- Code type options may be manipulated here, and if a test
- result is available for those settings, it will be displayed.
- For example, if you were to have the Matrix test as the current
- test you are viewing, and you want to see the results of the
- test under 68020+ code, selecting that item under the
- "This Machine" code type selection will show the Matrix test
- results utilizing this code type ( if they were previously
- performed, making the data available ).
-
- Start/Stop Log File
- AIBB has the ability to keep a "log file" of test
- activities. This option allows you to start this logging
- operation, or stop it once in progress. The log files contain
- basic information, in text form, about each test as it is
- performed, as well as essential system information.
- Starting a log file involves selecting a file name to which
- AIBB will save this data. If the file is an existing one, AIBB
- will check for the words "AIBBLogFile" at the start of the file.
- If this is not found, you will be warned and given the option of
- aborting the use of this file as a log file. Heed this...AIBB
- WILL write into any file if told it is acceptable, including
- executable load files. This checking is done in order to
- prevent accidental file damage or destruction.
-
- All Tests | Make Module
- This is a rather important option. As indicated earlier,
- AIBB has the ability to create a "load module" of comparison
- results in order to utilize them later in other runs as a
- comparison system. This selection allows the generation of just
- such a load module. Selecting this menu item will result in a
- requester being displayed which warns that this option may
- take considerable time, and that multitasking will not be
- functional during it's operation. At this point, the operation
- may be cancelled if it is not desired at that time. When
- performing all the tests, the options "Disable Multitasking",
- and "Screen Overlay" previously mentioned are automatically
- enabled in order to give consistancy to all such generated
- modules which may be utilized in AIBB. Using this option,
- all tests are performed in all possible code combinations
- available on the host system configuration, in order that
- later comparisons will have as much data to go by as possible.
- Upon completion of all the tests, a requester will be
- displayed informing you if the tests completed successfully, and
- asking if you wish to create such a load module at that time.
- If you choose to do so, a file requester will appear asking for
- the name of the file to save the module under. Following this,
- a smaller requester will appear asking for the name to use with
- the module under the graph display for it. This defaults to
- the first 8 characters of the filename, but may be changed as
- desired. Note that only names of up to 8 characters are
- supported at this time.
- If "Cancel" is selected in reference to the module creation
- requester, AIBB will go back to it's normal operations, and
- other tests may be performed. In this manner, it is possible
- to use this option simply to perform all possible test
- combinations for later review. If you wish to review the tests
- done before making a module, this is possible by not saving the
- module at the time, and entering "Review Mode" upon finishing.
- If no further tests are performed ( which would invalidate the
- consistancy of the module's data ), then selecting "All Tests |
- Make Module" again after reviewing the data will result in a
- requester informing you that the data for a module is still
- valid and will ask you if you wish to create one now.
- It should be noted that comparison options and settings are
- not in effect during the performance of the tests with this
- option. AIBB will merely do all tests with all code types
- possible, and keep the results ( if desired ). Comparison
- options are only effective ( and necessary ) when viewing
- information present, and are not important when generating
- a load module.
-
- Section 2: System Information Display
-
- The System Information Display is a seperate display which is brought
- up when the Main Display menu option "Systems Information" is selected.
- This display gives various information about the state of the system
- selected, and is also the location from which other load modules to enter
- as comparison systems may be selected.
- The display here is broken into several sections, giving modular
- information areas pertaining to various system data. If the host system
- is the system being viewed, the data represents the current state of the
- host system. If a comparison system's information is being viewed, then
- the data is representative of the system state when that machine's module
- was created for further comparisons.
- The upper portion of the display consists primarily of CPU/FPU/MMU
- data and state information which is fairly self-explanatory. Other
- information given in this section includes the display type in use, Agnus
- and Denise custom chip revisions of the sytstem, and two items of
- particular interest:
-
- System Stack Memory Location
- The system stack ( or "Supervisor Stack" ) is the memory region
- reserved for use by the processor while operating in what is known
- in M680x0 terms as "Supervisor Mode". Supervisor mode is the CPU
- mode of operation most often associated with operating system
- use, and various system maintenance operations. Supervisor mode
- is characterized primarily by the fact that it allows unhindered
- access to certain CPU operations which are of primary interest only
- to system-level operating system functions. User Mode is the
- operational status in which almost all applications function, and
- said CPU operations are considered "off limits" in this mode. This
- is to protect the integrity of the system from runaway programs and
- the like, and to more easily facilitate multiprocessor/multiuser
- system environments. It is a characteristic of the M68000
- microprocessor series and serves to allow a seperation between
- operating system priviledges and user program priviledges.
- The system stack is where much CPU state information is stored
- during operating system activities, and thus it is important to
- recognize it's location in memory. Depending on the memory type
- where this stack is located, it may affect certain operation speeds,
- and it's location is thus given here to allow this to be taken into
- account when evaluating system performance. It should be noted
- that although this is an important item of interest, it is generally
- not going to have much effect on the greater majority of AIBB's
- operational modes and testing.
-
- AIBB Process Stack Memory Location
- This item is probably of more interest than the System Stack
- location. AIBB's process stack is a memory region which is assigned
- to AIBB ( and any user program ) when it is invoked. Certain
- program variables and data are stored on the stack during operations,
- and thus it's location can affect performance levels. This
- should be taken into account carefully, as some of the testing
- AIBB does utilizes this stack for data, and thus will be affected
- if it is located in a slower memory medium than optimal for the
- system configuration.
-
- Operating System Version
- This field identifies the operating system version in use on the
- system in question. Certain versions may have different features,
- and may affect certain of the test performance levels.
-
- Operating System Location
- On certain MMU equipped accelerated systems, or on such system with
- special hardware setups, the operating system ROM image may be
- relocated to a faster memory medium. ROM access times are generally
- slower than that of RAM resources, and in the case of an A500 or
- A2000 with an accelerator which is more at home with a 32-bit bus
- than those system's normal 16-bit 7.15 MHz bus, it is extremely
- advantageous to move the operating system kernel code to a faster
- accessed memory region. Often times, this relocation is done by
- using a system's MMU ( Memory Management Unit ), which allows for
- address translation of memory "pages". Translation occurs by
- mapping a certain memory region such that accesses to it are
- rediverted to an alternate location in this kind of setup. Programs
- such as Dave Haynie's SetCPU and the CPU program which comes with
- AmigaOS 2.0 allow this type of operation. AIBB is capable of
- determining the actual memory location of the ROM code image by
- checking through the MMU translation tables, and will report where
- the code resides.
- Some accelerators allow for translation of the ROM image without
- utilizing an MMU. Such units utilize a custom hardware arrangement,
- and at this time AIBB cannot accurately determine the memory
- location of the ROM image for these systems. In these cases, it
- is recommended that such translations be noted for further
- reference if comparisons are to be made against other systems
- utilizing a module or log file results so that no confusion about
- the system setup occurs.
-
- The bottom portion of the System Information Display contains a
- list of the system memory node regions, and data about them. The node
- number listed as the first field for each region is the number which is
- referenced by the earlier mentioned Test Code and Data locations field
- on the Main Display under Test Result/Information. Pertinent information
- about each memory region is given as follows:
-
- Node Name
- This is simply the text name which identifies the memory node,
- and is generally set by the manufacturer's initialization setup.
-
- Address Range
- This is the physical addressing range which this memory region
- occupies in the M68000 flat linear memory model. The range
- given is the hexadecimal number format addressing range which is
- utilized to access the memory in this region.
-
- Priority
- The Amiga's memory resources are arranged in a priority-based
- fashion. Regions with higher priority are checked first for
- suitable locations whenever a general memory allocation request
- is issued by a program. Thus, this priority can affect the location
- where program code and data will be located. Generally, CHIP RAM
- will be given the lowest priority on the system in order to prevent
- memory from being allocated there unless it is specifically
- requested to go there, preventing it's use unless necessary.
-
- Port Size
- This is the width of the bus interface to this memory region from
- the main CPU. Generally, it will be either 16 or 32 bits in width.
- A 16 bit interface is ideal for a 16-bit data bused CPU such as the
- MC68000 or MC68010, while a 32-bit ported interface is more suited
- for a 32-bit data bused CPU such as the more advanced Motorola CPUs.
- While an 8-bit memory port size is not impossible, nor infeasible
- for the MC680x0, and thus the Amiga, it is not practical, and thus
- it is not likely this port size will be seen.
-
- Node Size
- The size of the memory node's maximum usable memory is
- given here for reference purposes. Note that although a memory
- board may be given at having 2 megabytes of memory on the board
- ( as an example ), not all of this will be usable. A small number
- of bytes will be used by the operating system to tag the board's
- memory region.
-
- The System Information Display also includes a number of menu options which
- are explained below:
-
- Select Other
- A submenu attached to this item allows you to switch to viewing
- another system's attributes from within this display.
-
- Load New
- This is the option to utilize if you wish to load a comparison
- module in place of the ones alread in use. The loaded module
- will replace the currently displayed system's location in the
- comparison systems. This option is not available when viewing
- the host system's data. Subitems attached to this menu item
- allow you to select the type of module to load. These are:
-
- From File
- This should be selected if you wish to load a previously saved
- module in file form. A requester will be displayed asking
- for the file name to load. AIBB will attempt to load the
- module, and if all data consistancy checks are valid, it will
- place this data in the location of the previously displayed
- system.
-
- Under this option is a list of the internal default modules AIBB
- contains. This allows the rearranging of the order of the default
- systems as they appear on the graph in the Main Display, and also
- allows a default system's values to be re-loaded if one is
- superseded by a file-based module at an earlier time. Note that
- the order of the system default modules is one of the items saved
- in the AIBB.prefs file, so you may choose any ordering of the
- internal startup default systems which suits you best.
-
- Return to Main
- Returns you to the Main Display portion of AIBB.
-
-
- AIBB's internal default comparison systems were selected to give a
- broad overview of a number of system configurations and hardware types.
- These systems are:
-
- A500-NFR
- An Amiga 500 system with no FAST RAM ( NFR ) complement. This is
- an all CHIP RAM based machine, and is provided here to give a
- comparison towards systems utilizing only CHIP RAM. This is a
- stock machine, with accelerator devices or other additional
- enhancements.
-
- A2000-FR
- This system is an Amiga 2000 equipped with FAST RAM ( FR ) as well
- as the normal CHIP RAM complement. No acceleration of the system
- was performed, and other than the addition of FAST RAM this system
- is basically stock.
-
- A2500-20
- An Amiga 2500 of the earlier MC68020-based variety comprises this
- system. It utilizes an MC68020 microprocessor running at 14.3 MHz,
- and includes an MC68881 floating point coprocessor unit ( FPU )
- running at that same clock frequency, as well as an MC68851 Memory
- Management Unit ( MMU ). Four megabytes of 32-bit ported FAST RAM
- equipped the accelerator ( a CBM A2620 ). All comparison data was
- obtained with the ROM image translated by way of the MMU into the
- 32-bit ported accelerator RAM area.
-
- A3000-25
- This is currently CBM's top of the line Amiga model: The Amiga 3000.
- The comparison data here was obtained from a 25 MHz CPU rated
- system, which utilizes the MC68030 CPU and MC68882 FPU as it's
- processing engines. Sixteen megabytes of FAST RAM were included
- on the motherboard, ( Static-Column type ) as well as a normal 2
- megabyte CHIP RAM complement. The A3000 used for these comparisons
- was utilizing AmigaOS 2.04 as a RAM based image instead of a ROM.
- This is noted for reference against A3000 systems which may utilize
- a ROM based operating system image, as ROM access times are slower
- than for a RAM-based image.
-
- A2500-30
- This is CBM's later offering of the Amiga 2500 line, utilizing the
- A2630 MC68030-based accelerator. The MC68030 and included MC68882
- FPU of this system are operated at a 25.0 MHz clock rate. This
- comparison system was equipped with 4 megabytes of 32-bit ported
- FAST RAM located on the accelerator, as well as 2 megabytes of
- 16-bit FAST RAM and 1 megabyte of CHIP RAM on the system motherboard.
- All tests were performed with the system OS image translated into
- the 32-bit accelerator RAM by way of MMU translation.
-
- It should be kept in mind that all parameters for each system should
- be noted when making comparisons by checking the statistics located
- on AIBB's System Information Display. Small items such as the system
- stack location, cache settings, OS version and image location, etc...,
- could play a part in any apparent discrepency. Making note of these is
- important to fully understand the figures being provided.
- In all the systems above, all tests performed were done with AIBB's
- test code and data located in the fastest memory medium located on each
- system.
- No third-party accelerated machines were included in the lineup as
- this would leave an unfair advantage/disadvantage to any particular
- manufacturer. Comparisons of that sort can still be carried out
- utilizing AIBB's load-module capability to bring in data from such
- systems for direct comparisons.
-
- OVERVIEW OF INCLUDED TESTS
-
- The tests AIBB incorporates are described below. The type of test,
- and it's basic operations are given in the descriptions, as well as the
- amount of memory each test may need to allocate external to AIBB itself.
- The "standard" tests are as follows:
-
- WritePixel
- The WritePixel benchmark will open a low-resolution screen
- and fill it completely with a given color. The filling is done
- one pixel at a time, utilizing the operating system routines
- SetAPen() [set the current RastPort primary pen color] and
- WritePixel() [which sets a pixel to the given primary pen color].
- The test is basically a benchmark of the time needed to call these
- routines, and for them to execute. For the most part, this test
- will be primarily useful for evaluating the effective ROM image
- access time for systems which differ from the conventional ROM
- access method found on the Amiga 500 and 2000, namely accessing
- the ROM over those systems' normal 16 bit bus. As these routines
- also result in many accesses to the CHIP RAM bus, it can also
- give a hint as to the efficiency of a system's CHIP RAM bus
- interface.
-
- Memory Usage: No memory resources external to AIBB are allocated.
-
- Dhrystone
- This test should be fairly familiar to most people, as it has
- been utilized on many different system for benchmarking purposes.
- It is a test which attempts to put conditions upon the system
- which more closely simulates a possible applications program
- section. It returns, not run-time in seconds, but rather a rating
- of Dhrystones per second, where in this case, the larger number
- indicates better performance.
-
- Memory Usage: No memory resources external to AIBB are allocated.
-
- Matrix
- A matrix manipulation benchmark utilizing 3 50x50 integer matrices.
- The test simply performs a series of matrix operations
- (addition/subtraction, multiplication, transposition, etc) upon
- these matrices. The test is set up in such a way that a great
- amount of time is spent moving data, as well as performing
- arithmetic operations upon it. Therefore, this could be thought
- of as also testing memory manipulation efficiency. The test
- is an indicator of how well a processor/memory combination handles
- memory accesses to data and operations on such, as the test does
- not allow the processor to simply perform the data operations
- solely within it's registers.
-
- Memory Usage: 30,000 ( 29.3K ) bytes external to AIBB are allocated.
-
- MemTest:
- This test is memory-bound, as the name implies. In essence, it it
- a memory block movement test, timing the efficiency of memory
- accesses and transfers. Memory from both CHIP and FAST RAM is
- utilized, with transfers occuring from FAST RAM to FAST RAM, FAST
- RAM to CHIP RAM, and CHIP RAM to CHIP RAM. This gives an overall
- look at the memory efficiency of both the system's FAST RAM and
- CHIP RAM complements. It should be noted that the Data Loc portion
- of the test result information will supply the node location of the
- FAST RAM portion, as the CHIP RAM portion is contiguous and known.
- The results given here will be a composite showing overall how the
- system is performing in terms of memory accesses. Systems with FAST
- RAM will show higher results, as those portions of the test will
- execute quicker, and as can be expected, 32-bit ported FAST RAM will
- perform better than its 16-bit ported cousin, also resulting in
- better overall performance. The test does report weighted results,
- with more emphasis being placed on FAST RAM-only operations, followed
- by CHIP to FAST RAM, then CHIP to CHIP RAM operations. This is to
- take into account that most CPU operations do not operate entirely
- within CHIP RAM, or even at all utilize CHIP RAM. Thus, FAST RAM
- performance will have a larger weight on the results than CHIP RAM
- performance. However, a very slow CHIP RAM interface will
- necessarily affect the test greatly as well.
-
- Memory Usage: 32,768 ( 32K ) bytes of FAST RAM external to AIBB are
- used. The same amount of CHIP RAM is also allocated.
- Note that you must have 32K of contiguous free space
- in BOTH CHIP and FAST RAM for this test to execute.
-
- Sieve:
- Another test which should be familiar to most, the Sieve of
- Erathosthenes. It uses a fairly simple algorithm to determine
- prime numbers within a range of numbers. This test simply times
- your system when implementing this algorithm, which is decribed
- fully in many textbooks, or one can simply look at BYTE Magazine's
- benchmarks, which use a similar Sieve test.
-
- Memory Usage: No memory resources external to AIBB are allocated.
-
- Sort:
- A series of 30,000 16-bit integers is sorted from a pseudo-random
- setup, and the procedure is timed. "Pseudo-random" meaning that
- the number arranement is not created in a random fashion, but
- rather in a mixed fashion so that on each invocation of the test
- the numbers will be created in the SAME mixed fashion. This is
- because the sorting algorithm is sensitive to the mixing, and if
- each time the test was run a different group of values was used,
- no two tests results could be compared well. The mixing method I
- used was to insure that the algorithm would be forced to do the
- most work for each test.
-
- Memory Usage: 60,000 ( 58.6K ) bytes external to AIBB are allocated.
-
- IMath:
- Integer Math. This test performs a wide variety of integer math
- functions. Included in these functions are the standard functions,
- such as addition, subtraction, multiplication, division, and a
- few additional bitwise functions, such as ANDing, ORing, and XORing.
-
- Memory Usage: No memory resources external to AIBB are allocated.
-
- TGTest:
- Text/Graphics test. This test is another one which is dependent
- upon the efficiency of the system graphics routines' execution
- speed, as well as the efficiency of the CHIP RAM bus interface
- on the system. Text is ouput to the screen in a repeated pattern,
- and scrolled in order to maintain it's visibility on the screen.
-
- Memory Usage: No memory resources external to AIBB are allocated.
-
- The floating-point specific tests implemented by AIBB are given below.
- Note that these tests are also dependent on any standard code type
- selections which may be made, as well as the type of floating-point
- code utilized. Tests are marked as to their usage of transcendental
- functions ( sin(), cos(), log(), etc... ) for record keeping and
- comparsions by 68040 users, who should see the appropriate notes in
- this documentation concerning the built-in 68040 FPU and
- transcendental functions. The rating scale used below for such usage
- coresponds to this table:
-
- Level Meaning
- ------------------------------------------------------------------------
- NONE | No transcendental functions are used
- LIGHT | 5-20% of calculations are transcendental in nature.
- MODERATE | 21-50% of calculations are transendental in nature.
- HEAVY | Greater than 50% of calculations are transcendental.
- ------------------------------------------------------------------------
-
- FMath:
- Floating Point Math. Similar to the IMath test, with the exeception
- that Floating Point values and operations are utilized. With this
- test, no bitwise operations are performed. Single precision
- floating point operations/values are used here.
-
- Transcendental Usage: NONE.
- Memory Usage: No memory resources external to AIBB are allocated.
-
- Savage:
- This is another of the "probably familiar" tests. It is a standard
- implementation of the Savage test, which makes nested calls to
- transcendental functions to create a single value. Double
- precision floating point operations/values are used.
-
- Transcendental Usage: HEAVY; this test is almost exclusively
- transcendental in nature.
- Memory Usage: No memory resources external to AIBB are allocated.
-
- FMatrix:
- The FMatrix test is similar in concept to the Integer Matrix test
- outlined above. Again, a great deal of data movement is performed,
- in addition to the operations involved, which are floating point
- operations in this case. With the matrix operations, the results
- under Floating Point coprocessor equipped systems can be interesting
- to note, as the system is not able to keep the data within
- fast-access FPU registers, and thus must make many bus accesses for
- the data it needs. Double-precision floating point math is used
- for this test.
-
- Transcendental Usage: NONE.
- Memory Usage: 38,400 ( 37.5K ) bytes external to AIBB are allocated.
-
- SWhetstone and DWhetstone:
- These tests are identical, save that the SWhetstone utilizes
- single-precision floating point operations and data, while the
- DWhetstone is double precision in nature. The Whetstone test
- is yet another of the many "standard" types of benchmarks which
- have been used to test system performance. It tests various
- circumstances, including floating point math, function calls,
- etc. Integer math is also tested to an extent, but since this
- test does rely on floating point math as well it is kept in this
- section. The test returns values in Whetstones per second, where
- like the Dhrystone, higher values indicate better performance.
-
- Transcendental Usage: MODERATE.
- Memory Usage: No memory resources external to AIBB are allocated.
-
- BeachBall:
- The BeachBall test was originally written by Bruce Holloway of
- Weitek, and published in the March 1988 issue of Byte Magazine.
- It is essentially a very math-intensive operation which draws a
- beachball on the screen, complete with shading. The test opens a
- 640x400 interlaced 16-color screen, and proceeds to render the
- picture. This test is closer to a true "application" test, in that
- it actually does something visible, and produces an output. The
- system will end up being tested in both the floating point arena,
- and in CHIP RAM access performance, which is done through standard
- operating system graphics handling calls ( thus will be affected by
- the speed of such, which in turn can be affected by ROM image
- re-mapping, etc ).
-
- Transcendental Usage: LIGHT.
- Memory Usage: No memory resources external to AIBB are allocated.
-
- FTrace:
- Another applications-type test. FTrace implements a subset of the
- calculating functions which are used to perform ray-tracing
- operations. Ray-tracing is a particularly floating-point intensive
- art, and this test gives some indication of a system's performance
- in this type of operation. No visible result is produced, so in
- that matter it is not an 'ideal' test, but it can be used to give
- some indications in this arena.
-
- Transcendental Usage: LIGHT; Calculations are performed in such
- a way that transcendental usage is minimized.
- Memory Usage: No memory resources external to AIBB are allocated.
-
- CplxTest:
- This test implements a series of complex-number operations and
- times their execution. Complex number applications are important
- in many of the sciences, and are particularly prevalent in such
- areas as electrical engineering ( circuit analysis ) and vector
- analysis to some degree ( not specifically "complex numbers" in
- that case, but the operations are similar ). This test utilizes
- a lot of quick, small memory moves, as well as performing a
- variety of floating-point operations.
-
- Transcendental Usage: LIGHT TO MODERATE.
- Memory Usage: No memory resources external to AIBB are allocated.
-
-
- NOTES AND SUMMARY
-
- It has been indicated before, but it should again be emphasized that
- no benchmark or even suite of benchmarks can hope to give a complete picture
- of system performance alone. A full picture of the system resources, as
- well as an understanding of just what the system in question is being used
- for is necessary to make any type of evalution. AIBB is merely one small
- tool which may be used to try to gather a sampling of data when making
- a performance determination.
- When performing tests, it is very important to keep track of just
- where test code and data is being placed in the system by using the
- information provided by AIBB, and by using other methods if need be. For
- example, if you have a 512K CHIP RAM machine, and some SLOW-FAST RAM
- ( sometimes mistakenly thought of as true FAST RAM ), this could affect
- test results in ways not expected. Keeping careful track of these
- variables can help in determining just what is occuring in the system
- during performance analysis.
- Of some interest in terms of FPU performance is the MC68040's
- built-in FPU unit. This FPU is a subset of Motorola's previous MC68881
- and MC68882 coprocessors, and does not include all functions on-chip
- which were supported by the previous FPUs. Most notably, the transcendental
- function such as sine and cosine, etc... are not hardware supported.
- Rather, the simpler functions such as floating-point multiplication,
- addition, division, etc.. have been greatly optimized and enhanced. The
- MC68040 FPU relies on software emulation of the complex functions, and
- most accelerator vendors, as well as CBM itself, supply a function library
- to emulate these routines in the form of software 'traps'. Since the
- complex functions utilize the simpler functions to derive their actions,
- in theory all functions should still execute faster than on previous
- coprocessors. However, this may not be the case.
- Trap functions such as those supplied in the aforementioned libraries
- are routines executed when the coprocessor indicates an unsupported
- function routine is being called. This is a form of 'exception' routine,
- requireing CPU/FPU internal context saving, and other related actions.
- This is because the CPU/FPU treats the function call as an error, and
- calls the error routine appropriate to it. In this case, it will be
- the math support library, which will execute the proper function and return
- the value needed. Unfortunately, all this activity results in a
- performance hit, resulting in timings which are longer than that of the
- previous coprocessors which emulated these functions in their hardware.
- All this might imply that the 68040 is crippled in this respect. However,
- this is not the case. Applications written to take advantage of of
- 68040's FPU will function much faster, as they will emulate the required
- complex functions in forms not requiring the trap functions. The trap
- functions are there for programs which are using FPU code set up for the
- MC68881 or MC68882, which are at this time the more common FPU units.
- AIBB's FPU code is indeed MC68881 or MC68882 style, and the results
- of some of the FPU tests using many complex type functions will show this.
- At some future time, AIBB will offer an option to allow for the use of
- in-line FPU code which will take advantage of the 68040 FPU specifically.
-
-
- CREDITS AND ACKNOWLEDGEMENTS
-
- As with all large projects, nothing is accomplished entirely by one
- person. I have many people to thank for their assistance in the
- development of AIBB. A few of the more influential people who have
- contributed greatly to this effort are:
-
- Dr. J. Scott Thayer - Sysop of AmigaFriends BBS, and a dedicated beta
- tester extraordinaire. His comments and testing
- data were key to much of what was done with this
- program over the course of it's development.
-
- Redmond Simonsen - One heck of a nice guy and thought provoking fellow.
- His help with interface ideas were very much appreciated,
- and are still instrumental in any upcoming future
- versions of AIBB. By providing me with processor data
- books I had not been able to get a hold of, Redmond was
- also instrumental in the addition of many functions
- within AIBB which would have gone unimplemented.
-
- Mathew Rouch - A good friend of mine, and a computer science student at
- present. His help in several algorithmic coding problems
- helped me solve some bottlenecks which would have taken a
- great deal longer to overcome than they did.
-
- Greg Tibbs - For finding one SERIOUSLY obscure bug, and helping me to
- track it down, despite the difficulty. Also thanks to
- him for his help with specific aspects of the Motorola
- MC68040 microprocessor.
-
- Unfortunately, I cannot list everyone who has been of assistance with
- this project, but to all of them, listed and unlisted, I wish to express
- my deepest thanks and appreciation.
-
- Comments and suggestions about this program are always welcomed, as I
- hope to be able to continue its development. Please feel free to make
- any suggestion you see fit, but do try to be constructive in any
- criticism so that I may improve AIBB. Bug reports are certainly wanted,
- and I will do my best to locate and correct such problems.
- I can be reached electronically many ways, but the following are probably
- the easiest methods:
-
- Safe Harbor BBS: (414) 548-8140 ( 16 lines rollover ). I can be
- reached here almost always by name, or by addressing
- mail to the HARDWARE SIGOP in the hardware special
- interest group.
-
- Amiga Friends BBS: (714) 870-4754 or (714) 870-6594 ( 2 lines ). I
- also frequent this BBS, and mail may be addressed
- to me here by name.
-
- If you have internet access, there are two methods to reach me. The
- Internet addresses I can be found at are:
-
- lkoop@tigger.stcloud.msus.edu ( GP Acct )
- f00012@kanga.stcloud.msus.edu ( Engineering Acct )
-
- ( Pick your paths :) )
-
- I can also be found on BIX as lkoop, and can be reached there easily
- as well. For those wishing to correspond by mail, comments may be sent to:
-
- LaMonte Koop
- 565 Park Meadows Dr. #302
- Waite Park, MN 56387
-
- As for me, well, I'm an Electrical/Computer Engineering student
- ( currently 4th year in school ), with an added minor in Computer Science,
- and an emphasis in systems architecture design. AIBB was originally
- started as a bit of a hobby, and as time went on became a long-standing
- project. This particular version is almost a year in the making, and I
- do intend to continue enhancing the package as long as interest remains in
- it. Enjoy the program; I hope you find it useful, and that it serves
- whatever purpose you may need of it.
-
-