home *** CD-ROM | disk | FTP | other *** search
Text File | 1993-06-09 | 67.6 KB | 1,125 lines |
- ======================= COMPTEST 2.59 ============================
-
- Release date: June 9, 1993
-
- COMPTEST is a program that determines the system configuration and
- performance characteristics of PC compatible computers. COMPTEST
- was designed to be fast, so most parameters are determined during
- program start-up and the first page of results will come up almost
- immediately. Even for slow systems like the original IBM PC the
- first page will be displayed within a few seconds. There are at
- most three pages with results displayed sequentially, with some
- tests occurring only when the appropriate page is displayed.
-
- Usage: COMPTEST [file name] [/D] [/H]
-
- [file name] is an optional parameter specifying a file in which
- the results displayed by COMPTEST will be saved upon
- termination of COMPTEST.
-
- /D is an optional switch enabling additional messages
- that aid in debugging COMPTEST if the program should
- crash or fail to correctly determine the system
- configuration.
-
- /H is a switch that prints a short help screen for COMPTEST.
-
-
-
- COMPTEST 2.59 is Copyright (c) 1988-1993 by
-
- Norbert Juffa
- Wielandtstr. 14
- 7500 Karlsruhe 1
- Germany
-
-
- COMPTEST 2.59 is public domain software and is distributed with full
- assembly language and Turbo Pascal 6.0 source code. You are free to
- incorporate parts of the code into your own programs as long as you
- don't use it in a commercial product. Please do others a favor and
- always distribute the complete COMPTEST package, not only the binary.
-
-
- If you want to notify me of bugs you discovered in COMPTEST or want
- to comment on the program in any way, you can either contact me at the
- address above or on Internet as JUFFA@IRA.UKA.DE
-
-
- Revision history:
-
-
- Changes since version 2.58
-
- o Detection method for Cyrix 486DCL/SLC has been changed back to method
- used in COMPTEST 2.57. The detection method used in version 2.58 that
- was based on checking the value in the destination register of a BSF
- instruction after executing it with a source register containing zero
- seems to work only with very old versions of the Intel 80486. Newer
- versions of the Intel 486DX/SX show exactly the same behavior as the
- 486DLC/SLC. Therefore, COMPTEST 2.58 reported most 486SX as 486DLC
- and 486DX as RapidCAD. Thanks to Jian Liu and David Ruggiero for
- reporting this bug.
-
- o Experimental support to detect an Intel Pentium CPU has been added.
- Detection is based on incomplete information, so Pentium detection
- and measurements may not work correctly yet.
-
-
- Changes since version 2.57:
-
- o Detection method for Cyrix 486DLC/SLC has been changed. The new method
- does not rely on timing anymore.
-
- o Timing routine has been enhanced to work more reliable with fast machines.
-
- o Documentation file have been enhanced and updated. Minor cosmetic changes
- to COMPTEST and MEMMAP programs.
-
- o New version of Turbo Pascal 6.0 run time library included.
-
-
- Changes since version 2.56:
-
- o COMPTEST 2.56 was compiled with the wrong library. Therefore, benchmark
- results for the floating-point benchmarks (Whetstone, LLL) differ from
- other versions of the program. Sorry about the mistake!
-
- o A correction was added to the determination of coprocessor clock frequency
- in the case a Cyrix 486DLC CPU is present. With a 486DLC present, the
- block of FSQRTs that determines the coprocessor clock frequency is
- executed faster than with the Intel 386DX as CPU, due to the improved
- communication between CPU and coprocessor. The observed speedup is in
- range between 4.7% and 9.7%. For simplicity, COMPTEST 2.59 uses a simple
- correction that divides the computed frequency by 1.055.
-
-
- Changes since version 2.55:
-
- o COMPTEST 2.55 wouldn't detect a Cyrix EMC87 if it was installed and
- reported it as a Cyrix 83D87 instead. This has been fixed.
-
- o Correct detection of the presence and clock frequency of the Cyrix
- 486DLC has been tested. Note that presence of a Cyrix 486DLC may
- lead to a higher clock frequency being displayed for a coprocessor,
- if one is installed. Since the 486DLC has an improved handshaking
- with the coprocessor, the block of FSQRT instructions used to measure
- the coprocessor's clock is executed up to 10% faster and the reported
- clock frequency of the coprocessor goes up accordingly.
-
-
- Changes since version 2.54:
-
- o Enhanced POPAD bug detection by testing POPAD execution using 31
- different initial values in the EAX register. Previously, only one
- value was used, which made correct detection of the bug somewhat
- unreliable.
-
- o Detection of the SuperMath 38700DX and 38700SX coprocessors from
- Chips&Technologies have been added and successfully tested. Detection
- for the Intel RapidCAD, Intel 387DX, and Cyrix 387+ has been changed
- from tests depending on instruction timing to tests that rely on
- certain small incompatibilities between the coprocessors.
-
- o When called with a filename to store the results in, COMPTEST would
- fail to print the final message "COMPTEST terminated - press any key"
- if either there was a error in handling the file indicated or if no
- hard disk results were stored. This has been fixed.
-
- ======================================================================
-
- The following is a detailed explanation of the information provided
- by COMPTEST 2.59 and the known limitations of the program.
-
-
- o General limitations:
- Correct execution of COMPTEST depends on certain hardware resources
- provided by the hardware of PC compatibles. That a machine runs MS-DOS
- is by itself no guarantee that COMPTEST can be run successfully, as
- MS-DOS may run on machines that are not 100% compatible with industry
- standard PCs. Since it directly accesses hardware components,
- COMPTEST should not be run in the DOS-boxes supplied by Windows, OS/2
- or similar programs. Even if it does run, some or all of the information
- it provides may be wrong or misleading. For the same reason, it should
- not be run on a PC emulator, e.g. SoftPC by Insignia, even if simpler
- programs like the Landmark Speed Test run successfully in such an
- environment.
-
-
- o Computer type:
- The specific type of an IBM compatible PC is coded into one or two
- bytes in the last 16 bytes of the first MByte of the address range.
- This memory is occupied by the BIOS ROM. Quiet a few values have
- been defined by IBM for its PC, AT, and PS/2 computers. COMPTEST
- decodes this information according to IBM's definitions and prints
- the result. Ordinary 286, 386, and 486 based PCs with an ISA bus
- are reported as AT compatibles.
-
-
- o CPU Type:
- COMPTEST is able to detect the following CPU types: Intel 8088,
- Intel 8086, NEC V20, NEC V30, Intel 80188, Intel 80186, Intel
- 80286, Intel 80386DX, Intel 80386SX, Intel 80486DX, Intel 80486SX,
- Intel RapidCAD, Chips&Technologies 38600DX, Cyrix 486DLC, Cyrix 486SLC.
-
- The Intel 80186/80188 are CPUs with integrated peripherals that
- have been used in only a few PCs that were manufactured around
- 1982/1983. It has an extended 8086 instruction set. The Intel
- RapidCAD is a replacement for a 80386/80387 combination and is
- basically a 80486DX without the internal cache and with a 386 pinout.
- The Chips&Technologies 38600DX is a pin compatible replacement for
- the 80386DX CPU that offers some performance improvements. The
- Intel 486SX is a 80486 without the FPU (floating point unit).
- The Cyrix 486DLC is a CPU that is software compatible with the
- 486SX, but is a replacement for the 80386DX CPU. Similarly, the
- Cyrix 486SLC is for use in 386SX systems.
-
- AMD's Am386DX, Am38DXL, Am386SX, and Am386SXL are 100% compatible
- to the Intel 386DX and Intel 386SX CPUs, respectively, and are
- reported as Intel 386DX and Intel 386SX, respectively, by COMPTEST.
- The Intel 486DX2-50, Intel 486DX2-66, and the Intel Overdrive
- processors are 486DX chips that use a clock-doubler that drives
- the CPU internally at twice the speed of all other system components.
- The 486DX2-50 is a replacement for the 486DX-25, while the 486DX2-66
- replaces the 486DX-33. The Overdrive goes into the 487SX socket
- found in many 486SX systems. These processors are all reported as
- 80486 processors by COMPTEST.
-
- To distinguish between the different CPUs and math coprocessors,
- COMPTEST in most cases takes advantage of certain incompatibilities
- between the chips that can be tested without using other system
- resources. Only a few tests involve timing differences in the execution
- of certain instructions. These depend on the correct operation of the
- PC's timer chip.
-
- To separate the 8086, 8088, V20, V30, 80186, and 80188 CPU from
- newer CPUs, the behavior of the instruction sequence PUSH SP,
- POP AX is used. While after the execution of these instructions,
- AX=SP on the newer processors, AX=SP-2 for the 8086..80188.
-
- To distinguish among the CPUs in the first group (8086..80188),
- the following properties of the processors are used. The 80186/
- 80188 (like all newer Intel CPUs) mask off shift counts MOD 32
- in shift and rotate instructions so that no more than 31 shift
- steps are performed. The V20, V30, 8088, and 8008 do not mask off
- shift counts and perform up to the 255 shift steps allowed by the
- 8-bit counter used. If a register with a non-zero contents is
- shifted left with a shift count of 32, it will be cleared after
- the operation on the 8086/8088 and V20/V30, while nothing will
- happen on the 80186/80188, since 32 MOD 32 = 0, that is, no shifting
- takes place. V20 and V30 (just like the 80186/80188 and newer CPUs
- from Intel) have a PUSHA instruction that saves all general registers
- on the stack. The 8086/8088 don't have this instruction, but the
- execution of the PUSHA opcode acts like a JMP skipping the next
- code byte. COMPTEST uses this code byte to set a flag so that
- the flag is only set on V20/V30 processors. If the flag is not
- set, an 8086/8086 must be present. This is verified by an additional
- test that checks if the highest nibble in the flag register can be set.
- This nibble is always cleared on the 8086/8088 and can not be set.
-
- The 8086, V30, and 80186, which have a 16-bit data bus, can be
- distinguished from the 8088, V20, and 80188, which have an 8-bit data
- bus through the use of self modifying code that works differently due
- to the different length of the instruction prefetch queue, which has
- a length of 4 bytes for the CPUs with 8-bit busses, but a length of
- 6 bytes for the CPUs with 16-bit busses. Modifying an instruction
- five bytes ahead in the instruction stream will cause the modified
- instruction to be executed by the 8-bit CPU versions, while the
- original instruction will be executed on the 16-bit CPU versions,
- since it was already in the prefetch queue by the time the instruction
- was modified in memory.
-
- To tell apart the 80286 from the 80386 and 80486, an attempt is made
- to change certain bits in the flag register of the CPU. While they can
- be modified in the 80386 and 80486, the 80286 will not allow that to
- be done. The 80486 has a new bit in its flag register that is not defined
- in the 80386 and is always clear there. By attempting to toggle this
- bit, it can be decided whether a 80386 or 80486 is present. The 486SX
- is a 80486 without the FPU (floating point unit ~ integrated coprocessor),
- so if a 486 CPU has been detected but the test for a coprocessor or
- FPU fails, it can be concluded that a 486SX is present. The 80386SX
- has a 16-bit data bus as compared to the 32-bit data bus of the otherwise
- (almost) identical 80386DX, so 32-bit memory accesses on the 80386SX
- are slower than 16-bit memory accesses since they have to be split into
- two 16-bit accesses. On the 386DX, both 16-bit and 32-bit memory
- accesses have the same speed, if memory operands on addresses divisible
- by four are accessed. By measuring and comparing the speed of 16-bit
- and 32-bit memory accesses, COMPTEST determines if a 386SX is present.
- Intel and AMD both make 386DX and 386SX processors that are functionally
- totally identical. However, AMD makes 386DXs that are rated for 40 MHz
- and 386SXs that are rated for 33 MHz, which Intel doesn't make. So
- COMPTEST could make an educated guess on what manufacturer's CPU is
- used based on the clock frequency it determines. COMPTEST does without
- this guess, though, and reports all AMD 386 CPUs as Intel 386DX or Intel
- 386SX.
-
- Chips and Technologies has introduced CPUs that are compatible with
- the 386DX and 386SX, which are called the 83600DX and 83600SX. Also,
- an 83600DX with a small internal cache has been announced called the
- 83605DX. While AMD uses Intel's microcode in their 386 CPUs, C&T uses
- its own microcode. Therefore, the CPUs from C&T do not possess a well
- known bug present in the Intel 80386. This so called POPAD bug causes
- the EAX register to be trashed for a certain instruction sequence
- involving the POPAD instruction. COMPTEST checks for this bug to
- distinguish the C&T CPUs from Intel's 386 processors. Since I have
- found that the POPAD bug can not be reliably reproduced on Intel 386SX
- CPUs, COMPTEST reports all 386SX CPUs as Intel 386SX, whether a POPAD
- failure occurs or not. Since C&T will not offer the 38600SX before
- late in 1993, this doesn't make the CPU detection by COMPTEST less
- reliable. COMPTEST will not recognize the 83605DX, but will report
- it as an 83600DX. Since the reproducibility of the POPAD bug depends
- somewhat on the initial value of the EAX register, COMPTEST uses 31
- different values.
-
- The Intel RapidCAD is basically a 486 without the internal cache
- that is an end user replacement for a 80386DX/80387 combination.
- It is 100% software compatible with this combination and can be
- detected by checking the speed of store operations from the FPU
- to memory, which are executed much faster on the RapidCAD than in
- any 386/387 system, regardless of the coprocessor used.
-
- Cyrix now offers the 486DLC and the 486SLC that are designed for
- 386/386SX systems. However, they are software compatible with the
- Intel 486SX. Cyrix has implemented a fast array multiplier on these
- chips to speed up integer multiplications, making the MUL instruction
- faster than on any other CPU found in PC compatible computers.
- COMPTEST detects the Cyrix 486DLC/SLC by comparing the speed of the
- MUL and AAM instructions. On the 80486, the execution time for the
- MUL instruction ranges from 13 to 26 clock cycles for a 16 bit
- operand, while the AAM instruction executes in 15 clock cycles. So
- multiplication is never significantly faster than AAM. On the Cyrix
- processors, MUL takes 3 cycles with a 16 bit operand, and AAM takes
- 16 cylces. So on these processors MUL is several times faster than
- AAM.
-
- o Clock frequency:
- Measuring the clock frequency of the CPU is based on repeated execution
- of the AAM (ASCII adjust after multiply) instruction. This instruction
- takes more than 10 clock cycles to execute on all CPUs that are supported,
- so there is enough time for the CPU to always keep its prefetch queue
- filled, resulting in very stable timings since there is no additional
- penalty for filling up the prefetch queue. Also the AAM instruction has
- the advantage to execute in a fixed number of clock cycles, as opposed
- to some instructions like DIV that take even longer to execute than AAM,
- but whose timing depends on the input arguments. To report the clock
- frequency accurately, it is absolutely important to use the correct
- execution time for the AAM instruction in COMPTEST. Note that the AAM
- execution time stated in the Intel manual for the 8088/8086 is not
- correct. Depending on the accuracy of the oscillator that drives the
- PCs timer, the reported CPU clock frequency should be accurate to within
- +/- 2%. Note that for CPUs that use internal clock doubler circuits
- (e.g. Intel 486DX2-50), the clock frequency displayed by COMPTEST is
- the frequency at which the CPU runs internally (50 MHz in the example
- cited).
-
-
- o Bus width:
- The width of the CPU data bus. This is determined by the type of the
- CPU, so this is actually redundant information.
-
-
- o Cache size:
- COMPTEST is one of very few test programs that correctly determine
- the cache size of first and second level CPU caches. This is very useful
- if you are not sure whether the CPU cache in your computer is enabled
- or functional at all. COMPTEST moves memory blocks of increasing size
- and watches for sharp drops in memory throughput to determine the cache
- sizes. Since the largest blocks tested have a size of 512 KB, COMPTEST
- is limited in that it can *not* correctly detect CPU caches that are
- bigger than 256 KB. The smallest cache size COMPTEST can determine
- is 1 KB, which is the size of the internal cache on the Cyrix 486SLC
- and 486DLC chips. COMPTEST's cache test strategy may be defeated if
- you have defined non-cacheable areas in the first 512 KB of base memory.
- Non-cacheable areas can usually be defined in the extended BIOS setups
- of 386 and 486 based machines and are only necessary if you have a
- write-back cache and have to ensure correct operation of memory mapped
- peripherals (e.g. video memory of graphics card, Weitek coprocessor).
- Usually there is no need to define a non-cacheable area in the first
- 512 KB of base memory. COMPTEST's cache detection usually is very
- reliable, only once did it indicate a cache on a system with no CPU
- cache, probably due to the mixture of page/interleave access modes
- used by that system's memory. The technique used by COMPTEST to
- determine cache size basically works as follows: Assume you have a
- machine with a n KB CPU cache. If you read a memory block of n KB or
- less twice, it will be read almost completely from the cache the
- second time it is read (some data may have been thrown out by accesses
- to the code of the test program, as the caches in the Intel 486 and
- comptible CPUs is a unified code and data cache). However, if you
- linearly read a block of 2n KB data twice, the second half of the
- data block will throw out the first half of the data block that is
- already in the cache. On the second pass through, accesses to the
- first half of the block will result in a miss for every cache line
- accessed, as the cache now contains the data from the first n KBytes
- of the block. If the times to read data blocks of 2^i KBytes are
- recorded, one sees a sharp increase in read time as soon as a data
- block larger than the cache size is read. COMPTEST uses block moves
- instead of the block reads in the example, as I have found this to
- be somewhat more reliable.
-
-
- o Maximum RAM throughput (without cache):
- This is a measure of the quality of the main memory system of the
- machine tested. As with all throughput numbers, higher numbers are
- better. Maximum throughput is determined by executing block move
- instructions moving blocks on addresses divisble by four. For processors
- up to and including the 80286, 16 bit transfers are used to move the
- data. For the 80386 and newer processors, 32 bit transfers are used to
- correctly reflect the higher memory throughput that is possible using
- instructions that can handle 32 bit data. By moving memory blocks
- that are bigger than CPU caches that may be present, COMPTEST tries
- to defeat the cache strategy and to measure the true speed of system
- RAM as if no cache(s) were present. However, this technique can back-
- fire so the values given by COMPTEST for RAM throughput without cache
- may be different from those that COMPTEST determines if the cache(s)
- are physically disabled (usually possible through the BIOS setup). The
- value reported by COMPTEST for RAM throughput without cache is really
- the memory throughput with the maximum possible number of cache misses.
- With a decent cache controller, the memory access speed in case of
- a cache miss is the same as if no cache were present at all. However,
- some cache controllers impose an additional overhead on such a memory
- access that may be as large as 40 clock cycles per cache line loaded,
- as opposed to 3-4 clocks for a good cache controller. In these cases,
- COMPTEST reports up to 6 wait states or even more for RAM access
- without cache. Even if COMPTEST does not report the correct value
- for the RAM throughput in these cases, it is still a valuable indicator
- of the quality of the cache/memory subsystem implementation. The
- higher the throughput reported the better does the system perform. On
- systems with no CPU cache, COMPTEST reliably measures the true throughput
- of the system RAM. Based on the measured throughput as compared to
- the maximum throughput for the detected CPU, COMPTEST also computes
- the equivalent number of wait states. This is not an integral number
- due to the fact that the number of wait states is usually not the
- same for every memory access. COMPTEST reports the *average* number
- of wait states needed. With wait states, lower numbers are better.
- Older 80286 based systems going faster than 10 MHz usually have at
- least 0.6-0.7 wait states but on a recently tested 80286 system running
- at 16 MHz, COMPTEST reported 0.1 wait states, which reflects the state
- of the art in memory system design. 80386DX and 80486 based systems
- typically have no less than 1.6 wait states. A brand-new 386SX system
- based on the Am386SX-33 had only 0.3 wait states, though. There are
- machines that use fast SRAM for the 640 KB base memory that doesn't
- force the CPU to insert wait states when accessing this memory. Please
- note that systems using clock doubled chips will often report more
- wait states than systems in which the CPU runs at the same speed as
- the other system components including the memory subsystem. In systems
- with a clock doubled CPU, memory always looks slow to the CPU, as one
- clock cycle on the memory data bus equals two internal CPU clock cycles.
- Since COMPTEST reports waits states measured in CPU clock cycles, one
- will see the wait states reported double when changing from a 486DX-33
- to a 486DX2-66 on the same motherboard. For example, a board for which
- COMPTEST reported 2.6 wait states when run with a 486-33 CPU will have
- 5.2 wait states reported after changing to a 486DX2-66 CPU. Reporting
- the wait states measured in internal CPU clock cycles is well justified,
- as the number of wait states tell the user something about the relative
- speed of the memory compared to the speed of the CPU. Clearly, a memory
- subsystem running at 33 MHz is more adequate for a CPU running at 33 MHz
- than for one running at 66 MHz. The RAM throughput measured by COMPTEST
- refers only to base memory (first 640 KB of memory or less).
-
-
- o Cache Throughput:
- This information is only reported if COMPTEST has found a CPU cache
- in the machine. As with memory throughput, higher numbers are better.
- Cache throughput is determined by performing block moves on addresses
- divisible by four within the cache memory using the CPU's MOVS
- instruction with the maximum data width available. If two levels of
- cache are present in the system (e.g. a 80486 system with an external
- cache of 256 KB and the 8 KB of internal cache in the 80486), the
- throughput for both is reported. You will see a performance drop
- going down the memory hierarchy. First level caches usually run with
- no wait states, that is with the full speed supported by the CPU.
- The second level cache has less throughput than the first level cache,
- but is several times larger. The system memory is even slower than
- the second level cache but much bigger. The cache throughput rates
- reported by COMPTEST usually are accurate indicators of the cache
- performance. Write-back caches may have higher throughput rates
- reported than write-through caches, as the block move performs reads
- and writes. However, write-back caches also show higher performance
- when used with real applications, so the higher performance indicated
- by COMPTEST is probably justified.
-
-
- o System memory:
- System memory here refers to the base memory within the first megabyte
- of the address space and below the start of a graphics adapter's
- display memory. COMPTEST searches for RAM in small steps from the
- bottom to the top of the address space until it reaches a graphics
- adapter or no more contiguous RAM is found. Note that this value
- can be larger than the usual 640 KB. For example, on systems that
- have only a CGA, system memory could be expanded up to address B8000h
- for a total of 736 KBytes of system memory. Similarly, system memory
- could be expanded to 704 KBytes on a system using a monochrome
- Hercules card. There are special memory cards that allow such extensions.
- Also, utilities such as the VIDRAM program included with Quarterdeck's
- QEMM memory manager can expand the base memory by either using part
- of a VGA's or EGA's display memory as system memory or mapping
- extended memory into this range, and disabling part of the EGA/VGA's
- capabilities to make basically a CGA out of them.
-
-
- o Memory available to DOS:
- This is the amount of system memory (base memory) that the BIOS
- reports to DOS. The BIOS determines the amount of system memory
- during a cold boot and stores the result in its data block which
- starts at address 400h. Several utility programs that extend the
- system memory above 640 KB, such as the VIDRAM program that comes
- with Quarterdeck's QEMM memory manager, manipulate this value to
- reflect the greater amount of memory now available to DOS. Note
- that the value reported by COMPTEST does not include DOS memory
- in UMBs created by MS-DOS 5.0 or similar programs. This test is
- a bit out of date and could be updated to support the new features
- of the latest DOS versions.
-
-
- o Memory permanently used by DOS and TSRs:
- As the previous test, this one is outdated in that it doesn't take
- into consideration the state of the art with regard to DOS memory
- management. All memory below the address at which COMPTEST is loaded
- is assumed to be unavailable to DOS. Device drivers and TSRs that are
- loaded high are not included into the amount of memory reported. Use
- the program MEMMAP also included in the COMPTEST archive file to get
- a detailed list of memory blocks allocated to DOS and TSRs.
-
-
- o Extended memory (INT 15h throughput):
- Extended memory is that part of a PC's memory that is above the
- first megabyte of the CPU's address space. It can be available only
- on those computers that have an 80286 or newer CPU. Except for the
- first 64 KB block of extended memory, which is called HMA (high
- memory area), it can only be accessed in the protected mode of
- these processors. COMPTEST reports the amount of extended memory
- as determined by the system BIOS during a cold boot. This value is
- stored in the CMOS RAM of the real time clock of AT type machines.
- On newer PCs, this CMOS has been physically incorporated into the
- chip set that contains most of the discrete logic of older PCs in
- two or three chips. COMPTEST reads the amount of extended memory
- directly from the CMOS RAM. Note that the sum of base memory
- and extended memory may not add up to the total memory installed
- in the machine, as some of this memory may be used to shadow the
- BIOS ROM and/or ROM extensions (e.g. VGA BIOS) and is therefore
- unavailable for other purposes. Shadowing means that the code in
- the ROMs is copied to RAM during a cold boot and that this RAM is
- mapped to the same address as the shadowed ROM. Since ROMs are slow,
- code in the ROMs (e.g. BIOS) executes slowly and can be sped up a
- lot by shadowing. Extended memory can be accessed in several ways.
- One way is to use the services of a XMS (extended memory specification)
- driver such as HIMEM.SYS. This, however, requires such a driver to
- be loaded. There are also functions provided by the BIOS via INT 15h
- to access extended memory. COMPTEST uses these to copy a block of
- memory from extended memory to the base memory below 640 KB and
- determines the transfer rate (throughput) by measuring the time it takes
- to copy the block. Using a memory manager such as QEMM usually causes
- the INT 15h functions to be mapped to the appropriate XMS calls, so
- the INT 15h throughput value may differ significantly depending on
- whether a memory manager is loaded or not. Not all BIOSes use 32-bit
- transfers for block copies from/to extended memory when it is
- possible (that is, on 386 and 486 based machines), so the INT 15h
- throughput from extended memory may be only half of the normal system
- memory throughput. Also, access to extended memory usually requires the
- CPU to be switched to protected mode and back, which causes considerable
- overhead when it is not done via the fast methods provided by most
- 386 and 486 chip sets, but uses the traditional method which involves
- using the keyboard controller.
-
-
- o Expanded Memory:
- Expanded Memory, specified in the EMS (expanded memory specification),
- was originally designed to provide 8086 and 8088 based computers, which
- have only a one MByte address space, with up to 32 MBytes of memory.
- The LIM (Lotus, Intel, Microsoft)-EMS is a standardized application
- interface that permits several implementation techniques. Memory cards
- which support expanded memory in hardware use sort of a bank switching
- technique. Up to four blocks of 16 KBytes each can be mapped into a
- contiguous 64 KB region in the address range C8000h-E0000h. This region
- is called the EMS page frame. The memory on such EMS cards can not be
- accessed as fast as the system memory on the motherboard in most
- computers, since the data has to travel over the relatively slow ISA
- bus. On the 386 and 486 based computers mostly used nowadays, expanded
- memory is usually provided by a memory manager like 386MAX, QEMM, or
- EMM386 that manages part of the extended memory (memory above the first
- MByte of the address space) as expanded memory. These programs use the
- MMU (memory management unit) built into these CPUs to map memory blocks
- from the extended memory to the EMS page frame. There are also programs
- that use the hard disk to provide the storage for expanded memory.
-
- COMPTEST tries to detect an EMS driver in the system. If it finds one,
- it will question the driver for the total EMS memory provided by it,
- the start address of the EMS page frame and the EMS version number with
- which the EMS driver complies. The current version of EMS is 4.0, which
- defines additional services over the previous version 3.2. COMPTEST does
- not detect a memory card with hardware support for EMS if the EMS driver
- for the card has not been loaded. COMPTEST determines the throughput from
- EMS memory by reserving a EMS-page, mapping it into the page frame and
- doing a block copy from the mapped-in page.
-
- Note that the total amount of memory available to programs that can
- make use of expanded and extended memory may well be lower than the
- sum of the extended memory and the expanded memory, as some of the
- extended memory physically present in the machine and reported by
- COMPTEST may have been logically converted to EMS memory by an EMS
- driver. For example, the QEMM 6.0 memory manager provides extended
- memory according to XMS and expanded memory via a built-in EMS driver,
- and satisfies memory allocation request from a common pool for both
- types of memory. So for a machine with 8 MBytes of physical memory it
- may report 7 MBytes of EMS and 7 Mbytes of extended memory.
-
-
- o other RAM:
- COMPTEST tries to find additional RAM between the end of the graphics
- card's display memory and the start of the BIOS ROM. This RAM may be
- provided by special memory cards or by some chips sets like NEAT that
- can map memory to this region physically. It can also be provided by
- 386 memory managers, that use the processor's MMU (memory management unit)
- to logically map memory to this region. The latest DOS versions (e.g.
- MS-DOS 5.0) can use such memory in the form of UMBs (upper memory
- blocks). COMPTEST may also find RAM that is present on network adapters
- or certain hard disk controllers. In such cases, COMPTEST may report
- numerous very short RAM blocks. If the length of these blocks is below
- one KByte, COMPTEST prints the size as 0 KB. The memory found on network
- adapters and hard disk controllers is of course not available to DOS.
- Rather, these adapters use the RAM for buffers or to hold certain
- variables. COMPTEST does not scan the address space above the display
- memory byte by byte to find RAM. Rather it tests every 256th byte if
- it is in RAM. The byte is tested by writing two different values to
- it and checking if both can be reliably read back.
-
-
- o BIOS-extensions:
- COMPTEST searches for BIOS-extensions such as a VGA-BIOS or hard disk
- BIOS between the end of the graphic adapter's display memory and the
- start of the main BIOS-ROM. COMPTEST checks in steps of 256 bytes if
- the next two bytes read 55h, AAh, which is the common ID with which
- all BIOS extensions start. If the sequence 55h, AAh is found, COMPTEST
- reads the next byte, which stores the length of the BIOS-ROM measured
- in 0.5 KByte blocks, if a BIOS-extension is indeed present. All bytes
- in the memory region specified by the length byte are summed up. If
- the 8 lowest bits of the sum are found to be all zero, a valid BIOS
- extension has been found. COMPTEST then tries to determine if the BIOS
- extension found is a hard disk BIOS or an EGA/VGA BIOS and displays
- that additional information where applicable. Note that the same block
- of memory can be displayed as both, a BIOS extension and extra RAM.
- This usually indicates that BIOS shadowing is being used and that
- the shadowed BIOS has not been write protected.
-
-
- o parallel ports:
- COMPTEST prints the number of parallel ports as reported in the
- BIOS's equipment byte.
-
-
- o serial ports:
- PCs are commonly prepared to manage up to four serial ports in the
- system. The BIOS checks for serial ports installed during a cold
- boot and stores the number of serial ports found in the BIOS' data
- area (starting at address 400h). It also determines the start address
- for the block of IO-ports each serial port occupies. COMPTEST evaluates
- this information. It also tests for serial ports (UARTs = universal
- asynchronous receiver transmitter) itself, searching the four
- standardized IO starting addresses used in PCs (3F8h, 2F8h, 3E8h, 2E8h).
- First it tries to establish whether a UART is present at the specified
- address by trying to switch on the loop back feature of the UART and
- then transferring a byte through the loop. If this test passes, COMPTEST
- assumes that a UART is present, since it is highly unlikely that any
- other device would correspond to the same sequence of instructions in
- the same manner. COMPTEST then tries to find out whether the UART chip
- used is a 8250, 16450, 16550, or 16550A. The 8250 is the UART chip used
- in PC compatibles. The 16450 is the successor to the 8250 chip. It
- supports transfers at higher baud rates than the 8250. It also features
- a scratch register that the 8250 does not have. By trying to store
- values in the scratch register and read them back, COMPTEST determines
- whether a 8250 or 16450 is in the system. The 82450 is the same chip
- as the 16450 produced by a different manufacturer and is reported as
- a 16450 by COMPTEST. The 16550 is a 16450 with added send and receive
- FIFO buffers. This makes for more reliable communication and higher
- effective transfer rates in interrupt driven serial communication. The
- original 16550 had a bug that was fixed in the 16550A. The 16550 has
- two status bits that reflect the status of the send/receive FIFOs. On
- the 16550 only one of these bits works correctly, while on the 16550A,
- both of them perform as expected. This is used by COMPTEST to distinguish
- between the two chips.
-
-
- o mathematical coprocessor:
- COMPTEST checks if a 80x87 mathematical coprocessor is present. If
- one is found, it does a detailed check on the type of coprocessor
- installed. It can determine the presence of the Intel 8087, Intel
- 80187, Intel 80287, Intel 287XL, Intel 80387, Intel 387DX, Intel
- 387SX, Intel RapidCAD, Cyrix 82S87, Cyrix 83S87, Cyrix 83D87, Cyrix
- 387+, Cyrix EMC87, IIT 2C87, IIT 3C87, IIT 3C87SX, ULSI 83S87, ULSI
- 83C87, C&T 38700DX, C&T 38700SX and coprocessor emulators using INT 7
- for emulation. The Cyrix EMC87 is a 387DX compatible coprocessor that
- also provides a memory mapped mode and goes into the EMC socket (found
- in most 386 based machines) that was originally designed for the Weitek
- coprocessors. Correct detection of most, but not all, of the chips
- mentioned has been tested. COMPTEST also tries to determine the presence
- of a Weitek Abacus 3167 or 4167 coprocessor by checking the BIOS'
- equipment status for a set Weitek bit. However, on most systems, this
- bit is only set if the Weitek coprocessor has been registered in the
- BIOS extended setup by the user, so it is not very reliable. COMPTEST
- does not check for physical presence of a Weitek coprocessor.
-
- COMPTEST takes advantage of the fact that 80x87 coprocessor instructions
- are ignored in systems with no coprocessor. It executes instructions
- that store the default status and control words of a coprocessor to
- memory. If no coprocessor is present, nothing gets stored in these
- memory locations. If the expected values are stored in these locations
- COMPTEST knows that a coprocessor is present.
-
- If a coprocessor has been found, COMPTEST tries to detect into which
- of the following four groups it belongs: emulator via INT 7, 80486,
- 8087/80287, all other coprocessors. If the emulation bit in the
- machine status word of a 286 or 386 CPU is set, COMPTEST assumes that
- the 'coprocessor' found is actually an emulator that emulates coprocessor
- instructions via the INT 7 trap. If the CPU was found to be an Intel
- 80486, COMPTEST knows that the coprocessor found is the FPU on this
- chip. The Intel 8087 and 80287 were designed before the IEEE-754 Standard
- for Binary Floating-Point Arithmetic was finally accepted in 1985. As
- opposed to all newer coprocessors, they implemented certain features
- no longer supported in the final form of the standard. One of these
- features was that they had two modes for handling infinities. In one
- mode, infinities were signed, in the other, all infinities did not
- carry a sign and were the same. COMPTEST uses this to separate the 8087
- and 80287 from other coprocessors. It generates an infinity by means
- of a division by zero, duplicates that infinity, changes the sign of
- the infinity and then compared the two values. On the 8087 and 80287,
- they will be reported to be identical, while on all other coprocessors,
- the are reported as different due to the different sign. This enables
- COMPTEST to distinguish the Intel 80287 from the 287XL and coprocessors
- compatible with the latter, such as Cyrix 82S87 and IIT 2C87. It also
- makes it possible to find out whether a 386 based system uses the 287
- or a 387 as the coprocessor.
-
- The 287 and 387 compatible coprocessors from different manufacturers
- can be told apart by certain incompatibilities:
- The IIT coprocessors do not support denormal numbers in the coprocessor's
- internal format, while all other coprocessors do. COMPTEST tests for
- the IIT coprocessors by loading an extended precision denormal and
- adding that number to itself. On all coprocessors except the ones
- from IIT, this causes the denormal exception to be raised. Since the
- result is flushed to zero on the IIT coprocessors, the denormal exception
- is not raised.
- The ULSI coprocessors do not support the rounding control feature of
- the other coprocessors. They compute all results in extended precision.
- To test for ULSI coprocessors, COMPTEST sets the precision control
- to 53 bits and then multiplies two numbers whose product can be
- represented exactly in the 64 mantissa bits of the extended precision
- format, but not in 53 mantissa bits. Therefore, the precision exception
- is raised on all coprocessors except the ones from ULSI.
- In the Cyrix coprocessors, several small bugs present in the Intel
- coprocessors have been fixed. One of them deals with the operation
- on NaNs. Intel's requirements state that an instruction that operates
- on two NaNs should return the larger of the two NaNs. However, if both
- NaNs have the same absolute value but different sign, Intel's coprocessors
- erroneously return the negative (and therefore smaller) NaN. The Cyrix
- coprocessors return the correct result in these cases. COMPTEST uses
- the FPATAN instruction to perform the test described. The successor
- to the 83D87 from Cyrix is the 387+. This is a "Europe-only" name, in
- other parts of the world, the new coprocessor is sold under the old
- name 83D87. The 387+ can be told apart from the 83D87 because of its
- extended argument range for the FYL2XP1 instruction. While the range
- for this instruction is restricted to -sqrt(2)/2..sqrt(2)/2 on all
- other 80x87 compatibles, it is unrestricted in the 387+. COMPTEST
- computes FYL2XP1 (1.0) and tests if the correct result (1.0) is returned.
- The Cyrix EMC87 can be told apart from other Cyrix coprocessors since
- the most significant bit of its control word can be written.
- The Intel 80387 exists in two versions. The newer one is called 387DX
- and provides about 20% more performance. One difference between these
- chips is what they get for the exponent when doing an FXTRACT of -1.0.
- While the older 80387 gets -0.0 as the answer, the newer 387DX gets
- +0.0. This difference is used by COMPTEST to decide which Intel 80387
- is in the machine.
- The coprocessors from Chips and Technologies are detected by the result
- they return for F2XM1 (pi). Note that F2XM1 is only defined for arguments
- in the interval -1..1. The C&T 38700 coprocessor returns pi/2 when F2XM1
- is called with an argument of pi. The Cyrix coprocessors return the same
- result, but are never submitted to the test for the C&T coprocessors.
- The Intel RapidCAD behaves like a 386/387 combination. One of the few
- differences is the way in which the value BCD INDEFINITE is stored.
- While the Intel 80387 and 387DX store it as FFFF 8000000000000000, the
- RapidCAD and the Intel 80486 store it as FFFF C000000000000000. This
- difference is used by COMPTEST to detect the RapidCAD chip.
-
- COMPTEST measures the clock speed of the coprocessor by measuring the
- time it takes to execute a block of FSQRT instructions. This instruction
- was picked since it has a very stable execution time that varies
- only minimally and has a sufficiently high execution time. However,
- the execution time of coprocessor instructions in 286 and 386 systems
- may vary by a few clock cycles, depending on the chip set used. Also,
- in some systems, the CPU and the coprocessor run asynchronously,
- causing the execution time of coprocessor instructions to vary even
- more, since in 286 and 386 systems the CPU has to fetch the instructions
- and operands for the coprocessor.
-
-
- o mouse:
- COMPTEST tries to detect the presence of a mouse driver, not if a
- mouse if physically hooked up to the PC. It calls a specific mouse
- driver function that returns the information which mouse button has
- been pressed. This has the advantage that the mouse driver's status
- is not changed.
-
-
- o games adapter:
- COMPTEST tries to determine the presence of a games adapter (used
- to hook up up to two analog joysticks to the PC). This test has two
- stages: In the first stage COMPTEST asks the BIOS if a games adapter
- is present, in the second stage it tries to access the games adapters
- registers. Unfortunately, both methods seem to be highly unreliable,
- as COMPTEST usually reports that no games adapter could be found,
- even if one is installed.
-
-
- o DOS drives:
- COMPTEST determines the number of DOS drives by trying to set the
- default drive to DOS drives 0 to 8, and returns all those drives
- as valid DOS drives that can be used as DOS default drive. Note
- that this limits the number of DOS drives that COMPTEST recognizes
- to a maximum of nine drives. This can easily be expanded by some
- changes to the source code.
-
- o floppy drives:
- COMPTEST reports the number of floppy drives as reported in the BIOS
- equipment flag. In AT compatible systems, the type of each floppy
- drive is taken from the drive information found in the CMOS RAM in
- the real time clock. In modern system, this RAM is now part of the
- system's chip set.
-
-
- o hard disks:
- COMPTEST recognizes up to four hard disks. For each drive, it calls
- the 'drive ready' status function of the BIOS. Every drive that
- returns the 'ready' condition is included into the final tally.
- Some removable hard disks, such as Tandon's data packs, that require
- special drivers to hook into the DOS file system, are not recognized
- as hard disks by COMPTEST.
-
-
- o graphics card:
- COMPTEST reports only one graphics card in the system, even if two
- are installed (e.g. EGA and Hercules). It recognizes MDA and CGA
- (found in the original IBM-PC), EGA (introduced with the IBM-AT),
- MCGA and VGA (introduced with IBM's PS/2), also the monochrome
- Hercules card and IBM's PGA. No attempt is made to distinguish
- between the many different chip sets used in today's VGAs (e.g.
- TVGA, Tseng ET4000, Video 7). COMPTEST will also not recognize the
- Hercules RAMFont and Hercules InColor cards, 8514/A, Tiga cards or
- other accelerated graphics cards. COMPTEST detects most of the
- graphics cards it recognizes by making call's to certain functions
- in their BIOS. For EGA cards, it also reports the amount of memory
- on the adapter as reported by the EGA-BIOS.
-
-
- o Video-RAM wait states:
- The CPU usually can not access the display memory on a graphics card
- at full speed due to a number of reasons. As the CRT controller on the
- graphics adapter has to read out the display memory to generate the
- CRT signals and the DRAM found on most graphics cards does not allow
- simultaneous access from the CPU and the CRT controller, the CPU may
- have to wait until the CRT controller has finished its access to a
- particular part of display memory. Second, data transferred to/from
- a graphics card has to travel over the PCs system bus, which has a
- limited throughput that is much smaller than the memory bandwidth of
- the CPU, thus slowing down the average memory access over the bus. The
- ISA bus found in most PCs is particularly slow, while the MCA and
- EISA busses provide more bandwidth. To overcome this problem, some
- manufacturers have chosen to integrate the graphics adapter on the
- mother board or couple the graphics adapter more closely to the CPU
- using a technique called local bus. Local busses are direct extensions
- of the CPU's busses. They usually run at the full speed of the system
- and provide high bandwidth, but can only drive a limited number of
- cards. A popular form of the local bus concept is the VESA local bus
- (VLB), for which numerous graphics cards are now available. Third,
- for fast machines, the speed of the DRAM chips on many graphics card
- (60-70 ns at best) is to slow to allow zero wait state operation of
- video memory accesses. This is the same problem that affects memory
- accesses to the system RAM in these machines. Use of VRAM eases the
- problem somewhat, so all fast graphics cards now use VRAM. Fourth,
- the bus interface used by the graphic card's chip set may introduce
- additional slow down due to the physical organization of the display
- memory (e.g. remapping word accesses to byte accesses). End users can
- influence the raw video throughput (and thereby the number of wait
- states) by selecting a graphics card with a fast chip set and by
- configuring their system to use as high a bus speed as possible.
- Typical numbers for video-RAM wait states are 1 wait state per MHz
- CPU clock frequency for Hercules cards, ~15 wait states for EGA cards,
- and as low as 7 wait states for fast VGA cards (e.g. those using the
- ET4000 chip set). Note that on 486-DX2 system, the number of wait
- states is usually higher since the whole system runs at half the
- speed of the CPU. To measure wait states for video-RAM accesses,
- COMPTEST stores a block of data to the video adapter and measures
- the time to do that. This time is then compared to the time it would
- take to store this data in zero wait state memory. From these values
- the number of wait states is computed. COMPTEST uses the memory at
- address B0000h for monochrome modes and B8000h for color modes for
- this test. On some graphics cards, the memory access at these addresses
- may have a higher number of wait states than in other parts of the
- screen memory (e.g. the memory at A0000h used by high resolution
- graphics modes of EGA/VGA cards). There is also a PD program
- called VIDSPEED that uses a technique similar to COMPTEST's to
- report video-RAM throughput and vertical and horizontal retrace
- frequencies. Note that for graphics cards with accelerator chips,
- the speed with which the CPU can access the RAM on the graphics card
- is not a good indicator of the Windows, OS/2, or X-Windows performance,
- as many operations on the cards are performed on the card itself.
-
-
- o Speed of video output via BIOS:
- The speed is given in characters output per second as measured for
- function #9 of the video BIOS (write character with attribute). Note
- that there are other video-BIOS functions that write characters
- to the screen, that may be faster or slower than the function to
- be chosen for COMPTEST. For these reasons, it is hard to compare
- the output speed determined by COMPTEST with other system diagnostic
- programs such as DiagSoft's PowerMeter. As this test heavily exercises
- code in the video-BIOS, there may be huge performance differences
- between a shadowed and a non-shadowed BIOS. BIOS shadowing means
- that the BIOS code is copied from slow ROMs to fast RAM for faster
- execution at system start up. This is an option in most 286/386/486
- based systems that operate at more than 10 MHz. BIOS throughput drops
- due to the use of the 386 memory managers, since these programs
- intercept all interrupts and therefore introduce a considerable
- overhead into the execution of BIOS interrupts. There may also be
- TSRs that hook the video BIOS interrupt and cause BIOS throughput to
- drop even further. The typical BIOS throughput in fast 386 and 486
- systems usually exceeds 100,000 characters per second. Due to the
- trend to GUIs (graphical user interfaces), the output speed of the
- BIOS has lost its earlier importance. Most programs do not even
- use the BIOS for character based output, but rather write to the
- screen directly. Note that scrolling may reduce the output speed
- significantly and is not included in the BIOS throughput test by
- COMPTEST.
-
-
- o Speed of video output via DOS:
- The speed is given in characters output per second as measured for
- the DOS functions #9 (print string). Note that the DOS file functions
- can also be used to print to the screen and that the output speed
- for these function may differ from the output speed reported. Also,
- the output speed may depend on the string length of the string to
- be printed due to varying amount of overhead while calling DOS. Since
- an ANSI driver usually causes a much slower DOS video output due to
- the need of the driver to check the output stream for interpretable
- sequences, COMPTEST states if the speed shown refers to the output
- speed with or without an ANSI driver present. To check for an ANSI
- driver, COMPTEST prints an ANSI ESC sequence that causes an ANSI
- driver to report the cursor position by inserting the result string
- into the keyboard buffer. COMPTEST then checks if this information
- has arrived in the keyboard buffer and assumes the presence of an
- ANSI device driver if it finds the information in the keyboard buffer.
- Besides the use of an ANSI or similar terminal driver (e.g. EANSI,
- NANSI), the use of a 386 memory manager such a 386MAX, QEMM, or EMM386
- can slow down DOS video throughput as the use of these programs causes
- a higher overhead in the interrupt calls used in the code that prints
- to the screen.
-
-
- o DOS version:
- Shows the DOS version as reported by a call to the DOSGetVersion
- function of MS-DOS. For version 5.0 or later, this may not be
- the true DOS version, since the DOS version number reported to
- an application can be manipulated using the SETVER utility of DOS.
-
-
- o Standard benchmarks:
- COMPTEST uses three widely known standard benchmarks to provide some
- measurements of system performance. Since results for these benchmarks
- depend not only on the speed of the hardware, but also on the code
- quality of the compiler, only the relative performance to the original
- IBM PC displayed by COMPTEST is really significant. The reference
- numbers used in COMPTEST were determined using my own fast replacement
- for Turbo Pascal 6.0's run-time library (available as TPL60N19.ZIP
- from garbo.uwasa.fi and additional ftp sites). The compiler switch
- settings were the same as those found in the source code of COMPTEST
- 2.59. If you use another compiler, e.g. Stony Brook Pascal+, which is
- an optimizing compiler mostly compatible with TP 6.0, or use other
- switch settings, you *must* determine new reference values for the
- IBM PC if the PC relative performance numbers are to be of any use.
-
-
- o Dhrystones:
- The results of running the Dhrystone benchmark, a synthetic benchmark
- that is supposedly representative of integer applications. Note that
- Dhrystone performance depends on the hardware as much as on the
- compiler. Therefore, Dhrystone numbers by other system test programs
- may be higher or lower as those reported by COMPTEST, depending on
- whether or not they were compiled with an optimizing compiler or run
- as a 16-bit or a 32-bit program. There are different versions of
- Dhrystone, the version used here (2.1) is the latest available from
- the author of the benchmark, Reinhold Weicker. The Dhrystone code
- fits well into a rather small cache (8 KB will be sufficient), so
- for systems with CPU caches it tests only CPU performance, *not*
- the performance of the memory system. To make the Dhrystone performance
- as determined by COMPTEST useful, the relative performance as
- compared to the original IBM-PC is given.
-
-
- o Whetstones:
- The results of running the Whetstone benchmark, a synthetic benchmark
- that stresses mainly floating point performance, including trans-
- cendental functions like Sin or Exp. As the Dhrystone numbers, the
- results of the Whetstone benchmark depend as much on the hardware
- as on the code quality of the compiler (whether optimizing or not),
- although the compiler dependency is usually somewhat less than with
- the Dhrystone benchmark. Therefore, Whetstone numbers as determined
- by COMPTEST should not be compared to those determined by other programs.
- To make Whetstone results useful, the performance is also rated in
- comparison with the original IBM-PC. Note that the test uses software
- emulation of the coprocessor if the machine tested does not have
- an 80x87 mathematical coprocessor, and that in this case the performance
- is compared to the equivalent PC configuration, that is a PC without
- an 8087. There are two versions of the Whetstone benchmark, an older
- version derived from the original article published in 1976 and a newer
- version that includes sanity checks. The latest available version
- acquired from one of the original authors (Brian Wichmann) is used here.
-
-
- o MFLOPS:
- This benchmark result tells you how many millions of basic floating
- point operations (add, subtract, multiply) the tested machine is able
- to execute per second. This number is determined by running an older
- version of the Lawrence Livermoore Loops, a set of 14 kernels taken
- from *real* number crunching programs and computing the average MFLOPS
- (Millions of FLoating-point OPerations per Second). There is a newer
- suite of LLL out that uses 24 kernels and provides more detailed
- diagnostics of the floating point performance. Due to its size, it
- could not easily be integrated into COMPTEST, so the older (and simpler)
- version was used. The LLL benchmark uses about 60 KB of RAM, so results
- may be influenced by the size of the CPU cache, if any is installed.
- For reference, the MFLOPS are compared to the performance of an IBM-PC
- with 8087 (if the tested machine also has a coprocessor) or to a plain
- IBM-PC using the software emulator (if the machine tested does *not*
- have a coprocessor). As with the other benchmarks, the LLL performance
- depends not only on the hardware, but also on the compiler used. Highly
- optimizing compilers that make use of 32-bit instructions where possible
- would give an MFLOPS rating that is about 50% higher.
-
-
- o Hard disk data:
- COMPTEST usually is able to test all hard disks in your system, regard-
- less of whether they use the ST506, IDE, ESDI, or SCSI interface. How-
- ever, it may fail to detect some special types of hard disks like the
- removable data packs found on some Tandon computers that use special
- drivers to hook into the DOS file system.
-
-
- o Hard disk geometry (# cylinders, # read/write heads, sectors per track):
- These disk parameters are given as reported by the BIOS. The parameters
- given may not reflect the physical geometry of the disk. For example,
- if a disks uses the zone bit recording technique, there is no
- fixed ratio of sectors per track, rather the number of sectors per
- track is greater in the outer zones and smaller in the inner zones.
- So the parameters given reflect the logical layout of the disk as
- seen by the BIOS, which may or may not coincide with the physical
- layout of the disk.
-
-
- o Hard disk storage capacity:
- The *formatted* capacity of the disk is given in bytes and MB. Note
- that a MB contains 1024x1024 = 1,048,576 bytes if computed correctly.
- Some disk manufacturers (e.g. Quantum) state the storage capacity in
- MBs consisting of only 1,000,000 bytes. Therefore, the capacity in MB
- as reported by COMPTEST may be lower than the capacity the manufacturer
- claims for the disk. Also, the capacity you can use using DOS may be
- even smaller, since DOS allocates some disk memory to build allocation
- structures like partition tables or FATs.
-
-
- o Track-to-track seek time:
- The time it takes to move the read/write heads of your hard disks
- from one cylinder to an adjacent cylinder. The time reported may
- be zero when using certain disk cache programs (e.g. HyperDisk),
- as these suppress unnecessary head movements if no data is read
- or written in the process (as happens when COMPTEST does this
- test). Also, COMPTEST may be fooled by the so called translation
- modes mainly used by certain hard disks using the IDE interface
- to overcome limitations to the maximum number of cylinders set
- by the BIOS or to accommodate the fixed sectors/track scheme of
- PCs to the modern zone bit recording technique. With translation
- mode enabled, two logical tracks can reside on the same physical
- track, essentially nullifying the time to move between the logical
- tracks. COMPTEST moves the read/write heads over all tracks in single
- track steps and divides the total time by the number of tracks moved.
- This provides an average time, since movements between adjacent tracks
- may take different times depending on the absolute location of the
- tracks.
-
-
- o Average seek time:
- The time needed on the *average* to position the read/write heads
- over an arbitrarily selected cylinder on the disk. This time is
- roughly equal to the time it takes the heads to travel over one
- third the total numbers of tracks. It can be shown that, if tracks
- are selected at random from a uniform distribution on [0..MaxTrack]
- the average difference between any pair of track numbers is equal
- to one third the total numbers of tracks. Note however, that the
- assumption of a uniform distribution in track access patterns
- usually does *not* hold for practical file systems. Also, the
- average number of tracks traveled by read/write heads varies
- for the different zones of a disk using zone bit recording or
- using read/write queuing. For most disks however, the number
- reported by COMPTEST should be close to the number stated by the
- disk's manufacturer. When using certain disk cache programs, you
- will see very small times due to the fact that these programs
- suppress all unnecessary head movements. Note also that the
- average access time by definition is not equal to the tine needed
- on the average to access a specific sector on the disk. For this
- you have to add at least the rotational latency (the time needed
- until the read/write head reaches the designated sector on the
- track after having moved to the correct cylinder). This takes
- half of the time required for a full revolution of the disk in the
- average case, that is 8 ms for a disk spinning at 3600 rpm. Newer
- disks often spin at a higher speed of 4400 rpm or more to reduce
- rotational latency.
-
-
- o maximum throughput:
- COMPTEST determines the maximum throughput of a disk similar to the
- well known CORETEST disk test program by repeatedly reading the
- same block from disk. It uses the low level functions provided by
- the BIOS to maximize performance. The amount of data read is the
- data on one cylinder or 63 KBytes, whichever is smaller. Therefore,
- no movement of the read/write heads occurs during the read test.
- The disk's read throughput is determined for the first and the last
- cylinder on the disk. For hard disks using the zone bit recording
- technique, the throughput on the outermost cylinder can be about 50%
- higher than on the innermost cylinder, since more data is recorded
- on the outer cylinders. Since COMPTEST reports the maximum throughput,
- it always reports the higher of the two transfer rates it determines.
- The transfer rate in real applications is usually much lower than
- the maximum transfer rate reported by COMPTEST. By repeatedly reading
- the same block from disk, COMPTEST causes the block to be read into
- the track buffers and on-disk hardware caches present on most modern
- disks or the cache memory of a caching controller. If the block fits
- completely into such a cache, the transfer rate measured is actually
- the transfer rate between the buffer/cache and the system memory. A
- more realistic transfer rate could be determined by completely reading
- the data on several adjacent tracks. Since every data item is read
- only once, a cache will not inflate the transfer rate. The transfer
- rate determined by this process is called the linear read rate. It
- is used to measure disk performance in programs such as PCTOOLS 7.1's
- SI. The linear read rate can be useful to determine disk performance
- for operations on large files that are contiguous and are read
- sequentially with only a small amount of head movement. Many applications
- use random access files, though, and files may not be stored in
- contiguous form on the disk. In these circumstances, there is a
- considerable amount of head movement and the seek time and rotational
- latency cause overhead that further reduces the effective transfer
- rate available to applications. Note that COMPTEST uses the BIOS to
- access the disk, while applications make use of an operating system's
- file system that introduces additional overhead. Also, write access
- to a disk may be significantly slower than read accesses. Hardware
- caches on the disk or on the disk controller and software caches
- like HYPERDISK and SMARTDRV can provide better disk performance
- by storing frequently used data in cache memory that can be accessed
- faster than the disk itself. COMPTEST tries to determine the presence
- of a disk cache by repeatedly reading a small amount of data from
- different (non-adjacent) tracks. If no cache is present, the head
- movement to access the different tracks is the reason that the read
- time can not fall below a certain level due to the physical limits
- that prevent a reduction in average seek times below about 12 ms.
- If a cache is present, all data can be hold in the cache memory
- after the first read access and no additional head movements take
- place, thereby causing fast execution of COMPTEST's test to determine
- presence of a disk cache. COMPTEST does not differentiate between
- hardware and software caches.