home *** CD-ROM | disk | FTP | other *** search
-
- DDDDDD FFFFFFF BBBBB
- DDDDDDD FFFFFFF BBBBBBB
- DD DD FF BB BB
- DD DD FFFFF BBBBB
- DD DD FFFFF BB BB
- DDDDDDD FF BBBBBBB
- DDDDDD FF BBBBB
-
- Data Flow Benchmark V 1.7
-
- (c) 1994,1995 by D.Engert
-
-
- 1. Legal stuff
-
- There is no warranty. Use this software on your own risk. Due to the
- complexity and variety of today's hardware and software which may be
- used to run this program, I am not responsible for any damage or loss of
- data caused by use of this software. It was tested very well and is
- expected to work correctly, but nobody can actually guarantee this for
- any circumstances. And because this software is free, you get what you
- pay for ...
-
- This program can be used freely for private or educational purposes. If
- you want to use it for commercial purposes or find any bugs or have
- suggestions about further enhancement, please contact the author.
-
- Author: Detlef D. Engert
- Gruentenweg 14
- D-90471 Nuernberg
- Germany
-
- Fax: +49-911-861319
- Mailer: +49-911-861319 UTC 18:00 - 23:00
- EMail: 2:2490/2576.1@fidonet
- engert@ibm.net
-
- 2. Purpose and intent of this program
-
- Todays hardware gets more and more powerful but more complicated too. Modern
- motherboards using up to date chip sets may turn out to be very difficult to
- configure. And to make things worse, there are different manufacturers of CPU
- chips besides Intel now with new features and options. The memory subsystems
- implemented on these motherboards are even harder to configure, taking into
- consideration different cache strategies, RAM speeds and access modes.
-
- Beyond the core of any computer system lay the peripherals (video, magnetic
- storage..) connected by a variety of bus implementations like ISA, EISA, VLB or
- PCI. Chip sets used on these peripherals are often of even higher complexity
- than the computer core.
-
- Even skilled users are often overwhelmed by the sheer complexity and variety
- of options offered. Nobody will them the real power available to them by a
- given computer system using a particular configuration set. How should a user
- optimize his or her computer or how should a buyer choose between similar
- looking components based on hard facts ? May be this program will help you !
-
-
- 3. What will this program offer ?
-
- Let's have a look on the output (framed) of the current version run on my own
- machine.
-
- Machine configuration:
-
- ASUS P55TP4XE motherboard
- Intel P54C-100 CPU,
- Intel 'Triton' PCI/ISA chip set,
- 256 KByte synchronous cache RAM, 32 MByte of 60ns DRAM,
- Diamond Stealth 64 VRAM video board, 2 MByte VRAM
-
- All BIOS configuration settings are optimized for maximum throughput. That
- gives the following results:
-
- +-----------------------------------------------------------------------------+
- | Data flow benchmark v1.7 |
- | |
- | copyright (c) 1994,1995 D.Engert |
- | |
- | Processor : Intel Pentium P54 (Fam:5 Mod:2 Rev:5) |
- | Clock : 99.6 MHz |
- | virtual Interrupts : present |
- | Coprocessor : present |
- | Internal bus width : 32 bit between processor and primary cache |
- | External bus width : 32 bit between primary and secondary cache |
- | DRAM page size : 4 KByte |
- | MMU cache : 64 entries 2-way set associative, 4KByte per entry |
- | Primary cache : 8 KByte 2-way set associative |
- | Secondary cache : 256 KByte direct mapped |
- | Cache line size : 16 Bytes |
- | Cache strategy : write back, no dirty tag, dirty extra waits: 0.0 clocks|
- +-----------------------------------------------------------------------------+
-
- These figures are quite selfexplanatory. Type and speed of the CPU are detected,
- the width of the data paths between CPU core and primary cache - typically
- located on the same chip as the CPU core - and between the core or primary
- cache and the secondary cache (if present) or main memory.
- The program next tries to determine the effective page size, if page mode is
- implemented by the chip set. The following 3 lines show information about the
- address translation lookaside buffer (MMU cache), the primary and secondary
- cache. Size and associativity are checked, the length of a cache line is
- determined and the strategy used by the cache subsystem (write-through or write-
- back) is sensed.
-
- +-----------------------------------------------------------------------------+
- | Data flow and bus performance memory |
- | |
- | -- Memory -> CPU --------- |
- | Maximum 8K FETCH (Hits) : 7.7µs ( 769c) ->1053.4MB/s ( 0.09c/Byte) |
- | 8K FETCH (Miss+Hit) : 27.6µs ( 2749c) -> 294.8MB/s ( 0.34c/Byte) |
- | Minimum 8K FETCH (Misses) : 54.9µs ( 5470c) -> 148.1MB/s ( 0.67c/Byte) |
- | Maximum 4K LODSD (Hits) : 30.9µs ( 3080c) -> 132.5MB/s ( 0.75c/Byte) |
- | 4K LODSD (Miss+Hit) : 43.9µs ( 4371c) -> 93.3MB/s ( 1.07c/Byte) |
- | Minimum 4K LODSD (Misses) : 67.7µs ( 6741c) -> 60.5MB/s ( 1.65c/Byte) |
- | -- CPU -> Memory --------- |
- | Maximum 4K STOSD (Hits) : 10.8µs ( 1079c) -> 378.2MB/s ( 0.26c/Byte) |
- | Minimum 4K STOSD (Misses) : 45.8µs ( 4561c) -> 89.4MB/s ( 1.11c/Byte) |
- | -- Memory -> Memory ------ |
- | Maximum 4K MOVSD (Hits) : 10.6µs ( 1051c) -> 388.0MB/s ( 0.26c/Byte) |
- | 4K MOVSD (Miss+Hit) : 63.3µs ( 6301c) -> 64.7MB/s ( 1.54c/Byte) |
- | 4K MOVSD (Clean) : 86.5µs ( 8616c) -> 47.3MB/s ( 2.10c/Byte) |
- | Minimum 4K MOVSD (Misses) : 112.2µs ( 11178c) -> 36.5MB/s ( 2.73c/Byte) |
- +-----------------------------------------------------------------------------+
-
- These are the performance figures of the CPU <--> memory data path.
-
- There are four disciplines:
- - opcode fetch
- - data load
- - data store
- - data move.
-
- Depending on the discipline several scenarios are tested (denoted in paren-
- theses):
- - hits in all memory caches (hits)
- - hit in secondary cache, but not in primary (miss+hit)
- - hit with replace in clean secondary cache, no write back necessary (clean)
- - hit with replace in dirty secondary cache, write back carried out (dirty)
- - misses in all caches (misses)
-
- The first (hits) should give maximum performance down to the last (misses) with
- minimum speed.
-
- The test transfer size depends on the cache and page sizes. In the case above
- it is 4 or 8KByte.
-
- There are four result columns:
- - absolute time needed for one test of the mentioned size
- - the same in CPU clock cycles
- - the resulting transfer speed in MBytes per second
- - the cost of the operation in cycles per Byte
-
- +-----------------------------------------------------------------------------+
- | VIO info : SVGA, 2 MByte video memory |
- | Device info : manufacturer S3, chip set 86C968, 2 MByte video memory |
- | Screen : 1280x1024x256 |
- | Aperture : 64 KByte @ 0x000A0000 |
- | Bus width : 32 bit between CPU and video memory |
- +-----------------------------------------------------------------------------+
-
- The information about the video system is queried from different parts of OS/2,
- so there may be different figures for the same item. That depends more or less
- on how careful the developer of the video drivers did the job...
-
- +-----------------------------------------------------------------------------+
- | Data flow and bus performance video |
- | |
- | -- Video -> CPU ---------- |
- | 4K LODSD : 567.1µs ( 56476T) -> 7.2MB/s (13.79T/Byte) |
- | -- CPU -> Video ---------- |
- | 4K STOSD : 48.4µs ( 4823T) -> 84.6MB/s ( 1.18T/Byte) |
- | -- Memory -> Video ------- |
- | Maximum 4K MOVSD (Hits) : 48.6µs ( 4836T) -> 84.4MB/s ( 1.18T/Byte) |
- | 4K MOVSD (Miss+Hit) : 63.3µs ( 6304T) -> 64.7MB/s ( 1.54T/Byte) |
- | Minimum 4K MOVSD (Misses) : 86.6µs ( 8625T) -> 47.3MB/s ( 2.11T/Byte) |
- | -- Video -> Memory ------- |
- | 4K MOVSD : 603.4µs ( 60093T) -> 6.8MB/s (14.67T/Byte) |
- | -- Video -> Video -------- |
- | 4K MOVSD : 941.9µs ( 93813T) -> 4.3MB/s (22.90T/Byte) |
- +-----------------------------------------------------------------------------+
-
- This is the same as above, obviously the discipline opcode fetch is left out,
- but there are more transfer data paths.
-
- The figures for data store and move from primary cache into video memory are
- more or less senseless on local bus systems and coprocessed video cards, but
- give at least an idea how careful these buses are implemented.
-
- I don't comment the actual figures, because each - and most probably your -
- system is different. Compare yourself, I only say this system is a fast one in
- its category...
-
-
- 4. How do I start this program ?
-
- That's easy: go to a command line and type
-
- DFB [options]
-
- The following options are currently implemented:
-
- /NOV[ideo] : suppress video testing
- /CC:number : set country code to number, default is from CONFIG.SYS
- 49 : Germany (deutsch)
- 39 : Italy (italiano) [DFBITA.MSG required]
- else : international (english)
- /MORE : stops output after each section to ease reading
- /DMP : dump test values to stderr
- may be redirected to file via 'DFB /DMP [...] 2>filename'
-
- Options are not case sensitive !
-
- The commands 'help DFB0000' or 'DFB /?' will give you the same information about
- the usage of the latest version of DFB.
-
- If you start DFB in a full screen session, video testing will be left out,
- since there is no video aperture available. Better use a windowed OS/2 box.
-
- To access memory and video for testing, the device driver SSMDD.SYS must be
- loaded. It is part of MMPM/2 and included in the distribution of DFB also. Be
- sure, the statement
-
- DEVICE=[path]\SSMDD.SYS
-
- is in your config.sys file. DFB will tell you if it is not. In this case, only
- a small part of the DFB functionality is available (basic CPU type checking).
-
- If you create a minimum boot disk with floppy support only, there is enough
- room left to put SSMDD.SYS and DFB.EXE onto it too. So you may enter your
- favourite computer shop and check out the different machines offered by a mere
- boot from this floppy disk.
-
-
- 5. Is there danger to use this program ?
-
- Yes, there is !
-
- First, I am human so I am error prone :-)
-
- Second, DFB goes to the bones of your computer. Therefore I don't guarantee that
- it will interact with any running program or any active device in a totally
- harmless manner. If you plan to start DFB, I recommend to stop any other running
- user process and wait until all sensitive devices are idle. That is not a must
- but reduces any risks (I run DFB in parallel to my communications software,
- active CD-ROM and active audio system).
-
-
- 6. A whish of the author
-
- Since I have access only to Intel machines, I would appreciate if you run the
- tests using the dump option and drop me an email with a description of your
- system and the resulting dump output. If you can't reach me through Internet or
- Fidonet you may send your message to Compuserve 100275,3253. I am interested in
- any AMD(the new SL-types only!)/IBM/UMC/CYRIX/TI CPU and the brand new P24,
- P55, P6, M1 and what else may come.
- Everybody who provides me new information may consider him/herself as a
- registered user of a forthcoming shareware version (if there will be one...)
-
- ---
-