home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
OS/2 Shareware BBS: 8 Other
/
08-Other.zip
/
sysb094e.zip
/
dontread.hch
< prev
next >
Wrap
Text File
|
1997-11-30
|
21KB
|
560 lines
Prelude for those that don't read the documentation:
Do not mail me bug reports. I can't fix them... Other opinions on the
program are welcome.
I do not know if this program works on a CPU without math co-proc (like the
486-SX)
System Benchmark "SysBench" 0.9.0
---------------------------------
(C) 1994 Henrik Harmsen
The disk IO code: (C) 1994 Kai Uwe Rommel
Contents:
1 Introduction
2 Tests
3 Copyright notice
4 Thanks
Appendix A : Todo
Appendix B : Building
Appendix C : Example results
---
1 Introduction
I thought OS/2 needed a benchmark program, so I wrote one. This
program is not quite finished, and probably never will be, not by me
anyway, since I'm saying goodbye to OS/2 and turning my attention to
Linux. The reasons for this has not so much to do with OS/2, which is
still a great OS, as it has to do with Linux. Linux is slick,
super-fast, finally has drivers for my Viper card, has free TCP/IP and
last but not least, Linux is Unix.
This means I am probably not going to make updates to this program,
since I won't have OS/2 on my disk anymore. I'm saying probably, since
I can't read the future. Maybe one day my whimsical mind will think
OS/2 is more fun that Linux, who knows ? :-)
It also means that I am donating this program to anyone who is willing
to continue working on it. If you think you want to continue working
on this program, make sure you clearly note that this is released by
you, not me. To do this, change the version number to 0.9.0xxx, where
xxx are your initials. For example 0.9.0hch, which would indicate that
I (Henrik C Harmsen) has made this release. The version numbering
scheme should follow that of GCC. The first number is the major
release number, to be increased when major enhancements have been made
to the program or it is considered out of beta. The second number is
the minor release number, increase it when you have made small changes
to the program. The last number should be increased when making
bug-fixes only.
Take a look at the appendices for more information on what needs to be
done, what's not quite finished yet, and how to re-build the
program. Among other things, this document needs rewriting.
Do not send me complaints about bugs and errors, since I will have no
way of fixing them...
Now, that said, let's take a look at what this program tests.
2 Tests
HANDLE WITH CARE! DO NOT BLINDLY TRUST BENCHMARK VALUES. THEY ARE ONLY
GOOD IF YOU KNOW WHAT THEY ARE TESTING AND KNOW WHAT THEY ARE NOT
TESTING...
The values obtained here are not useful for comparing against values
obtained from other benchmarks programs. Even though one of the tests
for example measure Linpack performance and yields a value in MFLOPS,
this value is not useful in comparing with other values from a
different benchmark program. The only exception here is the dhrystone
2.1 value which might possibly be compared to values from other
dhrystone 2.1 benchmarks. As a rule: Only compare values with people
running this same benchmark program.
Almost all tests are adaptive in that they will first measure the
approximate speed of your computer so the test will take about 10-15
seconds in total, no matter how slow or fast your computer is. The
ones that are not adaptive are the floating point tests and the
CPU integer tests with the exception of the dhrystone test.
2.1 Graphic tests
These tests test how fast the video hardware/display driver
combination can pump pixels to the screen. OS/2 has long had abysmal
display drivers for many cards, these tests are meant to sort out
whether they really are bad, good or stink.
Most window operations are using only a few key operations of the
video card accelerator. Take a look at your windows, they're mostly
built from filled rectangles, with some text and vertical and
horizontal lines. Maybe a few bitmaps here and there (icons and such).
The PM-marks are calculated from the other values as a weighted
arithmetic mean-value.
2.1.1 BitBlit S->S Copy
Tests the speed of the bitblit screen->screen copy operation. One of
the most important values, since it affects how fast you can scroll
text, and move large windows.
2.1.2 BitBlit M->S Copy
Tests the speed of the bitblit memory->screen copy operation. This
affects how fast updates of large bitmaps are and all operations that
copy data from RAM to Video RAM.
2.1.3 Filled rectangle, patterned filled rectangle.
Tests how fast the blitter can blank areas with a color or stipple
pattern. When updating a window, the background is usually blanked
with a single color or pattern before text or other things are drawn
on it.
2.1.4 Lines
Tests the speed of line-drawing in different directions. The
horizontal and vertical line drawing speed is important when drawing
frames around windows and such.
2.1.5 Text render
Extremely important function for speedy updates in text editors, shell
windows, word processors etc.
2.2 CPU Integer tests
The CPU tests are divided into two sections, one to test 'integer'
performance, meaning not only integer arithmetics but also every other
'normal' program that does some kind of data processing. 99% of all
applications do not use floating-point arithmetic. Those that do are
usually ray-tracers, scientific engineering type of programs etc.
The CPU-int marks are calculated as a weighted mean average of the other
tests.
2.2.1 Dhrystone VAX MIPS
When reading about how many MIPS a computer performs, that is usually
tested by running this Dhrystone test and adjusting the result to be
relative to one VAX 11/780 MIPS. That means, this test does not
benchmark the number of million instructions per second (MIPS) as
defined by machine instructions, but rather a weighted value against
the base reference of one VAX 11/780 MIPS.
This test uses very little memory, meaning it will measure the CPU
performance only, not taking into account other vital parts as memory
speed etc.
Here is an excerpt from the sources from where I got this program:
"Dhrystone is a short synthetic benchmark program intended to be
representative for system (integer) programming. Based on published
statistics on use of programming language features: see original
publication in CACM 27,10 (Oct 1984). Orginally published in ADA, now
mostly used in C. Version 2 (in C) published in SIGPLAN Notices 23,8
(Aug 1988), together with measurement rules. Version 1 is no longer
recommended since state-of-the-art compilers can eliminate too much
'dead code' from the benchmark (However, quoted MIPS numbers are often
based on version 1). Problems: Due to its small size (100 HLL
statements, 1-1.5 KB code), the memory system outside the cache is not
tested; compilers can too easily optimize for Dhrystone; string
operations are somewhat over-represented. Recommendation: Use it for
controlled experiments only; don't blindly trust single Dhrystone MIPS
numbers quoted somewhere (don't do this for any benchmark)."
This test is based on the C-version of Dhrystone 2.1.
2.2.2 Hanoi
An integer program which solves the Towers of Hanoi puzzle using
recursive function calls. It uses very little memory, and thus does
not test memory speed.
2.2.3 Heapsort
Tests how fast your computer can sort a large array of random values
using the heapsort algorithm. Tests both CPU and memory speed. The
MIPS are just a measurement against some arbitrary base MIPS
reference. This test uses about 1 MB memory.
2.2.4 Sieve
Tests how fast your computer can find lots of prime numbers using the
sieve of Eratosthenes using arrays from 8 kB to 1.2 MB. The result is
a weighted mean value of the different speeds. Tests both CPU and
memory speed.
2.3 CPU floating point tests
These tests measure how fast your computer is at floating point
arithmetics. (Floating point means non-integer numbers like 2.3,
0.24 etc.)
The CPUfloat-marks are calculated as a weighted mean average of the
other values.
2.3.1 Linpack
This is the Linpack program (floating-point) converted to C. Results
here are sensitive to cache effects and memory speed. This version
tests only the rolled double precision version.
2.3.2 Flops
Estimates MFLOPS rating for specific FADD, FSUB, FMUL, and FDIV
instruction mixes. Four distinct MFLOPS ratings are provided based on
the FDIV weightings from 25% to 0% and using register-register
operations. Works with both scalar and vector machines. Since the
program trys to maximize register usage the results are NOT sensitive
to main memory speed. In this sense flops yields a peak rating. The
four different values are used to get a weighted mean average.
2.3.3 The Fast Fourier Transform
This program performs FFT's using the Duhamel-Hollman method for FFT's
from 32 to 262,144 points in size.
2.4 DIVE tests
DIVE means Direct Interface to video extensions. It is a library in
OS/2 that gives fast access to video routines used for programming
games or other very demanding graphic applications. It gives the games
programmer access to the Holy Graal - a pointer to the frame buffer.
The tests here are not incorporated into the benchmark since the DIVE
functionality will not actually appear until OS/2 3.0. I will describe
them, nonetheless.
The DIVE-marks are calculated as a weighted mean average of the other values.
2.4.1 Video bus bandwidth
This test makes a copy of the frame buffer and copies it back to the
screen a lot of times in order to measure how many bytes per second
you can pump data to the video RAM. On my 486-66 machine with a
Diamond Viper card this amounts to about 13 MB/s! That means about 42
frames per second in 640x480x256...
2.4.2 DIVE fun
This was an entry I added since I had a few ideas on fun screen hacks
you can do with DIVE. One of them is smoothly turning the screen
upside down and back again. The value obtained here will be highly
correlated with the Video Bus Bandwidth test.
2.4.3 Memory to screen copy with DIVE
DIVE has built-in routines for copying a large amount of data from RAM
or Video RAM to the display with the help of an hardware blitter (if
one is available), or software. There are three such tests. The first
test just blits an image to the screen, the second performs
pixel-doubling, effectivly doubling the size of the display. The third
test tests arbitrary stretching of the bitmap when displaying it on
screen. If you have Warp II or OS/2 3.0 you will have seen the ability
to stretch a running video clip to any size you want. These tests are
not finished yet.
2.5 Disk IO tests
These tests were programmed by Kai Uwe Rommel, although I have made a
lot of changes to his source code. Thanks Kai Uwe!. The tests are
available as a free-standing package called diskio14.zip at
ftp.cdrom.com. If there are any errors or strange behaviour in these
tests then blame me, not Kai Uwe.
The test can test all you fixed disks in your system. There is a menu
choice to change which disk to test.
The DiskIO-marks are calculated as a weighted mean average of the
other values.
2.5.1 Average seek time
Tests the average seek time of the currently selected disk. I have
seen that this is often a bit higher than what the disk manufacturers
promise... This is most likely due to different ways of testing
things.
2.5.2 Disk transfer speed.
Measures how fast the disk can be read NOT using the cache. When I
first came across the diskio program by Kai Uwe, my disk performed at
about 1.0 MB/s. I thought that was not very good, but perhaps
acceptable. Then I started to muck around with the CMOS parameters and
by changing the IO block read delay (I think that is what it was
called) the speed of the disk jumped from 1.0 to 1.5 MB/s ! Not bad, I
thought. But when I upgraded to Warp II the disk performance suddenly
jumped to 2.2 MB/s. This is probably due to OS/2 using multiple mode
block transfer mode. Then finally, I changed the AT bus speed from 8.3 MHz
to 11 MHz and the disk transfer speed jumped again from 2.2 to
2.6 MB/s !
From this can be learned that there seems to be a lot that can be done
about slow IO. Just be careful when you muck around with the CMOS
parameters though, since there is a very high likelyhood of making
mistakes that can make the machine unusable or prone to strange
errors. Usually, this is not dangerous, just reset the value to the
old one and your machine should perform as before. Sometimes, though,
you _can_ destroy your computer by changing values incorrectly. Be
warned...
2.6 Memory speed tests
Memory speed seems to be a forgotten area when talking about the speed
of a computer. You hear a lot about CPU speed and disk speed and video
speed and such, but rarely of memory speed. This is wrong IMHO, since
a lot of the performance of a computer has to do with memory IO. When
PC Magazine measured memory speed in one of their grande tests they
discovered a lot of difference between the good and bad performers. I
would like to bring this fact into focus: Memory IO speed is a vital
part of the performance of your computer, even more so with faster and
faster processors. A really fast RISC processor can execute as much as
40 instructions in one memory read...
Of course, memory speed timing is a complex issue. How fast a memory
access is depends on:
The pattern of the access : Random, sequential, local, global ?
Cache : Primary and secondary cache size and type.
Virtual memory : Paging algorithm, disk IO performace.
Motherboard Memory controller : This is the key component to fast mem IO
Speed of SIMMS : 60, 70 or 100 ns?
etc. etc.
These tests are also limited. They cannot test the whole truth about
the speed of your memory IO.
The Mem-marks are calculated as a weighted mean average of the other values.
2.6.1 Memory copy
This test first allocates a chunk of memory and then reads and writes
it back and forth a few times to "activate" the memory: Initialize the
physical pages, and read it into the caches. This is done to obtain as
stable as possible value between measures. It also has the effect of
maximizing the access speed.
Then it proceeds to copy the first half of the memory to the second
and then the second half to the first. This is to diminish the strange
effects you get from write-through and copy-back caches. When it says
5 kB copy, that means copying 2.5 kB back and forth.
You can clearly see the effects of your caches. As long as the access
is within the cache, it is a lot faster. There is also another factor
that will make the larger (80-160kB) values jump up and down, and that
is the effect of virtual memory. The second level cache performs well
on a sequential memory range, but the virtual memory will chop the
physical memory into 4kB pages and shuffle them around in physical
memory. If you are lucky, the physical pages are sequential but they
don't have to be. When they are not, the pages are scattered around
and the second level cache (which is almost always a direct-mapped
cache) will have a larger probability of mapping several physical
pages to the same area. Higher level cache (2-way, 4-way) techniques
should help here, but that is not certain.
Again, CMOS settings can very much affect the speed of your memory
access. Be sure to use as low value as possible on the various wait
state entries and make sure the whole memory is cached, not just the
first 16 MB if you have more.
2.6.2 Memory read
Tested by calculating the checksum over the specified amount of bytes
over and over again.
2.6.3 Memory write
Tested by writing a value into all longwords of the specified amount
of memory.
3 Copyright notice
There is no warranty. Use this software at your own risk. Due to the
complexity and variety of today's hardware and software which may be
used to run this program, I am not responsible for any damage or loss
of data caused by use of this software. It was tested and is expected
to work correctly, but nobody can actually guarantee this for any
circumstances. And because this software is free, you get what you pay
for...
This program can be used freely for non-commercial purposes.
4. Thanks
Thanks to Kai Uwe Rommel (rommel@ars.muc.de) for supplying the disk IO
benchmark code and to Al Aburto (aburto@marlin.nosc.mil) for supplying
the CPU integer and CPU float benchmark code.
-- Henrik Harmsen
Email: harmsen@eritel.se
Appendix A - TODO
1 Make the CPU integer and CPU float tests adaptive to the speed of the
computer.
2 DIVE: Support for bank-switched cards. Better error handling. Finish the
Memory->Screen bitblit tests.
3 Graphics test: The Memory to screen bitblit copy is probably not
correct for 16 and 24 bit displays.
Appendix B - Building
You need Cset++ 2.1. Cd src, run nmake. It is probably quite easy to
port to emx-gcc.
Why are all the source code files named pmb_* ? Well I first wanted
to call it PMBench, as a play with WinBench, but it turned out that
PC Magazine already had a PMBench program... So I changed the name
to SysBench, but I did not have time to change all the 'pmb' to 'sysb'...
Appendix C - Example results
Example of a result file, when benchmarking my own system, which is:
Software:
--------------
OS/2 2.11
Diamond Viper display drivers 1.02beta running 1024x768x8
Hardware:
--------------
CPU : 486DX2-66
Chipset : UMC
Cache : 8 kB level 1, 256 kB copy-back level 2.
Memory : 20 MB 70ns.
Harddisk: disk 1: Seagate 340 MB. disk 2: Conner CFA540A 540 MB.
Video : Diamond Viper VLB, 2MB VRAM, 2.02 BIOS.
-------
Sysbench 0.9.0 result file created Sat Oct 22 14:31:27 1994
Graphics
BitBlt S->S cpy : 52.640 Mpixels/s
BitBlt M->S cpy : 15.581 Mpixels/s
Filled Rectangle : 356.366 Mpixels/s
Pattern Fill : 90.477 Mpixels/s
Vertical Lines : 6.233 Mpixels/s
Horizontal Lines : 9.656 Mpixels/s
Diagonal Lines : 7.545 Mpixels/s
Text Render : 18.553 Mpixels/s
------------------------------------------------------------
Total : 73.835 PM-marks
CPU integer
Dhrystone : 39.800 VAX 11/780 MIPS
Hanoi : 27.083 moves/25 usec
Heapsort : 19.290 MIPS
Sieve : 37.741 MIPS
------------------------------------------------------------
Total : 32.938 CPUint-marks
CPU float
Linpack : 2.535 MFLOPS
Flops : 3.572 MFLOPS
Fast Fourier Tr. : 4.291 VAX FFT's
------------------------------------------------------------
Total : 3.472 CPUfloat-marks
Direct Interface to video extensions - DIVE
Video bus bandw. : --.--- MB/s (on Warp II, this was ca. 13 MB/s)
DIVE fun : --.--- fps
M->S, DD, 1.00:1 : --.--- fps
M->S, DD, 2.00:1 : --.--- fps
M->S, DD, 2.43:1 : --.--- fps
------------------------------------------------------------
Total : --.--- DIVE-marks
Disk I/O - disk 2: 528 MB
Average seek time : 16.852 ms
Transfer speed : 1.990 MB/s
------------------------------------------------------------
Total : 1.465 DiskIO-marks
Memory
5 kB copy : 61.561 MB/s
10 kB copy : 49.211 MB/s
20 kB copy : 33.167 MB/s
40 kB copy : 25.707 MB/s
80 kB copy : 25.571 MB/s
160 kB copy : 17.578 MB/s
320 kB copy : 15.526 MB/s
640 kB copy : 13.385 MB/s
1280 kB copy : 11.941 MB/s
5 kB read : 70.885 MB/s
10 kB read : 42.156 MB/s
20 kB read : 42.970 MB/s
40 kB read : 32.170 MB/s
80 kB read : 31.747 MB/s
160 kB read : 21.777 MB/s
320 kB read : 19.533 MB/s
640 kB read : 17.150 MB/s
1280 kB read : 15.710 MB/s
5 kB write : 50.263 MB/s
10 kB write : 47.512 MB/s
20 kB write : 49.802 MB/s
40 kB write : 50.763 MB/s
80 kB write : 48.561 MB/s
160 kB write : 47.028 MB/s
320 kB write : 44.140 MB/s
640 kB write : 44.034 MB/s
1280 kB write : 42.258 MB/s
------------------------------------------------------------
Total : 28.007 Mem-marks