home *** CD-ROM | disk | FTP | other *** search
-
- --------------------------------- VBENCH -----------------------------------
-
- Requirements:
- ------------
-
- * 80286 or higher processor
- * DOS 2.0 or higher
- * VGA compatible graphics card
-
- To compile you need:
-
- * C++ compiler -> BORLAND C++ 3.1
- * Assembler -> TASM 3.1+
- * Linker
- * Mark Betz's HTimer class
- Where to get: Compuserve-> the Gamer's forum, Game Design library
-
- Files Included:
- --------------
-
- In addition to this file, the following files should also be in the
- ZIP file:
-
- * VBENCH.EXE - The executable video benchmark program
-
- * VBENCH.CPP - The main C++ source module, takes care of
- calling and calculating time for the benchmarks.
-
- * BENCH.ASM - The benchmark Assembly language source module,
- includes all benchmark code.
-
- * VIDEO.ASM - The video mode setup and buffer management code.
-
- * VBENCH.MAK - The make file used to compile program
-
- * VBENCH.PRJ - The Borland C++ 3.1 project file used to compile program
-
- Description:
- -----------
-
- The VBENCH program was developed for the prime purpose of comparing
- different blit (block-transfer) techniques in both mode 13h and tweaked
- mode (planar mode 13h with 4 pages). Hopefully, the benchmarks will
- serve the purpose of helping graphics programmers choose the technique
- that suits their application best, based on some of the timing results.
- By no means should these results be used as the _sole_ reason for
- choosing a technique, because there are many special cases that aren't
- accounted for in the benchmarks.
-
- Usage:
- -----
-
- To use the benchmark program, all that's required is that you type
- in it's name at the command line, like so:
- VBENCH
-
- Press a key at the prompt, and the tests will then go underway...
- depending on your system, and on the amount of tests being done, this
- may take some time. Once they are finished, the program will exit and
- display the benchmark results on your screen.
-
-
- Benchmark Info:
- ---------------
-
- - For Ram-to-Video, AND for Video-to-Ram benchmarks, a 64,016 byte
- buffer was used as the Ram buffer, and therefore named ram_buffer. The
- extra 16 bytes on the end are for special non-aligned accesses. For
- Video-to-Video benchmarks, I copied from higher addresses to lower
- addresses, using an incrementing index. (ex: 1st copy moves from byte
- 4 to byte 2, 2nd copy moved from byte 5 to byte 3,etc). The reason for
- me doing this was because this seemed to be the only way to get the
- average speed of moves. It may sound weird, but in my tests, at least,
- on my system, I got much faster speeds if I copied _ahead_, by, say,
- about 2 bytes, as compared to copying ahead by 32 bytes or copying
- 'backwards'. I'd be interested in hearing if the same situation occurs
- to others out there.. all you need do is change the video transfer
- functions in BENCH.ASM, and compare those results to the 'backwards'
- moves.
-
- - As of yet, all benchmarks are 64,000 byte moves, repeated 10
- times. The functions are not _called_ 10 times, but the functions
- are _performed_ 10 times. Perhaps in the future I will change this,
- but it doesn't make a difference right now, since I am timing video
- speed, not function speed.
- No parameters were passed to the benchmark functions, and no
- variables were accessed by the function. Each function was aligned
- on a paragraph boundary. The program was compiled in COMPACT model,
- so all CALLs were NEAR calls.
-
- - One thing to note.. all current functions are aligned optimally
- for the type of move being done,assuming the type is of word or dword.
- The reason for this is that aligned moves work much faster than unaligned
- moves, and you should try to avoid those types of moves as best you can.
- On my system, I have found that unaligned moves can be just as slow, or
- even slower than BYTE moves.
-
- - For mode 13h benchmarks, ram_buffer was treated as a 64,000 pixel
- buffer setup just like the screen (linear bitmap).
-
- - For Tweaked mode benchmarks, ram_buffer was treated as though it
- was set up in a planar fashion, meaning every 16,000 bytes of the buffer
- represented a different plane. This isn't a cheat, but an ideal setup
- for Tweaked mode video transfers.
-
- - I didn't do a interleaved write measure because I usually always
- code my loops to use REP MOVS' instructions. The interleaved write
- requires that you do a LOOP, which will always be slower than a REP
- instruction, since REP doesn't need to load the instruction pointer
- and whatnot. If you wanted to do a fair comparison of interleaved and
- non-interleaved writes, you would want to make them _BOTH_ contain LOOP
- instructions, avoiding the REP MOVS instructions where possible.
-
- Specific Benchmark Info:
-
- * Shared Benchmarks (benchmarks done in both mode 13h and Tweaked mode)
-
- - Byte Blit
-
- Write to the screen, using BYTE moves.
-
- - Word Blit
-
- Write to the screen, using WORD moves.
-
- - Word Read
-
- Read from the screen, using WORD moves.
-
- * Mode 13h-specific Benchmarks
-
- - Word Video Transfer
-
- Video Transfer (moving data using the video card as the source
- and the destination), using WORD moves.
-
- * Tweaked mode-specific Benchmarks
-
- - Hardware Video Transfer
-
- Video Transfer (moving data using the video card as the source
- and the destination), with the video card in Write Mode 1,
- using BYTE moves. Write mode 1 gives a hardware-assisted
- move which allows 32-bits to be moved with one MOVSB instruction.
- 32-bits equals 4 pixels in tweaked mode.
-
- Adding Benchmarks:
- -----------------
-
- To add a benchmark is basically straightforward. The main module,
- VBENCH.CPP includes a file called VBENCH.H.. this file serves the
- purpose of defining the benchmarks to be done by the main program.
- There are two class definitions in VBENCH.H.. one named SharedBenchData,
- the other named BenchData. Each of these holds a description of the
- benchmark in string form, and a pointer or pointers to the benchmark
- function(s). The only thing these classes lack is the timing results
- (which are stored elsewhere in the program) for each test, but the
- reason for this is to make adding more benchmarks less work.
-
- Now, to add a benchmark test to the list, all you need to do is:
-
- 1) Define the function prototype. There are 2 different 'slots'
- for function prototypes, one for mode 13h function prototypes,
- and one for tweaked mode function prototypes. While you don't
- have to put the prototypes in these places, it does help in
- readability and organization.
-
- 2) Depending on the type of benchmark you are peforming, you either
-
- A) For mode-specific benchmarks, find the correct list, either
- the Tweaked mode or Mode 13h list, and add another BenchData
- object to the list. To add another BenchData object, you must
- define it like such:
-
- function_address,bench_description
-
- The function_address is just the benchmark function name,
- without the parentheses(). The bench_description is a
- description of the benchmark being performed, in a string
- form. Example:
-
- New13Blit,"A new blit function"
-
- B) For shared benchmarks (benchmarks that can be performed in
- both mode 13h or Tweaked mode), find the shared_benchmark list,
- and add another SharedBenchData object to the list. To add
- another SharedBenchData object, you must define it like such:
-
- m13_function_address,tw_function_address,bench_description
-
- The m13_function address is just the mode 13h benchmark
- function name, without the parentheses(), and the
- tw_function_address is the Tweaked mode benchmark function
- name. bench_description is a description of the benchmark
- being performed. Example:
-
- NewBlit13,NewBlitTw,"A new blit function"
-
- Note: the bench_description strings are limited to 30 characters!
-
- 3) Include the benchmark functions file in the compiling project,
- and then compile away!
-
- Guidelines for creating benchmark functions:
-
- 1) Right now I only do a loop of 10 64,000 byte moves. This can be
- performed using byte,word, or dword transfers..just be sure to
- indicate which kind was being done in the benchmark description.
-
- 2) I align all benchmark functions on a paragraph boundary, to make
- sure the timed function is at optimal speed. (though it might not
- make _too_much_ difference in time)
-
- 3) All system ram access is done on the ram_buffer which is located
- in the Uninitialized Far Data segment, and also paragraph aligned.
-
- 4) No parameters are passed, and no variables accessed from within
- the functions.
-
- 5) All functions requiring OUT's or somesuch activity do these within
- the loop. Even if the OUT is needed only once, it still should
- be included in the loop, to make sure no cheats are performed.
- The loop I speak of is the outside loop, not the inner transfer
- loops.
-
- You can use the current benchmark code as a reference if you like.
-
- Notes:
- -----
-
- * This program can be used and distributed without any worry. It is
- asked, though, that it not be sold for profit. The benchmark can
- be modified, but with these restrictions:
-
- 1) Adding more benchmarks to the program is allowed, so long as
- the current benchmarks remain in the program.
-
- 2) Modifying the current benchmarks is allowed ONLY if you
- contact the author of that benchmark. Right now, there's only
- one programmer (me), but I hope that others will also contribute
- to this benchmark program.
-
- 3) Modifying the _way_ in which the benchmarks are timed is allowed
- ONLY if you contact me first (Dan Corritore)
-
- * Actually, I need to find some way of making sure version numbers
- and additions to the benchmarks are handled correctly, so for now,
- don't upload any additions/changes until speaking with me (Dan Corritore).
-
- * Eventually I will rewrite this documentation once other benchmarks
- are added, or perhaps if the benchmark program is changed in any way.
- Any help or suggestions with documentation layout and stuff would be
- greatly appreciated, as I'm not the best documentor.
-
- * Please, if you see a problem with the program code, or any mistakes,
- or perhaps think I'm going about doing the benchmarks totally wrong,
- let me know!
-
- Email address(es):
- -----------------
-
- Dan Corritore, author of VBENCH 1.0:
- CompuServe address: 70243,1110
-