Computer Gaming World 3D testing: What are the facts?

The recently published 3D graphics accelerator review in Computer Gaming World's November issue "The 3D Wave hits the Shores" (p.147) attempts to rate the performance of 3D graphics cards. However, we believe the testing methodology is flawed and as a result the conclusions drawn from these measurements are incorrect and not representative of the boards' actual performance.

Although Matrox has approached Computer Gaming World with these issues and requested that the magazine run their tests again, Computer Gaming World has not yet complied with this request. The following outlines Matrox's concerns about the magazine's methodology and provides supplementary performance results based on recommended methodology.

Computer Gaming World's testing premises
Computer Gaming World's review was intended to demonstrate how graphics accelerators would perform in a "real-world" 3D application. Several elements of the review, which will all be discussed in detail, undermine the effectiveness of their testing to represent 3D hardware performance.
1) The results published by Computer Gaming World cannot be reproduced by Matrox when following the same methodology on a similar system. Our testing shows the Matrox Mystique achieves higher scores than Diamond.
2) Hellbender was the only Direct3D application available at the time, but the magazine used it to draw specific conclusions about 3D hardware performance and to make product recommendations.
3) By not pushing the 3D hardware to its limits in their tests, Computer Gaming World actually measured the CPU's performance. The results Computer Gaming World printed therefore do not accurately represent the differences in 3D performance between software 3D rendering (i.e. the STB Lightspeed 128) and a true mainstream 3D hardware accelerator (i.e. the Matrox Mystique.)
4) By only publishing a min - max range, and not an average score, Computer Gaming World does not accurately represent the overall performance of the boards.
5) Computer Gaming World's methodology also introduced the element of human error in the testing, resulting in data that cannot be accurately reproduced.


1 - Hellbender benchmark results erroneous
Using Computer Gaming World's own methodology as described by the reviewer, Matrox found results that differ significantly from those published by the magazine. Specifically, the Matrox Mystique beats the Diamond Stealth 2000XL - the editors' favored card.
CGW published results
Software (STB Lightspeed) Diamond Stealth 3D 2000XL Matrox Mystique
Min Max Min Max Min Max
Cockpit "on" 7 15 7 18 4 19
System: P5-166, Triton II chipset, 32MB EDO RAM, 512KB pipeline burst L2 cache, 75Hz refresh

Matrox's results using the same methodology
Software (STB Lightspeed) Diamond Stealth 3D 2000XL Matrox Mystique
Min Max Min Max Min Max
Cockpit "on" 7 18 4 19 4 25
System: P5-166, Triton FX chipset, 16MB EDO RAM, 256KB cache, 85Hz refreshchipset, 16MB EDO RAM, 256KB cache, 85Hz refresh using the latest drivers available for each graphics accelerator at the time of the testing.


2 - Using Hellbender as only application to draw conclusions
Although Computer Gaming World does acknowledge, in their sidebar, that using Hellbender as the sole application tested could only give "an indication of 3D performance, not the final word," (p. 153) they use these benchmark numbers to draw some very definite conclusions. In fact, the reviewer claims that "the 3D performance of some of these cards isn't all it's cracked up to be," which is a strong statement considering the lack of comprehensive tests used to draw such a conclusion. In addition, as will be explained next, the results the magazine printed were not representative of the boards' 3D capabilities in general, but were specific to the game they chose to test, which also makes the conclusions they draw inaccurate.

3 - Testing methodology with Hellbender not representative of true performance
As shown in the following table, there is a larger performance difference between hardware and software acceleration when the cockpit is "off" than when the cockpit is "on". By testing the boards' performance in the game with the cockpit "on", Computer Gaming World has created a situation where the 3D hardware performance is similar to the software performance because it is limited by the bottleneck of the CPU. Therefore, the results do not reflect the graphics' boards actual 3D capabilities.

Tests performed by Matrox
Software (STB Lightspeed) Diamond Stealth 3D 2000XL Matrox Mystique
Min Max Min Max Min Max
Cockpit "on" 7 18 4 19 4 25
No cockpit 4 17 4 17 4 34
Overlay (partial cockpit) 5 16 4 15 4 25
System: P5-166, Triton FX chipset, 16MB EDO RAM, 256KB cache, 85Hz refreshchipset, 16MB EDO RAM, 256KB cache, 85Hz refresh using the latest drivers available for each graphics accelerator at the time of the testing.


Q: Why would it make a difference whether the cockpit is on or off?
A: The cockpit turned "on" during testing reduces the viewing area to be rendered to a significantly lower resolution (about half the size of the full screen resolution). Consequently, the performance bottleneck in this situation is not the speed of the 3D rendering card - which was what Computer Gaming World intended to test - but instead the CPU's ability to transform and setup triangles (software performance).

When rendering a 3D scene, a mainstream 3D accelerator depends on the CPU to calculate the vertices of the triangles to be rendered. The larger the triangles are, or the more pixels there are to be rendered in the scene, the larger the portion of these pixels are left to be rendered by the hardware accelerator. Reducing the size of the viewing area by turning the cockpit on also reduces the number of pixels to be rendered by hardware on the screen, therefore not stressing the hardware's capabilities.

The actual benefit of using hardware 3D acceleration is being able to add more detail to the scene, and run at a higher resolution to make the application graphically more appealing. By performing testing in the reduced viewing area, CGW is not testing the fundamental improvement 3D hardware brings to the gamer, which is the capability to handle 3D in higher resolutions at high frame rates. Computer Gaming World's test therefore does not accurately reflect the boards' performance in a situation where hardware would heavily be relied upon.

Q: Wouldn't most gamers play with the cockpit on?
A: Playing with the cockpit on or off is usually a personal preference. However, while cockpit instruments can contain some important information in certain games, the cockpit in Hellbender was specifically designed to block out as much of the visible area as possible. By so reducing the amount of rendering required, the game maintains a reasonable frame rate in the case where rendering would have to be handled in software (if 3D hardware acceleration is not present). However, playing without the cockpit not only increases the viewing area for the player, therefore creating a more immersive experience, it also allows the user to truly benefit from 3D hardware performance. In order to truly measure the benefit of using hardware 3D acceleration, the cockpit should have been turned off.

If, in fact, the user required the information to be displayed while playing Hellbender, the overlay mode available in the game allows to display all essential dials while leaving the main viewing area almost completely open. This mode could also be used to run the tests, and would also demonstrate (as shown in the above benchmarks) a significant difference between hardware and software rendering.

3D testing
Cockpit on Cockpit off Overlay


4 - Publishing a range, instead of an average number, is not representative of overall performance
By publishing minimum and maximum frame rates reached by the hardware, Computer Gaming World does not accurately represent the average performance delivered by each board. These minimum and maximum scores may only have been reached for a few seconds, at specific points in the game. However, these scores might only represent one percent of the total game play, and therefore do not represent the average performance delivered by the hardware overall during the entire game.

5 - Introducing the human error element in testing produces unreliable scores
When performing their tests, Computer Gaming World chose not to use the pre-set flight pattern available in the looped demo of Hellbender and instead produced their own path by flying through the game with a joystick. Although it might be argued that using the joystick would be more representative of actual game play, it also introduces the element of human error. In this particular case, a slight variation in flight pattern produces significantly different scores; any imperceptible inclination of the joystick upwards or downwards automatically results in large differences in frame rates. Since it was impossible for CGW to reproduce the exact same flight pattern for each board they tested, their methodology created an uneven playing field, causing unreliable test results. Using the demo loop at the beginning of the game would have provided a consistent flight path, generating easy to reproduce frame rates.

Conclusion
The testing methodology used by Computer Gaming World in their November issue fails to correctly depict the benefits of using a mainstream 3D hardware accelerator such as the Mystique in a real world application. In addition the test measurements published by Computer Gaming World misrepresented the graphics boards 3D performance, and the conclusions they draw in their article are therefore non-conclusive and incorrect.

The following test results provide a more accurate measurement of the differences in performance between software and hardware, as well as an actual assessment of each board's capabilities in real-world applications available today, and those coming to the market in the near future.

Card Matrox Mystique Number Nine Reality 332f Diamond Stealth 3D 2000XL ATI 3D Xpression PC2TV Hercules Dynamite 128/Video STB Lightspeed 128 Orchid Righteous 3D Diamond Stealth 3D 3000XL
Chipset 1064SG Virge Virge Rage2 ET6000 ET6000 Voodoo Virge/VX
Graphics Winmark
1024x768x8
44 26.6 38 34.5 34.5 44 N.S. 39.6
Graphics Winmark
640x480x16
38 20.4 34.1 28.5 31.1 36.7 N.S. 35.0
Cbench VGA (fps) 123.3 114.1 116.3 83.7 123.1 123.1 N.S. 106.7
Cbench SVGA (fps) 35.9 33.4 34.2 36.1 38.3 38.3 N.S. 31.2
Quake (320x200) (fps) 32.9 32.44 32.35 31.3 33.89 34.44 N.S. 33.6
Quake (640x480) (fps) 13.7 - N.S. 12.5 14.86 14.31 N.S. N. S.
Min D3D Test, Fill Rate (megapixels per second) 11.20 - 8.92 12.18 - - 32.88 9.31
Min D3D Test, Polygon Throughput (kilopixels per second) 142.14 - 200.88 139.04 - - 306.53 178.56
Mid D3D Test, Fill Rate (megapixels per second) 11.20 - 5.20 10.11 - - - 5.78
Mid D3D Test, Polygon Throughput (kilopixels per second) 129.0 - 142.14 76.53 - - - 171.12
Max D3D Test, Fill Rate (megapixels per second) N.S. - 4.65 6.72 - - 24.7 4.78
Max D3D Test, Polygon Throughput (kilopixels per second) N.S. - 141.36 76.53 - - 230.64 171.75
D3D Tunnel Test (512x384)(fps) 72.99 - 18.58 35.08 - - N.S. 21.92
D3D Tunnel Test (640x400)(fps) 57.14 - N.S. 33.11 - - 120 N S
D3D Twist Test (512x384)(fps) 232.55 - 69.93 69.93 - - N.S. 60.60
D3D Twist Test (640x400)(fps) 190.0 - N.S. 69.93 - - 120 N S
Hellbender, Slowest-Fastest (fps) No Cockpit 4-34 - 4-17 4-25 4-18 4-17 7-38 4-19
Monster Truck Madness, Slowest-Fastest (fps) Dashboard Off 4-30 - 4-11 4-16 4-11 4-11 8-32 4-10


Bottom toolbar
Copyright © 1996 Matrox Graphics Inc. All rights reserved.
Send all questions and comments regarding this site's construction to webmaster@matrox.com