NetNews Usenet Archive 1992 #30

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #30 / NN_1992_30.iso / spool / comp / benchmar / 1876 < prev next >

Wrap

Text File | 1992-12-16 | 16.9 KB | 433 lines

Newsgroups: comp.benchmarks Path: sparky!uunet!think.com!ames!data.nas.nasa.gov!amelia.nas.nasa.gov!eugene From: eugene@amelia.nas.nasa.gov (Eugene N. Miya) Subject: [l/m 12/1/92] SPEC info sources (16/28) c.be. FAQ Keywords: who, what, where, when, why, how Sender: news@nas.nasa.gov (News Administrator) Organization: NAS Program, NASA Ames Research Center, Moffett Field, CA Date: Wed, 16 Dec 92 12:25:10 GMT Message-ID: <1992Dec16.122510.18274@nas.nasa.gov> Reply-To: eugene@amelia.nas.nasa.gov (Eugene N. Miya) Lines: 420 16 SPEC <This panel> 17 Benchmark invalidation methods 18 19 WPI Benchmark 20 Equivalence 21 TPC 22 23 24 25 Ridiculously short benchmarks 26 Other miscellaneous benchmarks 27 28 References 1 Introduction to the FAQ chain and netiquette 2 3 PERFECT Club 4 5 Performance Metrics 6 7 Music to benchmark by 8 Benchmark types 9 Linpack 10 11 NIST source and .orgs 12 Measurement Environments 13 SLALOM 14 15 12 Ways to Fool the Mass with Benchmarks Contents: 1. What is SPEC 2. How to Contact SPEC 3. SPEC's Products and Services 4. Current SPEC Benchmarks 4.1 CPU Benchmarks 4.2 SDM Benchmarks 5. Outdated SPEC Benchmarks 6. Forthcoming SPEC Benchmarks 7. Membership in SPEC 1. What is SPEC =============== SPEC, the System Performance Evaluation Cooperative, is a non-profit corporation formed to "establish, maintain and endorse a standardized set of relevant benchmarks that can be applied to the newest generation of high-performance computers". The founders of this organization believe that the user community will benefit greatly from an objective series of applications-oriented tests, which can serve as common reference points and be considered during the evaluation process. While no one benchmark can fully characterize overall system performance, the results of a variety of realistic benchmarks can give valuable insight into expected real performance. The members of SPEC are currently: Apple, AT&T/NCR, Bull, Compaq, Control Data, Data General, DEC, Fujitsu, Hal Computer, Hewlett-Packard, IBM, Intel, Intergraph, MIPS, Motorola, NeXT, Prime, Siemens Nixdorf, Silicon Graphics, Solbourne, Sun, Unisys. SPEC Associates are currently: Leibniz Computing Center of the Bavarian Academy of Science, National Taiwan University, SERC Daresbury Laboratory. Legally, SPEC is a non-profit coporation (legal name: Standard Performance Evaluation Corporation) registered in California. SPEC basically does 2 things: - SPEC develops suites of benchmarks intended to measure computer performance. These suites are packaged with source code and tools and are extensively tested for portability before release. They are available to the public for a fee covering development and administration costs. By license agreement, SPEC members and customers agree to run and report results as specified in each benchmark suite's documentation. - SPEC publishes a quarterly report of SPEC news and benchmark results: The SPEC Newsletter. This provides a centralized source of information for SPEC benchmark results. Both SPEC members and non-SPEC members may publish in the SPEC Newsletter, though there is a fee for non-members. (Note that results may be published elsewhere as long as the format specified in the SPEC Run Rules is followed.) 2. How to Contact SPEC ====================== SPEC [Systems Performance Evaluation Corporation] c/o NCGA [National Computer Graphics Association] 2722 Merrilee Drive Suite 200 Fairfax, VA 22031 USA Phone: +1-703-698-9600 Ext. 318 FAX: +1-703-560-2752 E-Mail: spec-ncga@cup.portal.com For technical questions regarding the SPEC benchmarks (e.g., problems with execution of the benchmarks), Dianne Dean (she is the person normally handling SPEC matters at NCGA) refers the caller to an expert at a SPEC member company. 3. SPEC Products and Services ============================= The SPEC benchmark sources are generally available, but not free. SPEC is charging separately for its benchmark suites; the income from the benchmark source tapes is intended to support the administrative costs of the corporation - making tapes, answering questions about the benchmarks, and so on. Buyers of the benchmark tapes have to sign a license stating the conditions of use (site license only) and the rules for result publications. All benchmark suites come on QIC 24 tapes, written in UNIX tar format. Accredited universities receive a 50 % discount on SPEC tape products. Current prices are: CINT92 $ 425.00 CFP92 $ 575.00 CINT92&CFP92 $ 900.00 SDM $ 1450.00 Release1.2b $ 1000.00 The SPEC Newsletter appears quarterly, it contains result publications for a variety of machines (typically, about 50-70 pages of result pages per issue) as well as articles dealing with SPEC and benchmarking. Newsletter $ 550.00 (1 year subscription, 4 issues) 4. Current SPEC Benchmarks ========================== 4.1 CPU Benchmarks ================== There are currently two suites of compute-intensive SPEC benchmarks, measuring the performance of CPU, memory system, and compiler code generation. They use UNIX as the portability vehicle, but the percentage of time spent in operating system and I/O functions is generally negligible. CINT92, current release: Rel. 1.1 --------------------------------- This suite contains 6 benchmarks performing integer computations, all of them are written in C. The individual programs are: Number and name Area Approx. size gross net 008.espresso Circuit Theory 14800 11000 022.li LISP interpreter 7700 5000 023.eqntott Logic Design 3600 2600 026.compress Data Compression 1500 1000 072.sc Spreadsheet Calculator 8500 7100 085.gcc GNU C Compiler 87800 58800 ------ ----- 123900 85500 The approximate static size is given in numbers of source code lines, including declarations (header files). "Gross" numbers include comments and blank lines, "net" numbers exclude them. CFP92, current release: Rel. 1.1 -------------------------------- This suite contains 14 benchmarks performing floating-point computations. 12 of them are written in Fortran, 2 in C. The individual programs are: Number and name Area Lang. Approx. size gross net 013.spice2g6 Circuit Design F 18900 15000 015.doduc Monte Carlo Simulation F 5300 5300 034.mdljdp2 Quantum Chemistry F 4500 3600 039.wave5 Maxwell's Equation F 7600 6400 047.tomcatv Coordinate Translation F 200 100 048 ora Optical Ray Tracing F 500 300 052.alvinn Robotics C 300 200 056.ear Medical Modeling C 5200 3300 077.mdljsp2 Quantum Chemistry F 3900 3100 078.swm256 Shallow Water Model F 500 300 089.su2cor Quantum Physics F 2500 1700 090.hydro2d Astrophysics F 4500 1700 093.nasa7 NASA Kernels F 1300 800 094.fpppp Quantum Chemistry F 2700 2100 ----- ----- 57900 43900 More information about the individual benchmarks is contained in description files in each benchmark's subdirectory on the SPEC benchmark tape. The CPU benchmarks can be used for measurement in two ways: - Speed measurement - Throuput measurement Speed Measurement ----------------- The results ("SPEC Ratio" for each individual benchmark) are expressed as the ratio of the wall clock time to execute one single copy of the benchmark, compared to a fixed "SPEC reference time" (which was chosen early-on as the execution time on a VAX 11/780). As is apparent from results publications, the different SPEC ratios for a given machine can vary widely. SPEC encourages the public to look at the individual results for each benchmarks; users should compare the characteristics of their workload with that of the individual SPEC benchmarks and consider those benchmarks that best approximate their jobs. However, SPEC also recognizes the demand for aggregate result numbers and has defined the integer and floating-point averages SPECint92 = geometric average of the 6 SPEC ratios from CINT92 SPECfp92 = geometric average of the 14 SPEC ratios from CFP92 Throughput Measurement ----------------------- With this measurement method, called the "homogenuous capacity method", several copies of a given benchmark are executed; this method is particularly suitable for multiprocessor systems. The results, called SPEC rate, express how many jobs of a particular type (characterized by the individual benchmark) can be executed in a given time (The SPEC reference time happens to be a week, the execution times are normalized with respect to a VAX 11/780). The SPEC rates therefore characterize the capacity of a system for compute-intensive jobs of similar characteristics. Similar as with the speed metric, SPEC has defined averages SPECrate_int92 = geometric average of the 6 SPEC rates from CINT92 SPECrate_fp92 = geometric average of the 14 SPEC rates from CFP92 Because of the different units, the values SPECint92/SPECfp92 and SPECrate_int92/SPECrate_fp92 cannot be compared directly. No more SPECmark computation ---------------------------- While the old average "SPECmark[89]" has been popular with the industry and the press (see section 5: Oudated SPEC Benchmarks), SPEC has intentionally *not* defined an average "SPECmark92" over all CPU benchmarks of the 1992 suites, for the following reasons: - With 6 integer and 14 floating-point benchmarks, the average would be biased too much towards floating-point, - Customers' workloads are different, some integer-only, some floating-point intensive, some mixed, - Current processors have developed their strengths in a more diverse way (some more emphasizing integer performance, some more floating- point performance) than in 1989. 4.2 SDM Benchmarks ================== SDM stands for "Systems Development Multiuser"; the two benchmarks in this suite (current release: 1.1) characterize the capacity of a system in a multiuser UNIX environment. Contrary to the CPU benchmarks, the SDM benchmarks contain UNIX shell scripts (consisting of commands like "cd", "mkdir", "find", "cc", "nroff", etc.) that exercise the operating system as well as the CPU and I/O components of the system. The workloads of the benchmarks are intended to represent UNIX software development environments. For each benchmark, throughput numbers (scripts, i.e. simulated user loads per hour) are given for several values of concurrent workloads. The reader can determine the peak throughput as well as the ability of a system to sustain throughput over a range of concurrent workloads. Since the workloads for the two benchmarks (057.sdet, 061.kenbus1) are different, the throughput values for different benchmarks cannot be compared. 5. Outdated SPEC Benchmarks =========================== SPEC has published the first CPU benchmark suite in 1989, the current release of it is 1.2b. It contains 10 compute-intensive programs, 4 integer (written in C) and 6 floating-point (written in Fortran). The following average values had been defined: SPECint89 = geometric average of the SPEC ratios of the 4 integer programs in rel. 1.2b (CPU-Suite of 1989) SPECfp89 = geometric average of the SPEC ratios of the 6 floating-point programs in rel. 1.2b SPECmark89 = geometric average of all 10 SPEC ratios of the programs in rel. 1.2b In addition, there was the possibility of throughput measurements, with 2 copies of a benchmark running per CPU, called "Thruput Method A" (There was never a "Method B"). The following average values had been defined: SPECintThruput89 = geometric average of the Thruput Ratios of the 4 integer programs SPECfpThruput89 = geometric average of the Thruput Ratios of the 6 floating-point programs SPECThruput89 ("aggregate thruput") = geometric average of the Thruput Ratios of all 10 programs SPEC now discourages use of the 1989 benchmark suite and recommends use of the CINT92 and CFP92 suites, for the following reasons: - The new suites cover a wider area of programs (20 programs instead of 10), - The execution times for some of the old benchmarks became too short on today's fast machines, with the danger of timing inaccuracies, - Input files have now been provided for most benchmarks in the 1992 suites, eliminating the danger of unintended compiler optimizations (constant propagation), - The new suites do no longer contain a benchmark (030.matrix300) that was too much influenced by a particular compiler optimization. This optimization, while legal and a significant step in compiler technology (it is still often used with the benchmarks of 1992), inflated the SPEC ratio for this benchmark since it executed only code susceptible to this optimization. However, SPEC is aware of the fact that results with the old benchmark suite will still be quoted for a while and used for comparison purposes. SPEC will discontinue sales of Rel. 1.2b tapes after December 1992 and discontinue result publications for it after June 1993. 6. Forthcoming SPEC Benchmarks ============================== SPEC is currently working on an NFS benchmark, also known as the LADDIS benchmark, measuring the performance of network file servers that follow the NFS protocol. Release of this benchmark is expected in March 1993. A beta test version ("pre-LADDIS") has been made available earlier to interested parties; however, SPEC explicitly disallowed the use of pre-LADDIS results in company publications. There have been significant changes between pre-LADDIS and the final version of LADDIS. A number of other areas have been considered or are being considered by SPEC for future benchmark efforts: - I/O benchmarks - Client/Server benchmarks - RTE-based benchmarks - Commercial computing benchmarks SPEC is always open to suggestions from the computing community for future benchmarking directions. Of course, SPEC even more welcomes proposals for actual programs that can be used as benchmarks. 7. Membership in SPEC ===================== The costs for SPEC membership are Annual Dues $ 5000.00 Initiation Fee $ 1000.00 There is also the category of a "SPEC Associate", intended for accredited educational institutions or non-profit organizations: Annual Dues $ 1000.00 Initiation Fee $ 500.00 Associates have no voting privileges, otherwise they have the same benefits as SPEC members: Newsletters and benchmark tapes as they are available, with company-wide license. Probably more important are early access to benchmarks that are being developed, and the possibility to participate in the technical work on the benchmarks. The intention for associates is that they can act in an advisory capacity to SPEC, getting first-hand experience in an area that is widely neglected in academia but nethertheless very important in the "real world", and providing technical input to SPEC's task. SPEC meetings are held about every six weeks, for technical work and decisions about the benchmarks. Every member or associate can participate and make proposals; decisions are made by a Steering Committee (9 members) elected by the general membership at the Annual Meeting. All members vote before a benchmark is finally accepted. 8. Acknowledgment ================= This summary of SPEC's activities has been written initially by Reinhold Weicker (Siemens Nixdorf). Portions of the text have been carried over from earlier Usenet postings (Answers to Frequently Asked Questions, No. 16: SPEC) by Eugene Miya (NASA Ames Research Center). Additional input has been provided by Jeff Reilly (Intel). This summary is regularly updated by Jeff Reilly and possibly other SPEC people. Managerial and technical inquiries about SPEC should be directed to NCGA (see section 2). E-Mail questions that do not interfere too much with our real work :-) can also be mailed to jwreilly@mipos2.intel.com from North America Reinhold.Weicker@stm.mchp.sni.de from Europe or elsewhere ^ A s / \ r m / \ c h / \ h t / \ i i / \ t r / \ e o / \ c g / \ t l / \ u A / \ r <_____________________> e Language