NetNews Usenet Archive 1992 #19

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #19 / NN_1992_19.iso / spool / comp / benchmar / 1355 < prev next >

Wrap

Text File | 1992-09-02 | 6.0 KB | 170 lines

Newsgroups: comp.benchmarks Path: sparky!uunet!sun-barr!ames!data.nas.nasa.gov!amelia!eugene From: eugene@amelia.nas.nasa.gov (Eugene N. Miya) Subject: [l/m 4/23/92] good conceptual benchmarking (2/28) c.be FAQ Keywords: who, what, where, when, why, how Sender: news@nas.nasa.gov (News Administrator) Organization: NAS Program, NASA Ames Research Center, Moffett Field, CA Date: Wed, 2 Sep 92 11:25:10 GMT Message-ID: <1992Sep2.112510.576@nas.nasa.gov> Reply-To: eugene@amelia.nas.nasa.gov (Eugene N. Miya) Lines: 157 2 Benchmarking concepts <this panel> 3 PERFECT Club/Suite 4 5 Performance Metrics 6 Temporary scaffold of New FAQ material 7 Music to benchmark by 8 Benchmark types 9 Linpack 10 11 12 Benchmark Environments 13 14 15 12 Ways to Fool the Masses with Benchmarks 16 SPEC 17 Benchmark invalidation methods 18 19 WPI Benchmark 20 Equivalence 21 TPC 22 23 RFC 1242 terminology (network benchmarking) 24 25 Ridiculously short benchmarks 26 Other miscellaneous benchmarks 27 28 References 1 Introduction to FAQ chain and netiquette Benchmarking is difficult black art which combines several technical and social problems. It is a juggling act, as such, the solutions must attempt to combine several components to the solutions: technical and social. In particular the social problems require some degree of consensus very much like the problems Internatonal measurement: ala the Metric system. Benchmarking is usually seen as a linear process: ----------- | test | "optional input" -->| program |---> "output [time]" | | ----------- Sort of like a ruler or scale. It really is a more detailed process. This is probably too simplistic. A more useful figure: ----------- ----------- ----------- ----------- |pre | |pre | | | |post | ->|compiled |->|test |->|test |->|test |-> |condition| |execution|| | | ||execution| ----------- -----------| ----------- |----------- | | | ----------- | | |control | | |-|condition|-| | | ----------- From this figure one can see some of the more detailed elements and issues of the basic measurement problem: equivalence, concurrency, control, intrusive (invasive) measurement, overheads, preparation, etc. Before you ever say: "That's trying to measure apples and ornanges" you had best realize that the biologists and biochemists did just that several decades ago. They did. They discovered that apples and oranges have a very common base, it's called DNA and the gene maps between the two differ very little. Let's make some clear distinctions: Performance Evaluation The over all process. (Analysis and masurement) Performance Analysis Like mathematical analysis. The implication should be mathematical or simulation. Susceptible to illusion and deception. Never the last word. Ideally: deterministic. Performance Measurement The emphasis should be empirical. Benchmarks run on simulations are "Analysis." Measurement is a verification of real hardware performance. It's bound by the laws of physics. It can be spoofed. It appear as "the last word." This is where benchmarking lies. Ideally: demonstrable, repeatable, and reproducible. The history of area is such that many architectures are claimed for one performance and in the reality under-performing (usually). [Wulf81]: We want to learn about the consequences of different designs on the useability and performance of multiprocessors. Unfortunately, each decision we make precludes us from exploring its alternatives. This is unfortunate, but probably inevitable for hardware. Perhaps, however, it is not inevitable for the software.... and especially for the facilities provided by the operating system. Quoting Georg von\ Bekesy . . . AS I see it the difference between successful and unsuccessful research is basically a problem of asking the right question. I can distinguish the following types of questions: 1. The unimportant question 2. The premature question 3. The strategic question 4. The stimulating question 5. The embarrassing question (the kind asked at meetings) 6. The pseudo-question (often a consequence of a different definition or a different approach) As a beginner I wanted to find a strategic question, but was unable to do so. Pierce (and Bekesy) likes stimulating questions: they motivate you to do something. %A Willem A van\ Bergeijk %A John R. Pierce %A Edward E. David, Jr. %T Waves and the Ear %I Double Day %C Garden City, New York %D 1960 Every science begins with the observation of striking events like thunderstorms or fevers, and soon establishes rough connections between them and other events, such as hot weather or infection. The next stage is a stage of exact observation and measurement, and it is often very difficult to know what we should measure in order to best explain the events we are investigating. In the case of both thunderstorms and fevers the clue came from measuring the lengths of mercury columns in glass tubes, but what prophet could have predicted this? Then comes a stage of innumerable graphs and tables of figures, the dispair of the student, the laughing-stock of the man in the street. And out of this intellectual mess there sudden crystallizes a new and easily grasped idea, the idea of a cyclone of an electron, a bacillus or an antitoxin, and everybody wonders why it had not been thought of before. %A J.B.S. Haldane %T The Future of Biology %B oN BEinG THE rIGht SiZe and other Essays %O Oxford Univ. Press %C Oxford, England %D 1985 %X Also good for "What 'Hot' means" (terminology) and pseudo science essays. ^ A s / \ r m / \ c h / \ h t / \ i i / \ t r / \ e o / \ c g / \ t l / \ u A / \ r <_____________________> e Language