NetNews Usenet Archive 1992 #20

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #20 / NN_1992_20.iso / spool / comp / sys / ibm / pc / hardware / 24349 < prev next >

Wrap

Text File | 1992-09-15 | 58.6 KB | 1,193 lines

Newsgroups: comp.sys.ibm.pc.hardware Path: sparky!uunet!usc!sol.ctr.columbia.edu!ira.uka.de!uni-heidelberg!rz.uni-karlsruhe.de!usenet From: S_JUFFA@iravcl.ira.uka.de (|S| Norbert Juffa) Subject: What you always wanted to know about math coprocessors 4/4 Message-ID: <1992Sep15.163011.11189@rz.uni-karlsruhe.de> Sender: usenet@rz.uni-karlsruhe.de (USENET News System) Organization: University of Karlsruhe (FRG) - Informatik Rechnerabt. Date: Tue, 15 Sep 1992 16:30:11 GMT X-News-Reader: VMS NEWS 1.23 Lines: 1181 Test results for accuracy of transcendental functions for double extended precision as returned by the program TRANCK. 100,000 trials per function. %wrong is the percentage of results that differ from the 'exact' result (infinitely precise result rounded to 64 bits) ULP_hi is the number of results where the returned result was greater than the 'exact' (correctly rounded) result by one ULP (the numeric weight of the last mantissa bit, 2**-63 to 2**-64 depending of the size of the number). ULPs_hi is the number of results where the returned result was greater than the 'exact' result by two or more ULPs. ULP_lo is the number of results where the returned result was smaller than the 'exact' (correctly rounded) result by one ULP (the numeric weight of the last mantissa bit, 2**-63 to 2**-64 depending of the size of the number). ULPs_lo is the number of results where the returned result was smaller than the 'exact' result by two or more ULPs. max ULP err is the maximum deviation of a returned result from the 'exact' answer expressed in ULPs. Franke387 V2.4 emulator max funct. intervall %wrong ULP_hi ULPs_hi ULP_lo ULPs_lo ULP err SIN 0,pi/4 39.042 25301 708 13029 4 2 COS 0,pi/4 75.714 49827 25887 0 0 3 TAN 0,pi/4 76.976 14230 10029 24323 28394 9 ATAN 0,1 55.826 26028 1529 24044 4225 4 2XM1 0,0.5 96.717 0 0 47910 48807 5 YL2XP1 0,sqrt(2)-1 93.007 578 9 27416 65004 8 YL2X 0.1,10 62.252 16817 4712 37082 3641 2953 Microsoft's coproc. emulator (part of MS-C and MS-Fortran libraries) max funct. intervall %wrong ULP_hi ULPs_hi ULP_lo ULPs_lo ULP err SIN 0,pi/4 N/A N/A N/A N/A N/A N/A COS 0,pi/4 N/A N/A N/A N/A N/A N/A TAN 0,pi/4 40.828 27764 1520 11445 99 2 ATAN 0,1 32.307 18893 485 12530 299 2 2XM1 0,0.5 52.163 8585 189 37745 5644 3 YL2XP1 0,sqrt(2)-1 88.801 4714 916 14239 68932 11 YL2X 0.1,10 36.598 13813 3272 13866 5647 11 INTEL 8087, 80287 max funct. intervall %wrong ULP_hi ULPs_hi ULP_lo ULPs_lo ULP err SIN 0,pi/4 N/A N/A N/A N/A N/A N/A COS 0,pi/4 N/A N/A N/A N/A N/A N/A TAN 0,pi/4 37.001 18756 524 17405 316 2 ATAN 0,1 9.666 6065 0 3601 0 1 2XM1 0,0.5 19.920 0 0 19920 0 1 YL2XP1 0,sqrt(2)-1 7.780 868 0 6912 0 1 YL2X 0.1,10 1.287 723 0 564 0 1 INTEL 387 max funct. intervall %wrong ULP_hi ULPs_hi ULP_lo ULPs_lo ULP err SIN 0,pi/4 28.872 2467 0 26392 13 2 COS 0,pi/4 27.213 27169 35 9 0 2 TAN 0,pi/4 10.532 441 0 10091 0 1 ATAN 0,1 7.088 2386 0 4691 1 2 2XM1 0,0.5 32.024 0 0 32024 0 1 YL2XP1 0,sqrt(2)-1 22.611 3461 0 19150 0 1 YL2X 0.1,10 13.020 6508 0 6512 0 1 INTEL 387DX max funct. intervall %wrong ULP_hi ULPs_hi ULP_lo ULPs_lo ULP err SIN 0,pi/4 28.873 2467 0 26393 13 2 COS 0,pi/4 27.121 27090 22 9 0 2 TAN 0,pi/4 10.711 457 0 10254 0 1 ATAN 0,1 7.088 2386 0 4691 1 2 2XM1 0,0.5 32.024 0 0 32024 0 1 YL2XP1 0,sqrt(2)-1 22.611 3461 0 19150 0 1 YL2X 0.1,10 13.020 6508 0 6512 0 1 ULSI 83C87 max funct. intervall %wrong ULP_hi ULPs_hi ULP_lo ULPs_lo ULP err SIN 0,pi/4 35.530 4989 6 30238 297 2 COS 0,pi/4 43.989 11193 675 31393 728 2 TAN 0,pi/4 48.539 18880 1015 26349 2295 3 ATAN 0,1 20.858 62 0 20796 0 1 2XM1 0,0.5 21.257 4 0 21253 0 1 YL2XP1 0,sqrt(2)-1 27.893 9446 0 18213 234 2 YL2X 0.1,10 13.603 9816 0 3787 0 1 IIT 3C87 max funct. intervall %wrong ULP_hi ULPs_hi ULP_lo ULPs_lo ULP err SIN 0,pi/4 18.650 11171 0 7479 0 1 COS 0,pi/4 7.700 3024 0 4676 0 1 TAN 0,pi/4 20.973 9681 0 11291 1 2 ATAN 0,1 19.280 13186 0 6094 0 1 2XM1 0,0.5 25.660 17570 0 8090 0 1 YL2XP1 0,sqrt(2)-1 45.830 23503 1896 19654 777 3 YL2X 0.1,10 10.888 5638 357 4845 48 3 C&T 38700DX max funct. intervall %wrong ULP_hi ULPs_hi ULP_lo ULPs_lo ULP err SIN 0,pi/4 1.821 1272 0 549 0 1 COS 0,pi/4 23.358 12458 0 10901 0 1 TAN 0,pi/4 17.178 10725 0 6453 0 1 ATAN 0,1 9.359 7082 0 2277 0 1 2XM1 0,0.5 15.188 3039 0 12149 0 1 YL2XP1 0,sqrt(2)-1 19.497 12109 0 7388 0 1 YL2X 0.1,10 46.868 261 0 46607 0 1 CYRIX 83D87 max funct. intervall %wrong ULP_hi ULPs_hi ULP_lo ULPs_lo ULP err SIN 0,pi/4 1.554 1015 0 539 0 1 COS 0,pi/4 0.925 143 0 782 0 1 TAN 0,pi/4 4.147 881 0 3266 0 1 ATAN 0,1 0.656 229 0 427 0 1 2XM1 0,0.5 2.628 1433 0 1194 0 1 YL2XP1 0,sqrt(2)-1 3.242 825 0 2417 0 1 YL2X 0.1,10 0.931 256 0 675 0 1 CYRIX 387+ max funct. intervall %wrong ULP_hi ULPs_hi ULP_lo ULPs_lo ULP err SIN 0,pi/4 1.486 864 0 622 0 1 COS 0,pi/4 2.072 12 0 2060 0 1 TAN 0,pi/4 0.602 63 0 539 0 1 ATAN 0,1 0.384 12 0 372 0 1 2XM1 0,0.5 1.985 27 0 1958 0 1 YL2XP1 0,sqrt(2)-1 3.662 1705 0 1957 0 1 YL2X 0.1,10 0.764 367 0 397 0 1 INTEL RapidCAD, Intel 486 max funct. intervall %wrong ULP_hi ULPs_hi ULP_lo ULPs_lo ULP err SIN 0,pi/4 16.991 1517 0 15474 0 1 COS 0,pi/4 9.003 7603 0 1400 0 1 TAN 0,pi/4 10.532 441 0 10091 0 1 ATAN 0,1 7.078 2386 0 4691 1 2 2XM1 0,0.5 32.025 0 0 32025 0 1 YL2XP1 0,sqrt(2)-1 21.800 533 0 21267 0 1 YL2X 0.1,10 3.894 1879 0 2015 0 1 The test results above indicate that all 80x87 compatibles do not exceed Intel's stated error bound of 3 ULPs for the transcendental functions. However, some coprocessors are more accurate than others. Rating the coprocessors according to the accuracy of their trans- cendental functions gives the following list (highest accuracy first): Cyrix 387+, Cyrix 83D87, Intel 486, Intel RapidCAD, Intel 80287(!), C&T 38700DX, Intel 387DX, Intel 80387, IIT 3C87, ULSI 83C87. The tests also show that the problems with excessive inaccuracy of the transcendental functions in early versions of the IIT coprocessors with errors of up to 8 ULPs [8] have been eliminated. According to [56], certain problems with the FPATAN instruction on the IIT 3C87 occuring under the UNIX version of AutoCAD have been corrected in June, 1990. The Franke387 has acceptable accuracy for the FSIN, FCOS, and FPATAN instructions, taking into consideration that according to its documentation, Franke387 uses only 64 bits of precision for the intermediate results, while coprocessors typically use 68 bits and more. However, the larger error in the FPTAN, F2XM1, FYL2XP1, and especially the FYL2X operations show that the emulator doesn't use state of the art algorithms, which ensure an error of only a very few ULPs even if no extra precise intermediate results are available. Microsoft's coprocessor emulator provides transcendental functions with rather good accuracy, except for the logarithmic operations, which seem to contain some minor flaws. Chips and Technologies has included a program SMDIAG on the diagnostic disk V1.0 for the SuperMATH 38700DX to test the compatibility of the computational results and flag settings returned by the coprocessor with the Intel 387DX. However, the tests for the transcendental functions seem to have been tweaked to let the C&T 38700DX pass, while coprocessors like the Intel RapidCAD and the Cyrix 83D87 fail. Also, SMDIAG shows failure in the FSCALE test for the Intel RapidCAD, Cyrix 83D87, Cyrix 387+, and ULSI 83C87, although they return the correct result according to Intel's documentation for the Intel 387DX (Intel's second generation 387), which is indeed returned by the 387DX. SMDIAG expects the result returned by the original Intel 80387. Results of running the SMDIAG program on 387 compatible coprocessors (p = passed, f = failed) Intel Intel Intel Cyrix Cyrix IIT ULSI C&T Test RapidCAD 387DX 80387 387+ 83D87 3C87 83C87 38700 1 (fstore) f p p p f f f p ## 2 (fiall) p p p p p p f p 3 (faddsub) p p p p p p p p 4 (faddsub_nr) p p p p f f f p 5 (faddsub_cp) p p p p f f f p 6 (faddsub_dn) p p p p f f f p 7 (faddsub_up) p p p p f f f p 8 (fmul) p p p p p f f p 9 (fdivn) p p p p p p p p 10 (fdiv) p p p p p p f p 11 (fxch) p p p p p p p p 12 (fyl2x) p p p f f f f p ++ 13 (fyl2xp1) f p p f f f f p ++ 14 (fsqrt) p p p p p p p p 15 (fsincos) f p p f f f f p ++ 16 (fptan) p p p f p f f p ++ 17 (fpatan) p p p f f f f p ++ 18 (f2xm1) p p p f f f f p ++ 19 (fscale) f f p f f f f p ** 20 (fcom1) p p p p p f f p 21 (fprem) p p p p p p p p 22 (misc1) p p p p p f f p 23 (misc3) p p p p p p p p 24 (misc4) p p p p f f p p failed modules: 4 1 0 7 12 16 17 0 ## the failure of the Intel RapidCAD is caused by the fact that it stores the value of BCD INDEFINITE differently from the Intel 387DX. It uses FFFFC000000000000000, while the 387DX uses FFFF8000000000000000. However, both encodings are valid according to Intel's documentation, which defines the BCD INDEFINITE as FFFFUUUUUUUUUUUUUUUU, where U is undefined. So failure of the RapidCAD to deliver the same answer as the 387DX is *not* an error, just a very slight incompatibility. ** the FSCALE errors reported for the Intel 387DX, Intel RapidCAD, Cyrix 83D87, Cyrix 387+, and ULSI 83C87 are due to a single 'wrong' result each returned by one of the FSCALE computations. SMDIAG expects the result returned by the first generation Intel 80387 (and the C&T 38700DX). However, this result is wrong according to Intel's documentation and the behavior was corrected in the second generation Intel 387DX. Therefor, the Intel RapidCAD, Cyrix 83D87, Cyrix 387+, and ULSI 83C87 return the result compatible with the Intel 387DX. ++ The failures for the test of transcendental functions are caused by the tested coprocessor returning results that differ from the ones returned by the Intel 387DX. On the Cyrix 83D87, Cyrix 387+, and Intel RapidCAD, this is simply due to the improved accuracy these coprocessors provide over the Intel 387DX. The failures of the IIT 3C87 and ULSI 83C87 are mainly due to the lesser accuracy in the transcendental functions of these coprocesosors, but for the IIT 3C87 an additional source of failures is its inability to handle extended precision denormals. References [1] Schnurer, G.: Zahlenknacker im Vormarsch. c't 1992, Heft 4, Seiten 170-186 [2] Curnow, H.J.; Wichmann, B.A.: A synthetic benchmark. Computer Journal, Vol. 19, No. 1, 1976, pp. 43-49 [3] Wichmann, B.A.: Validation code for the Whetstone benchmark. NPL Report DITC 107/88, National Physics Laboratory, UK, March 1988 [4] Curnow, H.J.: Wither Whetstone? The Synthetic Benchmark after 15 Years. In: Aad van der Steen (ed.): Evaluating Supercomputers. London: Chapman and Hall 1990 [5] Dongarra, J.J.: The Linpack Benchmark: An Explanation. In: Aad van der Steen (ed.): Evaluating Supercomputers. London: Chapman and Hall 1990 [6] Dongarra, J.J.: Performance of Various Computers Using Standard Linear Equations Software. Report CS-89-85, Computer Science Department, University of Tennessee, March 11, 1992 [7] Huth, N.: Dichtung und Wahrheit oder Datenblatt und Test. Design & Elektronik 1990, Heft 13, Seiten 105-110 [8] Ungerer, B.: Sockelfolger. c't 1990, Heft 4, Seiten 162-163 [9] Coonen, J.T.: Contributions to a Proposed Standard for Binary Floating-Point Arithmetic Ph.D. thesis, University of California, Berkeley, 1984 [10] IEEE: IEEE Standard for Binary Floating-Point Arithmetic. SIGPLAN Notices, Vol. 22, No. 2, 1985, pp. 9-25 [11] IEEE Standard for Binary Floating-Point Arithmetic. ANSI/IEEE Std 754-1985. New York, NY: Institute of Electrical and Electronics Engineers 1985 [12] FasMath 83D87 Compatibility Report. Cyrix Corporation, Nov. 1989 Order No. B2004 [13] FasMath 83D87 Accuracy Report. Cyrix Corporation, July 1990 Order No. B2002 [14] FasMath 83D87 Benchmark Report. Cyrix Corporation, June 1990 Order No. B2004 [15] FasMath 83D87 User's Manual. Cyrix Corporation, June 1990 Order No. L2001-003 [16] Brent, R.P.: A FORTRAN multiple-precision arithmetic package. ACM Transactions on Mathematical Software, Vol. 4, No. 1, March 1978, pp. 57-70 [17] 387DX User's Manual, Programmer's Reference. Intel Corporation, 1989 Order No. 231917-002 [18] Volder, J.E.: The CORDIC Trigonometric Computing Technique. IRE Transactions on Electronic Computers, Vol. EC-8, No. 5, September 1959, pp. 330-334 [19] Walther, J.S.: A unified algorithm for elementary functions. AFIPS Conference Proceedings, Vol. 38, SJCC 1971, pp. 379-385 [20] Esser, R.; Kremer, F.; Schmidt, W.G.: Testrechnungen auf der IBM 3090E mit Vektoreinrichtung. Arbeitsbericht RRZK-8803, Regionales Rechenzentrum an der Universit"at zu Köln, Februar 1988 [21] McMahon, H.H.: The Livermore Fortran Kernels: A test of the numerical performance range. Technical Report UCRL-53745, Lawrence Livermore National Laboratory, USA, December 1986 [22] Nave, R.: Implementation of Transcendental Functions on a Numerics Processor. Microprocessing and Microprogramming, Vol. 11, No. 3-4, March-April 1983, pp. 221-225 [23] Yuen, A.K.: Intel's Floating-Point Processors. Electro/88 Conference Record, Boston, MA, USA, 10-12 May 1988, pp. 48/5-1 - 48/5-7 [24] Stiller, A.; Ungerer, B.: Ausgerechnet. c't 1990, Heft 1, Seiten 90-92 [25] Rosch, W.L.: Handfeste Hilfe oder Seifenblase? PC Professionell, Juni 1991, Seiten 214-237 [26] Intel 80286 Hardware Reference Manual. Intel Corporation, 1987 Order No.210760-002 [27] AMD 80C287 80-bit CMOS Numeric Processor. Advanced Micro Devices, June 1989 Order No. 11671B/0 [28] Intel RapidCAD(tm) Engineering CoProcessor Performance Brief. Intel Corporation, 1992 [29] i486(tm) Microprocessor Performance Report. Intel Corporation, April 1990 Order No. 240734-001 [30] Intel486(tm) DX2 Microprocessor Performance Brief. Intel Corporation, March 1992 Order No. 241254-001 [31] Abacus 3167 Floating-Point Coprocessor Data Book. Weitek Corporation, July 1990 DOC No. 9030 [32] WTL 4167 Floating-Point Coprocessor Data Book. Weitek Corporation, July 1989 DOC No. 8943 [33] Abacus Software Designer's Guide. Weitek Corporation, September 1989 DOC No. 8967 [34] Stiller, A.: Cache & Carry. c't 1992, Heft 6, Seiten 118-130 [35] Stiller, A.: Cache & Carry, Teil 2. c't 1992, Heft 7, Seiten 28-34 [36] Palmer, J.F.; Morse, S.P.: Die mathematischen Grundlagen der Numerik-Prozessoren 8087/80287. München: tewi 1985 [37] 80C187 80-bit Math Coprocessor Data Sheet. Intel Corporation, September 1989 Order No. 270640-003 [38] IIT-2C87 80-bit Numeric Co-Processor Data Sheet. IIT, May 1990 [39] Engineering note 4x4 matrix multiply transformation. IIT, 1989 [40] Tscheuschner, E.: 4 mal 4 auf einen Streich. c't 1990, Heft 3, Seiten 266-276 [41] Goldberg, D.: Computer Arithmetic. In: Hennessy, J.L.; Patterson, D.A.: Computer Architecture A Quantitative Approach. San Mateo, CA: Morgan Kaufmann 1990 [42] 8087 Math Coprocessor Data Sheet. Intel Corporation, October 1989, Order No. 205835-007 [43] 8086/8088 User's Manual, Programmer's and Hardware Reference. Intel Corporation, 1989 Order No. 240487-001 [44] 80286 and 80287 Programmer's Reference Manual. Intel Corporation, 1987 Order No. 210498-005 [45] 80287XL/XLT CHMOS III Math Coprocessor Data Sheet. Intel Corporation, May 1990 Order No. 290376-001 [46] Cyrix FasMath(tm) 82S87 Coprocessor Data Sheet. Cyrix Coporation, 1991 Document 94018-00 Rev. 1.0 [47] IIT-3C87 80-bit Numeric Co-Processor Data Sheet. IIT, May 1990 [48] 486(tm)SX(tm) Microprocessor/ 487(tm)SX(tm) Math CoProcessor Data Sheet. Intel Corporation, April 1991. Order No. 240950-001 [49] Schnurer, G.: Die gro"se Verlade. c't 1991, Heft 7, Seiten 55-57 [50] Schnurer, G.: Eine 4 f"ur alle. c't 1991, Heft 6, Seite 25 [51] Intel486(tm)DX Microprocessor Data Book. Intel Corporation, June 1991 Order No. 240440-004 [52] i486(tm) Microprocessor Hardware Reference Manual. Intel Corporation, 1990 Order No. 240552-001 [53] i486(tm) Microprocessor Programmer's Reference Manual. Intel Corporation, 1990 Order No. 240486-001 [54] Ungerer, B.: Kalte H"ute. c't 1992, Heft 8, Seiten 140-144 [55] Ungerer, B.: Hei"se Sache. c't 1991, Heft 4, Seiten 104-108 [56] Rosch, W.L.: Handfeste Hilfe oder Seifenblase? PC Profesionell, Juni 1991, Seiten 214-237 [57] Niederkr"uger, W.: Lebendige Vergangenheit. c't 1990, Heft 12, Seiten 114-116 [58] ULSI Math*Co Advanced Math Coprocessor Technical Specification. ULSI System, 5/92, Rev. E [59] 387(tm)DX Math CoProcessor Data Sheet. Intel Corporation, September 1990. Order No. 240448-003 [60] 387(tm) Numerics Coprocessor Extension Data Sheet. Intel Corporation, February 1989. Order No. 231920-005 [61] Koren, I.; Zinaty, O.: Evaluating Elementary Functions in a Numerical Coprocessor Based on Rational Approximations. IEEE Transactions on Computers, Vol. C-39, No. 8, August 1990, pp. 1030-1037 [62] 387(tm) SX Math CoProcessor Data Sheet. Intel Corporation, November 1989 Order No. 240225-005 [63] Frenkel, G.: Coprocessors Speed Numeric Operations. PC-Week, August 27, 1990 [64] Schnurer, G.; Stiller, A.: Auto-Matt. c't 1991, Heft 10, Seiten 94-96 [65] Grehan, R.: FPU Face-Off. Byte, November 1990, pp. 194-200 [66] Tang, P.T.P.: Testing Computer Arithmetic by Elementary Number Theory. Preprint MCS-P84-0889, Mathematics and Computer Science Division, Argonne National Laboratory, August 1989 [67] Ferguson, W.E.: Selecting math coprocessors. IEEE Spectrum, July 1991, pp. 38-41 [68] Schnabel, J.: Viermal 387. Computer Pers"onlich 1991, Heft 22, Seiten 153-156 [69] Hofmann, J.: Starke Rechenknechte. mc 1990, Heft 7, Seiten 64-67 [70] Woerrlein, H.; Hinnenberg, R.: Die Lust an der Power. Computer Live 1991, Heft 10, Seiten 138-149 [71] email from Peter Forsberg (peterf@vnet.ibm.com), email from Alan Brown (abrown@Reston.ICL.COM) [72] email from Eric Johnson (johnsone%camax01@uunet.UU.NET), email from Jerry Whelan (guru@stasi.bradley.edu), email from Arto Viitanen (av@cs.uta.fi), email from Richard Krehbiel (richk@grebyn.com) [73] email from Fred Dunlap (cyrix!volt!fred@texsun.Central.Sun.COM) [74] correspondence with Bengt Ask (f89ba@efd.lth.se) [75] email from Thomas Hoberg (tmh@prosun.first.gmd.de) [76] Microsoft Macro Assembler Programmer's Guide Version 6.0, Microsoft Corporation, 1991. Document No. LN06556-0291 Manufacturer's addresses Intel Corporation 3065 Bowers Avenue Santa Clara, CA 95051 USA IIT Integrated Information Technology, Inc. 2540 Mission College Blvd. Santa Clara, CA 95054 USA ULSI Systems, Inc. 58 Daggett Drive San Jose, CA 95134 USA Chips & Technologies, Inc. 3050 Zanker Road San Jose, CA 95134 USA Weitek Corporation 1060 East Arques Avenue Sunnyvale, CA 94086 USA AMD Advanced Microdevices, Inc. 901 Thompson Place P.O.B. 3453 Sunnyvale, CA 94088-3453 USA Cyrix Corporation P.O.B. 850118 Richardson, TX 75085 USA Appendix A {$N+,E+} PROGRAM PCtrl; VAR B,c: EXTENDED; Precision, L: WORD; PROCEDURE SetPrecisionControl (Precision: WORD); (* This procedure sets the internal precision of the NDP. Available *) (* precision values: 0 - 24 bits (SINGLE) *) (* 1 - n.a. (mapped to single) *) (* 2 - 53 bits (DOUBLE) *) (* 3 - 64 bits (EXTENDED) *) VAR CtrlWord: WORD; BEGIN {SetPrecisionCtrl} IF Precision = 1 THEN Precision := 0; Precision := Precision SHL 8; { make mask for PC field in ctrl word} ASM FSTCW [CtrlWord] { store NDP control word } MOV AX, [CtrlWord] { load control word into CPU } AND AX, 0FCFFh { mask out precision control field } OR AX, [Precision] { set desired precision in PC field } MOV [CtrlWord], AX { store new control word } FLDCW [CtrlWord] { set new precision control in NDP } END; END; {SetPrecisionCtrl} BEGIN {main} FOR Precision := 1 TO 3 DO BEGIN B := 1.2345678901234567890; SetPrecisionControl (Precision); FOR L := 1 TO 20 DO BEGIN B := Sqrt (B); END; FOR L := 1 TO 20 DO BEGIN B := B*B; END; SetPrecisionControl (3); { full precision for printout } WriteLn (Precision, B:28); END; END. +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ {$N+,E+} PROGRAM RCtrl; VAR B,c: EXTENDED; RoundingMode, L: WORD; PROCEDURE SetRoundingMode (RCMode: WORD); (* This procedure selects one of four available rounding modes *) (* 0 - Round to nearest (default) *) (* 1 - Round down (towards negative infinity) *) (* 2 - Round up (towards positive infinity) *) (* 3 - Chop (truncate, round towards zero) *) VAR CtrlWord: WORD; BEGIN RCMode := RCMode SHL 10; { make mask for RC field in control word} ASM FSTCW [CtrlWord] { store NDP control word } MOV AX, [CtrlWord] { load control word into CPU } AND AX, 0F3FFh { mask out rounding control field } OR AX, [RCMode] { set desired precision in RC field } MOV [CtrlWord], AX { store new control word } FLDCW [CtrlWord] { set new rounding control in NDP } END; END; BEGIN FOR RoundingMode := 0 TO 3 DO BEGIN B := 1.2345678901234567890e100; SetRoundingMode (RoundingMode); FOR L := 1 TO 51 DO BEGIN B := Sqrt (B); END; FOR L := 1 TO 51 DO BEGIN B := -B*B; END; SetRoundingMode (0); { round to nearest for printout } WriteLn (RoundingMode, B:28); END; END. +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ {$N+,E+} PROGRAM DenormTs; VAR E: EXTENDED; D: DOUBLE; S: SINGLE; BEGIN WriteLn ('Testing support and printing of denormals'); WriteLn; Write ('Coprocessor is: '); CASE Test8087 OF 0: WriteLn ('Emulator'); 1: WriteLn ('8087 or compatible'); 2: WriteLn ('80287 or compatible'); 3: WriteLn ('80387 or compatible'); END; WriteLn; S := 1.18e-38; S := S * 3.90625e-3; IF S = 0 THEN WriteLn ('SINGLE denormals not supported') ELSE BEGIN WriteLn ('SINGLE denormals supported'); WriteLn ('SINGLE denormal prints as: ', S); WriteLn ('Denormal should be printed as 4.60943...E-0041'); END; WriteLn; D := 2.24e-308; D := D * 3.90625e-3; IF D = 0 THEN WriteLn ('DOUBLE denormals not supported') ELSE BEGIN WriteLn ('DOUBLE denormals supported'); WriteLn ('DOUBLE denormal prints as: ', D); WriteLn ('Denormal should be printed as 8.75...E-0311'); END; WriteLn; E := 3.37e-4932; E := E * 3.90625e-3; IF E = 0 THEN WriteLn ('EXTENDED denormals not supported') ELSE BEGIN WriteLn ('EXTENDED denormals supported'); WriteLn ('EXTENDED denormal prints as: ', E); WriteLn ('Denormal should be printed as 1.3164...E-4934'); END; END. Appendix B ; FILE: APFELM4.ASM ; assemble with MASM /e APFELM4 or TASM /e APFELM4 CODE SEGMENT BYTE PUBLIC 'CODE' ASSUME CS: CODE PAGE ,120 PUBLIC APPLE87; APPLE87 PROC NEAR PUSH BP ; save caller's base pointer MOV BP, SP ; make new frame pointer PUSH DS ; save caller's data segment PUSH SI ; save register PUSH DI ; variables LDS BX, [BP+04] ; pointer to parameter record FINIT ; init 80x87 FSP->R0 FILD WORD PTR [BX+02] ; maxrad FSP->R7 FLD QWORD PTR [BX+08] ; qmax FSP->R6 FSUB QWORD PTR [BX+16] ; qmax-qmin FSP->R6 DEC WORD PTR [BX+04] ; ymax-1 FIDIV WORD PTR [BX+04] ; (qmax-qmin)/(ymax-1)FSP->R6 FSTP QWORD PTR [BX+16] ; save delta_q FSP->R7 FLD QWORD PTR [BX+24] ; pmax FSP->R6 FSUB QWORD PTR [BX+32] ; pmax-pmin FSP->R6 DEC WORD PTR [BX+06] ; xmax-1 FIDIV WORD PTR [BX+06] ; delta_p FSP->R6 MOV AX, [BX] ; save maxiter,[BX] needed for MOV [BX+2], AX ; 80x87 status now XOR BP, BP ; y=0 FLD QWORD PTR [BX+08] ; qmax FSP->R5 CMP WORD PTR [BX+40], 0 ; fast mode on 8087 desired ? JE yloop ; no, normal mode FSTCW [BX] ; save NDP control word AND WORD PTR [BX], 0FCFFh; set PCTRL = single precision FLDCW [BX] ; get back NDP control word yloop: XOR DI, DI ; x=0 FLD QWORD PTR [BX+32] ; pmin FSP->R4 xloop: FLDZ ; j**2= 0 FSP->R3 FLDZ ; 2ij = 0 FSP->R2 FLDZ ; i**2= 0 FSP->R1 MOV CX, [BX+2] ; maxiter MOV DL, 41h ; mask for C0 and C3 cond.bits iteration: FSUB ST, ST(2) ; i**2-j**2 FSP->R1 FADD ST, ST(3) ; i**2-j**2+p = i FSP->R1 FLD ST(0) ; duplicate i FSP->R0 FMUL ST(1), ST ; i**2 FSP->R0 FADD ST, ST(0) ; 2i FSP->R0 FXCH ST(2) ; 2*i*j FSP->R0 FADD ST, ST(5) ; 2*i*j+q = j FSP->R0 FMUL ST(2), ST ; 2*i*j FSP->R0 FMUL ST, ST(0) ; j**2 FSP->R0 FST ST(3) ; save j**2 FSP->R0 FADD ST, ST(1) ; i**2+j**2 FSP->R0 FCOMP ST(7) ; i**2+j**2 > maxrad? FSP->R1 FSTSW [BX] ; save 80x87 cond.codeFSP->R1 TEST BYTE PTR [BX+1], DL ; test carry and zero flags LOOPNZ iteration ; until maxiter if not diverg. MOV DX, CX ; number of loops executed NEG CX ; carry set if CX <> 0 ADC DX, 0 ; adjust DX if no. of loops<>0 ; plot point here (DI = X, BP = y, DX has the color) FSTP ST(0) ; pop i**2 FSP->R2 FSTP ST(0) ; pop 2ij FSP->R3 FSTP ST(0) ; pop j**2 FSP->R4 FADD ST,ST(2) ; p=p+delta_p FSP->R4 INC DI ; x:=x+1 CMP DI, [BX+6] ; x > xmax ? JBE xloop ; no, continue on same line FSTP ST(0) ; pop p FSP->R5 FSUB QWORD PTR [BX+16] ; q=q-delta_q FSP->R5 INC BP ; y:=y+1 CMP BP, [BX+4] ; y > ymax ? JBE yloop ; no, picture not done yet groesser: POP DI ; restore POP SI ; register variables POP DS ; restore caller's data segm. POP BP ; save caller's base pointer RET 4 ; pop parameters and return APPLE87 ENDP CODE ENDS END ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ UNIT Time; INTERFACE FUNCTION Clock: LONGINT; { same as VMS; time in milliseconds } IMPLEMENTATION FUNCTION Clock: LONGINT; ASSEMBLER; ASM PUSH DS { save caller's data segment } XOR DX, DX { initialize data segment to } MOV DS, DX { access ticker counter } MOV BX, 46Ch { offset of ticker counter in segm.} MOV DX, 43h { timer chip control port } MOV AL, 4 { freeze timer 0 } PUSHF { save caller's int flag setting } STI { allow update of ticker counter } LES DI, DS:[BX] { read BIOS ticker counter } OUT DX, AL { latch timer 0 } LDS SI, DS:[BX] { read BIOS ticker counter } IN AL, 40h { read latched timer 0 lo-byte } MOV AH, AL { save lo-byte } IN AL, 40h { read latched timer 0 hi-byte } POPF { restore caller's int flag } XCHG AL, AH { correct order of hi and lo } MOV CX, ES { ticker counter 1 in CX:DI:AX } CMP DI, SI { ticker counter updated ? } JE @no_update { no } OR AX, AX { update before timer freeze ? } JNS @no_update { no } MOV DI, SI { use second } MOV CX, DS { ticker counter } @no_update:NOT AX { counter counts down } MOV BX, 36EDh { load multiplier } MUL BX { W1 * M } MOV SI, DX { save W1 * M (hi) } MOV AX, BX { get M } MUL DI { W2 * M } XCHG BX, AX { AX = M, BX = W2 * M (lo) } MOV DI, DX { DI = W2 * M (hi) } ADD BX, SI { accumulate } ADC DI, 0 { result } XOR SI, SI { load zero } MUL CX { W3 * M } ADD AX, DI { accumulate } ADC DX, SI { result in DX:AX:BX } MOV DH, DL { move result } MOV DL, AH { from DL:AX:BX } MOV AH, AL { to } MOV AL, BH { DX:AX:BH } MOV DI, DX { save result } MOV CX, AX { in DI:CX } MOV AX, 25110 { calculate correction } MUL DX { factor } SUB CX, DX { subtract correction } SBB DI, SI { factor } XCHG AX, CX { result back } MOV DX, DI { to DX:AX } POP DS { restore caller's data segment } END; BEGIN Port [$43] := $34; { need rate generator, not square wave} Port [$40] := 0; { generator as prog. by some BIOSes } Port [$40] := 0; { for timer 0 } END. { Time } ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ {$A+,B-,R-,I-,V-,N+,E+} PROGRAM PeakFlop; USES Time; TYPE ParamRec = RECORD MaxIter, MaxRad, YMax, XMax: WORD; Qmax, Qmin, Pmax, Pmin: DOUBLE; FastMod: WORD; PlotFkt: POINTER; FLOPS:LONGINT; END; VAR Param: ParamRec; Start: LONGINT; {$L APFELM4.OBJ} PROCEDURE Apple87 (VAR Param: ParamRec); EXTERNAL; BEGIN WITH Param DO BEGIN MaxIter:= 50; MaxRad := 30; YMax := 30; XMax := 30; Pmin :=-2.1; Pmax := 1.1; Qmin :=-1.2; Qmax := 1.2; FastMod:= Word (FALSE); PlotFkt:= NIL; Flops := 0; END; Start := Clock; Apple87 (Param); { executes 104002 FLOP } Start := Clock - Start; { elapsed time in milliseconds } WriteLn ('Peak-MFLOPS: ', 104.002 / Start); END. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ ; FILE: M4X4.ASM ; ; assemble with TASM /e M4X4 or MASM /e M4X4 CODE SEGMENT BYTE PUBLIC 'CODE' ASSUME CS:CODE PUBLIC MUL_4x4 PUBLIC IIT_MUL_4x4 FSBP0 EQU DB 0DBh, 0E8h ; declare special IIT FSBP1 EQU DB 0DBh, 0EBh ; instructions FSBP2 EQU DB 0DBh, 0EAh F4X4 EQU DB 0DBh, 0F1h ;--------------------------------------------------------------------- ; ; MUL_4x4 multiplicates a four-by-four matrix by an array of four ; dimensional vectors. This operation is needed for 3D transformations ; in graphics data processing. There are arrays for each component of ; a vector. Thus there is an ; array containing all the x components, ; another containing all the y components and so on. Each component is ; an 8 byte IEEE floating point number. Two indices into the array of ; vectors are given. The first is the index of the vector that will be ; processed first, the second is the index of the vector processed ; last. ; ;--------------------------------------------------------------------- MUL_4x4 PROC NEAR AddrX EQU DWORD PTR [BP+24] ; address of X component array AddrY EQU DWORD PTR [BP+20] ; address of Y component array AddrZ EQU DWORD PTR [BP+16] ; address of Z component array AddrW EQU DWORD PTR [BP+12] ; address of W component array AddrT EQU DWORD PTR [BP+8] ; addr. of 4x4 transform. mat. F EQU WORD PTR [BP+6] ; first vector to process K EQU WORD PTR [BP+4] ; last vector to process RetAddr EQU WORD PTR [BP+2] ; return address saved by call SavdBP EQU WORD PTR [BP+0] ; saved frame pointer SavdDS EQU WORD PTR [BP-2] ; caller's data segment PUSH BP ; save TURBO-Pascal frame pointer MOV BP, SP ; new frame pointer PUSH DS ; save TURBO-Pascal data segment MOV CX, K ; final index SUB CX, F ; final index - start index JNC $ok ; must not JMP $nothing ; be negative $ok: INC CX ; number of elements MOV SI, F ; init offset into arrays SHL SI, 1 ; each SHL SI, 1 ; element SHL SI, 1 ; has 8 bytes LDS DI, AddrT ; addr. of transformation mat. FLD QWORD PTR [DI] ; load a[0,0] = R7 FLD QWORD PTR [DI+8] ; load a[0,1] = R6 $mat_mul: LES BX, AddrX ; addr. of x component array FLD QWORD PTR ES:[BX+SI] ; load x[a] = R5 LES BX, AddrY ; addr. of y component array FLD QWORD PTR ES:[BX+SI] ; load y[a] = R4 LES BX, AddrZ ; addr. of z component array FLD QWORD PTR ES:[BX+SI] ; load z[a] = R3 LES BX, AddrW ; addr. of w component array FLD QWORD PTR ES:[BX+SI] ; load w[a] = R2 FLD ST(5) ; load a[0,0] = R1 FMUL ST, ST(4) ; a[0,0] * x[a] = R1 FLD ST(5) ; load a[0,1] = R0 FMUL ST, ST(4) ; a[0,1] * y[a] = R0 FADDP ST(1), ST ; a[0,0]*x[a]+a[0,1]*y[a]=R1 FLD QWORD PTR [DI+16] ; load a[0,2] = R0 FMUL ST, ST(3) ; a[0,2] * z[a] = R0 FADDP ST(1), ST ; a[0,0]*x[a]...a[0,2]*z[a]=R1 FLD QWORD PTR [DI+24] ; load a[0,3] = R0 FMUL ST, ST(2) ; a[0,3] * w[a] = R0 FADDP ST(1), ST ; a[0,0]*x[a]...a[0,3]*w[a]=R1 LES BX, AddrX ; get address of x vector FSTP QWORD PTR ES:[BX+SI] ; write new x[a] FLD QWORD PTR [DI+32] ; load a[1,0] = R1 FMUL ST, ST(4) ; a[1,0] * x[a] = R1 FLD QWORD PTR [DI+40] ; load a[1,1] = R0 FMUL ST, ST(4) ; a[1,1] * y[a] = R0 FADDP ST(1), ST ; a[1,0]*x[a]+a[1,1]*y[a]=R1 FLD QWORD PTR [DI+48] ; load a[1,2] = R0 FMUL ST, ST(3) ; a[1,2] * z[a] = R0 FADDP ST(1), ST ; a[1,0]*x[a]...a[1,2]*z[a]=R1 FLD QWORD PTR [DI+56] ; load a[1,3] = R0 FMUL ST, ST(2) ; a[1,3] * w[a] = R0 FADDP ST(1), ST ; a[1,0]*x[a]...a[1,3]*w[a]=R1 LES BX, AddrY ; get address of y vector FSTP QWORD PTR ES:[BX+SI] ; write new y[a] FLD QWORD PTR [DI+64] ; load a[2,0] = R1 FMUL ST, ST(4) ; a[2,0] * x[a] = R1 FLD QWORD PTR [DI+72] ; load a[2,1] = R0 FMUL ST, ST(4) ; a[2,1] * y[a] = R0 FADDP ST(1), ST ; a[2,0]*x[a]+a[2,1]*y[a]=R1 FLD QWORD PTR [DI+80] ; load a[2,2] = R0 FMUL ST, ST(3) ; a[2,2] * z[a] = R0 FADDP ST(1), ST ; a[2,0]*x[a]...a[2,2]*z[a]=R1 FLD QWORD PTR [DI+88] ; load a[2,3] = R0 FMUL ST, ST(2) ; a[2,3] * w[a] = R0 FADDP ST(1), ST ; a[2,0]*x[a]...a[2,3]*w[a]=R1 LES BX, AddrZ ; get address of z vector FSTP QWORD PTR ES:[BX+SI] ; write new z[a] FLD QWORD PTR [DI+96] ; load a[3,0] = R1 FMULP ST(4), ST ; a[3,0] * x[a] = R5 FLD QWORD PTR [DI+104] ; load a[3,1] = R1 FMULP ST(3), ST ; a[3,1] * y[a] = R4 FLD QWORD PTR [DI+112] ; load a[3,2] = R1 FMULP ST(2), ST ; a[3,2] * z[a] = R3 FLD QWORD PTR [DI+120] ; load a[3,3] = R1 FMULP ST(1), ST ; a[3,3] * w[a] = R2 FADDP ST(1), ST ; a[3,3]*w[a]+a[3,2]*z[a]=R3 FADDP ST(1), ST ; a[3,3]*w[a]...a[3,1]*y[a]=R4 FADDP ST(1), ST ; a[3,3]*w[a]...a[3,0]*x[a]=R5 LES BX, AddrW ; get address of w vector FSTP QWORD PTR ES:[BX+SI] ; write new w[a] ADD SI, 8 ; new offset into arrays DEC CX ; decrement element counter JZ $done ; no elements left, done JMP $mat_mul ; transform next vector $done: FSTP ST(0) ; clear FSTP ST(0) ; FPU stack $nothing: POP DS ; restore TP data segment POP BP ; restore TP frame pointer RET 24 ; pop parameters and return MUL_4X4 ENDP ;--------------------------------------------------------------------- ; ; IIT_MUL_4x4 multiplicates a four-by-four matrix by an array of four ; dimensional vectors. This operation is needed for 3D transformations ; in graphics data processing. There are arrays for each component of ; a vector. Thus there is an array containing all the x components, ; another containing all the y components and so on. Each component is ; an 8 byte IEEE floating point number. Two indices into the array of ; vectors are given. The first is the index of the vector that will be ; processed first, the second is the index of the vector processed ; last. This subroutine uses the special instructions only available ; on IIT coprocessors to provide fast matrix multiply capabilities. ; So make sure to use it only on IIT coprocessors. ; ;--------------------------------------------------------------------- IIT_MUL_4x4 PROC NEAR AddrX EQU DWORD PTR [BP+24] ; address of X component array AddrY EQU DWORD PTR [BP+20] ; address of Y component array AddrZ EQU DWORD PTR [BP+16] ; address of Z component array AddrW EQU DWORD PTR [BP+12] ; address of W component array AddrT EQU DWORD PTR [BP+8] ; addr. of 4x4 transf. matrix F EQU WORD PTR [BP+6] ; first vector to process K EQU WORD PTR [BP+4] ; last vector to process RetAddr EQU WORD PTR [BP+2] ; return address saved by call SavdBP EQU WORD PTR [BP+0] ; saved frame pointer SavdDS EQU WORD PTR [BP-2] ; caller's data segment Ctrl87 EQU WORD PTR [BP-4] ; caller's 80x87 control word PUSH BP ; save TURBO-Pascal frame ptr MOV BP, SP ; new frame pointer PUSH DS ; save TURBO-Pascal data seg. SUB SP, 2 ; make local variabe FSTCW [Ctrl87] ; save 80x87 ctrl word LES SI, AddrT ; ptr to transformation matrix FINIT ; initialize coprocessor FSBP2 ; set register bank 2 FLD QWORD PTR ES:[SI] ; load a[0,0] FLD QWORD PTR ES:[SI+32] ; load a[1,0] FLD QWORD PTR ES:[SI+64] ; load a[2,0] FLD QWORD PTR ES:[SI+96] ; load a[3,0] FLD QWORD PTR ES:[SI+8] ; load a[0,1] FLD QWORD PTR ES:[SI+40] ; load a[1,1] FLD QWORD PTR ES:[SI+72] ; load a[2,1] FLD QWORD PTR ES:[SI+104] ; load a[3,1] FINIT ; initialize coprocessor FSBP1 ; set register bank 1 FLD QWORD PTR ES:[SI+16] ; load a[0,2] FLD QWORD PTR ES:[SI+48] ; load a[1,2] FLD QWORD PTR ES:[SI+80] ; load a[2,2] FLD QWORD PTR ES:[SI+112] ; load a[3,2] FLD QWORD PTR ES:[SI+24] ; load a[0,3] FLD QWORD PTR ES:[SI+56] ; load a[1,3] FLD QWORD PTR ES:[SI+88] ; load a[2,3] FLD QWORD PTR ES:[SI+120] ; load a[3,3] ; transformation matrix loaded MOV AX, F ; index of first vector MOV DX, K ; index of last vector MOV BX, AX ; index 1st vector to process MOV CL, 3 ; component has 8 (2**3) bytes SHL BX, CL ; compute offset into arrays FINIT ; initialize coprocessor FSBP0 ; set register bank 0 $mat_loop:LES SI, AddrW ; addr. of W component array FLD QWORD PTR ES:[SI+BX] ; W component current vector LES SI, AddrZ ; addr. of Z component array FLD QWORD PTR ES:[SI+BX] ; Z component current vector LES SI, AddrY ; addr. of Y component array FLD QWORD PTR ES:[SI+BX] ; Y component current vector LES SI, AddrX ; addr. of X component array FLD QWORD PTR ES:[SI+BX] ; X component current vector F4X4 ; mul 4x4 matrix by 4x1 vector INC AX ; next vector MOV DI, AX ; next vector SHL DI, CL ; offset of vector into arrays FSTP QWORD PTR ES:[SI+BX] ; store X comp. of curr. vect. LES SI, AddrY ; address of Y component array FSTP QWORD PTR ES:[SI+BX] ; store Y comp. of curr. vect. LES SI, AddrZ ; address of Z component array FSTP QWORD PTR ES:[SI+BX] ; store Z comp. of curr. vect. LES SI, AddrW ; address of W component array FSTP QWORD PTR ES:[SI+BX] ; store W comp. of curr. vect. MOV BX, DI ; ofs nxt vect. in comp. arrays CMP AX, DX ; nxt vector past upper bound? JLE $mat_loop ; no, transform next vector FLDCW [Ctrl87] ; restore orig 80x87 ctrl word ADD SP, 2 ; get rid of local variable POP DS ; restore TP data segment POP BP ; restore TP frame pointer RET 24 ; pop parameters and return IIT_MUL_4x4 ENDP CODE ENDS END ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ {$N+,E+} PROGRAM Trnsform; USES Time; CONST VectorLen = 8190; TYPE Vector = ARRAY [0..VectorLen] OF DOUBLE; VectorPtr = ^Vector; Mat4 = ARRAY [1..4, 1..4] OF DOUBLE; VAR X: VectorPtr; Y: VectorPtr; Z: VectorPtr; W: VectorPtr; T: Mat4; K: INTEGER; L: INTEGER; First: INTEGER; Last: INTEGER; Start: LONGINT; Elapsed:LONGINT; PROCEDURE MUL_4X4 (X, Y, Z, W: VectorPtr; VAR T: Mat4; First, Last: INTEGER); EXTERNAL; PROCEDURE IIT_MUL_4X4 (X, Y, Z, W: VectorPtr; VAR T: Mat4; First, Last: INTEGER); EXTERNAL; {$L M4X4.OBJ} BEGIN WriteLn ('Test8087 = ', Test8087); New (X); New (Y); New (Z); New (W); FOR L := 1 TO VectorLen DO BEGIN X^ [L] := Random; Y^ [L] := Random; Z^ [L] := Random; W^ [L] := Random; END; X^ [0] := 1; Y^ [0] := 1; Z^ [0] := 1; W^ [0] := 1; FOR K := 1 TO 4 DO BEGIN FOR L := 1 TO 4 DO BEGIN T [K, L] := (K-1)*4 + L; END; END; First := 0; Last := 8190; Start := Clock; MUL_4X4 (X, Y, Z, W, T, First, Last); { IIT_MUL_4X4 (X, Y, Z, W, T, First, Last); } Elapsed := Clock - Start; WriteLn ('Number of vectors: ', Last-First+1); WriteLn ('Time: ', Elapsed, ' ms'); WriteLn ('Equivalent to ', (28.0*(Last-First+1)/1e6)/ (Elapsed*1e-3):0:4, ' MFLOPS'); WriteLn; WriteLn ('Last vector:'); WriteLn; WriteLn (X^[Last]); WriteLn (Y^[Last]); WriteLn (Z^[Last]); WriteLn (W^[Last]); END.