C!T ROM 2

home *** CD-ROM | disk | FTP | other *** search

/ C!T ROM 2 / ctrom_ii_b.zip / ctrom_ii_b / PROGRAM / PASCAL / TPL60N19 / README.DOC < prev next >

Wrap

Text File | 1993-02-16 | 24KB | 489 lines

Turbo-Pascal 6.0 Runtime Libary Update - Release 1.9 02-16-1993 This library is a complete replacement for the runtime library that came with your Turbo Pascal 6.0 compiler. Due to lots of optimizations, programs compiled with this version of TURBO.TPL will be faster. This library maintains 99.9% compatibility with the original library. Differences are usually due to enhancements and should not cause any compatibility problems. Some bugs from the original library supplied by Borland have been eliminated, but there can be no guarantee that new ones have not crept in. If you discover any bugs, or have other comments, please let me know. My email and snail mail addresses are given below. Due to the nature of Borland's licensing of the TPL source code I am not allowed to distribute the source code of my enhanced library, so I can only provide the binary. What I am including, starting with version 1.8, is the source for the LONGINT and the REAL arithmetic routines. Also included for the first time in this version is the source of most of the string routines. Since all this code does not contain a single line of code written by Borland, I think they can't object to the fact that I am making *my* code public. The source for the arithmetic rouitnes is contained in the file ARISOURC.ZIP. The source code of the string routines is contained in file STRSOURC.ZIP. The code of the arithmetic and string routines is hereby released into the public domain. You may use it in your own programs under the condition that you do not include it into a commercial product. Parties interested in commercial use of my code should contact me at my address below. THIS VERSION OF THE LIBRARY REPLACEMENT FOR TP 6.0 IS DEFINITELY THE FINAL VERSION. NO FURTHER UPDATES WILL BE MADE, AS THE NEW 7.0 VERSION OF TURBO PASCAL (AND BORLAND PASCAL) HAS BEEN INTRODUCED IN NOVEMBER 1992 AND IS NOW AVAILABLE IN LOCALIZED VERSIONS EVERYWHERE. I AM PREPARING A SIMILAR LIBRARY FILE CALLED TPL70N10.ZIP FOR USE WITH TP/BP 7.0. Original library code is Copyright (C) 1983,91 Borland International New / additional library code is Copyright (C) 1988-1993 Norbert Juffa, Wielandtstr. 14, 7500 Karlsruhe 1, Germany Internet: S_JUFFA@IRAVCL.IRA.UKA.DE Contents of this document: I. Capabilities of RTL replacement II. Revision History III. References I. Capabilities of RTL replacement ================================== Improvements in SYSTEM module ----------------------------- o REAL type software arithmetic operations now comply with ANSI/IEEE Standard 754-1985 for Binary Floating Point Arithmetic [1,2] as much as possible. Note that REAL arithmetic by design differs from the standard in many ways, especially available numeric formats, value set, and available operations. The rounding mode implemented here is "round to nearest or even" as specified by the standard. Add, Subtract, Multiply, Squaring, Division, and Square Root deliver exact results with regard to this rounding mode, as demanded by the standard. Conversions from REAL to LONGINT and from EXTENDED to REAL use rounding to nearest or even, as specified in the standard. Correct implementation of above features was tested with the PARANOIA test program [3]. The correctness of basic REAL arithmetic functions has also been tested against the coprocessor/emulator EXTENDED format with the program FUN1_TST. The EXTENDED format carries approximately 19 decimal digits of precision. o REAL arithmetic operations have been sped up. Speed-up for SQRT varies between a factor of 12 for a 8086 and 29 for a Cyrix 486DLC. FRAC now executes at nearly three times the original speed. Speed-up for SIN, COS, ARCTAN, LN, EXP is between 50% and 100%. Division is now between 60% and 300% faster than before, depending on the CPU. Overall numeric processing power using REAL arithmetic increases by about 51% for an 8086, by 62% for an Intel 386DX, and 80% for a Cyrix 486DLC as measured by the WHETSTONE benchmark [4,5]. o Overall accuracy of REAL arithmetic transcendental functions has been improved as indicated by Cody&Waite's ELEFUNT tests [6]: DLOG, DEXP, DATAN, DSIN. Correct argument reduction ensures that relative error over the whole argument range does not exceed 1.9e-12 for Exp, 2.8e-12 for Arctan, and 2.7e-12 for Ln. These values have been determined by comparing the function returns of the REAL transcendental functions to the values computed on a Cyrix 83D87 coprocessor for the EXTENDED format. For Sin and Cos, relative error is also in the above range when the argument is reasonably small (e.g. in range -100..100) and not very close to an integer multiple of 0.25*Pi. The error of the transcendental functions expressed in ULPs (units in the last place) over the whole argument range does not exceed 1.6 ULPs for Exp, 1.8 ULPs for Arctan, and 2.2 ULPs for Ln. These values were determined using the ULPERR program. o Execution of coprocessor floating point computations using an 80287 or 80387 has been accelerated. For these coprocessors, NOPs will be inserted before every floating point instruction converted from an emulator interrupt instead of WAITs. As a result of this optimization, an improvement in execution speed of 15% has been observed running the Lawrence Livermore Loops (LLL) [7] on a Cyrix 83D87, the improvement for the WHETSTONE benchmark on the 83D87 is 9.4%. Maximum performance gain for tight loops (e.g. fractal computation) by this measure is about 22%. o On 80287XL, 80387, 80486DX or compatible chips the Sin and Cos functions take advantage of the FSIN and FCOS instructions of these coprocessors, speeding up these functions by almost a factor of two. As a side effect, there is also some improvement in accuracy as measured by the DSIN test program from the ELEFUNT test suite. Also, the Arctan function takes advantage of the increased argument range of the FPATAN function. These optimizations result in another 19% increase in WHETSTONE power, so that the total combined speedup over the original library is 25% for this benchmark when run on a 387 compatible coprocessor. o STRING operations are faster, especially for longer strings. Most dramatic increase is in the INSERT function, with execution times reduced to up to one fourth compared with the original version of the RTL. Faster string operations cause 7% performance increase for the DHRYSTONE [8,9] benchmark on a 8086. o Improved speed of random number generation. Random for REAL numbers is 10-20% faster, Random for EXTENDED numbers is 5% faster. Due to the improvements in the uniform distribution of integer random numbers, there is a decrease in the speed of integer random number generation of about 5%. o Binary to decimal conversions used in Str and Write procedures have been sped up by up to 70% for integers (BYTE, SHORTINT, INTEGER, WORD, LONGINT), up to 5% for REAL numbers and about 3% for EXTENDED numbers. o Improved speed of LONGINT arithmetic. Division enjoys fourfold reduction of execution time on 8086, for an Intel RapidCAD CPU the speed up factor is 5.2 and for a Cyrix 486DLC the speed-up factor is 10.1. o Several of the functions of the heap manager have been tuned, resulting in 6%-18% faster operation for these routines, depending on the CPU used. o Set functions have been sped up by a few percent, but the add variable range operation may be up to eight times as fast. o UPCASE function has been enhanced to support the complete IBM character set. This means that characters ä,ü,ö,å,æ,é,ñ,ç are converted to upper case by this function. o Several bugs of the original RTL supplied by Borland have been fixed: Using REAL arithmetic in $N- mode, Trunc and Round could not produce the smallest legal LONGINT number -2147483648. Arguments that should result in this number caused a run time error 207 instead. Trunc/Round now will return the correct result of -2147483648. Correct implementation can be checked using the ROUNDTST program. The Random function could return 1.0 when compiled in the $N+ state, although the specifications call for a return value 0 <= Random < 1. This has been corrected. Return values from Random are strictly smaller than 1 now. The integer Random function would return unevenly distributed random numbers if the upper limit passed to the routine was not a power of two. The new function in TPL60N19.ZIP should return uniformly distributed random numbers for all arguments of the integer Random function. GetDir now correctly returns a run-time error 15 (invalid drive) when called with a non existent drive. Differing from the original, it also signals all errors reported by DOS as run-time errors. E.g. when applied to a floppy drive that does not contain a floppy, it will now return run-time error 152 (drive not ready), where previously it would incorrectly signal successful completion of the operation (InOutRes = 0). LONGINT Read and Val routines now accept the smallest LONGINT number -2147483648 as decimal input. For programs compiled with $N+, only true INFs are printed out as INF where with the original library some NaNs are also printed as INF. Correct operation can be tested with the INFBUG program. Addition/Subtraction of REAL arithmetic sometimes was unnecessarily inaccurate due to incorrect handling of discarded digits of the operand with smaller absolute value. This has been fixed with the introduction of a completely new add/subtract routine. Multiplication of REAL numbers by small integers was inaccurate, causing among others problems unnecessary inaccuracies in binary <-> decimal conversion. This has been eliminated with the new REAL arithmetic modules. REAL arithmetic EXP functions no longer signals overflow when called with small arguments, but underflows to zero instead as it should. Denormals in EXTENDED computations no longer cause invalid state on 8087 coprocessor when being converted to true zeros. Consistency between register contents and tag bits is now asserted. Removal of this bug can be tested with the BUG87 program. Denormals in EXTENDED format are now correctly converted to decimal strings by the Str and Write routines. The original routines printed EXTENDED precision denormals as zero. Note that TP 6.0 supports EXTENDED denormals only if your machine has an 80287XL, 80387, 80486 or equivalent. On the 8087 and Intel's original 80287 coprocessor denormals are only supported for the SINGLE and DOUBLE formats. Stack checking routine for programs compiled in the $S+ state now reliably detects all stack overflows. This bug in TP 6.0 has also been fixed in the second release of TP 6.0 by Borland, but Borland's code is slower than the one used here. Program initialization routine now tries to prevent that programs compiled with the $G+ (286 code generation) switch are run on 8086 and 8088. The checks done are not 100% safe, but catch most of these cases, displaying the message "CPU > 8086 required" and aborting the program with a return code of 254 ($FE) instead of letting it crash. Note that this check lets programs compiled with $G+ run on 80186 and V20/V30 processors, since these have the ability to execute all 80286 real mode instructions produced by Turbo Pascal. o Improved functionality only marginally increases overall code length of the SYSTEM unit by 1244 bytes (about 6%). This is due to careful optimizing in numerous routines. Most programs compiled with the new RTL will be smaller due to finer granularity of the RTL modules. Savings are usually in the 0.5 KB range for reasonably large programs. Improvements in CRT module -------------------------- o Bug fix in routine DirectWrite. The method used to prevent "snow" when writing directly to a CGA graphics card was not entirely safe. When used in a heavily interrupted program (e.g. serial communication as a background task), it would not always write during the time when scanning was in the invisible parts of the screen. The method used now is 100% save and is even faster, since it takes advantage of the horizontal and vertical retrace periods, as opposed to the old method which only used the horizontal retrace time. New routine has been tested successfully on original IBM-CGA card. II. Revision History ==================== Changes since version 1.8, dated 12-04-92 ----------------------------------------- o While previous versions of the replacement library had been optimized for speed and size (in that order), version 1.9 has been tuned for speed exlusively. All entries to library routines are now aligned on double word boundaries, as are the some timing critical loops. This optimization is aimed specifically at 386 and 486 processors. The increased speed of this library version was achieved by putting in additional code, so programs compiled with version 1.9 of the library tend to be somewhat bigger than those compiled with version 1.8. However, the increase in size is well in the range of a few hundred bytes at most. o Speed of REAL arithmetic transcendental functions has been increased by about 20%. o Speed of heap memory allocation/deallocation has been increased by about 5%. Changes since version 1.7, dated 10-12-92 ----------------------------------------- o Fixed bug in LONGINT division. If the dividend was -2147483648 and the divisor 65536, the division routine would incorrectly return 32768 instead of -32768 before. o Fixed bug in Str -> LONGINT conversion introduced in version 1.6. Because of this bug, the valid input $80000000 would cause a runtime-error. o Speed of the LONGINT -> Str conversion has been improved by a factor of up to 2, depending on the CPU. o Speed of LONGINT shift routines (SHL, SHR) has been improved by a factor of up to 3, depending on the CPU. o Speed of LONGINT division has been improved for large divisors. o The integer Random function has been enhanced to deliver more evenly distributed numbers. The algorithm used has been taken from BP 7. Changes since version 1.6, dated 08-31-92 ----------------------------------------- o The Round function for REAl arithmetic would sometimes not return the correct IEEE-rounded result due to wrong handling of the sticky flag. This has been fixed. o Size of the RTL has been reduced a bit. Changes since version 1.5, dated 06-10-92 ----------------------------------------- o The Test8087 variable would sometimes contain the wrong value when the environment conatined the string 87=Y. This has been fixed. o Speed and accuracy of the REAL arithmetic Sin und Cos functions have been enhanced. The previously introduced limits for arguments to these functions have been removed. Sin and Cos now accept all REAL numbers as arguments without producing a run-time error, just like the routines in the original TP 6.0 RTL. Due to the nature of the argument reduction scheme now used, the accuracy of the Sin and Cos functions takes a rather sharp drop if the arguments exceed an absolute value of about 1e8. Outside this range, accuracy is about the same as for the Sin and Cos from the original library. There is a gradual loss of precision towards bigger arguments. For arguments with an absolute value of more than about 5e12, there is a total loss of precision. Sin always returns 0, Cos always returns 1 for arguments exceeding this limit. Changes since version 1.4, dated 05-07-92 ----------------------------------------- o Fixed fatal bug in the Reset procedure. When applying Reset to an already open file, the procedure would crash the program during the automatic closing of the file before reopening it. This bug was reported by Chris Dennis (dennis_c@kosmos.wcc.govt.nz). Thanks! o Fixed bug in the Float->String conversion routine of the original RTL that caused EXTENDED precision denormals to be printed as zero. The conversion routine has been further enhanced to correctly print unnormals on machines with a 387 or 486. Note that these processors do not support unnormals, whereas the 8087 and 287 do. The original RTL only prints unnormals correctly on machines that have a coprocessor that supports this format. Since it is possible that unnormal numbers stored by a 8087/287 are loaded into a 387/486, e.g. via a binary data file, the conversion routine was changed to handle unnormals on all processors. Note that due to this change, there is a difference in printout between the original RTL and this RTL on machines with a 8087/287. The original RTL prints unnormals in an unnormalized format e.g. -0.2777289e-4387, whereas this library prints unnormals in a normalized format, e.g. -2.777289e-4388. The new conversion routine can handle all EXTENDED encodings, even those not supported by the 387/486 processors [10]. Zeroes and Pseudozeros are printed as zero, NaNs, Pseudo-NaNs, and Indefinite (a special NaN) are printed as NAN, Infinities and Pseudoinfinities are printed as INF, unnormals are printed the same way as normals/denormals after normalizing them as much as possible. o Improved speed and accuracy of REAL ArcTan function. o Improved speed and accuracy of REAL Exp function. o Improved speed of REAL addition/subtraction and Round/Trunc to match more closely the speed of these routines in the original RTL. These routines had suffered a drop in performance due to increased accuracy requirements before. Changes since version 1.3, dated 05-01-92 ----------------------------------------- o Fixed bug that set Test8087 variable incorrectly for 8087/80287 coprocessors. o Fixed bug in REAL Ln function error return. It would return the biggest possible REAL number before if called with an argument <= 0. It now correctly raises error 207 (invalid floating-point operation). Changes since version 1.2, dated 11-01-92 ----------------------------------------- o Fixed bug in Rename procedure. Due to this error, Rename would not work at all, but always return with error code 3 (path not found). This has been corrected. This error was reported by ShinKuang Chang (skchang@csemail.cropsci.ncsu.edu). Thanks! o Cleaned up the source code of the ELEFUNT test programs a bit. Since these programs were ported from the FORTRAN original in a 'quick and dirty' way, they were looking quite messy. Changes since version 1.1 beta, dated 04-01-92 ---------------------------------------------- o The Round and Trunc functions were unable to produce the smallest LONGINT number, -2147483648. If a call to these functions resulted in this number, an error was raised instead of returning the correct result. This has been fixed. Valid inputs to the Trunc functions are REAL numbers x for which -2147483649 < x < 2147483648 holds. Valid inputs to the Round function are numbers x for which -2147483648.5 <= x < 2147483647.5 holds. Changes since first beta release (version 1.0, dated 03-21-92) -------------------------------------------------------------- o Fixed bug in the routine that adds variable ranges to sets (as in s := [foo..bar], where s is a set and foo and bar are variables of the set's base type. o Switched back code in the REAL add/subtract routine to plain 8086 code. Forgot to remove the use386 flag when building code for the original release 1.0 beta. Changes since the alpha release ------------------------------- o There was an error in the 8087 float to string conversion in the alpha release which has been fixed. o A bug in the coprocessor identification that sets the Test8087 variable present in the alpha release has been fixed. o For string -> LONGINT conversion, it is now possible to input the smallest LONGINT number -2147483648 in decimal. o An enhanced argument reduction has been implemented for REAL arithmetic SIN, COS, and EXP function, delivering much more accurate results over the complete argument range. This has slowed these functions down somewhat, however, none of them runs slower than in the original TP 6.01 RTL. As a result of the new argument reduction, arguments to SIN and COS are restricted to the range -3.37325e9..3.37325e9 now. Arguments to these functions were previously unrestricted. For arguments outside the range given, an error 207 will result. This is consistent with the coprocessor/emulator generated SIN/COS functions, that also signal error 207 for arguments out of range (-9.22337e18..9.22337e18). o SIN, COS, and ARCTAN functions compiled in the $N+ state will now use the faster coprocessor instructions available on the 387 and 486 if such a coprocessor/FPU is present. o A check has been included to prevent programs compiled with $G+ (286 code generation) to run on a 8086. o The Random function has been fixed to return values strictly smaller than 1 when compiled with $N+. III. References =============== [1] IEEE: IEEE Standard for Binary Floating-Point Arithmetic. SIGPLAN Notices, Vol. 22, No. 2, 1985, pp. 9-25 [2] IEEE Standard for Binary Floating-Point Arithmetic. ANSI/IEEE Std 754-1985. New York, NY: Institute of Electrical and Electronics Engineers 1985 [3] Karpinski, R.: Paranoia: A Floating-Point Benchmark. Byte, February 1985, pp. 223-235 [4] Curnow, H.J.; Wichmann, B.A.: A synthetic benchmark. Computer Journal, Vol. 19, No. 1, 1976, pp. 43-49 [5] Wichmannn, B.A.: Validation code for the Whetstone benchmark. NPL Report DITC 107/88, National Physics Laboratory, UK, March 1988 [6] Cody, W.J.; Waite, W.: Software Manual for the Elementary Functions. Englewood Cliffs, NJ: Prentice Hall 1980 [7] McMahon, H.H.: The Livermore Fortran Kernels: A Test of the Numerical Performance Range. Technical Report UCRL-53745, Lawrence Livermore National Laboratory, December 1986, p. 179 [8] Weicker, R.P.: Dhrystone: A Synthetic Systems Programming Benchmark. Communications of the ACM, Vol. 27, No. 10, October 1984, pp. 1013-1030 [9] Weicker, R.P.: Dhrystone Benchmark: Rationale for Version 2 and Measurement Rules. SIGPLAN Notices, Vol. 23, No. 8, August 1988, pp. 49-62 [10] 387DX User's Manual, Programmer's Reference. Intel 1989 Note: PARANOIA, DHRYSTONE, WHETSTONE, LLL, and ELEFUNT source code is available from NETLIB@ORNL.GOV