home *** CD-ROM | disk | FTP | other *** search
- Turbo-Pascal 6.0 Runtime Libary Update - Release 1.9 02-16-1993
-
-
- This library is a complete replacement for the runtime library that
- came with your Turbo Pascal 6.0 compiler. Due to lots of optimizations,
- programs compiled with this version of TURBO.TPL will be faster.
- This library maintains 99.9% compatibility with the original library.
- Differences are usually due to enhancements and should not cause
- any compatibility problems. Some bugs from the original library
- supplied by Borland have been eliminated, but there can be no guarantee
- that new ones have not crept in. If you discover any bugs, or have
- other comments, please let me know. My email and snail mail addresses
- are given below. Due to the nature of Borland's licensing of the
- TPL source code I am not allowed to distribute the source code of
- my enhanced library, so I can only provide the binary. What I am
- including, starting with version 1.8, is the source for the LONGINT
- and the REAL arithmetic routines. Also included for the first time in
- this version is the source of most of the string routines. Since all
- this code does not contain a single line of code written by Borland,
- I think they can't object to the fact that I am making *my* code public.
- The source for the arithmetic rouitnes is contained in the file
- ARISOURC.ZIP. The source code of the string routines is contained in
- file STRSOURC.ZIP. The code of the arithmetic and string routines is
- hereby released into the public domain. You may use it in your own
- programs under the condition that you do not include it into a
- commercial product. Parties interested in commercial use of my code
- should contact me at my address below.
-
-
- THIS VERSION OF THE LIBRARY REPLACEMENT FOR TP 6.0 IS DEFINITELY THE
- FINAL VERSION. NO FURTHER UPDATES WILL BE MADE, AS THE NEW 7.0 VERSION
- OF TURBO PASCAL (AND BORLAND PASCAL) HAS BEEN INTRODUCED IN NOVEMBER
- 1992 AND IS NOW AVAILABLE IN LOCALIZED VERSIONS EVERYWHERE. I AM
- PREPARING A SIMILAR LIBRARY FILE CALLED TPL70N10.ZIP FOR USE WITH
- TP/BP 7.0.
-
-
- Original library code is Copyright (C) 1983,91 Borland International
-
-
- New / additional library code is Copyright (C) 1988-1993
-
- Norbert Juffa, Wielandtstr. 14, 7500 Karlsruhe 1, Germany
- Internet: S_JUFFA@IRAVCL.IRA.UKA.DE
-
-
-
- Contents of this document:
-
- I. Capabilities of RTL replacement
- II. Revision History
- III. References
-
-
-
- I. Capabilities of RTL replacement
- ==================================
-
- Improvements in SYSTEM module
- -----------------------------
-
- o REAL type software arithmetic operations now comply with ANSI/IEEE
- Standard 754-1985 for Binary Floating Point Arithmetic [1,2] as much
- as possible. Note that REAL arithmetic by design differs from the
- standard in many ways, especially available numeric formats, value
- set, and available operations. The rounding mode implemented here
- is "round to nearest or even" as specified by the standard. Add,
- Subtract, Multiply, Squaring, Division, and Square Root deliver
- exact results with regard to this rounding mode, as demanded by the
- standard. Conversions from REAL to LONGINT and from EXTENDED to REAL
- use rounding to nearest or even, as specified in the standard. Correct
- implementation of above features was tested with the PARANOIA test
- program [3]. The correctness of basic REAL arithmetic functions has
- also been tested against the coprocessor/emulator EXTENDED format
- with the program FUN1_TST. The EXTENDED format carries approximately
- 19 decimal digits of precision.
-
- o REAL arithmetic operations have been sped up. Speed-up for SQRT varies
- between a factor of 12 for a 8086 and 29 for a Cyrix 486DLC. FRAC now
- executes at nearly three times the original speed. Speed-up for SIN,
- COS, ARCTAN, LN, EXP is between 50% and 100%. Division is now between
- 60% and 300% faster than before, depending on the CPU. Overall numeric
- processing power using REAL arithmetic increases by about 51% for an
- 8086, by 62% for an Intel 386DX, and 80% for a Cyrix 486DLC as measured
- by the WHETSTONE benchmark [4,5].
-
- o Overall accuracy of REAL arithmetic transcendental functions has been
- improved as indicated by Cody&Waite's ELEFUNT tests [6]: DLOG, DEXP,
- DATAN, DSIN. Correct argument reduction ensures that relative error
- over the whole argument range does not exceed 1.9e-12 for Exp, 2.8e-12
- for Arctan, and 2.7e-12 for Ln. These values have been determined
- by comparing the function returns of the REAL transcendental functions
- to the values computed on a Cyrix 83D87 coprocessor for the EXTENDED
- format. For Sin and Cos, relative error is also in the above range
- when the argument is reasonably small (e.g. in range -100..100) and
- not very close to an integer multiple of 0.25*Pi. The error of the
- transcendental functions expressed in ULPs (units in the last place)
- over the whole argument range does not exceed 1.6 ULPs for Exp, 1.8
- ULPs for Arctan, and 2.2 ULPs for Ln. These values were determined
- using the ULPERR program.
-
- o Execution of coprocessor floating point computations using an 80287 or
- 80387 has been accelerated. For these coprocessors, NOPs will be inserted
- before every floating point instruction converted from an emulator
- interrupt instead of WAITs. As a result of this optimization, an
- improvement in execution speed of 15% has been observed running the
- Lawrence Livermore Loops (LLL) [7] on a Cyrix 83D87, the improvement
- for the WHETSTONE benchmark on the 83D87 is 9.4%. Maximum performance
- gain for tight loops (e.g. fractal computation) by this measure is about
- 22%.
-
- o On 80287XL, 80387, 80486DX or compatible chips the Sin and Cos functions
- take advantage of the FSIN and FCOS instructions of these coprocessors,
- speeding up these functions by almost a factor of two. As a side effect,
- there is also some improvement in accuracy as measured by the DSIN test
- program from the ELEFUNT test suite. Also, the Arctan function takes
- advantage of the increased argument range of the FPATAN function. These
- optimizations result in another 19% increase in WHETSTONE power, so
- that the total combined speedup over the original library is 25%
- for this benchmark when run on a 387 compatible coprocessor.
-
- o STRING operations are faster, especially for longer strings. Most
- dramatic increase is in the INSERT function, with execution times
- reduced to up to one fourth compared with the original version of
- the RTL. Faster string operations cause 7% performance increase for
- the DHRYSTONE [8,9] benchmark on a 8086.
-
- o Improved speed of random number generation. Random for REAL numbers
- is 10-20% faster, Random for EXTENDED numbers is 5% faster. Due to
- the improvements in the uniform distribution of integer random numbers,
- there is a decrease in the speed of integer random number generation
- of about 5%.
-
- o Binary to decimal conversions used in Str and Write procedures have
- been sped up by up to 70% for integers (BYTE, SHORTINT, INTEGER,
- WORD, LONGINT), up to 5% for REAL numbers and about 3% for EXTENDED
- numbers.
-
- o Improved speed of LONGINT arithmetic. Division enjoys fourfold reduction
- of execution time on 8086, for an Intel RapidCAD CPU the speed up factor
- is 5.2 and for a Cyrix 486DLC the speed-up factor is 10.1.
-
- o Several of the functions of the heap manager have been tuned, resulting
- in 6%-18% faster operation for these routines, depending on the CPU used.
-
- o Set functions have been sped up by a few percent, but the add variable
- range operation may be up to eight times as fast.
-
- o UPCASE function has been enhanced to support the complete IBM character
- set. This means that characters ä,ü,ö,å,æ,é,ñ,ç are converted to upper
- case by this function.
-
- o Several bugs of the original RTL supplied by Borland have been fixed:
-
- Using REAL arithmetic in $N- mode, Trunc and Round could not produce
- the smallest legal LONGINT number -2147483648. Arguments that should
- result in this number caused a run time error 207 instead. Trunc/Round
- now will return the correct result of -2147483648. Correct implementation
- can be checked using the ROUNDTST program.
-
- The Random function could return 1.0 when compiled in the $N+ state,
- although the specifications call for a return value 0 <= Random < 1.
- This has been corrected. Return values from Random are strictly
- smaller than 1 now.
-
- The integer Random function would return unevenly distributed random
- numbers if the upper limit passed to the routine was not a power of
- two. The new function in TPL60N19.ZIP should return uniformly distributed
- random numbers for all arguments of the integer Random function.
-
- GetDir now correctly returns a run-time error 15 (invalid drive)
- when called with a non existent drive. Differing from the original,
- it also signals all errors reported by DOS as run-time errors. E.g.
- when applied to a floppy drive that does not contain a floppy, it
- will now return run-time error 152 (drive not ready), where previously
- it would incorrectly signal successful completion of the operation
- (InOutRes = 0).
-
- LONGINT Read and Val routines now accept the smallest LONGINT
- number -2147483648 as decimal input.
-
- For programs compiled with $N+, only true INFs are printed out as
- INF where with the original library some NaNs are also printed as
- INF. Correct operation can be tested with the INFBUG program.
-
- Addition/Subtraction of REAL arithmetic sometimes was unnecessarily
- inaccurate due to incorrect handling of discarded digits of the
- operand with smaller absolute value. This has been fixed with the
- introduction of a completely new add/subtract routine.
-
- Multiplication of REAL numbers by small integers was inaccurate,
- causing among others problems unnecessary inaccuracies in binary
- <-> decimal conversion. This has been eliminated with the new
- REAL arithmetic modules.
-
- REAL arithmetic EXP functions no longer signals overflow when
- called with small arguments, but underflows to zero instead as it
- should.
-
- Denormals in EXTENDED computations no longer cause invalid state
- on 8087 coprocessor when being converted to true zeros. Consistency
- between register contents and tag bits is now asserted. Removal of
- this bug can be tested with the BUG87 program.
-
- Denormals in EXTENDED format are now correctly converted to decimal
- strings by the Str and Write routines. The original routines printed
- EXTENDED precision denormals as zero. Note that TP 6.0 supports
- EXTENDED denormals only if your machine has an 80287XL, 80387, 80486
- or equivalent. On the 8087 and Intel's original 80287 coprocessor
- denormals are only supported for the SINGLE and DOUBLE formats.
-
- Stack checking routine for programs compiled in the $S+ state now
- reliably detects all stack overflows. This bug in TP 6.0 has also
- been fixed in the second release of TP 6.0 by Borland, but Borland's
- code is slower than the one used here.
-
- Program initialization routine now tries to prevent that programs
- compiled with the $G+ (286 code generation) switch are run on 8086
- and 8088. The checks done are not 100% safe, but catch most of these
- cases, displaying the message "CPU > 8086 required" and aborting the
- program with a return code of 254 ($FE) instead of letting it crash.
- Note that this check lets programs compiled with $G+ run on 80186 and
- V20/V30 processors, since these have the ability to execute all 80286
- real mode instructions produced by Turbo Pascal.
-
- o Improved functionality only marginally increases overall code length
- of the SYSTEM unit by 1244 bytes (about 6%). This is due to careful
- optimizing in numerous routines. Most programs compiled with the new
- RTL will be smaller due to finer granularity of the RTL modules.
- Savings are usually in the 0.5 KB range for reasonably large programs.
-
-
-
- Improvements in CRT module
- --------------------------
-
- o Bug fix in routine DirectWrite. The method used to prevent "snow"
- when writing directly to a CGA graphics card was not entirely safe.
- When used in a heavily interrupted program (e.g. serial communication
- as a background task), it would not always write during the time
- when scanning was in the invisible parts of the screen. The method
- used now is 100% save and is even faster, since it takes advantage
- of the horizontal and vertical retrace periods, as opposed to the
- old method which only used the horizontal retrace time. New routine
- has been tested successfully on original IBM-CGA card.
-
-
-
- II. Revision History
- ====================
-
- Changes since version 1.8, dated 12-04-92
- -----------------------------------------
-
- o While previous versions of the replacement library had been optimized
- for speed and size (in that order), version 1.9 has been tuned for
- speed exlusively. All entries to library routines are now aligned on
- double word boundaries, as are the some timing critical loops. This
- optimization is aimed specifically at 386 and 486 processors. The
- increased speed of this library version was achieved by putting in
- additional code, so programs compiled with version 1.9 of the library
- tend to be somewhat bigger than those compiled with version 1.8.
- However, the increase in size is well in the range of a few hundred
- bytes at most.
-
- o Speed of REAL arithmetic transcendental functions has been increased by
- about 20%.
-
- o Speed of heap memory allocation/deallocation has been increased by
- about 5%.
-
-
- Changes since version 1.7, dated 10-12-92
- -----------------------------------------
-
- o Fixed bug in LONGINT division. If the dividend was -2147483648 and the
- divisor 65536, the division routine would incorrectly return 32768
- instead of -32768 before.
-
- o Fixed bug in Str -> LONGINT conversion introduced in version 1.6. Because
- of this bug, the valid input $80000000 would cause a runtime-error.
-
- o Speed of the LONGINT -> Str conversion has been improved by a factor
- of up to 2, depending on the CPU.
-
- o Speed of LONGINT shift routines (SHL, SHR) has been improved by a factor
- of up to 3, depending on the CPU.
-
- o Speed of LONGINT division has been improved for large divisors.
-
- o The integer Random function has been enhanced to deliver more evenly
- distributed numbers. The algorithm used has been taken from BP 7.
-
-
- Changes since version 1.6, dated 08-31-92
- -----------------------------------------
-
- o The Round function for REAl arithmetic would sometimes not return the
- correct IEEE-rounded result due to wrong handling of the sticky flag.
- This has been fixed.
-
- o Size of the RTL has been reduced a bit.
-
-
- Changes since version 1.5, dated 06-10-92
- -----------------------------------------
-
- o The Test8087 variable would sometimes contain the wrong value when
- the environment conatined the string 87=Y. This has been fixed.
-
- o Speed and accuracy of the REAL arithmetic Sin und Cos functions have
- been enhanced. The previously introduced limits for arguments to these
- functions have been removed. Sin and Cos now accept all REAL numbers
- as arguments without producing a run-time error, just like the routines
- in the original TP 6.0 RTL. Due to the nature of the argument reduction
- scheme now used, the accuracy of the Sin and Cos functions takes a
- rather sharp drop if the arguments exceed an absolute value of about 1e8.
- Outside this range, accuracy is about the same as for the Sin and Cos
- from the original library. There is a gradual loss of precision towards
- bigger arguments. For arguments with an absolute value of more than
- about 5e12, there is a total loss of precision. Sin always returns 0,
- Cos always returns 1 for arguments exceeding this limit.
-
-
- Changes since version 1.4, dated 05-07-92
- -----------------------------------------
-
- o Fixed fatal bug in the Reset procedure. When applying Reset to an
- already open file, the procedure would crash the program during
- the automatic closing of the file before reopening it. This bug was
- reported by Chris Dennis (dennis_c@kosmos.wcc.govt.nz). Thanks!
-
- o Fixed bug in the Float->String conversion routine of the original RTL
- that caused EXTENDED precision denormals to be printed as zero. The
- conversion routine has been further enhanced to correctly print
- unnormals on machines with a 387 or 486. Note that these processors
- do not support unnormals, whereas the 8087 and 287 do. The original
- RTL only prints unnormals correctly on machines that have a coprocessor
- that supports this format. Since it is possible that unnormal numbers
- stored by a 8087/287 are loaded into a 387/486, e.g. via a binary data
- file, the conversion routine was changed to handle unnormals on all
- processors. Note that due to this change, there is a difference in
- printout between the original RTL and this RTL on machines with a
- 8087/287. The original RTL prints unnormals in an unnormalized format
- e.g. -0.2777289e-4387, whereas this library prints unnormals in a
- normalized format, e.g. -2.777289e-4388. The new conversion routine
- can handle all EXTENDED encodings, even those not supported by the
- 387/486 processors [10]. Zeroes and Pseudozeros are printed as zero,
- NaNs, Pseudo-NaNs, and Indefinite (a special NaN) are printed as NAN,
- Infinities and Pseudoinfinities are printed as INF, unnormals are
- printed the same way as normals/denormals after normalizing them as
- much as possible.
-
- o Improved speed and accuracy of REAL ArcTan function.
-
- o Improved speed and accuracy of REAL Exp function.
-
- o Improved speed of REAL addition/subtraction and Round/Trunc to match
- more closely the speed of these routines in the original RTL. These
- routines had suffered a drop in performance due to increased accuracy
- requirements before.
-
-
- Changes since version 1.3, dated 05-01-92
- -----------------------------------------
-
- o Fixed bug that set Test8087 variable incorrectly for 8087/80287
- coprocessors.
-
- o Fixed bug in REAL Ln function error return. It would return the biggest
- possible REAL number before if called with an argument <= 0. It now
- correctly raises error 207 (invalid floating-point operation).
-
-
- Changes since version 1.2, dated 11-01-92
- -----------------------------------------
-
- o Fixed bug in Rename procedure. Due to this error, Rename would not
- work at all, but always return with error code 3 (path not found).
- This has been corrected. This error was reported by ShinKuang Chang
- (skchang@csemail.cropsci.ncsu.edu). Thanks!
-
- o Cleaned up the source code of the ELEFUNT test programs a bit. Since
- these programs were ported from the FORTRAN original in a 'quick and
- dirty' way, they were looking quite messy.
-
-
- Changes since version 1.1 beta, dated 04-01-92
- ----------------------------------------------
-
- o The Round and Trunc functions were unable to produce the smallest
- LONGINT number, -2147483648. If a call to these functions resulted
- in this number, an error was raised instead of returning the correct
- result. This has been fixed. Valid inputs to the Trunc functions are
- REAL numbers x for which -2147483649 < x < 2147483648 holds. Valid
- inputs to the Round function are numbers x for which -2147483648.5
- <= x < 2147483647.5 holds.
-
-
- Changes since first beta release (version 1.0, dated 03-21-92)
- --------------------------------------------------------------
-
- o Fixed bug in the routine that adds variable ranges to sets (as in
- s := [foo..bar], where s is a set and foo and bar are variables of
- the set's base type.
-
- o Switched back code in the REAL add/subtract routine to plain 8086
- code. Forgot to remove the use386 flag when building code for the
- original release 1.0 beta.
-
-
- Changes since the alpha release
- -------------------------------
-
- o There was an error in the 8087 float to string conversion in the
- alpha release which has been fixed.
-
- o A bug in the coprocessor identification that sets the Test8087 variable
- present in the alpha release has been fixed.
-
- o For string -> LONGINT conversion, it is now possible to input the
- smallest LONGINT number -2147483648 in decimal.
-
- o An enhanced argument reduction has been implemented for REAL arithmetic
- SIN, COS, and EXP function, delivering much more accurate results over
- the complete argument range. This has slowed these functions down
- somewhat, however, none of them runs slower than in the original TP 6.01
- RTL. As a result of the new argument reduction, arguments to SIN and
- COS are restricted to the range -3.37325e9..3.37325e9 now. Arguments to
- these functions were previously unrestricted. For arguments outside the
- range given, an error 207 will result. This is consistent with the
- coprocessor/emulator generated SIN/COS functions, that also signal
- error 207 for arguments out of range (-9.22337e18..9.22337e18).
-
- o SIN, COS, and ARCTAN functions compiled in the $N+ state will now use
- the faster coprocessor instructions available on the 387 and 486 if
- such a coprocessor/FPU is present.
-
- o A check has been included to prevent programs compiled with $G+ (286 code
- generation) to run on a 8086.
-
- o The Random function has been fixed to return values strictly smaller than
- 1 when compiled with $N+.
-
-
-
- III. References
- ===============
-
- [1] IEEE: IEEE Standard for Binary Floating-Point Arithmetic.
- SIGPLAN Notices, Vol. 22, No. 2, 1985, pp. 9-25
-
- [2] IEEE Standard for Binary Floating-Point Arithmetic.
- ANSI/IEEE Std 754-1985.
- New York, NY: Institute of Electrical and Electronics Engineers 1985
-
- [3] Karpinski, R.: Paranoia: A Floating-Point Benchmark.
- Byte, February 1985, pp. 223-235
-
- [4] Curnow, H.J.; Wichmann, B.A.: A synthetic benchmark.
- Computer Journal, Vol. 19, No. 1, 1976, pp. 43-49
-
- [5] Wichmannn, B.A.: Validation code for the Whetstone benchmark.
- NPL Report DITC 107/88, National Physics Laboratory, UK, March 1988
-
- [6] Cody, W.J.; Waite, W.: Software Manual for the Elementary Functions.
- Englewood Cliffs, NJ: Prentice Hall 1980
-
- [7] McMahon, H.H.: The Livermore Fortran Kernels: A Test of the Numerical
- Performance Range.
- Technical Report UCRL-53745, Lawrence Livermore National Laboratory,
- December 1986, p. 179
-
- [8] Weicker, R.P.: Dhrystone: A Synthetic Systems Programming Benchmark.
- Communications of the ACM, Vol. 27, No. 10, October 1984, pp. 1013-1030
-
- [9] Weicker, R.P.: Dhrystone Benchmark: Rationale for Version 2 and
- Measurement Rules.
- SIGPLAN Notices, Vol. 23, No. 8, August 1988, pp. 49-62
-
- [10] 387DX User's Manual, Programmer's Reference. Intel 1989
-
-
- Note:
-
- PARANOIA, DHRYSTONE, WHETSTONE, LLL, and ELEFUNT source code is
- available from NETLIB@ORNL.GOV
-
-