home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
C!T ROM 2
/
ctrom_ii_b.zip
/
ctrom_ii_b
/
PROGRAM
/
PASCAL
/
TPL60N19
/
README.DOC
< prev
next >
Wrap
Text File
|
1993-02-16
|
24KB
|
489 lines
Turbo-Pascal 6.0 Runtime Libary Update - Release 1.9 02-16-1993
This library is a complete replacement for the runtime library that
came with your Turbo Pascal 6.0 compiler. Due to lots of optimizations,
programs compiled with this version of TURBO.TPL will be faster.
This library maintains 99.9% compatibility with the original library.
Differences are usually due to enhancements and should not cause
any compatibility problems. Some bugs from the original library
supplied by Borland have been eliminated, but there can be no guarantee
that new ones have not crept in. If you discover any bugs, or have
other comments, please let me know. My email and snail mail addresses
are given below. Due to the nature of Borland's licensing of the
TPL source code I am not allowed to distribute the source code of
my enhanced library, so I can only provide the binary. What I am
including, starting with version 1.8, is the source for the LONGINT
and the REAL arithmetic routines. Also included for the first time in
this version is the source of most of the string routines. Since all
this code does not contain a single line of code written by Borland,
I think they can't object to the fact that I am making *my* code public.
The source for the arithmetic rouitnes is contained in the file
ARISOURC.ZIP. The source code of the string routines is contained in
file STRSOURC.ZIP. The code of the arithmetic and string routines is
hereby released into the public domain. You may use it in your own
programs under the condition that you do not include it into a
commercial product. Parties interested in commercial use of my code
should contact me at my address below.
THIS VERSION OF THE LIBRARY REPLACEMENT FOR TP 6.0 IS DEFINITELY THE
FINAL VERSION. NO FURTHER UPDATES WILL BE MADE, AS THE NEW 7.0 VERSION
OF TURBO PASCAL (AND BORLAND PASCAL) HAS BEEN INTRODUCED IN NOVEMBER
1992 AND IS NOW AVAILABLE IN LOCALIZED VERSIONS EVERYWHERE. I AM
PREPARING A SIMILAR LIBRARY FILE CALLED TPL70N10.ZIP FOR USE WITH
TP/BP 7.0.
Original library code is Copyright (C) 1983,91 Borland International
New / additional library code is Copyright (C) 1988-1993
Norbert Juffa, Wielandtstr. 14, 7500 Karlsruhe 1, Germany
Internet: S_JUFFA@IRAVCL.IRA.UKA.DE
Contents of this document:
I. Capabilities of RTL replacement
II. Revision History
III. References
I. Capabilities of RTL replacement
==================================
Improvements in SYSTEM module
-----------------------------
o REAL type software arithmetic operations now comply with ANSI/IEEE
Standard 754-1985 for Binary Floating Point Arithmetic [1,2] as much
as possible. Note that REAL arithmetic by design differs from the
standard in many ways, especially available numeric formats, value
set, and available operations. The rounding mode implemented here
is "round to nearest or even" as specified by the standard. Add,
Subtract, Multiply, Squaring, Division, and Square Root deliver
exact results with regard to this rounding mode, as demanded by the
standard. Conversions from REAL to LONGINT and from EXTENDED to REAL
use rounding to nearest or even, as specified in the standard. Correct
implementation of above features was tested with the PARANOIA test
program [3]. The correctness of basic REAL arithmetic functions has
also been tested against the coprocessor/emulator EXTENDED format
with the program FUN1_TST. The EXTENDED format carries approximately
19 decimal digits of precision.
o REAL arithmetic operations have been sped up. Speed-up for SQRT varies
between a factor of 12 for a 8086 and 29 for a Cyrix 486DLC. FRAC now
executes at nearly three times the original speed. Speed-up for SIN,
COS, ARCTAN, LN, EXP is between 50% and 100%. Division is now between
60% and 300% faster than before, depending on the CPU. Overall numeric
processing power using REAL arithmetic increases by about 51% for an
8086, by 62% for an Intel 386DX, and 80% for a Cyrix 486DLC as measured
by the WHETSTONE benchmark [4,5].
o Overall accuracy of REAL arithmetic transcendental functions has been
improved as indicated by Cody&Waite's ELEFUNT tests [6]: DLOG, DEXP,
DATAN, DSIN. Correct argument reduction ensures that relative error
over the whole argument range does not exceed 1.9e-12 for Exp, 2.8e-12
for Arctan, and 2.7e-12 for Ln. These values have been determined
by comparing the function returns of the REAL transcendental functions
to the values computed on a Cyrix 83D87 coprocessor for the EXTENDED
format. For Sin and Cos, relative error is also in the above range
when the argument is reasonably small (e.g. in range -100..100) and
not very close to an integer multiple of 0.25*Pi. The error of the
transcendental functions expressed in ULPs (units in the last place)
over the whole argument range does not exceed 1.6 ULPs for Exp, 1.8
ULPs for Arctan, and 2.2 ULPs for Ln. These values were determined
using the ULPERR program.
o Execution of coprocessor floating point computations using an 80287 or
80387 has been accelerated. For these coprocessors, NOPs will be inserted
before every floating point instruction converted from an emulator
interrupt instead of WAITs. As a result of this optimization, an
improvement in execution speed of 15% has been observed running the
Lawrence Livermore Loops (LLL) [7] on a Cyrix 83D87, the improvement
for the WHETSTONE benchmark on the 83D87 is 9.4%. Maximum performance
gain for tight loops (e.g. fractal computation) by this measure is about
22%.
o On 80287XL, 80387, 80486DX or compatible chips the Sin and Cos functions
take advantage of the FSIN and FCOS instructions of these coprocessors,
speeding up these functions by almost a factor of two. As a side effect,
there is also some improvement in accuracy as measured by the DSIN test
program from the ELEFUNT test suite. Also, the Arctan function takes
advantage of the increased argument range of the FPATAN function. These
optimizations result in another 19% increase in WHETSTONE power, so
that the total combined speedup over the original library is 25%
for this benchmark when run on a 387 compatible coprocessor.
o STRING operations are faster, especially for longer strings. Most
dramatic increase is in the INSERT function, with execution times
reduced to up to one fourth compared with the original version of
the RTL. Faster string operations cause 7% performance increase for
the DHRYSTONE [8,9] benchmark on a 8086.
o Improved speed of random number generation. Random for REAL numbers
is 10-20% faster, Random for EXTENDED numbers is 5% faster. Due to
the improvements in the uniform distribution of integer random numbers,
there is a decrease in the speed of integer random number generation
of about 5%.
o Binary to decimal conversions used in Str and Write procedures have
been sped up by up to 70% for integers (BYTE, SHORTINT, INTEGER,
WORD, LONGINT), up to 5% for REAL numbers and about 3% for EXTENDED
numbers.
o Improved speed of LONGINT arithmetic. Division enjoys fourfold reduction
of execution time on 8086, for an Intel RapidCAD CPU the speed up factor
is 5.2 and for a Cyrix 486DLC the speed-up factor is 10.1.
o Several of the functions of the heap manager have been tuned, resulting
in 6%-18% faster operation for these routines, depending on the CPU used.
o Set functions have been sped up by a few percent, but the add variable
range operation may be up to eight times as fast.
o UPCASE function has been enhanced to support the complete IBM character
set. This means that characters ä,ü,ö,å,æ,é,ñ,ç are converted to upper
case by this function.
o Several bugs of the original RTL supplied by Borland have been fixed:
Using REAL arithmetic in $N- mode, Trunc and Round could not produce
the smallest legal LONGINT number -2147483648. Arguments that should
result in this number caused a run time error 207 instead. Trunc/Round
now will return the correct result of -2147483648. Correct implementation
can be checked using the ROUNDTST program.
The Random function could return 1.0 when compiled in the $N+ state,
although the specifications call for a return value 0 <= Random < 1.
This has been corrected. Return values from Random are strictly
smaller than 1 now.
The integer Random function would return unevenly distributed random
numbers if the upper limit passed to the routine was not a power of
two. The new function in TPL60N19.ZIP should return uniformly distributed
random numbers for all arguments of the integer Random function.
GetDir now correctly returns a run-time error 15 (invalid drive)
when called with a non existent drive. Differing from the original,
it also signals all errors reported by DOS as run-time errors. E.g.
when applied to a floppy drive that does not contain a floppy, it
will now return run-time error 152 (drive not ready), where previously
it would incorrectly signal successful completion of the operation
(InOutRes = 0).
LONGINT Read and Val routines now accept the smallest LONGINT
number -2147483648 as decimal input.
For programs compiled with $N+, only true INFs are printed out as
INF where with the original library some NaNs are also printed as
INF. Correct operation can be tested with the INFBUG program.
Addition/Subtraction of REAL arithmetic sometimes was unnecessarily
inaccurate due to incorrect handling of discarded digits of the
operand with smaller absolute value. This has been fixed with the
introduction of a completely new add/subtract routine.
Multiplication of REAL numbers by small integers was inaccurate,
causing among others problems unnecessary inaccuracies in binary
<-> decimal conversion. This has been eliminated with the new
REAL arithmetic modules.
REAL arithmetic EXP functions no longer signals overflow when
called with small arguments, but underflows to zero instead as it
should.
Denormals in EXTENDED computations no longer cause invalid state
on 8087 coprocessor when being converted to true zeros. Consistency
between register contents and tag bits is now asserted. Removal of
this bug can be tested with the BUG87 program.
Denormals in EXTENDED format are now correctly converted to decimal
strings by the Str and Write routines. The original routines printed
EXTENDED precision denormals as zero. Note that TP 6.0 supports
EXTENDED denormals only if your machine has an 80287XL, 80387, 80486
or equivalent. On the 8087 and Intel's original 80287 coprocessor
denormals are only supported for the SINGLE and DOUBLE formats.
Stack checking routine for programs compiled in the $S+ state now
reliably detects all stack overflows. This bug in TP 6.0 has also
been fixed in the second release of TP 6.0 by Borland, but Borland's
code is slower than the one used here.
Program initialization routine now tries to prevent that programs
compiled with the $G+ (286 code generation) switch are run on 8086
and 8088. The checks done are not 100% safe, but catch most of these
cases, displaying the message "CPU > 8086 required" and aborting the
program with a return code of 254 ($FE) instead of letting it crash.
Note that this check lets programs compiled with $G+ run on 80186 and
V20/V30 processors, since these have the ability to execute all 80286
real mode instructions produced by Turbo Pascal.
o Improved functionality only marginally increases overall code length
of the SYSTEM unit by 1244 bytes (about 6%). This is due to careful
optimizing in numerous routines. Most programs compiled with the new
RTL will be smaller due to finer granularity of the RTL modules.
Savings are usually in the 0.5 KB range for reasonably large programs.
Improvements in CRT module
--------------------------
o Bug fix in routine DirectWrite. The method used to prevent "snow"
when writing directly to a CGA graphics card was not entirely safe.
When used in a heavily interrupted program (e.g. serial communication
as a background task), it would not always write during the time
when scanning was in the invisible parts of the screen. The method
used now is 100% save and is even faster, since it takes advantage
of the horizontal and vertical retrace periods, as opposed to the
old method which only used the horizontal retrace time. New routine
has been tested successfully on original IBM-CGA card.
II. Revision History
====================
Changes since version 1.8, dated 12-04-92
-----------------------------------------
o While previous versions of the replacement library had been optimized
for speed and size (in that order), version 1.9 has been tuned for
speed exlusively. All entries to library routines are now aligned on
double word boundaries, as are the some timing critical loops. This
optimization is aimed specifically at 386 and 486 processors. The
increased speed of this library version was achieved by putting in
additional code, so programs compiled with version 1.9 of the library
tend to be somewhat bigger than those compiled with version 1.8.
However, the increase in size is well in the range of a few hundred
bytes at most.
o Speed of REAL arithmetic transcendental functions has been increased by
about 20%.
o Speed of heap memory allocation/deallocation has been increased by
about 5%.
Changes since version 1.7, dated 10-12-92
-----------------------------------------
o Fixed bug in LONGINT division. If the dividend was -2147483648 and the
divisor 65536, the division routine would incorrectly return 32768
instead of -32768 before.
o Fixed bug in Str -> LONGINT conversion introduced in version 1.6. Because
of this bug, the valid input $80000000 would cause a runtime-error.
o Speed of the LONGINT -> Str conversion has been improved by a factor
of up to 2, depending on the CPU.
o Speed of LONGINT shift routines (SHL, SHR) has been improved by a factor
of up to 3, depending on the CPU.
o Speed of LONGINT division has been improved for large divisors.
o The integer Random function has been enhanced to deliver more evenly
distributed numbers. The algorithm used has been taken from BP 7.
Changes since version 1.6, dated 08-31-92
-----------------------------------------
o The Round function for REAl arithmetic would sometimes not return the
correct IEEE-rounded result due to wrong handling of the sticky flag.
This has been fixed.
o Size of the RTL has been reduced a bit.
Changes since version 1.5, dated 06-10-92
-----------------------------------------
o The Test8087 variable would sometimes contain the wrong value when
the environment conatined the string 87=Y. This has been fixed.
o Speed and accuracy of the REAL arithmetic Sin und Cos functions have
been enhanced. The previously introduced limits for arguments to these
functions have been removed. Sin and Cos now accept all REAL numbers
as arguments without producing a run-time error, just like the routines
in the original TP 6.0 RTL. Due to the nature of the argument reduction
scheme now used, the accuracy of the Sin and Cos functions takes a
rather sharp drop if the arguments exceed an absolute value of about 1e8.
Outside this range, accuracy is about the same as for the Sin and Cos
from the original library. There is a gradual loss of precision towards
bigger arguments. For arguments with an absolute value of more than
about 5e12, there is a total loss of precision. Sin always returns 0,
Cos always returns 1 for arguments exceeding this limit.
Changes since version 1.4, dated 05-07-92
-----------------------------------------
o Fixed fatal bug in the Reset procedure. When applying Reset to an
already open file, the procedure would crash the program during
the automatic closing of the file before reopening it. This bug was
reported by Chris Dennis (dennis_c@kosmos.wcc.govt.nz). Thanks!
o Fixed bug in the Float->String conversion routine of the original RTL
that caused EXTENDED precision denormals to be printed as zero. The
conversion routine has been further enhanced to correctly print
unnormals on machines with a 387 or 486. Note that these processors
do not support unnormals, whereas the 8087 and 287 do. The original
RTL only prints unnormals correctly on machines that have a coprocessor
that supports this format. Since it is possible that unnormal numbers
stored by a 8087/287 are loaded into a 387/486, e.g. via a binary data
file, the conversion routine was changed to handle unnormals on all
processors. Note that due to this change, there is a difference in
printout between the original RTL and this RTL on machines with a
8087/287. The original RTL prints unnormals in an unnormalized format
e.g. -0.2777289e-4387, whereas this library prints unnormals in a
normalized format, e.g. -2.777289e-4388. The new conversion routine
can handle all EXTENDED encodings, even those not supported by the
387/486 processors [10]. Zeroes and Pseudozeros are printed as zero,
NaNs, Pseudo-NaNs, and Indefinite (a special NaN) are printed as NAN,
Infinities and Pseudoinfinities are printed as INF, unnormals are
printed the same way as normals/denormals after normalizing them as
much as possible.
o Improved speed and accuracy of REAL ArcTan function.
o Improved speed and accuracy of REAL Exp function.
o Improved speed of REAL addition/subtraction and Round/Trunc to match
more closely the speed of these routines in the original RTL. These
routines had suffered a drop in performance due to increased accuracy
requirements before.
Changes since version 1.3, dated 05-01-92
-----------------------------------------
o Fixed bug that set Test8087 variable incorrectly for 8087/80287
coprocessors.
o Fixed bug in REAL Ln function error return. It would return the biggest
possible REAL number before if called with an argument <= 0. It now
correctly raises error 207 (invalid floating-point operation).
Changes since version 1.2, dated 11-01-92
-----------------------------------------
o Fixed bug in Rename procedure. Due to this error, Rename would not
work at all, but always return with error code 3 (path not found).
This has been corrected. This error was reported by ShinKuang Chang
(skchang@csemail.cropsci.ncsu.edu). Thanks!
o Cleaned up the source code of the ELEFUNT test programs a bit. Since
these programs were ported from the FORTRAN original in a 'quick and
dirty' way, they were looking quite messy.
Changes since version 1.1 beta, dated 04-01-92
----------------------------------------------
o The Round and Trunc functions were unable to produce the smallest
LONGINT number, -2147483648. If a call to these functions resulted
in this number, an error was raised instead of returning the correct
result. This has been fixed. Valid inputs to the Trunc functions are
REAL numbers x for which -2147483649 < x < 2147483648 holds. Valid
inputs to the Round function are numbers x for which -2147483648.5
<= x < 2147483647.5 holds.
Changes since first beta release (version 1.0, dated 03-21-92)
--------------------------------------------------------------
o Fixed bug in the routine that adds variable ranges to sets (as in
s := [foo..bar], where s is a set and foo and bar are variables of
the set's base type.
o Switched back code in the REAL add/subtract routine to plain 8086
code. Forgot to remove the use386 flag when building code for the
original release 1.0 beta.
Changes since the alpha release
-------------------------------
o There was an error in the 8087 float to string conversion in the
alpha release which has been fixed.
o A bug in the coprocessor identification that sets the Test8087 variable
present in the alpha release has been fixed.
o For string -> LONGINT conversion, it is now possible to input the
smallest LONGINT number -2147483648 in decimal.
o An enhanced argument reduction has been implemented for REAL arithmetic
SIN, COS, and EXP function, delivering much more accurate results over
the complete argument range. This has slowed these functions down
somewhat, however, none of them runs slower than in the original TP 6.01
RTL. As a result of the new argument reduction, arguments to SIN and
COS are restricted to the range -3.37325e9..3.37325e9 now. Arguments to
these functions were previously unrestricted. For arguments outside the
range given, an error 207 will result. This is consistent with the
coprocessor/emulator generated SIN/COS functions, that also signal
error 207 for arguments out of range (-9.22337e18..9.22337e18).
o SIN, COS, and ARCTAN functions compiled in the $N+ state will now use
the faster coprocessor instructions available on the 387 and 486 if
such a coprocessor/FPU is present.
o A check has been included to prevent programs compiled with $G+ (286 code
generation) to run on a 8086.
o The Random function has been fixed to return values strictly smaller than
1 when compiled with $N+.
III. References
===============
[1] IEEE: IEEE Standard for Binary Floating-Point Arithmetic.
SIGPLAN Notices, Vol. 22, No. 2, 1985, pp. 9-25
[2] IEEE Standard for Binary Floating-Point Arithmetic.
ANSI/IEEE Std 754-1985.
New York, NY: Institute of Electrical and Electronics Engineers 1985
[3] Karpinski, R.: Paranoia: A Floating-Point Benchmark.
Byte, February 1985, pp. 223-235
[4] Curnow, H.J.; Wichmann, B.A.: A synthetic benchmark.
Computer Journal, Vol. 19, No. 1, 1976, pp. 43-49
[5] Wichmannn, B.A.: Validation code for the Whetstone benchmark.
NPL Report DITC 107/88, National Physics Laboratory, UK, March 1988
[6] Cody, W.J.; Waite, W.: Software Manual for the Elementary Functions.
Englewood Cliffs, NJ: Prentice Hall 1980
[7] McMahon, H.H.: The Livermore Fortran Kernels: A Test of the Numerical
Performance Range.
Technical Report UCRL-53745, Lawrence Livermore National Laboratory,
December 1986, p. 179
[8] Weicker, R.P.: Dhrystone: A Synthetic Systems Programming Benchmark.
Communications of the ACM, Vol. 27, No. 10, October 1984, pp. 1013-1030
[9] Weicker, R.P.: Dhrystone Benchmark: Rationale for Version 2 and
Measurement Rules.
SIGPLAN Notices, Vol. 23, No. 8, August 1988, pp. 49-62
[10] 387DX User's Manual, Programmer's Reference. Intel 1989
Note:
PARANOIA, DHRYSTONE, WHETSTONE, LLL, and ELEFUNT source code is
available from NETLIB@ORNL.GOV