home *** CD-ROM | disk | FTP | other *** search
Text File | 1992-09-02 | 97.6 KB | 2,272 lines |
- Path: sparky!uunet!cs.utexas.edu!qt.cs.utexas.edu!yale.edu!ira.uka.de!uka!uka!news
- From: S_JUFFA@iravcl.ira.uka.de (|S| Norbert Juffa)
- Newsgroups: comp.lang.pascal
- Subject: Turbo Pascal 6.0 bug list (long!)
- Date: 2 Sep 1992 13:41:57 GMT
- Organization: University of Karlsruhe (FRG) - Informatik Rechnerabt.
- Lines: 2259
- Distribution: world
- Message-ID: <182gb5INNsm4@iraul1.ira.uka.de>
- NNTP-Posting-Host: irav21.ira.uka.de
- X-News-Reader: VMS NEWS 1.23
-
- y++++++++++++++++++++++ Bug List TURBO-Pascal 6.0 ++++++++++++++++++++++++++++++
-
- This list is a compilation of all the bug reports I (Norbert Juffa, email:
- S_JUFFA@IRAVCL.IRA.UKA.DE) sent to Borland between 10-01-90 and 07-28-92
- regarding bugs in Turbo-Pascal 6.0 that have not been fixed up till now.
- There were more bugs in the original release of TP 6.0, which Borland fixed
- in a subsequent release of TP 6.0, so these are not included in this list.
- For a more complete bug list of Turbo Pascal 6.0 bugs, look for the list
- Duncan Murdoch (dmurdoch@mast.queensu.ca) irregularly publishes on
- Internet. If you find any bug in TP 6.0, be it in the compiler, run-time
- library or Turbo Vision, please send a description of the bug to Duncan.
- Include a demonstration program that reliably reproduces the bug whenever
- possible.
-
- +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
-
-
- 1. Error in coprocessor underflow exception handler on i8087
-
- This bug has also been present in version 5.5 of the compiler.
- It can lead to some really strange program behavior in pro-
- grams using operations on the IEEE temporary floating point
- format (EXTENDED) when the programm is executed on an 8087
- or 80287. This error is not reproducable on an 80387, 80287XL
- or 80486. The bug may be demonstrated by having the following
- program run with an 8087/287 coprocessor:
-
-
- {$A+,B+,D+,E-,F-,G-,I+,L+,N+,O-,R+,S+,V+,X-}
- {$M 16384,0,655360}
-
- PROGRAM 87BUG; { demonstrates some strange behavior on 8087/287 }
-
- VAR X: EXTENDED; { allows storing of denormal }
- L: WORD;
-
- BEGIN
- WriteLn ('Turbo-Pascal 6.0 floating point exception bug demo program');
- WriteLn;
- WriteLn ('Continously dividing 4e-4932 by 1.1...');
- WriteLn;
- X := 4e-4932; { close to smallest normalized EXTENDED number }
- FOR L := 1 TO 5 DO BEGIN
- X := X / 1.1;
- Write (X:25);
- IF L > 1 THEN { after 1st iter. underflow w/ flush to zero }
- WriteLn (' should be: ', 0.0:25)
- ELSE
- WriteLn (' should be: ', X:25);
- END;
- END. {87BUG}
-
-
- The output of this program will look like this when executed on a
- system with an 8087/287 coprocessor:
-
-
- Turbo-Pascal 6.0 floating point exception bug demo program
-
- Continously dividing 4e-4932 by 1.1...
-
- 3.6363636363636364E-4932 should be: 3.6363636363636364E-4932
- 0.0000000000000000E+0000 should be: 0.0000000000000000E+0000
- 1.1000000000000000E+0000 should be: 0.0000000000000000E+0000
- 1.0000000000000000E+0000 should be: 0.0000000000000000E+0000
- 9.0909090909090909E-0001 should be: 0.0000000000000000E+0000
-
-
- The bug can not be demonstrated using the coprocessor emulator,
- which does not exhibit this error. The problem is in the denormal
- exception handler for the coprocessor. Since denormalized numbers
- are not supported by Turbo-Pascal, whenever a denormal is loaded
- from memory into the coprocessor, it is changed to a true zero
- (so called "flush to zero" response). Denormals can only be stored
- to an EXTENDED type variable. When stored to DOUBLE or SINGLE type
- variables, the rounding provided by the coprocessor will generate
- zero.
-
- On loading the denormal, the coprocessor raises the denormal and
- underflow exceptions. Since the denormal exception is unmasked by
- the Turbo-Pascal start-up code, the appropriate hardware interrupt
- (INT 02,'NMI' on a PC type machine) is executed. The INT02 handler
- of Turbo-Pascal 6.0 now performs the following steps:
- First, it saves the coprocessor state using the FSTENV instruc-
- tion of the coprocessor. The saved state includes the control
- word, the status word, the tag word, and the instruction pointer
- and opcode of the instruction causing the exception.
- Second, it does some analysis to figure out what kind of exception
- was the cause of the interrupt, since all coprocessor exceptions
- will trap through the same interrupt.
- Third, after the handler is sure that an denormal triggered the
- interrupt, it empties the top of stack register (TOS) of the
- coprocessor, which contains the unwanted denormal, by executing
- a FSTP ST(0). After discarding the denormal, it loads a true zero
- with FLDZ.
- Finally, the handler exits through its standard exit, using FLDENV
- to restore the coprocessor environment. This is were things start
- to go wrong on an 8087/287. Although the TOS now contains zero,
- the associated tag code for that register still holds a 'special'
- tag, because the old tag word was reloaded, thus mirroring the
- state of the coprocessor before the coprocessor exception trap was
- taken. The tag code for the TOS should now contain a 'zero' tag.
- Register contents and coprocessor tag word are not consistent at
- this stage. This causes the coprocessor to ignore the next instruc-
- tion involving that register, which in the above example program
- is an FDIVP instruction. The divisor will be left on the copro-
- cessor stack and stored to memory instead of the quotient, thus
- giving 1.1 as a result in the above demonstration program.
- The error described should only occur on the 8087/80287, not on
- the 80387. From the 80387 on, Intel coprocessors examine tag codes
- only to distinguish empty ('11'), from nonempty ('00', '01', '10')
- registers.
-
- This error should be quite easy to fix. After discarding the
- denormal and replacing it with zero, the saved NDP state's
- memory image must be changed to reflect the new register
- contents. First, extract the value of ST from the saved
- status word memory image. Then generate a mask for the TOS's
- tag code such that subtracting it from the saved tag word
- memory image will decrement the tag code for the TOS from '10'
- (= SPECIAL) to '01' (= ZERO). The code might look like this:
-
- .
- FSTP ST(0) ; dispose unwanted denormal
- FLDZ ; load zero instead
- PUSH CX ; only DS, AX, and BX saved so far
- MOV CL, [SavedStatusWord+1]; get saved NDP status word MSB
- AND CL, 00111000b ; extract stack top field (0..56)
- SHR CL, 1 ; generate rotate
- SHR CL, 1 ; counter (0,2,4,..14)
- MOV AX, 1 ; load initial mask
- ROL AX, CL ; generate correct number for TOS
- SUB [SavedTagWord], AX ; correct TOS tag code
- POP CX ; don't need it any longer
- .
- .
-
-
- 2. Bugs in the handling of denormal numbers of types SINGLE, DOUBLE,
- EXTENDED (IEEE numeric data types)
-
- One of the important features of the IEEE-754 Standard for Binary
- Floating Point Arithmetic [1] is that it demands the implementation
- of denormal numbers. If the result of a computation can not be
- represented as a normalized number (based exponent > 0, mantissa 1
- <= mantissa < 2) there is an *underflow* condition. The required
- response from an IEEE-754 compliant system to an underflow is the
- generation of a denormal number, if that is at all possible (the
- number could be too small to be represented even as a denormal)
- Denormal numbers have a based exponent of zero, the mantissa
- can have any value between 01....0 and 00....1. Turbo Pascal 6.0
- generally supports denormal numbers for it's IEEE floating point
- types SINGLE, DOUBLE, and EXTENDED. However, there are some bugs
- and undocumented features involved on the handling of denormals.
-
- 1) Coprocessor emulator does not support denormal numbers
- Presumably for performance reasons, support for denormals
- of any IEEE data types is not included in the coprocessor
- emulator. If the result of an operation cannot be represented
- as a normalized number, the emulator will convert it to zero
- ("flush to zero" response). This is one of the many instances
- where the emulator differs from a real coprocessor and which
- is *not* documented in the manuals. The emulator's inability
- to support denormals can result in unexplainable differences
- in the results of the same computation depending on wether
- the emulator or a real coprocessor was used. There are
- real applications that underflow quite frequently, so the
- support of gradual underflow with the help of denormals can
- make for differences in the final results [2]. The inability
- of the emulator to support denormals should be flagged as a
- bug.
-
- 2) There is an undocumented quirk in the handling of denormalized
- numbers when using a program compiled with $N+ with a real
- coprocessor. On a 8087 and the original 80287, denormal numbers
- are only supported for the SINGLE and DOUBLE types, EXTENDED
- type denormals are flushed to zero upon being loaded. On the
- 80287XL, 80387, and 80486 denormals are supported for SINGLE,
- DOUBLE and EXTENDED. This difference is due to differences in
- the coprocessors and how Turbo Pascal initializes them.
- On a 8087/80287 loading a denormal of any data type will raise
- a denormalized number exception. For a 8087/287 TP has this
- exception unmasked. The expection response for the 8087/287
- in TP is to normalize SINGLE and DOUBLE numbers in the
- internal (EXTENDED) format (note that the 8087/287 do not do
- this automatically) and to flush EXTENDED denormals to zero
- (probably because denormals in the internal format give rise
- to further complications on a 8087/287, especially since the
- denormal exception is unmasked, which it has to be to enable
- at least correct operation on SINGLE and DOUBLE denormals).
- On a 80287XL/387/486 loading a SINGLE or DOUBLE denormal will
- raise a denormalized number exception, while loading a EXTENDED
- denormal will *not* raise that exception [3]. For these
- coprocessors, TP has the denormal exception masked. The masked
- response to a denormal exception on these coprocessors is to
- automatically normalize SINGLE and DOUBLE denormals in the
- internal (EXTENDED) format. Since the denormal exception is
- masked and the handling of denormals has been enhanced for
- the 287XL/387/486, operations on EXTENDED denormals are safe
- on these coprocessors.
- The approach chosen for TP is reasonable. Computational results
- can vary depending on the coprocessor used, though. Therefore,
- the different handling of denormals for 8087/287 and 80287XL/
- 387/486 must be documented in TP's manuals.
-
- 3) There is a bug in the float to string conversion routine in
- the TP 6.0 run time library that causes EXTENDED denormals
- to be printed as zero, even on coprocessors where EXTENDED
- denormals are supported by TP 6.0. This is caused by an
- incomplete test that results in everything with a zero
- exponent to be printed as zero. Unfortunately, EXTENDED
- denormals also have a zero exponent. The conversion routine
- can be enhanced to correctly print out EXTENDED denormals
- with very little additional code and only marginally increased
- timing overhead for the normalized number conversion.
-
- 4) The compiler does not allow to initialize typed constants
- of types SINGLE, DOUBLE, and EXTENDED with a constant
- representing a denormal. Without warning, zero is assigned
- to these constants. For example the declaration
-
- CONST Foo: SINGLE = 1e-40;
- Bar: DOUBLE = 1e-320;
-
- results in zero being stored to typed constants Foo and Bar.
- This problem is probably caused by the use of the emulator
- routines to compile programs with the $N+ flag. Since this
- behavior is not documented in the manuals and no warning is
- given during compilation, it is considered a bug.
-
- The following program tests the support for denormals and the
- correct printing of denormals:
-
- {$N+,E+}
-
- PROGRAM DenormTst;
-
- VAR E: EXTENDED;
- D: DOUBLE;
- S: SINGLE;
-
- BEGIN
- WriteLn ('Testing support and printing of denormals');
- WriteLn;
- Write ('Coprocessor is: ');
- CASE Test8087 OF
- 0: WriteLn ('Emulator');
- 1: WriteLn ('8087 or compatible');
- 2: WriteLn ('80287 or compatible');
- 3: WriteLn ('80387 or compatible');
- END;
- WriteLn;
- S := 1.18e-38;
- S := S * 3.90625e-3;
- IF S = 0 THEN
- WriteLn ('SINGLE denormals not supported')
- ELSE BEGIN
- WriteLn ('SINGLE denormals supported');
- WriteLn ('SINGLE denormal prints as: ', S);
- WriteLn ('Denormal should be printed as 4.60943...E-0041');
- END;
- WriteLn;
- D := 2.24e-308;
- D := D * 3.90625e-3;
- IF D = 0 THEN
- WriteLn ('DOUBLE denormals not supported')
- ELSE BEGIN
- WriteLn ('DOUBLE denormals supported');
- WriteLn ('DOUBLE denormal prints as: ', D);
- WriteLn ('Denormal should be printed as 8.75...E-0311');
- END;
- WriteLn;
- E := 3.37e-4932;
- E := E * 3.90625e-3;
- IF E = 0 THEN
- WriteLn ('EXTENDED denormals not supported')
- ELSE BEGIN
- WriteLn ('EXTENDED denormals supported');
- WriteLn ('EXTENDED denormal prints as: ', E);
- WriteLn ('Denormal should be printed as 1.3164...E-4934');
- END;
- END.
-
- References:
-
- [1] IEEE Standard for Binary Floating-Point Arithmetic.
- ANSI/IEEE Std 754-1985.
- New York, NY: Institute of Electrical and Electronics Engineers
- 1985
- [2] Goldberg, D.: Computer Arithmetic.
- In: Hennessy, J.L; Patterson, D.A: Computer Architecture - A
- Quantitative Approach. San Mateo, CA: Morgan Kaufmann 1990
- Page A-27
- [3] Intel: 387DX User's Manual. Programmers Reference. Intel 1989
-
-
-
- 3. Error in EXTENDED to string conversion (Str, Write)
-
- There is an error in the internal conversion routine Float2Str
- that converts an EXTENDED number to a string of decimal digits.
- This bug causes some NANs to be printed as INFs. The code in
- the routine fails to do a complete check on the mantissa to
- figure out if it is an INF. It checks only the sixteen most
- significant mantissa bits. Therefore, NANs with mantissa
- between 800000000001h and 8000FFFFFFFFh are printed as INF.
- The following program demonstrates the bug:
-
- {$N+,E+}
- PROGRAM INFBug;
-
- VAR X: EXTENDED;
- XA: ARRAY [1..5] OF WORD ABSOLUTE X;
-
- BEGIN
- WriteLn ('Testing correct printing of NANs');
- XA [5] := $7FFF;
- XA [4] := $8000;
- XA [3] := $0000;
- XA [2] := $0000;
- XA [1] := $0001;
- WriteLn ('First NAN (7FFF 8000 0000 0000 0001) prints as: ', X);
- XA [5] := $FFFF;
- XA [4] := $8000;
- XA [3] := $0000;
- XA [2] := $8000;
- XA [1] := $0000;
- WriteLn ('Second NAN (FFFF 8000 0000 8000 0000) prints as: ', X);
- XA [5] := $7FFF;
- XA [4] := $8000;
- XA [3] := $4000;
- XA [2] := $0000;
- XA [1] := $0000;
- WriteLn ('Third NAN (7FFF 8000 4000 0000 0000) prints as: ', X);
- END.
-
-
-
- The following is an excerpt from the Float2Str routine showing the
- faulty code that causes the bug:
-
- CMP AX,7FFFH ; INF or NAN ? (check exponent)
- JNE @@10 ; no, Normal
- CMP Value.w6,8000H ; INF ? <----- incomplete check !
- JE @@3 ; yes, print INF
- MOV AX,'AN' ; no, print NAN
- STOSW
- MOV AL,'N'
- STOSB
-
- This fragment should be replaced by the following code which
- eliminates the bug:
-
- CMP AX,7FFFH ; NAN or INF ? (check exponent)
- JNE @@10 ; no, Normal
- MOV DX, Value.w0 ; if any of
- OR DX, Value.w2 ; last 48 mantissa bits
- OR DX, Value.w4 ; is not zero,
- JNZ @3a ; must be NAN
- CMP Value.w6, 8000H ; INF ?
- JE @@3 ; yes, print INF
- @3a:
- MOV AX,'AN' ; no, print NAN
- STOSW
- MOV AL,'N'
- STOSB
-
-
-
- 4. Bug in string -> LONGINT conversion
-
- The VAL and READ procedures for LONGINTs do not allow the smallest
- LONGINT number -2147483648 (-2^31) to be read in decimal form. It
- can be entered in hexadecimal form $80000000 though. VAL and READ
- should be changed to allow all valid LONGINTs to be read, especially
- since this would not slow down the conversion process if done properly.
-
-
-
- 5. Bug in Random function for $N+ state
-
- The Random function in programs compiled with $N+ can return the
- number 1, although Random is specified to deliver values strictly
- smaller than 1. This error occurs since the unsigned 32-bit integer
- delivered by the random number generator is read into the coprocessor
- as a signed 32-bit integer. To avoid negative numbers, the absolute
- value is taken after that before the number is divided by 2^31. If
- however, the 32-bit integer delivered by the random number generator
- is 80000000h it will be converted to 2^31 by taking the absolute
- value in the coprocessor. Division by 2^31 will then return 1. The
- Random routine should be changed to read the 32-bit integers in
- a 64-bit format, thus avoiding the negative number problem and the
- FABS.
-
-
-
- 6. Differences in compile-time and run-time evaluation of certain functions
-
- It seems that at least the Round function behaves differently at
- compile time as compared to run time evaluation as demonstrated
- by the following program:
-
- {$N+,E+}
- PROGRAM RoundBug;
-
- CONST Y = 4.5;
- J = Round (Y);
-
- VAR I: INTEGER;
- X: EXTENDED;
-
- BEGIN
- X := Y;
- I := Round (X);
- WriteLn (I:5, J:5);
- END.
-
- One would expect I and J to be equal in the output, but actual
- output is:
-
- 4 5
-
- The problem here is that the run-time version of Round uses the
- Coprocessor/Emulator which provides correct IEEE-754 rounding to
- nearest or even, while the compile time version of Round uses the
- REAL software arithmetic with simple round to nearest or up. The
- problem can easily be solved by implementing correct IEEE style
- rounding for REAL arithmetic, which is strongly recommended
- regardless of the bug given above.
-
-
-
- 7. Documentation enhancement needed with regard to Sin/Cos functions
-
- When in $N+ mode, a call to the Sin and Cos functions with an
- argument whose absolute value is > 2^63 = 9.22e18 will result
- in an error. This makes sense, since a total loss of precision
- will occur outside this range. However, the current version of
- the documentation does not document this type of error.
- In addition, the REAL type software arithmetic ($N-) will return
- zero for the sine and cosine of large arguments. It does not
- raise an error. When doing REAL computations in $N+ mode, an
- error occurs in these cases. This difference in REAL arithmetic
- between $N+ and $N- mode should be explained in the TP 6.0 manuals.
-
-
-
- 8. Errors in REAL type software arithmetic
-
- There is an error in the REAL-Add/Subtract routine of TP6.0
- runtime library, that may cause results to be less accurate
- than would be possible. Before shifting the smaller operand's
- mantissa to the right for alignment prior to mantissa addition,
- a test is performed wether the shift would be for more than
- 39 bits. It is assumed that any shift count >= 40 would make
- the second operand so small that it cannot affect the result.
- This was true as long Turbo-Pascal truncated results in REAL
- arithmetic, which was the case up to and including version 5.0
- of the compiler. Due to the rounding introduced with TP 5.5
- the above assumption no longer holds. Because of rounding,
- which takes place at the 41st mantissa bit, a carry may pro-
- pagate to the significant 40th bit of the final result. Therefore,
- only when the shift counts needed for alignment is greater or
- equal to 41 should the mantissa addition be skipped.
-
- There are constant errors in the REAL-Exp and REAL-Ln routines
- that cause some uncessary inaccuracies in the function results.
- The constant Sqrt(2) in EXP should be coded as 81 FA 33 F3 04 35,
- not as 81 FB 33 F3 04 35, since the mantissa to six bytes of
- accuracy would be 0.3504F333F9DE. Likewise, the constant 0.5*Sqrt(2)
- used by LN should be 80 FA 33 F3 04 35.
-
- The REAL-Exp function returns with a runtime error 205 (overflow),
- when called with an argument smaller than about -88.029. However,
- even an argument of -88.72 would still deliver a result bigger than
- the smallest normalized REAL number, which is 2^-128 or 2.94e-39.
- Thus, Exp does not make use of the available argument range. Exp
- should not abort with an error when called with very small arguments
- anyhow. Since the exponential function approaches zero as the argument
- approaches negative infinity, it should simply return zero when the
- result is too small to be represented as a normalized real number.
- This behavior of Exp would be consistent with the rest of the REAL
- operations, since REAL arithmetic always takes the "flush to zero"
- approach when results underflow.
-
-
-
- 9. Error in REAL-type multiplication
-
- The multiplication routine for REAL type software arithmetic
- provides a faster multiplication if all but the first sixteen
- bits of the mantissa in one of the factors is zero. This makes
- multiplication much faster when a floating point number is to
- be multiplied by a small integer, such as in 3.1415926*10, since
- the converted integer does not use more than the first sixteen
- mantissa bits.
- This part of the multiplication contains a logical error. The
- bug will introduce a relative error of at most 3e-12 in the
- result, whereas all other basic arithmetical operation (with
- the exception of addition, see above) are accurate to the
- theoretical limit of the arithmetic (9.09e-13). The bug causes
- incorrect results in about 3% of the described type of multi-
- plications. The errors is caused by not using all necessary
- mantissa bits in the computation. The code looks as follows:
- .
- .
- 8BC5 MOV AX, BP ; the partial product
- 8AC4 MOV AL, AH ; of CH * (LSB of BP) is
- F6E5 MUL CH ; not taken into consideration
- 8BD8 MOV BX, AX ; by this computation
- .
-
- The bug can be eliminated by applying the following patch:
-
-
- 89C8 MOV AX, CX ; this code performs the correct
- F7E5 MUL BP ; computation. Note that the
- 89D3 MOV BX, DX ; correct value stored in BX
- 90 NOP ; may exceed the corresponding
- 90 NOP ; value above by up to three
-
-
-
- 10. Incorrectly restricted argument range for REAL arithmetic Round/Trunc
-
- When using REAL arithmetic ($N- mode), the Round/Trunc functions
- raise an error for all inputs that would cause the smallest
- LONGINT number -2147483648 to be returned. Thus inputs to Round
- are restricted to -2147483647.5 < x < 2147483647.5, although the
- correct available argument range should be -2147483648.5 <= x <
- 2147483647.5 . Inputs to Trunc are similarly incorrectly restricted
- to -2147483648 < x < 2147483648 where the correct range would be
- -2147483649 < x < 2147483648. This bug can be corrected with no
- increase in execution time. Correct implementation of Round/Trunc
- can be tested with the ROUNDTST program given below. Try a run
- with {$N+} and then try it with {$N-}.
-
-
- PROGRAM RoundTst;
-
- VAR X,Y,Z: REAL;
- I: LONGINT;
-
- BEGIN
- Y := 4.5;
- Z := 5.5;
- WriteLn ('Testing implementation of Round/Trunc for correct range',
- ' and IEEE-rounding');
- WriteLn;
- WriteLn;
- Write ('Testing range of Round towards lower limit ... ');
- X := -2147483647.0;
- REPEAT
- I := Round (X);
- (* WriteLn (X+2147483648.0);*)
- X := X - 1.0/256.0;
- UNTIL X < -2147483648.5;
- WriteLn ('passed');
- WriteLn;
- Write ('Testing range of Round towards upper limit ... ');
- X := 2147483647.0;
- REPEAT
- I := Round (X);
- (* writeln (x-2147483648.0);*)
- X := X + 1.0/256.0;
- UNTIL X >= 2147483647.5;
- WriteLn ('passed');
- WriteLn;
- Write ('Testing range of Trunc towards lower limit ... ');
- X := -2147483647.0;
- REPEAT
- I := Trunc (X);
- (* writeln (x+2147483648.0);*)
- X := X - 1.0/256.0;
- UNTIL X <= -2147483649.0;
- WriteLn ('passed');
- WriteLn;
- Write ('Testing range of Trunc towards upper limit ... ');
- X := 2147483647.0;
- REPEAT
- I := Trunc (X);
- (* writeln (x-2147483648.0);*)
- X := X + 1.0/256.0;
- UNTIL X >= 2147483648.0;
- WriteLn ('passed');
- WriteLn;
- Write ('Round (4.5) should be: 4, actual value is: ', Round (Y));
- IF Round (Y) = 4 THEN
- WriteLn (' passed')
- ELSE
- WriteLn (' failed');
- Write ('Round (5.5) should be: 6, actual value is: ', Round (Z));
- IF Round (Z) = 6 THEN
- WriteLn (' passed')
- ELSE
- WriteLn (' failed');
- WriteLn;
- Y := -4.5;
- Z := -5.5;
- Write ('Round (-4.5) should be:-4, actual value is:', Round (Y));
- IF Round (Y) =-4 THEN
- WriteLn (' passed')
- ELSE
- WriteLn (' failed');
- Write ('Round (-5.5) should be:-6, actual value is:', Round (Z));
- IF Round (Z) =-6 THEN
- WriteLn (' passed')
- ELSE
- WriteLn (' failed');
- END.
-
-
-
- 11. Errors in coprocessor emulator
-
- Certain coprocessor instructions will not be correctly emulated
- by the emulation package of Turbo-Pascal 6.0. This bug has also
- been present in all previous versions of the emulator. The
- following program will demonstrate the faulty emulation of
- FDECSTP:
-
- {$A+,B-,D+,E+,F-,G-,I-,L+,N+,O-,R-,S-,V+,X-}
- {$M 16384,0,655360}
- PROGRAM EMUBUG;
-
- VAR StackPointer: BYTE;
- Control87, Status87: WORD;
-
- BEGIN
- WriteLn ('Turbo Pascal 6.0 coprocessor emulator bug demo program');
- WriteLn;
- IF Test8087 <> 0 THEN
- WriteLn ('Initializing coprocessor')
- ELSE
- WriteLn ('Initializing emulator');
- WriteLn ('Loading π, 1, and 0 into coprocessor / emulator');
- ASM
- FSTCW [Control87] { save control word as set by Turbo Pascal}
- FINIT { initialize coprocessor / emulator }
- FLDPI { load π, stack pointer = 7 }
- FLD1 { load 1, stack pointer = 6 }
- FLDZ { load 0, stack pointer = 5 }
- FSTSW [Status87] { save status word containing stack pointer}
- FWAIT { wait until saved }
- MOV AX, [Status87] { load status word }
- AND AH, 38h { extract stack pointer field }
- SHR AH, 1 { make }
- SHR AH, 1 { it }
- SHR AH, 1 { right-aligned in byte }
- MOV [StackPointer], AH{ store stack pointer value }
- END;
- IF Test8087 = 0 THEN
- WriteLn ('emulator stack pointer now: ', StackPointer,
- ' should be: 5')
- ELSE
- WriteLn ('coprocessor stack pointer now: ', StackPointer,
- ' should be: 5');
- WriteLn ('executing / emulating FDECSTP instruction');
- ASM
- FDECSTP { decrement coprocessor/emulator stack ptr }
- FSTSW [Status87] { store status word containing stack ptr }
- FWAIT { wait until stored }
- MOV AH, BYTE PTR [Status87+1] { get status word MSB }
- AND AH, 38h { isolate stack ptr field in status word }
- SHR AH, 1 { make }
- SHR AH, 1 { it left }
- SHR AH, 1 { aligned in byte }
- MOV [StackPointer], AH{ store stack pointer value }
- END;
- IF Test8087 = 0 THEN
- WriteLn ('emulator stack pointer now: ', StackPointer,
- ' should be: 4')
- ELSE
- WriteLn ('coprocessor stack pointer now: ', StackPointer,
- ' should be: 4');
- ASM
- FINIT { initalize coprocessor/emulator }
- FLDCW [Control87] { restore TURBO Pascal control word }
- END;
- END.
-
-
- When the above program is run with a coprocessor, the results will
- be as expected:
-
-
- Turbo Pascal 6.0 coprocessor emulator bug demo program
-
- Initializing coprocessor
- Loading π, 1, and 0 into coprocessor / emulator
- coprocessor stack pointer now: 5 should be: 5
- executing / emulating FDECSTP instruction
- coprocessor stack pointer now: 4 should be: 4
-
-
-
- However, when run with the emulator, strange things happen:
-
-
- Turbo Pascal 6.0 coprocessor emulator bug demo program
-
- Initializing emulator
- Loading π, 1, and 0 into coprocessor / emulator
- emulator stack pointer now: 5 should be: 5
- executing / emulating FDECSTP instruction
- emulator stack pointer now: 6 should be: 4
-
-
- It seems that at least FINCSTP and FDECSTP are incorrectly emu-
- lated. Tests show that FINCSTP actually decreases the stack pointer,
- while FDECSTP increases it. This is caused by faulty entries in
- the jump index table for emulated opcodes D9E0 to D9FF. Exchanging
- the indices for FINCSTP and FDECSTP will cause FINCSTP to function
- correctly, but because of another error FDECSTP will disturb the
- emulated registers of the 80x87 emulator, which it shouldn't do.
- FINCSTP and FDECSTP instructions will not be generated by the
- compiler. However, programs that link with modules written in
- assembly language or use the new ASM directive of Turbo-Pascal 6.0
- might contain them. When run with the emulator, these programs
- will behave odd or might even crash. In addition, the wraparound
- stack addressing provided by the coprocessor is unavailable on the
- emulator. On the coprocessor, an instruction such as FADD ST(7),ST
- would write to register #2 if the current stacktop was three. The
- emulator computes a linear offset and tries to write to a non-
- implemented register #10. In doing so, it destroys other emulator
- data residing at offsets C0h to E5h in the stack segment, just
- above the emulator's register file (60h to BFh). This will cause
- the emulator to malfunction or will crash the program.
-
- The best way to get rid of this bugs would be to fix the emulator
- so that it correctly emulates all coprocessor instructions and the
- wraparound addressing. This should prove not to be too difficult,
- since the faulty instructions are among the easiest to emulate. When
- figuring out which register is meant in stack top relativ addressing,
- the result should be taken modulo 8 to provide correct wraparound.
- This should be quite easy. Additional code needed will be at a mini-
- mum.
- The other solution would be to trap all unemulated or faulty
- instructions (e.g. FINCSTP, FADD ST(7),ST) upon invocation of the
- emulator. The emulator would then emit an error message such as
- 'run-time error xx, unemulated coprocessor instruction' and abort
- the program. In this case, the documentation should provide explicit
- information which instructions are not emulated and should not be
- used if the program is to perform correctly with the emulator. Also,
- all other differences between coprocessor and emulator should be
- explained.
-
-
-
- 12. Deficiencies of coprocessor emulator
-
- The emulator of Turbo Pascal 6.0 does not emulate the following
- features of a physical coprocessor: precision control and rounding
- control. This can be proved by running the following programs
- with and without a coprocessor.
-
- {$N+,E+}
- PROGRAM PCtrlTst;
-
- VAR B: EXTENDED;
- Precision, L: WORD;
-
- PROCEDURE SetPrecisionControl (Precision: WORD);
- (* This procedure sets the internal precision of the NDP. Available *)
- (* precision values: 0 - 24 bits (SINGLE) *)
- (* 1 - n.a. (mapped to single) *)
- (* 2 - 53 bits (DOUBLE) *)
- (* 3 - 64 bits (EXTENDED) *)
-
- VAR CtrlWord: WORD;
-
- BEGIN {SetPrecisionCtrl}
- IF Precision = 1 THEN
- Precision := 0;
- Precision := Precision SHL 8; { make mask for PC field in ctrl word }
- ASM
- FSTCW [CtrlWord] { store NDP control word }
- MOV AX, [CtrlWord] { load control word into CPU }
- AND AX, 0FCFFh { mask out precision control field }
- OR AX, [Precision] { set desired precision in PC field }
- MOV [CtrlWord], AX { store new control word }
- FLDCW [CtrlWord] { set new precision control in NDP }
- END;
- END; {SetPrecisionCtrl}
-
- BEGIN {main}
- FOR Precision := 1 TO 3 DO BEGIN
- B := 1.2345678901234567890;
- SetPrecisionControl (Precision);
- FOR L := 1 TO 20 DO BEGIN
- B := Sqrt (B);
- END;
- FOR L := 1 TO 20 DO BEGIN
- B := B*B;
- END;
- SetPrecisionControl (3); { full precision for printout }
- WriteLn (Precision, B:28);
- END;
- END.
-
-
- The output of the above program looks like this when executed
- with a coprocessor present:
-
- 1 1.13311278820037842E+0000 (* single precision *)
- 2 1.23456789006442125E+0000 (* double precision *)
- 3 1.23456789012337585E+0000 (* extended precision *)
-
- However, when executed with the emulator, output is as follows:
-
- 1 1.23456789012351396E+0000
- 2 1.23456789012351396E+0000
- 3 1.23456789012351396E+0000
-
- Changing the value of precision control obviously has no effect at
- all on the emulator. It always works with extended precision in
- internal calculations. This deviation of the emulator from a real
- coprocessor should be documented in the TP 6.0 User Manual.
-
- {$N+,E+}
- PROGRAM RCtrlTst;
-
- VAR B: EXTENDED;
- RoundingMode, L: WORD;
-
-
- PROCEDURE SetRoundingMode (RCMode: WORD);
- (* This procedure selects one of four available rounding modes *)
- (* 0 - Round to nearest (default) *)
- (* 1 - Round down (towards negative infinity) *)
- (* 2 - Round up (towards positive infinity) *)
- (* 3 - Chop (truncate, round towards zero) *)
-
- VAR CtrlWord: WORD;
-
- BEGIN
- RCMode := RCMode SHL 10; { make mask for RC field in control word}
- ASM
- FSTCW [CtrlWord] { store NDP control word }
- MOV AX, [CtrlWord] { load control word into CPU }
- AND AX, 0F3FFh { mask out rounding control field }
- OR AX, [RCMode] { set desired precision in RC field }
- MOV [CtrlWord], AX { store new control word }
- FLDCW [CtrlWord] { set new rounding control in NDP }
- END;
- END;
-
- BEGIN
- FOR RoundingMode := 0 TO 3 DO BEGIN
- B := 1.2345678901234567890e100;
- SetRoundingMode (RoundingMode);
- FOR L := 1 TO 51 DO BEGIN
- B := Sqrt (B);
- END;
- FOR L := 1 TO 51 DO BEGIN
- B := -B*B;
- END;
- SetRoundingMode (0); { round to nearest for printout }
- WriteLn (RoundingMode, B:28);
- END;
- END.
-
-
- The calculations performed in the above program were selected so
- that every rounding mode would lead to a distinct final value. The
- output when run with a coprocessor appears below.
-
- As expected, four different values are printed at the end of the
- program if a coprocessor is present.
-
- 0 -1.23427629010100635E+0100 (* round nearest *)
- 1 -1.23427623555772409E+0100 (* round down *)
- 2 -1.23457760966801097E+0100 (* round up *)
- 3 -1.23397493540770643E+0100 (* chop *)
-
- With the emulator, four identical results are produced, indicating
- that the emulator does not support the IEEE rounding modes of the
- coprocessor.
-
- 0 -1.23457766383395931E+0100
- 1 -1.23457766383395931E+0100
- 2 -1.23457766383395931E+0100
- 3 -1.23457766383395931E+0100
-
- This deviation from the behavior of the actual coprocessor should
- be mentioned in TP 6.0 documentation.
-
-
-
- 13. Deficiencies in Coprocessor emulator
-
- The coprocessor emulator used by programs compiled in the
- $N+,E+ mode when a coprocessor is absent at run-time does
- not correctly handle special arguments like ZERO, INF, and
- NANs. Specific problems are:
-
- - multiplication and division with INFs resulting in NANs
- instead of INFs
- - 0+(-0) = -0, but (-0)+0 = 0
- - operations on QNANs (quiet NaNs) signaling an exception
-
- The following program will demonstrate the bugs:
-
-
- {$N+,E+}
- PROGRAM InfTest;
-
- VAR INF, NEGINF: EXTENDED;
- QNAN, SNAN: EXTENDED;
- X, NEGX: EXTENDED;
- Z, NEGZ: EXTENDED;
- PSEUDOZERO: EXTENDED;
-
- INFA: ARRAY [1..5] OF WORD ABSOLUTE INF;
- QA: ARRAY [1..5] OF WORD ABSOLUTE QNAN;
- SA: ARRAY [1..5] OF WORD ABSOLUTE SNAN;
- PA: ARRAY [1..5] OF WORD ABSOLUTE PSEUDOZERO;
-
- BEGIN
- INFA [5] := $7FFF;
- INFA [4] := $8000;
- INFA [3] := $0000;
- INFA [2] := $0000;
- INFA [1] := $0000;
- QA [5] := $7FFF;
- QA [4] := $C000;
- QA [3] := $0000;
- QA [2] := $0000;
- QA [1] := $0001;
- SA [5] := $7FFF;
- SA [4] := $8000;
- SA [3] := $0000;
- SA [2] := $0000;
- SA [1] := $0001;
- PA [5] := $52FB;
- PA [4] := $0000;
- PA [3] := $0000;
- PA [2] := $0000;
- PA [1] := $0000;
- NEGINF := -INF;
- X := 5;
- NEGX := -5;
- Z := 0;
- NEGZ := -Z;
-
- WriteLn (' INF + INF: ', INF + INF:6:0, ' should be INF');
- WriteLn ('-INF + -INF: ', NEGINF + NEGINF:6:0, ' should be -INF');
- WriteLn (' INF + X : ', INF + X:6:0, ' should be INF');
-
- WriteLn (' INF - -INF: ', INF - NEGINF:6:0, ' should be INF');
- WriteLn ('-INF - INF: ', NEGINF - INF:6:0, ' should be -INF');
- WriteLn (' X - INF: ', X - INF:6:0, ' should be -INF');
-
- WriteLn (' INF * INF: ', INF * INF:6:0, ' should be INF');
- WriteLn ('-INF * INF: ', NEGINF * INF:6:0, ' should be -INF');
- WriteLn (' INF * -INF: ', INF * NEGINF:6:0, ' should be -INF');
- WriteLn ('-INF * -INF: ', NEGINF * NEGINF:6:0, ' should be INF');
- WriteLn (' X * INF: ', X * INF:6:0, ' should be INF');
- WriteLn (' -X * INF: ', NEGX * INF:6:0, ' should be -INF');
-
- WriteLn (' INF / 0 : ', INF / 0:6:0, ' should be INF');
- WriteLn ('-INF / 0 : ', NEGINF / 0:6:0, ' should be -INF');
- WriteLn (' X / INF: ', X / INF:6:0, ' should be 0');
- WriteLn (' INF / -X : ', INF / NEGX:6:0, ' should be INF');
-
- WriteLn (' Sqrt (INF): ', Sqrt (INF):6:0, ' should be INF');
-
- WriteLn (' -0 + -0 : ', NEGZ + NEGZ:6:0, ' should be -0');
- WriteLn (' 0 + -0 : ', Z + NEGZ:6:0, ' should be 0');
- WriteLn (' -0 + 0 : ', NEGZ + Z:6:0, ' should be 0');
- WriteLn (' -0 * 0 : ', NEGZ * Z:6:0, ' should be -0');
- WriteLn (' 0 * -0 : ', Z * NEGZ:6:0, ' should be -0');
- WriteLn (' -0 * X : ', NEGZ * X:6:0, ' should be -0');
- WriteLn (' X * -0 : ', X * NEGZ:6:0, ' should be -0');
- WriteLn (' -X * 0 : ', NEGX * Z:6:0, ' should be -0');
- WriteLn (' -X * -0 : ', NEGX * NEGZ:6:0, ' should be 0');
- WriteLn (' Sqrt (-0) : ', Sqrt (NEGZ):6:0, ' should be -0');
-
- WriteLn ('QNAN * QNAN: ', QNAN * QNAN:6:0, ' should be NAN');
- WriteLn ('QNAN + QNAN: ', QNAN + QNAN:6:0, ' should be NAN');
- WriteLn ('QNAN / QNAN: ', QNAN / QNAN:6:0, ' should be NAN');
-
- WriteLn ('Sqrt (QNAN): ', Sqrt (QNAN):6:0, ' should be NAN');
-
- END. { InfTest}
-
-
- When run on an 80387 coprocessor, the output of the above program
- is as follows:
-
- INF + INF: INF should be INF
- -INF + -INF: -INF should be -INF
- INF + X : INF should be INF
- INF - -INF: INF should be INF
- -INF - INF: -INF should be -INF
- X - INF: -INF should be -INF
- INF * INF: INF should be INF
- -INF * INF: -INF should be -INF
- INF * -INF: -INF should be -INF
- -INF * -INF: INF should be INF
- X * INF: INF should be INF
- -X * INF: -INF should be -INF
- INF / 0 : INF should be INF
- -INF / 0 : -INF should be -INF
- X / INF: 0 should be 0
- INF / -X : -INF should be INF
- Sqrt (INF): INF should be INF
- -0 + -0 : -0 should be -0
- 0 + -0 : 0 should be 0
- -0 + 0 : 0 should be 0
- -0 * 0 : -0 should be -0
- 0 * -0 : -0 should be -0
- -0 * X : -0 should be -0
- X * -0 : -0 should be -0
- -X * 0 : -0 should be -0
- -X * -0 : 0 should be 0
- Sqrt (-0) : -0 should be -0
- QNAN * QNAN: NAN should be NAN
- QNAN + QNAN: NAN should be NAN
- QNAN / QNAN: NAN should be NAN
- Sqrt (QNAN): NAN should be NAN
-
-
- However, when run with the emulator, the programs output looks
- like this:
-
- INF + INF: INF should be INF
- -INF + -INF: -INF should be -INF
- INF + X : INF should be INF
- INF - -INF: INF should be INF
- -INF - INF: -INF should be -INF
- X - INF: -INF should be -INF
- INF * INF: NAN should be INF <----- error
- -INF * INF: NAN should be -INF <----- error
- INF * -INF: NAN should be -INF <----- error
- -INF * -INF: NAN should be INF <----- error
- X * INF: NAN should be INF <----- error
- -X * INF: NAN should be -INF <----- error
- INF / 0 : NAN should be INF <----- error
- -INF / 0 : NAN should be -INF <----- error
- X / INF: 0 should be 0
- INF / -X : NAN should be INF <----- error
- Sqrt (INF): INF should be INF
- -0 + -0 : -0 should be -0
- 0 + -0 : -0 should be 0 <----- error
- -0 + 0 : 0 should be 0
- -0 * 0 : -0 should be -0
- 0 * -0 : -0 should be -0
- -0 * X : -0 should be -0
- X * -0 : -0 should be -0
- -X * 0 : -0 should be -0
- -X * -0 : 0 should be 0
- Sqrt (-0) : -0 should be -0
- QNAN * QNAN: Runtime error 207 at 0000:09F8. <----- error
-
-
- This handling of QNANs also violates the IEEE-754 standard for
- binary floating point arithmetic, which states in section
- 6.2 (Operations with NaNs): "Every operation involving one
- or two input NaNs, none of them signaling, shall signal no
- exception but, if a floating-point result is to be delivered,
- shall deliver as its result a quiet NaN, which should be one
- of the input NaNs".
-
-
-
-
- 14. Error in inline assembler
-
- The inline assembler incorrectly accepts a type name as a variable
- name for memory operands, if the type size is the same as the size
- of the memory operand. The following program fragment is legal in
- the current version of the inline assembler:
-
- ASM
- ...
- MOV AX, [WORD]
- MOV AL, [BYTE]
- LES DI, [LONGINT]
- ...
- END;
-
- The address generated for the memory operands in these cases is
- always 0. This bug should be fixed immediately.
-
-
-
- 15. Error in inline assembler
-
- The PTR operator of the inline assembler will incorrectly accept
- an expression of class register if the size of the register is
- the same as the type it is casted to. The following statements are
- legal in the current version of the inline assembler:
-
- ASM
- ...
- MOV BYTE PTR AL, 5
- ADD WORD PTR AX, 5
- MOV WORD PTR ES, 6
- ...
- END;
-
- The assembler will generate memory references to memory location
- 0 in these cases, e.g. MOV BYTE PTR [0], 5. The PTR operator
- should be fixed to only accept expressions of class memory.
-
-
-
- 16. Possible problems arising out of the use of the SEG and OFFSET
- operators with local variables
-
- Use of the SEG operator with local variables and parameters is
- allowed in the inline assembler. One would expect that the SEG
- value of a local (auto) variable is the value of the stack
- segment (SS), just as the SEG value of a global (static) variable
- is the data segment (DS, @Data). However, the value stored by the
- inline assembler seems always to be 0. Applying the SEG operator
- to local variables should either be forbidden, or the correct
- value should be supplied by the assembler.
-
- The Programmer's Guide states that the OFFSET of local variables,
- parameters, and the @Result symbol is the offset relative to
- the framepointer of the entity in which they were declared. This
- works as expected, but there is a problem when nested procedures/
- functions are used. Although the local variables of an outer
- procedural level are visible to inner procedures, no meaningful
- OFFSET value can be computed for these variables relative to
- the framepointer of the inner procedure. Therefore, the assembler
- should either forbid references to local variables declared at an
- outer level or generate the code necessary to address across the
- several levels of indirection involved. The following example
- illustrates the problem:
-
-
- FUNCTION Outer (R: WORD): WORD; { copies input R to function output }
-
- FUNCTION Inner (W: WORD): WORD; ASSEMBLER; {should copy R to func. output}
- ASM
- MOV AX, R { will not load R !!}
- END;
-
- BEGIN { Outer }
- R := Inner (R); { does *not* copy R to R !! }
- ASM
- MOV DI, OFFSET R { load offset of R relative to Outer's BP}
- MOV AX, [BP+DI] { load parameter R }
- MOV SI, OFFSET @Result { load offset of Outer's result }
- MOV [BP+SI], AX { store value of R into Outer's result }
- END;
- END;
-
- BEGIN
- WriteLn (Outer(5));
- END.
-
- Instead of printing '5', as one would expect, this program prints a
- value like '9094' depending on the initial stack size set in the
- TP configuration.
-
-
-
- 17. Error when using the WITH directive with the inline assembler
-
- When accessing parts of a record using inline assembler from within
- a WITH block, the inline assembler doesn't correctly compute the
- addresses of the record's parts. It uses the offset of the record
- part within the record as the address. However, the base address of
- the record must be added to this value to get the correct address
- of the record part. Writing to record parts using the inline assembler
- from within a WITH block will destroy other data in the data segment.
- The following program illustrates the problem. It first initializes
- a record using the inline assembler without making use of a WITH
- block. It prints the contents of the record, then updates it using
- inline assembler from within a WITH block and print the record again.
- If the inline assembler worked correctly, two different printouts
- would be the result. Actually, the second record update doesn't
- change the record but destroys other data in the data segment.
- Therefore, the same data is printed out twice.
-
-
- {$A+,N-,I-,S-,R-,B-}
-
- PROGRAM WITHBug;
-
- VAR Student: RECORD
- ID: LONGINT;
- Name: STRING;
- GPA: REAL;
- END;
-
- BEGIN
- { first student: ID = 12345678, Name = JOHN, GPA = 1.0 }
-
- ASM
- MOV WORD PTR [Student.ID], 5678
- MOV WORD PTR [Student.ID+2], 1234
- MOV WORD PTR [Student.GPA], 81h
- MOV WORD PTR [Student.GPA+2], 0
- MOV WORD PTR [Student.GPA+4], 0
- MOV BYTE PTR [Student.Name], 4
- MOV WORD PTR [Student.Name+1], 'OJ'
- MOV WORD PTR [Student.Name+3], 'NH'
- END;
- WriteLn ('Student''s ID: ', Student.ID);
- WriteLn (' Name: ', Student.Name);
- WriteLn (' GPA: ', Student.GPA:0:2);
-
- { second student: ID = 87654321, Name = JANE, GPA = 2.0 }
-
- WITH Student DO
- ASM
- MOV WORD PTR [ID], 4321
- MOV WORD PTR [ID+2], 8765
- MOV WORD PTR [GPA], 82h
- MOV WORD PTR [GPA+2], 0
- MOV WORD PTR [GPA+4], 0
- MOV BYTE PTR [Name], 4
- MOV WORD PTR [Name+1], 'AJ'
- MOV WORD PTR [Name+3], 'EN'
- END;
- WriteLn ('Student''s ID: ', Student.ID);
- WriteLn (' Name: ', Student.Name);
- WriteLn (' GPA: ', Student.GPA:0:2);
- END.
-
-
- The following excerpts from the resulting code show the error:
-
- 1738:003B C70644002E16 MOV Word Ptr [0044],162E ; 1st record
- 1738:0041 C7064600D204 MOV Word Ptr [0046],04D2 ; initialization
- 1738:0047 C70648018100 MOV Word Ptr [0148],0081 ; using
- 1738:004D C7064A010000 MOV Word Ptr [014A],0000 ; correct
- 1738:0053 C7064C010000 MOV Word Ptr [014C],0000 ; addresses
- 1738:0059 C606480004 MOV Byte Ptr [0048],04
- 1738:005E C70649004A4F MOV Word Ptr [0049],4F4A
- 1738:0064 C7064B00484E MOV Word Ptr [004B],4E48
-
- 1738:00E4 C7060000E110 MOV Word Ptr [0000],10E1 ; 2nd record
- 1738:00EA C70602003D22 MOV Word Ptr [0002],223D ; initialization
- 1738:00F0 C70604018200 MOV Word Ptr [0104],0082 ; using offsets
- 1738:00F6 C70606010000 MOV Word Ptr [0106],0000 ; into record
- 1738:00FC C70608010000 MOV Word Ptr [0108],0000 ; instead of
- 1738:0102 C606040004 MOV Byte Ptr [0004],04 ; addresses
- 1738:0107 C70605004A41 MOV Word Ptr [0005],414A
- 1738:010D C70607004E45 MOV Word Ptr [0007],454E
-
-
-
- 18. Incorrect assembly of certain JMPs and CALLs by inline assembler
-
- The inline assembler will incorrectly assemble certain JMPS and
- CALLs that are invalid and are rejected by the MASM and TASM
- assemblers. It will also incorrectly assemble JMPs and CALLs to
- destination declared with the ABSOLUTE directive. The bugs can
- be demonstrated by the following program:
-
- PROGRAM Jmp_Call;
-
- VAR AbsPointer: POINTER ABSOLUTE $1234:$5678;
- NormPointr: POINTER;
-
- PROCEDURE FarProc; FAR; ASSEMBLER;
- ASM
- END;
-
- PROCEDURE NearProc; NEAR; ASSEMBLER;
- ASM
- END;
-
- BEGIN
- ASM
- JMP NEAR PTR AbsPointer { This is illegal in MASM / TASM }
- JMP FAR PTR AbsPointer { incorrectly assembled by inline assmbl.! }
- JMP AbsPointer { This is illegal in MASM / TASM }
- JMP NEAR PTR NormPointr { This is illegal in MASM / TASM }
- JMP FAR PTR NormPointr
- JMP NormPointr
- JMP NEAR PTR FarProc
- JMP FAR PTR FarProc
- JMP FarProc
- JMP NEAR PTR NearProc
- JMP FAR PTR NearProc
- JMP NearProc
- CALL NEAR PTR AbsPointer { This is illegal in MASM / TASM }
- CALL FAR PTR AbsPointer { incorrectly assembled by inline assmbl.! }
- CALL AbsPointer { This is illegal in MASM / TASM }
- CALL NEAR PTR NormPointr { This is illegal in MASM / TASM }
- CALL FAR PTR NormPointr
- CALL NormPointr
- CALL NEAR PTR FarProc
- CALL FAR PTR FarProc
- CALL FarProc
- CALL NEAR PTR NearProc
- CALL FAR PTR NearProc
- CALL NearProc
- END;
- END.
-
-
- The instructions marked as "illegal in MASM / TASM" should be
- flagged as errors by the inline assembler. "JMP NEAR PTR AbsPointer"
- and "JMP NEAR PTR NormPointr" are near jumps to a different CS,
- and in "JMP AbsPointer", AbsPointer can't be addressed with the
- currently assumed registers. The same remarks apply to the
- equivalent CALL statements in the source. Consequently, the
- inline assembler produces garbage for the illegal statements.
- "JMP FAR PTR AbsPointer" should assemble to "JMP 1234:5678", but
- the inline assembler produces something very different. Instead
- of the absolute segment 1234h it uses the value of CS and in
- addition mangles the offset value. The assembly language program
- below, which is equivalent to the above PASCAL program, shows
- that "JMP FAR PTR AbsPointer" and "CALL FAR PTR AbsPointer" can
- be assembled correctly by TASM / MASM, so the inline assembler
- should do this as well.
-
-
- DOSSEG
-
- AbsSeg SEGMENT AT 1234h
- ORG 5678H
- AbsPointer DD ?
- AbsSeg ENDS
-
-
- DATA SEGMENT WORD PUBLIC 'DATA'
- ASSUME DS:DATA
- NormPointr DD ?
- DATA ENDS
-
-
- CODE SEGMENT BYTE PUBLIC 'CODE'
- ASSUME CS:CODE, DS:DATA
-
- FarProc PROC FAR
- RET
- FarProc ENDP
-
- NearProc PROC NEAR
- RET
- NearProc ENDP
-
- Main: MOV AX, SEG (NormPointr)
- MOV DS, AX
- ; JMP NEAR PTR AbsPointer ; error !
- JMP FAR PTR AbsPointer
- ; JMP AbsPointer ; error !
- ; JMP NEAR PTR NormPointr ; error !
- JMP FAR PTR NormPointr
- JMP NormPointr
- JMP NEAR PTR FarProc
- JMP FAR PTR FarProc
- JMP FarProc
- JMP NEAR PTR NearProc
- JMP FAR PTR NearProc
- JMP NearProc
- ; CALL NEAR PTR AbsPointer ; error !
- CALL FAR PTR AbsPointer
- ; CALL AbsPointer ; error !
- ; CALL NEAR PTR NormPointr ; error !
- CALL FAR PTR NormPointr
- CALL NormPointr
- CALL NEAR PTR FarProc
- CALL FAR PTR FarProc
- CALL FarProc
- CALL NEAR PTR NearProc
- CALL FAR PTR NearProc
- CALL NearProc
-
- CODE ENDS
-
-
- STACK SEGMENT STACK
- DB 100h DUP (?)
- STACK ENDS
-
- END MAIN
-
-
-
- 19. Other bugs in the inline assembler (ASM directive)
-
- Several instructions that have a format with an immediate operand
- support senseless or incorrect ranges for the immediate value. The
- IN, OUT, and INT instructions will accept values between -128 and
- 255. Since negative values make no sense here, the possible range
- should be restricted to 0 to 255. The ENTER instruction will also
- take negative arguments. Again, this is not a very sensible choice.
- How are -5 bytes reserved for local variables? The allowed range
- for the arguments should be 0 to 255 and 0 to 65535, respectively.
- Although it is not officially documented by Intel, the AAM and AAD
- instructions may take additional arguments that indicate the base
- on which to perform the conversion. This is supported by the inline
- assembler. However, it accepts arguments between -128 and 127 while
- it should accept bases between 0 and 255, since the bases available
- with AAM and AAD must be positive.
-
- The inline assembler performs no check on the index in
- the stack top relative addressing mode of the coprocessor.
- Very large or even negative values are allowed. For example,
- FADD ST, ST(123456) will be accepted as perfectly legal.
- This must be fixed to make sure the index is between zero and
- seven.
-
- There is no way to code an absolute far jump such as JMP F000:FFF0
- (to perform a warm start). The same restriction applies to far
- calls. In a conventional assembler such a jump could be coded as
- follows:
-
- BIOS SEGMENT AT 0F000h
- ORG 0FFF0h
- Restart LABEL FAR
- BIOS ENDS
-
- CODE SEGMENT BYTE PUBLIC 'CODE'
- JMP Restart
- CODE ENDS
-
- There are no segment declarations available with the inline
- assembler, so the jump has to be either hand coded with DBs or
- changed to a memory indirect far jump using a appropriately
- initialized pointer. This problem should be documented in
- the Turbo-Pascal manuals.
-
- One of the standard syntax available with the IMUL instruction is
- not accepted by the inline assembler. IMUL reg16, immed8 is not
- allowed, rather the inline assembler expects this to be coded as
- IMUL reg16, reg16, immed8, where the two registers are identical.
- The IMUL reg16, immed8 is commonly accepted by assemblers and is
- also listed in Intel's documentation. Therefore, this syntax should
- be supported by the inline assembler.
-
- Some 286 protected mode instructions (LLDT, LMSW, LTR, SMSW, VERR,
- VERW) when used with memory operands require the use of the PTR
- directive to establish operand size with an untyped operand (e.g.,
- [BX+SI]). Two other instruction, SGDT and SIDT, do not require the
- use of PTR in these cases. This usage is inconsistent. Since the
- operand size can be deduced from the instructions itself (just as
- can be done in the case of a MOV AX, [BX]) no PTR directive at all
- should be required. Likewise, the POP mem16 instruction should not
- require a WORD PTR directive with an untyped memory operand, since
- memory operand size is obvious from the instruction.
-
-
-
- 20. Errors / problems / documentation deficiencies using coprocessor
- instructions with the new ASM directive of TP 6.0
-
- In the $G+,N+ compiler mode the inline assembler does not
- assemble coprocessor instructions into emulator interrupts
- regardless of the $E switch setting. Instead it always generates
- optimized coprocessor instructions (without inserted WAITs).
- This causes programs compiled with $G+,N+,E+ to fail if no
- coprocessor is present. The assembler must ensure that the
- $E switch is off before performing this optimization.
-
- Coprocessor instructions in the no-wait form (e.g. FNINIT,
- FNSTSW, FNSTCW, FNSTENV, FNCLEX) are not encoded into emulator
- interrupts, since it makes no sense to use them with the
- emulator which cannot work in parallel with the CPU. This
- may lead to problems if programmers are not aware of the
- fact that these instructions will have absolutely no effect
- in an emulator environment. Since it is desirable to have the
- no-wait instructions available, programmers should be warned
- by the documentation not to use them in programs or routines
- that may be executed by the emulator or to explicitly code
- around this problem by using the system variable Test8087.
- An example of a work around solution follows.
-
- ASM
- . { some other code }
- .
- CMP Test8087, 0 { coprocessor present ? }
- JNE @Emulate { no, do specific code for emulator}
- FNINIT { can be safely used with 8087 }
- JMP @Continue { skip emulator code }
-
- @Emulate:
- FINIT { this can be emulated }
-
- @Continue: { continue with more code }
- .
- .
- END;
-
-
-
- 21. TP wrongly flags coprocessor instruction as 286 specific
-
- Turbo Pascal's inline assembler BASM will not allow one to
- assemble the coprocessor instruction FLDLN2 (load Ln(2) to
- TOS). During compilation it gives a compile time error 159,
- "286/287 instructions are not enabled". However, FLDLN2 is
- by no means 286 specific, it is included in the Intel docu-
- mentation for the 8087 (see for example "Microprocessors,
- Volume 1", page 2-141, Intel 1991). TP 6.0 handles FLDLN2
- correctly in $G+ mode.
-
- {$N+,E-,G-}
-
- PROGRAM FLDBUG; { will not compile under TP 6.0 }
-
- BEGIN
- ASM
- FLDLN2 { TP 6.0 refuses to compile this in $G- mode }
- FSTP ST(0);
- END;
- END.
-
-
- 22. Inconsistent error messages emitted by inline assembler
-
- When constants that are out of range are supplied to assembler
- instructions that take some kind of immediate operand, two different
- error messages are emitted depending on the type of the destination
- operand. If the destination operand is a byte operand, as in
- ADD AL, 256 the compilation will result in error 155, 'Invalid
- combination of opcode and operands'. However, if the destination
- is a word operand as in ADD AX, 65536 the resulting error will be
- #76, 'Constant out of range'. This discrepancy should be resolved
- by always emitting the 'Constant out of range' error when an
- immediate value is not within the specified limits called for by the
- destination operand.
-
- Instructions that require one of their operands to be a memory
- reference (BOUND, LDS, LES, LEA, SGDT, SIDT, LGDT, LIDT) should
- cause compile error 156 (memory reference expected) to be emitted
- when a register is supplied instead of a memory reference. This
- will give a more detailed description of the error than the
- currently used error 155 (invalid combination of opcode and
- operand).
-
- There are space saving sign extending encodings available for OR,
- AND, and XOR instructions that the inline assembler fails to use.
- These encodings are the equivalents of the sign extending encodings
- used with the ADD, ADC, SUB, SBB, and CMP instructions. A list of
- the additional instructions follows:
-
- Instruction | Encoding
- ---------------------+-------------------------------------------
- OR reg16, const8 | 83 mod 001 r/m data8
- OR mem16, const8 | 83 mod 001 r/m (disp) (disp) data8
- AND reg16, const8 | 83 mod 100 r/m data8
- AND mem16, const8 | 83 mod 100 r/m (disp) (disp) data8
- XOR reg16, const8 | 83 mod 110 r/m data8
- XOR mem16, const8 | 83 mod 110 r/m (disp) (disp) data8
-
-
- 23. Compiler Switch /V doesn't export names of SYSTEM routines
-
- When using the /V of TPC or choosing standalone debugging within
- the Turbo Pascal IDE, all public identifiers are supposed to be
- included into the EXE file for debugging purposes. However, the
- names of just about every routine from the SYSTEM unit are not
- included, although the variables (such as HeapOrg) are included.
- Among the few exceptions are the MemAvail and MaxAvail routines,
- which are sometimes included into the debug information. This bug
- is very annoying when programs are profiled with the Turbo Profiler
- and one wants to know how much time the program spends in certain
- SYSTEM routines. Also, when debugging programs with Turbo Debugger
- one would rather like the disassembly to display a call as e.g.
- CALL SYSTEM.LONGMUL instead of a cryptic CALL 152E:05B8. This makes
- the disassembled code hard to follow. I therefore urge Borland to
- assure correct inclusion of *all* public symbols in the debug
- information generated by the /V switch.
-
-
-
- 24. Problem with AAM xx and AAD xx instructions when stepping/tracing
- through inline ASM code with Turbo-Pascal's build-in debugger
-
- The inline assembler correctly allows parameters with the AAM and
- AAD instructions. Although this feature is not officially documented
- by Intel, it works on all 80x86 processors and compatibles, such as
- NEC's V30. The inline assembler will correctly assemble an instruction
- like AAM 16, which is quite useful when one wants to print a number
- in hexadecimal format. Turbo-Pascal's internal debugger does not
- recognize AAM opcodes other than the plain AAM opcode. When stepping/
- tracing through inline assembly code, it seems to skip the instruction,
- causing the program to behave differently than in an ordinary run.
- For example, the following instruction sequence will give 0505 in AX
- when run on any 80x86, but will give 0055 in AX when stepped through
- with Turbo-Pascal's internal debugger.
- .
- .
- MOV AX, 0055h
- AAM 16
- . { AX should contain 0505h now }
-
- Another example involving the AAD instruction will give 0066h in AX
- when simply run, but AX will contain 00CEh when the code is stepped.
- .
- .
- MOV AX, 0606h
- AAD 16
- . { AX should contain 0066h now }
-
-
-
- 25. Error in heap manager (GetMem, New)
-
- Turbo-Pascal 6.0 allows memory allocation functions to allocate
- data structures of more than 65528 bytes on the heap. Data
- structures on the heap of size greater than 65528 bytes may
- cause segment wrap-around, thereby destroying other data on the
- heap or causing a general protection exception on processors
- from the 80286 on upwards. This general protection exception
- #GP(0) is triggered when a word is accessed at offset FFFFh in
- a segment, even when the processor is in real mode. With no valid
- #GP(0) handler present, the system will crash upon returning
- from the INT 0Dh service routine since the exception has pushed
- an error code *after* pushing the return address, which will not
- be removed from the stack without a valid #GP(0) handler present
- when the INT ODh executes it's IRET. 386 memory managers like
- QEMM or the DOS-box of windows in 386-Enhanced catch a #GP(0)
- exception, but plain DOS, even with MS-DOS 5.0, crashes. The
- following program illustrates the problems:
-
-
- PROGRAM HeapBug;
-
- TYPE SpcRecord = RECORD
- W1: WORD;
- W2: WORD;
- B1: BYTE;
- END;
- SmallArray = ARRAY [1..8] OF CHAR;
- BigArray = ARRAY [1..65535] OF CHAR;
- SpcArray = ARRAY [1..13107] OF SpcRecord;
-
-
- VAR P1 : ^SmallArray;
- P2 : ^BigArray;
- P3 : ^SpcArray;
- Hptr: POINTER;
-
- BEGIN
- HPtr := HeapPtr; { save initial value of heap pointer }
- WHILE HeapPtr = HPtr DO BEGIN
- New (P1); { use up blocks in freelist }
- END;
- IF Ofs (HeapPtr^) <> 8 THEN
- New (P1); { make sure large array will have ofs of 8 }
- New (P2);
- FillChar (P1^, 8, 'A'); { initialize 1st array }
- FillChar (P2^, 65534, 'B');{ initialize 2nd array -> trashes 1st array }
- IF P1^[6] <> 'A' THEN { chk if 1st array's integrity was violated }
- WriteLn ('First array trashed!');
- P3 := Pointer (P2);
- P3^[13106].W2 := $55AA; { access at ofs FFFF causes #GP(0) on 80286 }
- END.
-
- The problem here is that 80x86 segments start at 16-byte boundaries
- (paragraph boundaries), while allocation of data structures on the
- heap is aligned at 8-byte boundaries. If a data structure in the
- heap has a start address with an offset of 8 and is greater than
- 65528 bytes, accessing the very last bytes of that data structure
- will cause undesired segment wrap around. Therefore, maximum allowed
- allocation for data structures on the heap should be 65528 bytes.
-
-
-
- 26. Logical error in GRAPH.TextWidth function
-
- The TextWidth function delivers uncorrect results when fonts
- are scaled with the SetUserCharSize procedure. To compute the
- width of the string passed to it, the TextWidth function adds
- the width of all characters in the string. Depending on the
- current setting of the Direction parameter within GRAPH the
- resulting value is then multiplied and divided by either the
- MultX and DivX or the MultY and DivY scaling factors. If these
- scaling factors are not unity, this method will compute the
- wrong text width. Since text justification using the OutTextXY
- and SetTextJustify procedures relies on the TextWidth function
- for computing the starting position for string output, this
- output is not correctly justified. The TextWidth function, when
- used with user supplied font scaling factors, usually returns
- a width that is bigger than the actual width of the string. The
- correct way to compute text width is to compute the actual size
- of every character in the string using the scale factors supplied
- by the user and add these values up. An example:
-
- Suppose we want to compute the width of the string 'World'.
- Assume that the unscaled width of the characters as taken from
- the font information is 10, 7, 7, 5, 7, the output direction is
- horizontal and that the scale factors are MultX = 5 and DivX = 8.
- The current implementation of TextWidth would compute the width
- as ((10+7+7+5+7) * 5) DIV 8 = (36 * 5) DIV 8 = 180 DIV 8 = 22.
- A correct implementation however, would calculate the width as
- follows: (10*5) DIV 8 + 3 * (7*5) DIV 8 + (5*5) DIV 8 = (6+12+3)
- = 21. This version is correct since it uses character sizes as
- used in the OutText and OutTextXY procedures.
-
-
-
- 27. Length of descender not taken into account by SetTextJustify
-
- If text is to be written at the very bottom of the current
- graphics window, one uses SetTextJustify (AnyMode, BottomText)
- and OutTextXY (AnyX, ViewMaxY, AnyText) to accomplish that.
- However, if the text contains letters that decend below the
- base line for letters, descenders are outside the window and
- clipped off. If one wants to output text in the manner described,
- this is very annoying, since the programmer has to adjust the
- Y-coordinate himself according to the font size in effect. The
- same problem occurs if text is to be written horizontally at
- the very right of the graphics window. Obviously, the TextHeight
- function used by the SetTextJustify procedure does not account
- for descender length. To fix the problem described, justification
- should be changed to account for the overall height of characters
- including descenders.
-
-
-
-
- 28. "Snow" prevention fails on CGA due to unsafe algorithm
-
- The internal DirectWrite routine of module CRT is designed to prevent
- "snow" when writing directly to the CGA screen. However, a logical
- error prevents that this snow-checking works 100% safe. The same
- critisism applies to the WriteView method in module VIEWS of Turbo
- Vision. The following is an excerpt from CRT.DirectWrite:
-
- .
- .
- @@2: LODSB ; 1; get char
- MOV BL,AL ; 2;
- @@3: IN AL,DX ; 3; wait until out of current horiz. sync,if in
- TEST AL,1 ; 4;
- JNE @@3 ; 5;
- CLI ; 6;
- @@4: IN AL,DX ; 7; wait until next horiz. sync starts
- TEST AL,1 ; 8;
- JE @@4 ; 9;
- MOV AX,BX ;10;
- STOSW ;11; write to screen
- STI ;12;
- LOOP @@2 ;13;
- .
- .
-
-
- If an interrupt occurs after line 3 and before line 6 in the
- above code fragment, the program will *not* wait for the *start* of
- the horizontal sync but only test if the CGA is *in* a horizontal
- sync upon returning from the interrupt service routine. Since
- horizontal sync allows only for the output of exactly one character
- if output starts at the very beginning of horizontal sync, there
- is a good chance that the above program writes to the screen after
- the horizontal sync has been completed, thereby causing the CGA to
- "snow". Of course, failure of the above code to prevent "snow" is
- only noticeable in a system with very high interrupt rates e.g.
- running serial communication as a background TSR. One additional
- disadvantage of the above code is that it makes only use of the
- horizontal sync period, although this is much shorter than the
- vertical retrace period.
-
- The following enhanced code is 100% safe to prevent snow and uses
- the vertical and horizontal retrace periods. It has been tested on
- an original IBM-CGA. Interrupt latency is only marginally higher than
- with the original code and still allows to run interrupt driven
- serial communication at the highest possible rate of 115000 baud.
-
- DirectWrite:
-
- CMP SI, DI ; start address = end address ?
- JE EmptyStr ; yes, nothing to write
- PUSH CX ; save
- PUSH DX ; registers
- PUSH DI ; that
- PUSH DS ; must be
- PUSH ES ; preserved
- MOV CX, DI ; string end address
- SUB CX, SI ; number of characters to write
- MOV DL, CheckSnow ; get flag for snow check
- MOV DH, TextAttr ; get current attribute
- XOR AX, AX ; address BIOS data area
- MOV DS, AX ; via segment 0
- MOV AL, DS:CrtWidth+400h; width of scan line in current mode
- MUL BH ; multiply by cursor y-position
- XOR BH, BH ; clear hi-byte to prepare for addition
- ADD AX, BX ; add cursor x-position
- ADD AX, AX ; two screen bytes for every character
- XCHG AX, DI ; offset into screen memory to DI
- MOV AX, DS:Addr6845+400h; get 6845 base address
- ADD AX, 6 ; 6845 status port
- XCHG AX, DX ; AX = CheckSnow/TextAttr, DX = port
- MOV BX, 0B800H ; screen segment for color modes
- CMP DS:CrtMode+400h, 7 ; monochrome mode ?
- JNE ColorMode ; no, one of the color modes
- MOV BH, 0B0H ; screen at segment B000h if mono
- ColorMode:PUSH ES ; address character string
- POP DS ; via DS
- MOV ES, BX ; extra segment addresses screen seg
- CLD ; autoincrement for string instruct.
- OR AL, AL ; CheckSnow = TRUE ? (AH=attribute)
- JE OutLoop ; no, don't check for snow
- WriteChr: LODSB ; get character to write, AH = attrib
- XCHG AX, BX ; save character/attribute to write
- WaitHor: CLI ; interrupts disturb critical timing
- IN AL, DX ; read 6845 status
- TEST AL, 8 ; in vertical retrace ?
- JNZ WriteScr ; yes, it is safe to write to screen
- TEST AL, 1 ; in horizontal retrace ?
- JNZ WaitHor ; yes, wait until out of hor. retrace
- WaitHor2: IN AL, DX ; read 6845 status
- TEST AL, 1 ; horizontal or vertical retrace ?
- JZ WaitHor2 ; no, wait until either kind of retr.
- WriteScr: XCHG AX, BX ; in horiz. or vert. retrace: get ch
- STOSW ; write character and attribute
- STI ; interrupts ok now
- LOOP WriteChr ; write next character until all thru
- JMPS WriteDone ; screen write done
- OutLoop: LODSB ; get character to write
- STOSW ; write character and attribute
- LOOP OutLoop ; until all characters printed
- WriteDone:POP ES ; restore
- POP DS ; destroyed
- POP DI ; registers
- POP DX
- POP CX
- EmptyStr: RET
-
-
-
-
- 29. GetDir doesn't report use of invalid drive number
-
- The GetDir procedure should emit run time error 15 "Invalid
- drive number" when passed an invalid drive number. However, the
- procedure does not do the required check on the DOS return code
- and therefore never raises run time error 15. Instead, it always
- returns the String "X:\", where the X stands for any character
- in the IBM character set. The bug can easily be fixed by adding
- a few lines of code to the source module DIRH.ASM. The following
- program will demonstrate the bug:
-
-
- PROGRAM GetDirBug;
-
- VAR DriveNr: INTEGER;
- PathName: STRING;
-
- BEGIN
- REPEAT
- Write ('Enter Drivenumber (try also numbers > 100, 99 exits): ');
- ReadLn (DriveNr);
- GetDir (DriveNr, PathName);
- WriteLn('The path on drive ', DriveNr, ' is ', PathName);
- UNTIL DriveNr = 99;
- END. {GetDirBug}
-
-
-
-
- 30. Help bug
-
- Context sensitive help (Ctrl-F1) for the predefined arrays Port
- and PortW is missing. There was no help for these arrays in TP5.5
- as well.
-
-
-
- 31. Problems with the file selector box in IDE
-
- The history list of a file selector box contains only those
- files that were selected entering the file name in the input
- box, not those selected by double clicking the name in the
- file list, which is the standard way to select a file if the
- mouse is heavily used. Even when working mainly with the mouse
- a history list is still useful, since the desired files may
- be at the end of a file list 100 files long and one has to
- get to the right part of the file list before being able to
- double click the file name. By the way, this is also a problem
- on the Apple Macintosh, since its file select boxes do not
- have a history list feature at all. This can really be a pain
- in the neck. It is therefore strongly recommended that all
- files that have been selected with either method (that is, by
- entering the name in the input box or by double clicking the
- name in the file list) be put in the history list.
-
-
-
- 32. Possible problems in unit APP.PAS
-
- APP.PAS contains a assembler function ISqr, that computes the
- integral part of the square root of its integer argument. This
- function has several shortcomings. First of all, it should more
- appropriately named ISqrt. Then, for all arguments > 32760, it
- will enter an endless loop. Finally it is not very fast, since
- it makes use of the IMUL instruction. Unfortunately, it is not
- clear to me, if the shortcomings pointed out cause any threat
- to program integrity. If it is desirable to fix the function,
- the following substitute could be used. It uses a more elegant
- and faster algorithm and returns the correct result for all
- positive INTEGERs. The code length is identical to the original
- routine ISqr.
-
- { ISqrt (I) computes INT (SQRT (I)), that is, the integral part of the }
- { square root of integer I. It does not check for negative arguments. }
- { For all arguments 0..MaxInt the correct result is returned. The }
- { algorithm exploits the following property: }
- { n }
- { n**2 = Sigma (2i-1) }
- { i=1 }
-
- FUNCTION ISqrt (I: INTEGER): INTEGER; ASSEMBLER;
-
- ASM
- MOV CX, I { load argument }
- MOV AX, -1 { init result }
- CWD { init odd numbers to -1 }
- XOR BX, BX { init perfect squares to 0 }
- @loop:INC AX { increment result }
- INC DX { compute }
- INC DX { next odd number }
- ADD BX, DX { next perfect square }
- CMP BX, CX { perfect square > argument ? }
- JBE @loop { until square greater than argument }
- END;
-
-
-
-
- 33. Poor performance of REAL type arithmetic
-
- Although this does not constitute a real bug, an analysis of the
- poor performance of the REAL type arithmetic will be given. The
- rationale here is that a 'TURBO product' should also deliver
- turbo performance wherever it can be achieved. One obvious example
- that there is ample room for speed improvements is the REAL-Sqrt
- function. It will take more time to compute the square root to 12
- decimal places than the coprocessor emulator needs to compute the
- function result to 19 decimal places. I feel that such a performance
- is unacceptable. Unfortunately, there were no improvements in TP6.0
- over TP5.5.
-
- Improvements are also possible in the LONGINT arithmetic, especially
- the division, which will enjoy accelerations of factor four to six
- (depending on the CPU) when coded using the DIV instruction.
-
- Performance can be enhanced by careful register scheduling within
- all routines, thus avoiding unnecessary memory accesses. This
- measure will also reduce the overall instruction count for a routine.
- Wherever possible, time saving CPU instructions such a MUL or DIV
- should be used. This will vastly improve performance especially on
- the 286, 386, and 486 CPUs. Most important is the choice of the
- appropriate algorithm for each function. Tests show that the REAL
- division uses the slowest out of four possible algorithms. This
- clearly indicates that not much time was invested in finding short
- but fast algorithms. On the other hand, the square rooting routine
- uses a basically fast algorithm (Newton's iteration), but
- obliterates it advantages by poor implementation. The trancendental
- functions are based on polynomial approximations. It seems that no
- care was taken to find the shortest and most accurate polynomials
- possible. The speed advantages possible by a careful recoding of
- the complete REAL arithmetic range from a few percent for simple
- functions like LONGINT to REAL conversion to up to a factor of 20
- for the Sqrt function.
-
-
-
- 34. Inefficient string handling
-
- The string handling operations Insert, Delete, and Pos have
- always been implemented in a very simple but quite ineffient
- manner in Turbo-Pascal. There were no improvements in Turbo-
- Pascal 6.0. Since an acceleration of 300% - 400% can be
- achieved, this is hard to accept.
-
-
- *** Note: The above mentioned improvements have been realized in a
- replacement for the original SYSTEM.TPU. The source has been
- made available to BORLAND, but will not be given here. The
- library replacement (not the source though) is available
- as TPL60N15.ZIP via anonymous FTP from garbo@uwasa.fi
-
-
-
-
- ++++++++++++++++++++ Suggestions for enhancements ++++++++++++++++++++++++++
-
-
- 1. Suggested improvements for coprocessor / emulator arithmetic
-
- The routine that patches the emulator interrupts (INT 34 to INT 3D)
- back to coprocessor instructions at runtime if a coprocessor is
- present always insert WAITs (9Bh) before the coprocessor instruction.
- However, for all coprocessors except the 8087 these WAITs are
- unnecessary, since the 287 and 387 synchronize with the CPU at
- hardware level, using ports F0h thru FFh. These WAITs can therefore
- be replaced by NOPs, resulting in somewhat faster code. Performance
- improvements of up to 6% were observed with programs that make heavy
- use of simple coprocessor instructions (linear equation solver) by
- this simple change. A new routine, which does insert NOPs instead
- of WAITs where approriate is presented here.
-
-
- CODE SEGMENT BYTE PUBLIC 'CODE'
-
- ASSUME CS:CODE
-
- JMPS EQU <JMP SHORT>
-
- ;------------------------------------------------------------
- ; PATCH87 is the routine responsible for converting emulator
- ; interrupts back to coprocessor opcodes if a coprocessor is
- ; detected by the startup code.
- ;
- ; This routine is 1 byte shorter than the original one and has
- ; been enhanced to generate NOPs instead of WAITs before each
- ; coprocessor instruction when the coprocessor is a 287 or 387.
- ;
- ; INPUT: No input or output. The desired sideeffect is
- ; OUTPUT: patching the code at run-time.
- ;
- ; DESTROYS: -
- ;
- ; All rights reserved (c) 1988, 1989, 1990, 1991 Norbert Juffa
- ;
- ; Borland is free to use this code if desired !
- ;-------------------------------------------------------------
-
- PATCH87 PROC FAR
- PUSH BP ; save TURBO-Pascal framepointer
- MOV BP, SP ; make new framepointer
- PUSH AX ; save
- PUSH SI ; destroyed
- PUSH DS ; registers
- TEST BYTE PTR [BP+7], 2; interrupts allowed before int ?
- JZ $intdis ; no
- STI ; yes, enable interrupts
- $intdis:LDS SI, [BP+2] ; load return address
- DEC SI ; point to int data
- MOV AX, WORD PTR [SI] ; get interrupt number & data
- DEC SI ; point to patch
- SUB AL, 34h ; 34..3D --> 0..9
- CMP AL, 9 ; interupt valid (between 0..9) ?
- JA $invald ; invalid interrupt
- JE $fwait ; interrupt $3D --> FWAIT
- CMP AL, 8 ; interrupt $3C ?
- JE $spcial ; yes, handle segment overrides
- ADD AL, 0D8h ; new opcode
- $tst286:MOV AH, AL ; second byte of opcode
- MOV AL, 90h ; first byte is a nop
- PUSH SP ; test if
- POP BP ; 286 or
- CMP SP, BP ; higher
- JE $patch ; 286
- MOV AL, 9Bh ; convert nop to wait
- $patch: MOV WORD PTR [SI], AX ; store new opcode
- MOV BP, SP ; address stack via BP
- MOV WORD PTR [BP+8],SI; set new return address
- $endptc:POP DS ; restore
- POP SI ; destroyed
- POP AX ; registers
- POP BP ; restore TURBO-Pascal frameptr
- IRET ; done
- $fwait: MOV AX, 9B90h ; store FWAIT
- JMPS $patch ; patch it in
- $spcial:TEST AH, 20h ; bit 5 set indicates spec. func.
- JNZ $invald ; not supported, invalid
- MOV AL, AH ; generate
- AND AX, 07C0h ; segment
- SHR AL, 1 ; override
- SHR AL, 1 ; byte
- SHR AL, 1 ; and
- XOR AL, 18h ; coprocessor
- ADD AX, 0D826h ; opcode
- MOV BYTE PTR [SI+2],AH; set new opcode
- JMPS $tst286 ; put in new opcode
- $invald:JMPS $endptc ; no error handling, ignore
- PATCH87 ENDP
-
- CODE ENDS
-
- END
-
-
- Another optimization could be performed if a program is
- compiled in the $N+,E- mode. Since no emulator is used
- anyhow, the compiler could give up generating emulator
- interrupts and generate real coprocessors instructions
- instead. On CPUs > 286 neither NOPs nor WAITs had to be
- inserted before NDP instructions. This would save space
- as well as time.
-
- Those functions that use the Borland shortcut interrupt 3Eh
- could test which NDP is present whenever this interrupt is
- called. If Test8087 = 3, the enhanced instructions (e.g. FSIN,
- FCOS) available on the 387/486/287XL could be executed. There
- would be only minimum timing overhead, but vast performance
- improvements on 386/486 machines. Since no elaborate argument
- reduction schemes are necessary, the additional code would be
- quite short.
-
- The Borland shortcut interrupt provides some functions not
- accessible from Turbo-Pascal 6.0. These functions are the tangent
- Tan (subcode F0h), the dyadic logarithm Ld (subcode F6h), the
- common logarithm Log (subcode F8h), power of two (subcode FCh),
- and power of ten (subcode FEh). Tests show that these undocumented
- functions are provided with a coprocessor as well as with the
- emulator and are fully operational. These functions should be
- made available to programmers through the SYSTEM unit and be
- documented. Especially the Tan is quite useful since it only
- takes 40% of the time of the equivalent construct Sin/Cos.
-
-
-
- 2. Inclusion of LOADALL in inline assembler
- Since the undocumented AAM xx and AAD xx instructions are provided
- by the inline assembler, the undocumented LOADALL instruction
- (opcode 0F05h) could be provided as well when the compiler is in
- $G+ mode. The Turbo-Debugger will correctly disassemble LOADALL.
-
-
-
- 3. Suggestions regarding 286 code generation feature ($G+)
-
- Programs compiled with the $G+ switch will have reduced memory
- requirements and will execute somewhat faster on a 286/386/486
- CPU. Typically, memory and time savings will not execeed 2%.
- Additionally, setting the $G switch on will allow the use of
- real and protected mode 286 instructions. As explained in section
- five of the README file, programs compiled with $G+ will not
- check for the presence of a approriate processor at runtime. It
- is strongly recommended that this behavior be changed. At least
- two cases are known (one involving Borlands biggest competitor)
- where programs were shipped that had been compiled with an 286
- switch setting. Customers using them on PC type machines were
- puzzled when they discovered that programs crashed on their systems
- although they had performed flawlessly on their office computer.
- Finally someone found the bug by tracing the program with a debugger.
- To avoid such unpleasant confusion, programs compiled with $G+
- should execute a short routine at startup to determine if an 286
- or later processor is present. If this is not the case, it should
- emit an error message and abort the program, just as programs
- compiled with $N+,$E- abort if they fail to detect a coprocessor.
-
- Since 286 real mode instructions can also be executed on NEC's
- V20/V30 processors and on the 80186/188, it might be desirable
- to have an 186 code generation feature. This would effectively
- split the $G switch into two separate switches. No changes would
- have to be made to the code generator, since it generates no 286
- protected mode instructions. Thus, generated code would be the
- same with either the 186 and 286 switches on. However, the inline
- assembler would only recognize protected mode instructions when
- the 286 switch is on. This would allow maximum utilization of the
- 286 real mode instructions and a run time check for the CPU at the
- same time. Below is some code that can be used to distinguish between
- 8086/8088, 80188/186/V20/V30, and 80286/386/486.
-
-
- ;--------------------------------------------------------------------
- ; CPU_Test distinguishes between three groups of CPUs commonly used
- ; in computers and returns an associated code for each.
- ;
- ; OUTPUT: AX = 0 Group #0 may execute 8086 code only (8086/8088)
- ; AX = 1 Group #1 may additionally execute 286 real mode
- ; instructions (V20/V30, 80186/80188)
- ; AX = 2 Group #2 may additionally execute 286 protected
- : mode instructions
- ;--------------------------------------------------------------------
-
- CPU_Test PROC FAR
- PUSH SP ; test updating
- POP AX ; of stackpointer
- CMP AX, SP ; stackpointer updated before push ?
- JE @Grp2 ; no, must be 286, 386 or 486
- CLC ; make sure carry clear
- PUSHA ; PUSHA executed on 88/86 as JMP $+2
- STC ; carry set if V20/V30 or 186/188
- @8086: JC @Grp1 ; yes, its group #1
- XOR AX, AX ; CPU is 8088/8086
- RET ; done
- @Grp1: POPA ; remove pushed bytes
- MOV AX, 1 ; CPU is V20/V30 or 80186/80188
- RET ; done
- @Grp2: MOV AX, 2 ; CPU is 286/386/486
- RET ; done
- CPU_Test ENDP
-
-
-
- 4. Suggestions for enhancements in the code generator
-
- 4.1 Enhancing procedure entry/exit code in $G+ mode (286 code generation)
-
- When a procedure/function does not use local variables, the
- standard exit code in $G- mode is:
-
- POP BP
- RET
-
- This is replaced by the following code in $G+ mode:
-
- LEAVE
- RET
-
- However, for procedures/function that have no local variables,
- it would be advantageous to always use the first sequence in
- either mode, $G- and $G+. Although both sequences take the same
- number of clock cylces on 286 and 386 processors, the first is
- considerably faster on the 486. Since the code generator already
- checks if no local variables are declared to generate optimized
- entry code in $G+ mode, the optimized exit code could be
- generated just as easily.
-
- Although the use of the ENTER imm16, 0 instruction does produce
- shorter code when a procedure/function has both, parameters and
- local variables, the equivalent but longer (two or three byte more)
- standard procedure entry code will execute faster than ENTER on
- all Intel processors. Therefore, it should be considered if it is
- really desirable to use ENTER at all. A lot of programs really
- do run slower on a 386DX machine if compiled with $G+ instead of
- $G-, as tests indicate.
-
- processor | ENTER imm16, 0 | standard entry sequence
- -----------+--------------------+------------------------
- 80286 | 11 clocks | 3 + 2 + 3 = 8 clocks
- 80386 | 10 clocks | 5 + 2 + 2 = 9 clocks
- 80486 | 14 clocks | 1 + 2 + 1 = 4 clocks
-
-
-
- 4.2 Optimizing entry code for non nested procedures without parameters
- and local variables
-
- If a procedure/function takes neither any parameters nor declares
- any local variables and is not statically nested within another
- procedure/function, there is no need for any entry code. Turbo
- Pascal performs this optimization only for assembler procedures,
- but skips it for normal procedures, probably so that nested and
- non-nested procedures can use the same branch of the code
- generator. The code generator could be enhanced to generate
- procedure entry code only for those procedures that are either
- statically nested (and thus have a hidden parameter, namely the
- framepointer of the preceding procedure in the static chain),
- take parameters, or declare local variables.
-
-
-
- 5. Suggestions for IDE
-
- The status line for the edit mode should be enhanced to include the
- shortcuts F5 Zoom and F6 Next. These additional hints will exactly
- fit into the remaining space. When IDE is in the stepping/debugging
- mode, shortcuts F4 Goto Cursor and Ctrl-F9 Run should be added to
- the status line. This would accelerate debugging sessions, since
- all program flow control could be excerted using simple mouse clicks
- on the status line.
-
-
- 6. Suggestion regarding TURBO command-line options
-
- There should be a help switch like /? or /Help on the Turbo-Pascal
- Prorammer's Platform command line that displays a help screen
- which describes the other command-line switches that are available
- and explains what they will do.
-
-