home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Amiga Developer CD v1.2
/
amidev_cd_12.iso
/
reference
/
amiga_mail_vol1
/
math
/
introieee
< prev
next >
Wrap
Text File
|
1990-01-26
|
15KB
|
447 lines
(c) Copyright 1989 Commodore-Amiga, Inc. All rights reserved.
The information contained herein is subject to change without notice, and
is provided "as is" without warranty of any kind, either expressed or implied.
The entire risk as to the use of this information is assumed by the user.
Introduction to 1.3 IEEE
Double Precision Libraries
by Dale Luck
The basic double precision IEEE library has been rewritten for V1.3.
The new library is up to 4 times faster than the old one that came with
V1.2. There were also several bugs fixed. And the routines now produce
slightly more accurate results. I've listed some benchmarks comparing
the two versions of the libraries at the end of this article.
Besides the faster software emulation of floating point, the new IEEE math
library recognizes and uses the 68020/68881 processor combination and
will use the special floating point instructions available. Also, if
an auto-configured math resource is available, it will use that as well.
Typically, this resource would point to the base of a 68881 designed as
a 16 bit IO port. But it could be another device as well.
With the new library, you also have the ability to programmatically trap
math errors such as overflow and divide by zero. Your program can now
ignore them or take suitable action without visiting the GURU.
In addition to a new version of the basic mathieeedoubbas.library, a
second library supporting transcendental functions has been added. The
name of the new library is mathieedoubtrans.library for IEEE double
precision transcendental library. It supports the same functions as the
transcendantal library for the Motorola fast floating point, such as sine,
cosine, square root, etc. This library also can identify and use the
68020/68881 combination or other math resources. And it has a very fast
software square root routine.
When Should You Use These Libraries?
These libraries have been benchmarked as the fastest IEEE double precision
libraries available on the Amiga as well as outperforming almost all other
software math libraries in the Amiga class personal workstation market.
If you need the precision of IEEE double, and wish to have a transparent
improvement in speed when your programs run on machines with math
coprocessors, then you should use these libraries. All the decision
making is done by the library when it is first initialized and it will
use the fastest available resources to do your math. You only need
one program to support a standard Amiga, a 68020/68881 Amiga, or a
external math coprocessor Amiga. It works automatically.
When Should You Avoid These Libraries?
If you don't need double precision, use the Motorola fast floating point
routines. As you can see from the benchmarks, the Motorola routines are
still quite a bit faster.
If you want your math to be the fastest possible, you will want to use
the new instructions available on the 68020/68881 directly in your code.
In that case, you would not need the IEEE libraries. However this would
prevent your code from running on conventional 68000 based Amigas unless
you supply different versions of your code for each configuration.
Floating Point Formats
Here's a chart comparing the various methods of representing floating
point numbers used by Amiga system software. The IEEE double precision
libraries operate on 64 bit quantities. The Motorola FFP libraries use
32 bits.
Note that there is a "hidden" bit in the fraction part of IEEE numbers.
Since all numbers are normalized, the leading 1 is dropped off.
Motorola Single Double
Field Size (bits) FFP IEEE IEEE
Sign 1 1 1
Exponent 7 7 11
Fraction 24 23+1 52+1
Total 32 32 64
Minimum (+) number 5.4e-20 1.3e-38 2.2e-308
Largest (+) number 9.2e+19 3.4e+38 1.8e+307
Minimum (+) number n/a 1.4e-45 4.9e-324
(denormalized)
Denormalized means reduced in precision so that numbers closer
to zero can be represented.
Floating Point Representation
+--------+--------+--------+--------+
|ffffffff|ffffffff|ffffffff|Seeeeeee| Motorola FFP
+--------+--------+--------+--------+
+--------+--------+--------+--------+
|Seeeeeee|ffffffff|ffffffff|ffffffff| IEEE Single
+--------+--------+--------+--------+
IEEE Double
+--------+--------+--------+--------+--------+--------+--------+--------+
|Seeeeeee|eeeeffff|ffffffff|ffffffff|ffffffff|ffffffff|ffffffff|ffffffff|
+--------+--------+--------+--------+--------+--------+--------+--------+
S = Sign bit
f = fraction bits
e = exponent bits
The scheme used in IEEE floating point representation includes a few
"special" numbers. Certain patterns of bits are used to represent
exceptions:
o NAN 'Not A Number' (result of 0/0)
o INF 'Infinity' (result of 1/0)
There are other assigned patterns in addition to these two.
Using the Libraries
The new IEEE libraries should be placed in the :libs directory. Use
the mathieeedoubbas.library to replace the old library of that same
name. The mathieeedoubtrans.library is an all new addition.
Code that calls routines in these libraries will have to be linked
to the new .lib files which also have awkward names. They are
mathieeedoubbas_lib.lib and mathieeedoubtrans_lib.lib. And there
is a new .fd file for the transcendental functions.
Using the IEEE routines is straight forward - they are a standard
library. Simply open the library, use the routines and close the
library when you are done. For example, to use the Sine routine:
/* IEEE Sine Routine */
/* Compile under Lattice 4.0 by linking with c.o + */
/* mathieeedoubbas_lib.lib + mathieeedoubtrans_lib.lib */
/* + lcm.lib + lc.lib + amiga.lib */
double IEEEDPSin();
extern int MathIeeeDoubBasBase;
int MathIeeeDoubTransBase;
void
main()
{
double x=0;
MathIeeeDoubBasBase=OpenLibrary("mathieeedoubbas.library",0);
if(MathIeeeDoubBasBase==0) exit(0);
MathIeeeDoubTransBase=OpenLibrary("mathieeedoubtrans.library",0);
if(MathIeeeDoubTransBase==0)
{
CloseLibrary(MathIeeeDoubBasBase);
exit(0);
}
x=IEEEDPSin( (double) 60 );
printf("sin 60 = %e\n",x);
CloseLibrary(MathIeeeDoubBasBase);
CloseLibrary(MathIeeeDoubTransBase);
}
Hardware Developer Information.
To make use of CBM's standard peripheral support for 68881 you must design
your peripheral to autoconfig. Your autoconfig software must create a
resource and add it to the resource list. The name of this resource is
"MathIEEE.resource". The IEEE library will attempt to open this resource.
If it finds it, it will extract the BaseAddr pointer and copy it into its
library structure. If the BaseAddr pointer is non-null it will use a
different list of routine entry points when the IEEE library is initialized.
After the IEEE library is initialized, the library again checks the resource
for alternate function bits in Flags of the resource. The Basic library only
checks the DblBasAlt bit, and the transcendental library only checks the
DblTransAlt bit. If they are set, the library routine will call the function
whose address is in the corresponding Init field. The arguments passed are
a6=sysbase, a1=resource and a2=mathlibrary.
If your device is not a 68881 then you may need to use this. There are
separate bits for different library capabilities in case your math resource
is only able to handle a limited set of functions. This will let you tie a
math processor in that may only provide addition, subtraction, multiplication and
and division functions. The rest of software will use it transparently by
calling your alternate routines.
Amiga does not provide for arbitrating a math accelerator in a multitasking
environment. Therefore, you must provide your own support for this when your
device autoconfigs. The only exception is the 68020/68881 combination where
support for that has been standard since V1.2. Arbitration usually involves
saving and restoring the state of you hardware device between task switches.
We recommend that you look at the tc_Switch and tc_Launch vectors in the task
data structure. These are called each time control transfers from one task to
another. Remember not to assume that you are the only process needing to use
those vectors.
The resource data structure is as follows:
STRUCTURE MathIEEE,LN_SIZE
UWORD MathIEEE_Flags
ULONG MathIEEE_BaseAddr ; for standard 68881 support
ULONG MathIEEE_DblBasInit ; something else besides 68881
ULONG MathIEEE_DblTransInit ; something else besides 68881
ULONG MathIEEE_SnglBasInit ; something else besides 68881
ULONG MathIEEE_SnglTransInit ; something else besides 68881
LABEL MathIEEE_sizeof
*
* Bits for MathIEEE_flags. All unassigned bits must be 0
*
BITDEF MathIEEE,DblBasAlt,0 ; alternate Basic library
BITDEF MathIEEE,DblTransAlt,1 ; alternate Trans library
BITDEF MathIEEE,SnglBasAlt,2 ; alternate Basic library
BITDEF MathIEEE,SnglTransAlt,3 ; alternate Trans library
The MathIEEE resource structure may grow in the future. Extensions will be
added as Commodore-Amiga adds new standards such as 80 bit extended format.
The 'Init' entries in the math resource structure are only used if the
corresponding Bit is set in the Flags field. So if you are just a 68881,
you do not need the Init entries. Make sure you have cleared the Flags field.
This should allow us to add Extended Precision later. For Init users, make
sure you add yourself into the Open/Close/Expunge vectors for this library.
The library structure that is used is tentatively laid out as shown below.
I say tentatively because the name of the entries may change yet. The order
of entries, their usage and size will not change. Naturally we may add new
fields to the end.
STRUCTURE MI,LIB_SIZE ; Standard library node
UBYTE io8_Flags ; is this 68881?
UBYTE io8_pad ; line up to next 32bit boundary
ULONG io8_68881 ; ptr to io68881 base
ULONG io8_SysLib ; ptr to SysBase
ULONG io8_SegList ; ptr to this SegList
ULONG io8_Resource ; ptr to mathIEEE.resource
ULONG io8_opentask ; called when task opens
ULONG io8_closetask ; called when task closes
LABEL MI_SIZE
Of particular interest to hardware developers are the opentask and closetask
entry points. These functions will be called when a task calls OpenLibrary
and CloseLibrary. This will give the vendor the opportunity to set up any
per task initialization necessary. The Amiga library presently sets them up
as NOPs in the case of straight emulation. It puts the 68881 initialization
code in there for the 68020/68881 as well as the peripheral 68881. That
initialization code currently sets up rounding modes and interrupt requests.
If you need to override the defaults, you will have to set the appropriate
Alt bits in the Resource structure and overwrite the opentask/closetask
fields when your AltInit function is called. The OpenLibrary routine checks
the return value of opentask for errors. If a nonzero is in d0.l then
OpenLibrary will return 0 to the task trying to OpenLibrary.
On the 68020/68881 some new exceptions are generated. Unfortunately the
V1.2 operating system does not properly initialize these. For users of the
new ramkick/A2024 system, the fixes have been added to the exec.library.
For the rest we provide a program to run during your startup sequence to
initialize the vectors and redirect processing back to exec when the new
exceptions occur. This is only necessary on 68020/68881 systems.
Benchmarks
This section contains some benchmarks comparing the performance of the
various Amiga math libraries. Use these as a guide when selecting the
math routines to be used for your application.
All these benchmarks show the reults when compiling under Greenhill's C.
The results you get with another compiler will vary.
How does V1.3 stack up to V1.2?
A Comparison of Software
V1.2 V1.3 V1.2
IEEE IEEE MathFFP
Float
10000 (secs) 92.14 45.22 17.64
256000 (secs) 580.58 282.52 136.78
Calcpi
(kflops/sec) 2.07 4.93 11.14
PI error -5.5e-14 -1.4e-11 6.1e-5
Whetstone
(kwhets/sec) 12 24 78
Savage
(secs) N/A 470 98.2
System tested: A1000, 512k chip memory, 1 external floppy
Transparent Increase in Speed
V1.3/000 000/881 020/881
Float
10000 (secs) 45.22 19.18 13.46
256000 (secs) 282.52 179.98 122.46
BCalcpi
(kflops/sec) 4.93 7.89 11.78
PI error -1.39e-11 -2.78e-11 -2.78e-11
Whetstone
(kwhets/sec) 24 81 124
Savage
(secs) 470 20.4 15.2
error -6.9e-7 -5.6e-7 -5.6e-7
Systems tested:
V1.3/000 was an A1000 with 512k.
000/881 was an A1000 with 512k plus 2M and Microbotic's "881 Starmath
020/881 was an A2000 with CSA's 68020/68881, 2M memory and a 2090a
Penultimate Speed Tests:
Comparison of Speed Using
Inline F instructions
V1.3/000 020/881
Float
10000 (secs) 45.22 0.26*
256000 (secs) 282.52 15.86
Calcpi
(kflops/sec) 4.93 81.3
Whetstone
(kwhets/sec) 24 459
Savage
(secs) 470 4.6
Systems tested:
V1.3/000 was an A1000 with 512k and 1 external floppy.
020/881 was an A2000 with CSA's 68020/881, 2M memory and a 2090a.
Note: Under this test, the 020/881 test code will not run on a
standard 68000 based system.
* The Greenhill compiler may have optimized this benchmark to nothing.
Penultimate Speed Tests, II:
Inline Results With
Fast 32-Bit Memory
Inline Inline
020 020/881 030/882 020/881 030/882
Float
10000 (secs) 25.6 6.08 5.16 0.24* 0.18*
256000 (secs) 168.74 54.08 47.52 15.28 13.16
Calcpi
(kflops/sec) 8.44 25.29 28.8 90.09 114.42
Whetstone
(kwhets/sec) 39 263 291 673 889
Savage
(secs) 320.8 8.4 7.6 4.46 3.98
Systems tested:
020 was an A2000 with CSA's 020 board running at 14 MHz.
020/881 was an A2000 with CSA's 020/881 board running at 14 MHz.
030/882 was an A2000 with CSA's 030/882 board running at 14/16 MHz.
* The greenhills compiler may have optimized this benchmark to nothing.