home *** CD-ROM | disk | FTP | other *** search
Text File | 2013-11-08 | 437.9 KB | 9,167 lines |
- INTEL 80387 PROGRAMMER'S REFERENCE MANUAL 1987
-
- MARCOM DISCLAIMER -- New word: Intel Certified, iRMK, SupportNET
- May 26, 1987
-
- Intel Corporation makes no warranty for the use of its products and
- assumes no responsibility for any errors which may appear in this document
- nor does it make a commitment to update the information contained herein.
-
- Intel retains the right to make changes to these specifications at any
- time, without notice.
-
- Contact your local sales office to obtain the latest specifications before
- placing your order.
-
- The following are trademarks of Intel Corporation and may only be used to
- identify Intel Products:
-
- Above, BITBUS, COMMputer, CREDIT, Data Pipeline, FASTPATH, Genius, i, î,
- ICE, iCEL, iCS, iDBP, iDIS, I²ICE, iLBX, im, iMDDX, iMMX, Inboard,
- Insite, Intel, intel, intelBOS, Intel Certified, Intelevision,
- inteligent Identifier, inteligent Programming, Intellec, Intellink,
- iOSP, iPDS, iPSC, iRMK, iRMX, iSBC, iSBX, iSDM, iSXM, KEPROM, Library
- Manager, MAPNET, MCS, Megachassis, MICROMAINFRAME, MULTIBUS, MULTICHANNEL,
- MULTIMODULE, MultiSERVER, ONCE, OpenNET, OTP, PC BUBBLE, Plug-A-Bubble,
- PROMPT, Promware, QUEST, QueX, Quick-Pulse Programming, Ripplemode, RMX/80,
- RUPI, Seamless, SLD, SugarCube, SupportNET, UPI, and VLSiCEL, and the
- combination of ICE, iCS, iRMX, iSBC, iSBX, iSXM, MCS, or UPI and a numerical
- suffix, 4-SITE.
-
- MDS is an ordering code only and is not used as a product name or
- trademark. MDS(R) is a registered trademark of Mohawk Data Sciences
- Corporation.
-
- *MULTIBUS is a patented Intel bus.
- Unix is a trademark of AT&T Bell Labs.
- MS-DOS, XENIX, and Multiplan are trademarks of Microsoft Corporation.
- Lotus and 1-2-3 are registered trademarks of Lotus Development Corporation.
- SuperCalc is a registered trademark of Computer Associates International.
- Framework is a trademark of Ashton-Tate.
- System 370 is a trademark of IBM Corporation.
- AT is a registered trademark of IBM Corporation.
-
- Additional copies of this manual or other Intel literature may be obtained
- from:
-
- Intel Corporation
- Literature Distribution
- Mail Stop SC6-59
- 3065 Bowers Avenue
- Santa Clara, CA 95051
-
- (c)INTEL CORPORATION 1987 CG-5/26/87
-
-
- Customer Support
-
- ───────────────────────────────────────────────────────────────────────────
-
- Customer Support is Intel's complete support service that provides Intel
- customers with hardware support, software support, customer training, and
- consulting services. For more information contact your local sales offices.
-
- After a customer purchases any system hardware or software product,
- service and support become major factors in determining whether that
- product will continue to meet a customer's expectations. Such support
- requires an international support organization and a breadth of programs
- to meet a variety of customer needs. As you might expect, Intel's customer
- support is quite extensive. It includes factory repair services and
- worldwide field service offices providing hardware repair services,
- software support services, customer training classes, and consulting
- services.
-
- Hardware Support Services
-
- Intel is committed to providing an international service support package
- through a wide variety of service offerings available from Intel Hardware
- Support.
-
- Software Support Services
-
- Intel's software support consists of two levels of contracts. Standard
- support includes TIPS (Technical Information Phone Service), updates and
- subscription service (product-specific troubleshooting guides and COMMENTS
- Magazine). Basic support includes updates and the subscription service.
- Contracts are sold in environments which represent product groupings
- (i.e., iRMX environment).
-
- Consulting Services
-
- Intel provides field systems engineering services for any phase of your
- development or support effort. You can use our systems engineers in a
- variety of ways ranging from assistance in using a new product, developing
- an application, personalizing training, and customizing or tailoring an
- Intel product to providing technical and management consulting. Systems
- Engineers are well versed in technical areas such as microcommunications,
- real-time applications, embedded microcontrollers, and network services.
- You know your application needs; we know our products. Working together we
- can help you get a successful product to market in the least possible time.
-
- Customer Training
-
- Intel offers a wide range of instructional programs covering various
- aspects of system design and implementation. In just three to ten days a
- limited number of individuals learn more in a single workshop than in
- weeks of self-study. For optimum convenience, workshops are scheduled
- regularly at Training Centers woridwide or we can take our workshops to
- you for on-site instruction. Covering a wide variety of topics, Intel's
- major course categories include: architecture and assembly language,
- programming and operating systems, bitbus and LAN applications.
-
- Training Center Locations
-
- To obtain a complete catalog of our workshops, call the nearest Training
- Center in your area.
-
- Boston (617) 692-1000
- Chicago (312) 310-5700
- San Francisco (415) 940-7800
- Washington D.C. (301) 474-2878
- Isreal (972) 349-491-099
- Tokyo 03-437-6611
- Osaka (Call Tokyo) 03-437-6611
- Toronto, Canada (416) 675-2105
- London (0793) 696-000
- Munich (089) 5389-1
- Paris (01) 687-22-21
- Stockholm (468) 734-01-00
- Milan 39-2-82-44-071
- Benelux (Rotterdam) (10) 21-23-77
- Copenhagen (1) 198-033
- Hong Kong 5-215311-7
-
-
- Preface
-
- ───────────────────────────────────────────────────────────────────────────
-
- This manual describes the 80387 Numeric Processor Extension (NPX) for the
- 80386 microprocessor. Understanding the 80387 requires an understanding of
- the 80386; therefore, a brief overview of 80386 concepts is presented first.
- A detailed discussion of the 80386 microprocessor can be found in the 80386
- Programmer's Reference Manual.
-
- The 80386 Microsystem
-
- The 80386 is the basis of a new VLSI microprocessor system with exceptional
- capabilities for supporting large-system applications. This powerful
- microsystem is designed to support multiuser reprogrammable and real-time
- multitasking applications. Its dedicated system support circuits simplify
- system hardware; sophisticated hardware and software tools reduce both the
- time and the cost of product development. The 80386 microsystem offers a
- total-solution approach, enabling you to develop high-speed, interactive,
- multiuser, multitasking──even multiprocessor──systems more rapidly and at
- higher performance than ever before.
-
- ■ Reliability and system up-time are becoming increasingly important in
- all applications. Information must be protected from misuse or
- accidental loss. The 80386 includes a sophisticated and flexible
- four-level protection mechanism that can isolate layers of operating
- system programs from application programs to maintain a high degree of
- system integrity.
-
- ■ The 80386 addresses up to 4 gigabytes of physical memory to support
- today's application requirements. This large physical memory enables
- the 80386 to keep many large programs and data structures
- simultaneously in memory for high-speed access.
-
- ■ For applications with dynamically changing memory requirements, such
- as multiuser business systems, the 80386 CPU provides on-chip memory
- management and virtual memory support. On an 80386-based system, each
- user can have up to 64 terabytes of virtual-address space. This large
- address space virtually eliminates restrictions on the size of programs
- that may be part of the system. The memory management features are
- subject to control of systems software; therefore, systems software
- designers can choose among a variety of memory-organization models.
- Systems designers can choose to view memory in terms of fixed-length
- pages, in terms of variable length segments, or as a combination of
- pages and segments. The sizes of segments can range from one byte to 4
- gigabytes. Virtual memory can be implemented either at the level of
- segments or at the level of pages.
-
- ■ Large multiuser or real-time multitasking systems are easily supported
- by the 80386. High-performance features, such as a very high-speed task
- switch, fast interrupt-response time, intertask protection,
- page-oriented virtual memory, and a quick and direct operating system
- interface, make the 80386 highly suited to multiuser/multitasking
- applications.
-
- ■ The 80386 has two primary operating modes: real-address mode and
- protected mode. In real-address mode, the 80386/80387 is fully upward
- compatible from the 8086, 8088, 80186, and 80188 microprocessors and
- from the 80286 real-address mode; all of the extensive libraries of
- 8086 and 8088 software execute 15 to 20 times faster on the 80386,
- without any modification.
-
- ■ In protected-address mode, the advanced memory management
- and protection features of the 80386 become available, without any
- reduction in performance. Upgrading 8086 and 8088 application
- programs to use these new memory management and protection features
- usually requires only reassembly or recompilation (some programs may
- require minor modification). Entire 80286 protected-mode applications
- can run in this mode without modification.
-
- ■ The virtual-8086 mode of the 80386 is available when the primary mode
- is protected mode. Virtual-8086 mode enables direct execution of
- multiple 8086/8088 programs within a protected-mode environment. Most
- 8086 and 8088 application programs can be executed in this environment
- without alteration (refer to the 80386 Programmer's Reference Manual
- for differences from 8086). This high degree of compatibility between
- 80386 and earlier members of the 8086 processor family reduces both
- the time and the cost of software development.
-
- The Organization of This Manual
-
- This manual describes the 80387 Numeric Processor Extension (NPX) for the
- 80386 microprocessor. The material in this manual is presented from the
- perspective of software designers, both at an applications and at a systems
- software level.
-
- ■ Chapter 1, "Introduction to the 80387 Numerics Processor Extension,"
- gives an overview of the 80387 NPX and reviews the concepts of numeric
- computation using the 80387.
-
- ■ Chapter 2, "80387 Numerics Processor Architecture," presents the
- registers and data types of the 80387 to both applications and systems
- programmers.
-
- ■ Chapter 3, "Special Computational Situations," discusses the special
- values that can be represented in the 80387's real formats──denormal
- numbers, zeros, infinities, NaNs (not a number)──as well as numerics
- exceptions. This chapter should be read thoroughly by systems
- programmers, but may be skimmed by applications programmers. Many of
- these special values and exceptions may never occur in applications
- programs.
-
- ■ Chapter 4, "80387 Instruction Set," provides functional information
- for software designers generating applications for systems containing
- an 80386 CPU with an 80387 NPX. The 80386/80387 instruction set
- mnemonics are explained in detail.
-
- ■ Chapter 5, "Programming Numeric Applications," provides a description
- of programming facilities for 80386/80387 systems. A comparative 80387
- programming example is given.
-
- ■ Chapter 6, "System-Level Numeric Programming," provides information of
- interest to systems software writers, including details of the 80387
- architecture and operational characteristics.
-
- ■ Chapter 7, "Numeric Programming Examples," provides several detailed
- programming examples for the 80387, including conditional branching,
- the conversion betweenfloating-point values and their ASCII
- representations, and the use of trigonometric functions. These examples
- illustrate assembly-language programming on the 80387 NPX.
-
- ■ Appendix A, "Machine Instruction Encoding and Decoding," gives
- reference information on the encoding of NPX instructions. This
- information is useful to writers of debuggers, exception handlers, and
- compilers.
-
- ■ Appendix B, "Exception Summary," provides a list of the exceptions
- that each instruction can cause. This list is valuable to both
- applications and systems programmers.
-
- ■ Appendix C, "Compatability between the 80387 and the 80287/8087,"
- describes the differences from the 80387 that are common to the 80287
- and the 8087.
-
- ■ Appendix D, "Compatability between the 80387 and the 8087," describes
- the additional differences between the 80387 and the 8087 that are of
- concern when porting 8086/8087 programs directly to the 80386/80387.
-
- ■ Appendix E, "80387 80-Bit CHMOS III Numeric Processor Extension,"
- reproduces a data sheet of 80387 specifications that is separately
- available. The table of instruction timings in this appendix will be of
- interest to many readers of this manual. (The AC specifications have
- been deliberately left out.) The specifications in data sheets are
- subject to change; consult the most recent data sheet for design-in
- information.
-
- ■ Appendix F, "PC/AT-Compatible 80387 Connection," documents a
- nonstandard method of connecting an 80387 to an 80386 to achieve
- compatibility with the IBM PC/AT.
-
- ■ The Glossary defines 80387 and floating-point terminology. Refer to it
- as needed.
-
- Related Publications
-
- To best use the material in this manual, readers should be familiar with
- the operation and architecture of 80386 systems. The following manuals
- contain information related to the content of this manual and of interest to
- programmers of 80387 systems:
-
- ■ Introduction to the 80386, order number 231252
- ■ 80386 Data Sheet, order number 231630
- ■ 80386 Hardware Reference Manual, order number 231732
- ■ 80386 Programmer's Reference Manual, order number 230985
- ■ 80387 Data Sheet, order number 231920
-
-
- Notational Conventions
-
- This manual uses special notation to represent sub and superscript
- characters. Subscript characters are surrounded by {curly brackets}, for
- example 10{2} = 10 base 2. Superscript characters are preceeded by a caret
- and enclosed within (parentheses), for example 10^(3) = 10 to the third
- power.
-
-
- Table of Contents
-
- ────────────────────────────────────────────────────────────────────────────
-
- Chapter 1 Introduction to the 80387 Numerics Processor Extension
-
- 1.1 History
- 1.2 Performance
- 1.3 Ease of Use
- 1.4 Applications
- 1.5 Upgradability
- 1.6 Programming Interface
-
- Chapter 2 80387 Numerics Processor Architecture
-
- 2.1 80387 Registers
- 2.1.1 The NPX Register Stack
- 2.1.2 The NPX Status Word
- 2.1.3 Control Word
- 2.1.4 The NPX Tag Word
- 2.1.5 The NPX Instruction and Data Pointers
-
- 2.2 Computation Fundamentals
- 2.2.1 Number System
- 2.2.2 Data Types and Formats
- 2.2.2.1 Binary Integers
- 2.2.2.2 Decimal Integers
- 2.2.2.3 Real Numbers
-
- 2.2.3 Rounding Control
- 2.2.4 Precision Control
-
- Chapter 3 Special Computational Situations
-
- 3.1 Special Numeric Values
- 3.1.1 Denormal Real Numbers
- 3.1.1.1 Denormals and Gradual Underflow
-
- 3.1.2 Zeros
- 3.1.3 Infinity
- 3.1.4 NaN (Not-a-Number)
- 3.1.4.1 Signaling NaNs
- 3.1.4.2 Quiet NaNs
-
- 3.1.5 Indefinite
- 3.1.6 Encoding of Data Types
- 3.1.7 Unsupported Formats
-
- 3.2 Numeric Exceptions
- 3.2.1 Handling Numeric Exceptions
- 3.2.1.1 Automatic Exception Handling
- 3.2.1.2 Software Exception Handling
-
- 3.2.2 Invalid Operation
- 3.2.2.1 Stack Exception
- 3.2.2.2 Invalid Arithmetic Operation
-
- 3.2.3 Division by Zero
- 3.2.4 Denormal Operand
- 3.2.5 Numeric Overflow and Underflow
- 3.2.5.1 Overflow
- 3.2.5.2 Underflow
-
- 3.2.6 Inexact (Precision)
- 3.2.7 Exception Priority
- 3.2.8 Standard Underflow/Overflow Exception Handler
-
- Chapter 4 The 80387 Instruction Set
-
- 4.1 Compatibility with the 80287 and 8087
- 4.2 Numeric Operands
- 4.3 Data Transfer Instructions
- 4.3.1 FLD source
- 4.3.2 FST destination
- 4.3.3 FSTP destination
- 4.3.4 FXCH//destination
- 4.3.5 FILD source
- 4.3.6 FIST destination
- 4.3.7 FISTP destination
- 4.3.8 FBLD source
- 4.3.9 FBSTP destination
-
- 4.4 Nontranscendental Instructions
- 4.4.1 Addition
- 4.4.2 Normal Subtraction
- 4.4.3 Reversed Subtraction
- 4.4.4 Multiplication
- 4.4.5 Normal Division
- 4.4.6 Reversed Division
- 4.4.7 FSQRT
- 4.4.8 FSCALE
- 4.4.9 FPREM---Partial Remainder (80287/8087-Compatible)
- 4.4.10 FPREM1---Partial Remainder (IEEE Std. 754-Compatible)
- 4.4.11 FRNDINT
- 4.4.12 FXTRACT
- 4.4.13 FABS
- 4.4.14 FCHS
-
- 4.5 Comparison Instructions
- 4.5.1 FCOM//source
- 4.5.2 FCOMP//source
- 4.5.3 FCOMPP
- 4.5.4 FICOM source
- 4.5.5 FICOMP source
- 4.5.6 FTST
- 4.5.7 FUCOM//source
- 4.5.8 FUCOMP//source
- 4.5.9 FUCOMPP
- 4.5.10 FXAM
-
- 4.6 Transcendental Instructions
- 4.6.1 FCOS
- 4.6.2 FSIN
- 4.6.3 FSINCOS
- 4.6.4 FPTAN
- 4.6.5 FPATAN
- 4.6.6 F2XM1
- 4.6.7 FYL2X
- 4.6.8 FYL2XP1
-
- 4.7 Constant Instructions
- 4.7.1 FLDZ
- 4.7.2 FLD1
- 4.7.3 FLDPI
- 4.7.4 FLDL2T
- 4.7.5 FLDL2E
- 4.7.6 FLDLG2
- 4.7.7 FLDLN2
-
- 4.8 Processor Control Instructions
- 4.8.1 FINIT/FNINIT
- 4.8.2 FLDCW source
- 4.8.3 FSTCW/FNSTCW destination
- 4.8.4 FSTSW/FNSTSW destination
- 4.8.5 FSTSW AX/FNSTSW AX
- 4.8.6 FCLEX/FNCLEX
- 4.8.7 FSAVE/FNSAVE destination
- 4.8.8 FRSTOR source
- 4.8.9 FSTENV/FNSTENV destination
- 4.8.10 FLDENV source
- 4.8.11 FINCSTP
- 4.8.12 FDECSTP
- 4.8.13 FFREE destination
- 4.8.14 FNOP
- 4.8.15 FWAIT (CPU Instruction)
-
- Chapter 5 Programming Numeric Applications
-
- 5.1 Programming Facilities
- 5.1.1 High-Level Languages
- 5.1.2 C Programs
- 5.1.3 PL/M-386
- 5.1.4 ASM386
- 5.1.4.1 Defining Data
- 5.1.4.2 Records and Structures
- 5.1.4.3 Addressing Methods
-
- 5.1.5 Comparative Programming Example
- 5.1.6 80387 Emulation
-
- 5.2 Concurrent Processing with the 80387
- 5.2.1 Managing Concurrency
- 5.2.1.1 Incorrect Exception Synchronization
- 5.2.1.2 Proper Exception Synchronization
-
- Chapter 6 System-Level Numeric Programming
-
- 6.1 80386/80387 Architecture
- 6.1.1 Instruction and Operand Transfer
- 6.1.2 Independent of CPU Addressing Modes
- 6.1.3 Dedicated I/O Locations
-
- 6.2 Processor Initialization and Control
- 6.2.1 System Initialization
- 6.2.2 Hardware Recognition of the NPX
- 6.2.3 Software Recognition of the NPX
- 6.2.4 Configuring the Numerics Environment
- 6.2.5 Initializing the 80387
- 6.2.6 80387 Emulation
- 6.2.7 Handling Numerics Exceptions
- 6.2.8 Simultaneous Exception Response
- 6.2.9 Exception Recovery Examples
-
- Chapter 7 Numeric Programming Examples
-
- 7.1 Conditional Branching Example
- 7.2 Exception Handling Examples
- 7.3 Floating-Point to ASCII Conversion Examples
- 7.3.1 Function Partitioning
- 7.3.2 Exception Considerations
- 7.3.3 Special Instructions
- 7.3.4 Description of Operation
- 7.3.5 Scaling the Value
- 7.3.5.1 Inaccuracy in Scaling
- 7.3.5.2 Avoiding Underflow and Overflow
- 7.3.5.3 Final Adjustments
-
- 7.3.6 Output Format
-
- 7.4 Trigonometric Calculation Examples (Not Tested)
-
- Appendix A Machine Instruction Encoding and Decoding
-
- Appendix B Exception Summary
-
- Appendix C Compatibility Between the 80387 and the 80287/8087
-
- Appendix D Compatibility Between the 80387 and the 8087
-
- Appendix E 80387 80-Bit CHMOS III Numeric Processor Extension
-
- Appendix F PC/AT-Compatible 80387 Connection
-
- Glossary of 80387 and Floating-Point Terminology
-
-
- Figures
-
- 1-1 Evolution and Performance of Numeric Processors
-
- 2-1 80387 Register Set
- 2-2 80387 Status Word
- 2-3 80387 Control Word Format
- 2-4 80387 Tag Word Format
- 2-5 Protected Mode 80387 Instruction and Data Pointer Image in Memory,
- 32-Bit Format
- 2-6 Real Mode 80387 Instruction and Data Pointer Image in Memory,
- 32-Bit Format
- 2-7 Protected Mode 80387 Instruction and Data Pointer Image in Memory,
- 16-Bit Format
- 2-8 Real Mode 80387 Instruction and Data Pointer Image in Memory,
- 16-Bit Format
- 2-9 80387 Double-Precision Number System
- 2-10 80387 Data Formats
-
- 3-1 Floating-Point System with Denormals
- 3-2 Floating-Point System without Denormals
- 3-3 Arithmetic Example Using Infinity
-
- 4-1 FSAVE/FRSTOR Memory Layout (32-Bit)
- 4-2 FSAVE/FRSTOR Memory Layout (16-Bit)
- 4-3 Protected Mode 80387 Environment, 32-Bit Format
- 4-4 Real Mode 80387 Environment, 32-Bit Format
- 4-5 Protected Mode 80387 Environment, 16-Bit Format
- 4-6 Real Mode 80387 Environment, 16-Bit Format
-
- 5-1 Sample C-386 Program
- 5-2 Sample 80387 Constants
- 5-3 Status Word Record Definition
- 5-4 Structure Definition
- 5-5 Sample PL/M-386 Program
- 5-6 Sample ASM386 Program
- 5-7 Instructions and Register Stack
- 5-8 Exception Synchronization Examples
-
- 6-1 Software Routine to Recognize the 80287
-
- 7-1 Conditional Branching for Compares
- 7-2 Conditional Branching for FXAM
- 7-3 Full-State Exception Handler
- 7-4 Reduced-Latency Exception Handler
- 7-5 Reentrant Exception Handler
- 7-6 Floating-Point to ASCII Conversion Routine
- 7-7 Relationships between Adjacent Joints
- 7-8 Robot Arm Kinematics Example
-
-
- Tables
-
- 1-1 Numeric Processing Speed Comparisons
- 1-2 Numeric Data Types
- 1-3 Principal NPX Instructions
-
- 2-1 Condition Code Interpretation
- 2-2 Correspondence between 80387 and 80386 Flag Bits
- 2-3 Summary of Format Parameters
- 2-4 Real Number Notation
- 2-5 Rounding Modes
-
- 3-1 Arithmetic and Nonarithmetic Instructions
- 3-2 Denormalization Process
- 3-3 Zero Operands and Results
- 3-4 Infinity Operands and Results
- 3-5 Rules for Generating QNaNs
- 3-6 Binary Integer Encodings
- 3-7 Packed Decimal Encodings
- 3-8 Single and Double Real Encodings
- 3-9 Extended Real Encodings
- 3-10 Masked Responses to Invalid Operations
- 3-11 Masked Overflow Results
-
- 4-1 Data Transfer Instructions
- 4-2 Nontranscendental Instructions
- 4-3 Basic Nontranscendental Instructions and Operands
- 4-4 Condition Code Interpretation after FPREM and FPREM
- Instructions
- 4-5 Comparison Instructions
- 4-6 Condition Code Resulting from Comparisons
- 4-7 Condition Code Resulting from FTST
- 4-8 Condition Code Defining Operand Class
- 4-9 Transcendental Instructions
- 4-10 Results of FPATAN
- 4-11 Constant Instructions
- 4-12 Processor Control Instructions
-
- 5-1 PL/M-386 Built-In Procedures
- 5-2 ASM386 Storage Allocation Directives
- 5-3 Addressing Method Examples
-
- 6-1 NPX Processor State Following Initialization
-
-
- Chapter 1 Introduction to the 80387 Numerics Processor Extension
-
- ────────────────────────────────────────────────────────────────────────────
-
- The 80387 NPX is a high-performance numerics processing element that
- extends the 80386 architecture by adding significant numeric capabilities
- and direct support for floating-point, extended-integer, and BCD data types.
- The 80386 CPU with 80387 NPX easily supports powerful and accurate numeric
- applications through its implementation of the IEEE Standard 754 for Binary
- Floating-Point Arithmetic. The 80387 provides floating-point performance
- comparable to that of large minicomputers while offering compatibility with
- object code for 8087 and 80287.
-
-
- 1.1 History
-
- The 80387 Numeric Processor Extension (NPX) is compatible with its
- predecessors, the earlier Intel 8087 NPX and 80287 NPX. As the 80386 runs
- 8086 programs, so programs designed to use the 8087 and 80287 should run
- unchanged on the 80387.
-
- The 8087 NPX was designed for use in 8086-family systems. The 8086 was the
- first microprocessor family to partition the processing unit to permit
- high-performance numeric capabilities. The 8087 NPX for this processor
- family implemented a complete numeric processing environment in compliance
- with an early proposal for the IEEE 754 Floating-Point Standard.
-
- With the 80287 Numeric Processor Extension, high-speed numeric computations
- were extended to 80286 high-performance multitasking and multiuser systems.
- Multiple tasks using the numeric processor extension were afforded the full
- protection of the 80286 memory management and protection features.
-
- The 80387 Numeric Processor Extension is Intel's third generation numerics
- processor. The 80387 implements the final IEEE standard, adds new
- trigonometric instructions, and uses a new design and CHMOS-III process to
- allow higher clock rates and require fewer clocks per instruction. Together,
- the 80387 with additional instructions and the improved standard bring even
- more convenience and reliability to numerics programming and make this
- convenience and reliability available to applications that need the
- high-speed and large memory capacity of the 32-bit environment of the 80386
- CPU.
-
- Figure 1-1 illustrates the relative performance of 5-MHz 8086/8087,
- 8-MHz 80286/80287, and 20-MHz 80386/80387 systems in executing
- numerics-oriented applications.
-
-
- Figure 1-1. Evolution and Performance of Numeric Processors
-
- 16│ 80386/80387 (20 MHz)
- 15│
- 14│
- 13│
- 12│
- 11│
- RELATIVE 10│
- PERFORMANCE 9│
- 8│
- 7│
- 6│
- 5│
- 4│
- 3│ 80286/80287 (8 MHz)
- 2│
- 1│ 8086/8087 (5 MHz)
- └─────────────────────────────────────
- 1980 1983 1987
-
- YEAR INTRODUCED
-
-
- 1.2 Performance
-
- Table 1-1 compares the execution times of several 80387 instructions with
- the equivalent operations executed on an 8-MHz 80287. As indicated in the
- table, the 16-MHz 80387 NPX provides about 5 to 6 times the performance of
- an 8-MHz 80287 NPX. A 16-MHz 80387 multiplies 32-bit and 64-bit
- floating-point numbers in about 1.9 and 2.8 microseconds, respectively. Of
- course, the actual performance of the NPX in a given system depends on the
- characteristics of the individual application.
-
- Although the performance figures shown in Table 1-1 refer to operations on
- real (floating-point) numbers, the 80387 also manipulates fixed-point
- binary and decimal integers of up to 64 bits or 18 digits, respectively. The
- 80387 can improve the speed of multiple-precision software algorithms for
- integer operations by 10 to 100 times.
-
- Because the 80387 NPX is an extension of the 80386 CPU, no software
- overhead is incurred in setting up the NPX for computation. The 80387 and
- 80386 processors coordinate their activities in a manner transparent to
- software. Moreover, built-in coordination facilities allow the 80386 CPU to
- proceed with other instructions while the 80387 NPX is simultaneously
- executing numeric instructions. Programs can exploit this concurrency of
- execution to further increase system performance and throughput.
-
-
- Table 1-1. Numeric Processing Speed Comparisons
-
- Approximate Performance Ratios:
- Floating-Point Instruction 16 MHz 80386/80387 ÷
- ┌───────────────────┴─────────────────┐ 8 MHz 80286/80287
-
- FADD ST, ST(i) Addition 6.2
- FDIV dword_var Division 4.7
- FYL2X stack (0), (1) assumed Logarithm 6.0
- FPATAX stack (0) assumed Arctangent 2.6
- F2XM1 stack (0) assumed Exponentiation 2.7
-
-
- 1.3 East of Use
-
- The 80387 NPX offers more than raw execution speed for
- computation-intensive tasks. The 80387 brings the functionality and power of
- accurate numeric computation into the hands of the general user. These
- features are available in most high-level languages available for the 80386.
-
- Like the 8087 and 80287 that preceded it, the 80387 is explicitly designed
- to deliver stable, accurate results when programmed using straightforward
- "pencil and paper" algorithms. The IEEE standard 754 specifically addresses
- this issue, recognizing the fundamental importance of making numeric
- computations both easy and safe to use.
-
- For example, most computers can overflow when two single-precision
- floating-point numbers are multiplied together and then divided by a third,
- even if the final result is a perfectly valid 32-bit number. The 80387
- delivers the correctly rounded result. Other typical examples of undesirable
- machine behavior in straightforward calculations occur when computing
- financial rate of return, which involves the expression (1 + i)^(n) or when
- solving for roots of a quadratic equation:
-
- -b ± √(b² - 4ac)
- ────────────────
- 2a
-
- If a does not equal 0, the formula is numerically unstable when the roots
- are nearly coincident or when their magnitudes are wildly different. The
- formula is also vulnerable to spurious over/underflows when the coefficients
- a, b, and c are all very big or all very tiny. When single-precision
- (4-byte) floating-point coefficients are given as data and the formula is
- evaluated in the 80387's normal way, keeping all intermediate results in
- its stack, the 80387 produces impeccable single-precision roots. This
- happens because, by default and with no effort on the programmer's part, the
- 80387 evaluates all those subexpressions with so much extra precision and
- range as to overwhelm any threat to numerical integrity.
-
- If double-precision data and results were at issue, a better formula would
- have to be used, and once again the 80387's default evaluation of that
- formula would provide substantially enhanced numerical integrity over mere
- double-precision evaluation.
-
- On most machines, straightforward algorithms will not deliver consistently
- correct results (and will not indicate when they are incorrect). To obtain
- correct results on traditional machines under all conditions usually
- requires sophisticated numerical techniques that are foreign to most
- programmers. General application programmers using straightforward
- algorithms will produce much more reliable programs using the 80387. This
- simple fact greatly reduces the software investment required to develop
- safe, accurate computation-based products.
-
- Beyond traditional numerics support for scientific applications, the 80387
- has built-in facilities for commercial computing. It can process decimal
- numbers of up to 18 digits without round-off errors, performing exact
- arithmetic on integers as large as 2^(64) or 10^(18). Exact arithmetic is
- vital in accounting applications where rounding errors may introduce
- monetary losses that cannot be reconciled.
-
- The NPX contains a number of optional facilities that can be invoked by
- sophisticated users. These advanced features include directed rounding,
- gradual underflow, and programmed exception-handling facilities.
-
- These automatic exception-handling facilities permit a high degree of
- flexibility in numeric processing software, without burdening the
- programmer. While performing numeric calculations, the NPX automatically
- detects exception conditions that can potentially damage a calculation (for
- example, X ÷ 0 or √X when X < 0). By default, on-chip exception logic
- handles these exceptions so that a reasonable result is produced and
- execution may proceed without program interruption. Alternatively, the NPX
- can signal the CPU, invoking a software exception handler to provide special
- results whenever various types of exceptions are detected.
-
-
- 1.4 Applications
-
- The 80386's versatility and performance make it appropriate to a broad
- array of numeric applications. In general, applications that exhibit any of
- the following characteristics can benefit by implementing numeric processing
- on the 80387:
-
- ■ Numeric data vary over a wide range of values, or include nonintegral
- values.
-
- ■ Algorithms produce very large or very small intermediate results.
-
- ■ Computations must be very precise; i.e., a large number of significant
- digits must be maintained.
-
- ■ Performance requirements exceed the capacity of traditional
- microprocessors.
-
- ■ Consistently safe, reliable results must be delivered using a
- programming staff that is not expert in numerical techniques.
-
- Note also that the 80387 can reduce software development costs and improve
- the performance of systems that use not only real numbers, but operate on
- multiprecision binary or decimal integer values as well.
-
- A few examples, which show how the 80387 might be used in specific numerics
- applications, are described below. In many cases, these types of systems
- have been implemented in the past with minicomputers or small mainframe
- computers. The advent of the 80387 brings the size and cost savings of
- microprocessor technology to these applications for the first time.
-
- ■ Business data processing──The NPX's ability to accept decimal operands
- and produce exact decimal results of up to 18 digits greatly simplifies
- accounting programming. Financial calculations that use power functions
- can take advantage of the 80387's exponentiation and logarithmic
- instructions. Many business software packages can benefit from the
- speed and accuracy of the 80387; for example, Lotus* 1-2-3*,
- Multiplan*, SuperCalc*, and Framework*.
-
- ■ Simulation──The large (32-bit) memory space of the 80386 coupled with
- the raw speed of the 80386 and 80387 processors make 80386/80387
- microsystems suitable for attacking large simulation problems, which
- heretofore could only be executed on expensive mini and mainframe
- computers. For example, complex electronic circuit simulations using
- SPICE can now be performed on a microcomputer, the 80386/80387.
- Simulation of mechanical systems using finite element analysis can
- employ more elements, resulting in more detailed analysis or simulation
- of larger systems.
-
- ■ Graphics transformations──The 80387 can be used in graphics terminals
- to locally perform many functions that normally demand the attention of
- a main computer; these include rotation, scaling, and interpolation. By
- also using an 82786 Graphics Display Controller to perform high-speed
- drawing and window management, very powerful and highly self-sufficient
- terminals can be built from a relatively small number of 80386 family
- parts.
-
- ■ Process control──The 80387 solves dynamic range problems
- automatically, and its extended precision allows control functions to
- be fine-tuned for more accurate and efficient performance. Control
- algorithms implemented with the NPX also contribute to improved
- reliability and safety, while the 80387's speed can be exploited in
- real-time operations.
-
- ■ Computer numerical control (CNC)──The 80387 can move and position
- machine tool heads with accuracy in real-time. Axis positioning also
- benefits from the hardware trigonometric support provided by the 80387.
-
- ■ Robotics──Coupling small size and modest power requirements with
- powerful computational abilities, the 80387 is ideal for on-board
- six-axis positioning.
-
- ■ Navigation──Very small, lightweight, and accurate inertial guidance
- systems can be implemented with the 80387. Its built-in trigonometric
- functions can speed and simplify the calculation of position from
- bearing data.
-
- ■ Data acquisition──The 80387 can be used to scan, scale, and reduce
- large quantities of data as it is collected, thereby lowering storage
- requirements and time required to process the data for analysis.
-
- The preceding examples are oriented toward traditional numerics
- applications. There are, in addition, many other types of systems that do
- not appear to the end user as computational, but can employ the 80387 to
- advantage. Indeed, the 80387 presents the imaginative system designer with
- an opportunity similar to that created by the introduction of the
- microprocessor itself. Many applications can be viewed as numerically-based
- if sufficient computational power is available to support this view (e.g.,
- character generation for a laser printer). This is analogous to the
- thousands of successful products that have been built around "buried"
- microprocessors, even though the products themselves bear little
- resemblance to computers.
-
-
- 1.5 Upgradability
-
- The architecture of the 80386 CPU is specifically adapted to allow easy
- upgradability to use an 80387, simply by plugging in the 80387 NPX. For this
- reason, designers of 80386 systems may wish to incorporate the 80387 NPX
- into their designs in order to offer two levels of price and performance at
- little additional cost.
-
- Two features of the 80386 CPU make the design and support of upgradable
- 80386 systems particularly simple:
-
- ■ The 80386 can be programmed to recognize the presence of an 80387 NPX;
- that is, software can recognize whether it is running on an 80386 with
- or without an 80387 NPX.
-
- ■ After determining whether the 80387 NPX is available, the 80386 CPU
- can be instructed to let the NPX execute all numeric instructions. If
- an 80387 NPX is not available, the 80386 CPU can emulate all 80387
- numeric instructions in software. This emulation is completely
- transparent to the application software──the same object code may be
- used by 80386 systems both with and without an 80387 NPX. No relinking
- or recompiling of application software is necessary; the same code will
- simply execute faster with the 80387 NPX than without.
-
- To facilitate this design of upgradable 80386 systems, Intel provides a
- software emulator for the 80387 that provides the functional equivalent of
- the 80387 hardware, implemented in software on the 80386. Except for timing,
- the operation of this 80387 emulator (EMUL387) is the same as for the 80387
- NPX hardware. When the emulator is combined as part of the systems software,
- the 80386 system with 80387 emulation and the 80386 with 80387 hardware are
- virtually indistinguishable to an application program. This capability
- makes it easy for software developers to maintain a single set of programs
- for both systems. System manufacturers can offer the NPX as a simple plug-in
- performance option without necessitating any changes in the user's software.
-
-
- 1.6 Programming Interface
-
- The 80386/80387 pair is programmed as a single processor; all of the 80387
- registers appear to a programmer as extensions of the basic 80386 register
- set. The 80386 has a class of instructions known as ESCAPE instructions, all
- having a common format. These ESC instructions are numeric instructions for
- the 80387 NPX. These numeric instructions for the 80387 are simply encoded
- into the instruction stream along with 80386 instructions.
-
- All of the CPU memory-addressing modes may be used in programming the NPX,
- allowing convenient access to record structures, numeric arrays, and other
- memory-based data structures. All of the memory management and protection
- features of the CPU (both paging and segmentation) are extended to the NPX
- as well.
-
- Numeric processing in the 80387 centers around the NPX register stack.
- Programmers can treat these eight 80-bit registers either as a fixed
- register set, with instructions operating on explicitly-designated
- registers, or as a classical stack, with instructions operating on the top
- one or two stack elements.
-
- Internally, the 80387 holds all numbers in a uniform 80-bit extended
- format. Operands that may be represented in memory as 16-, 32-, or 64-bit
- integers, 32-, 64-, or 80-bit floating-point numbers, or 18-digit packed BCD
- numbers, are automatically converted into extended format as they are loaded
- into the NPX registers. Computation results are subsequently converted back
- into one of these destination data formats when they are stored into memory
- from the NPX registers.
-
- Table 1-2 lists each of the seven data types supported by the 80387,
- showing the data format for each type. All operands are stored in memory
- with the least significant digits starting at the initial (lowest) memory
- address. Numeric instructions access and store memory operands using only
- this initial address. For maximum system performance, all operands should
- start at memory addresses divisible by four.
-
- Table 1-3 lists the 80387 instructions by class. No special programming
- tools are necessary to use the 80387, because all of the NPX instructions
- and data types are directly supported by the ASM386 Assembler, by high-level
- languages from Intel, and by assemblers and compilers produced by many
- independent software vendors. Software routines for the 80387 may be written
- in ASM386 Assembler or any of the following higher-level languages from
- Intel:
-
- PL/M-386
- C-386
-
- In addition, all of the development tools supporting the 8086/8087 and
- 80286/80287 can also be used to develop software for the 80386/80387.
-
- All of these high-level languages provide programmers with access to the
- computational power and speed of the 80387 without requiring an
- understanding of the architecture of the 80386 and 80387 chips. Such
- architectural considerations as concurrency and synchronization are handled
- automatically by these high-level languages. For the ASM386 programmer,
- specific rules for handling these issues are discussed in a later section
- of this manual.
-
- The following operating systems are known or expected to support the
- 80387: RMX-286/386, MS-DOS, Xenix-286/386, and Unix-286/386. Advanced
- in-circuit debugging support is provided by ICE-386.
-
-
- Table 1-2. Numeric Data Types
-
- Data Type Bits Significant Approximate Range (Decimal)
- Digits
- (Decimal)
-
- Word integer 16 4 -32,768 ≤ X ≤ +32,767
- Short integer 32 9 -2*10^(9) ≤ X ≤ +2*10^(9)
- Long integer 64 18 -9*10^(18) ≤ X ≤ +9*10^(18)
- Packed decimal 80 18 -99...99 ≤ X ≤ +99...99 (18 digits)
- Single real 32 6-7 1.18*10^(-38) ≤ │X│ ≤ 3.40*10^(38)
- Double real 64 15-16 2.23*10^(-308) ≤ │X│ ≤ 1.80*10^(308)
- Extended real 80 19 3.30*10^(-4932) ≤ │X│ ≤ 1.21*10^(4932)
-
-
- Table 1-3. Principal NPX Instructions
-
- Class Instruction Types
-
- Data Transfer Load (all data types), Store (all data types), Exchange
-
- Arithmetic Add, Subtract, Multiply, Divide, Subtract Reversed,
- Divide Reversed, Square Root, Scale, Remainder, Integer
- Part, Change Sign, Absolute Value, Extract
-
- Comparison Compare, Examine, Test
-
- Transcendental Tangent, Arctangent, Sine, Cosine, Sine and Cosine,
- 2^(x) - 1, Y * Log{2}(X), Y * Log{2}(X+1)
-
- Constants 0, 1, π, Log{10}2, Log{e}2, Log{2}10, Log{2}e
-
- Processor Control Load Control Word, Store Control Word, Store Status
- Word, Load Environment, Store Environment, Save,
- Restore, Clear Exceptions, Initialize
-
-
-
- Class Instruction Types
-
- Data Transfer Load (all data types), Store (all data types), Exchange
-
- Arithmetic Add, Subtract, Multiply, Divide, Subtract Reversed,
- Divide Reversed, Square Root, Scale, Remainder, Integer
- Part, Change Sign, Absolute Value, Extract
-
- Comparison Compare, Examine, Test
-
- Transcendental Tangent, Arctangent, Sine, Cosine, Sine and Cosine,
- 2^(x) - 1, Y * Log{2}(X), Y * Log{2}(X+1)
-
- Constants 0, 1, π, Log{10}2, Log{e}2, Log{2}10, Log{2}e
-
- Processor Control Load Control Word, Store Control Word, Store Status
- Word, Load Environment, Store Environment, Save,
- Restore, Clear Exceptions, Initialize
-
-
- Chapter 2 80387 Numerics Processor Architecture
-
- ────────────────────────────────────────────────────────────────────────────
-
- To the programmer, the 80387 NPX appears as a set of additional registers,
- data types, and instructions──all of which complement those of the 80386.
- Refer to Chapter 4 for detailed explanations of the 80387 instruction set.
- This chapter explains the new registers and data types that the 80387 brings
- to the architecture of the 80386.
-
-
- 2.1 80387 Registers
-
- The additional registers consist of
-
- ■ Eight individually-addressable 80-bit numeric registers, organized as
- a register stack
-
- ■ Three sixteen-bit registers containing:
-
- the NPX status word
- the NPX control word
- the tag word
-
- ■ Two 48-bit registers containing pointers to the current instruction
- and operand (these registers are actually located in the 80386)
-
- All of the NPX numeric instructions focus on the contents of these NPX
- registers.
-
-
- 2.1.1 The NPX Register Stack
-
- The 80387 register stack is shown in Figure 2-1. Each of the eight numeric
- registers in the 80387's register stack is 80 bits wide and is divided into
- fields corresponding to the NPX's extended real data type.
-
- Numeric instructions address the data registers relative to the register on
- the top of the stack. At any point in time, this top-of-stack register is
- indicated by the TOP (stack TOP) field in the NPX status word. Load or push
- operations decrement TOP by one and load a value into the new top register.
- A store-and-pop operation stores the value from the current TOP register and
- then increments TOP by one. Like 80386 stacks in memory, the 80387 register
- stack grows down toward lower-addressed registers.
-
- Many numeric instructions have several addressing modes that permit the
- programmer to implicitly operate on the top of the stack, or to explicitly
- operate on specific registers relative to the TOP. The ASM386 Assembler
- supports these register addressing modes, using the expression ST(0), or
- simply ST, to represent the current Stack Top and ST(i) to specify the ith
- register from TOP in the stack (0 ≤ i ≤ 7). For example, if TOP contains
- 011B (register 3 is the top of the stack), the following statement would add
- the contents of two registers in the stack (registers 3 and 5):
-
- FADD ST, ST(2)
-
- The stack organization and top-relative addressing of the numeric registers
- simplify subroutine programming by allowing routines to pass parameters on
- the register stack. By using the stack to pass parameters rather than using
- "dedicated" registers, calling routines gain more flexibility in how they
- use the stack. As long as the stack is not full, each routine simply loads
- the parameters onto the stack before calling a particular subroutine to
- perform a numeric calculation. The subroutine then addresses its parameters
- as ST, ST(1), etc., even though TOP may, for example, refer to physical
- register 3 in one invocation and physical register 5 in another.
-
-
- Figure 2-1. 80387 Register Set
-
- 80387 DATA REGISTERS TAG
- FIELD
- 79 78 64 63 0 1 0
- ╔════╤════════╤════════════════════════════════════╗ ╔═══╗
- R0║SIGN│EXPONENT│ SIGNIFICAND ║ ║ ║
- R1╟────┼────────┼────────────────────────────────────╢ ╟───╢
- R2╟────┼────────┼────────────────────────────────────╢ ╟───╢
- R3╟────┼────────┼────────────────────────────────────╢ ╟───╢
- R4╟────┼────────┼────────────────────────────────────╢ ╟───╢
- R5╟────┼────────┼────────────────────────────────────╢ ╟───╢
- R6╟────┼────────┼────────────────────────────────────╢ ╟───╢
- R7╟────┼────────┼────────────────────────────────────╢ ╟───╢
- ╚════╧════════╧════════════════════════════════════╝ ╚═══╝
-
- 15 0 47 0
- ╔═══════════════════╗ ╔═══════════════════════════════════╗
- ║ CONTROL REGISTER ║ ║ INSTRUCTION POINTER ║
- ╟───────────────────╢ ╟───────────────────────────────────╢
- ║ STATUS REGISTER ║ ║ DATA POINTER ║
- ╟───────────────────╢ ╚═══════════════════════════════════╝
- ║ TAG WORD ║
- ╚═══════════════════╝
-
-
- 2.1.2 The NPX Status Word
-
- The 16-bit status word shown in Figure 2-2 reflects the overall state of
- the 80387. This status word may be stored into memory using the
- FSTSW/FNSTSW, FSTENV/FNSTENV, and FSAVE/FNSAVE instructions, and can be
- transferred into the 80386 AX register with the FSTSW AX/FNSTSW AX
- instructions, allowing the NPX status to be inspected by the CPU.
-
- The B-bit (bit 15) is included for 8087 compatibility only. It reflects the
- contents of the ES bit (bit 7 of the status word), not the status of the
- BUSY# output of the 80387.
-
- The four NPX condition code bits (C{3}-C{0}) are similar to the flags in a
- CPU: the 80387 updates these bits to reflect the outcome of arithmetic
- operations. The effect of these instructions on the condition code bits is
- summarized in Table 2-1. These condition code bits are used principally for
- conditional branching. The FSTSW AX instruction stores the NPX status word
- directly into the CPU AX register, allowing these condition codes to be
- inspected efficiently by 80386 code. The 80386 SAHF instruction can copy
- C{3}-C{0} directly to 80386 flag bits to simplify conditional branching.
- Table 2-2 shows the mapping of these bits to the 80386 flag bits.
-
- Bits 12-14 of the status word point to the 80387 register that is the
- current Top of Stack (TOP). The significance of the stack top has been
- described in the prior section on the register stack.
-
- Figure 2-2 shows the six exception flags in bits 0-5 of the status word.
- Bit 7 is the exception summary status (ES) bit. ES is set if any unmasked
- exception bits are set, and is cleared otherwise. If this bit is set, the
- ERROR# signal is asserted. Bits 0-5 indicate whether the NPX has detected
- one of six possible exception conditions since these status bits were last
- cleared or reset. They are "sticky" bits, and can only be cleared by the
- instructions FINIT, FCLEX, FLDENV, FSAVE, and FRSTOR.
-
- Bit 6 is the stack fault (SF) bit. This bit distinguishes invalid
- operations due to stack overflow or underflow from other kinds of invalid
- operations. When SF is set, bit 9 (C{1}) distinguishes between stack
- overflow (C{1} = 1) and underflow (C{1} = 0).
-
-
- Figure 2-2. 80387 Status Word
-
- ┌─────────────────────────────────────────────────── 80387 BUSY
- │ ┌───┬───┬───────────────────────── TOP OF STACK POINTER
- │ ┌───│───│───│───┬───┬───┬─────────────────── CONDITION CODE
-
- 15 7 0
- ╔═══╤═══╤═══╤═══╤═══╤═══╤═══╤═══╤═══╤═══╤═══╤═══╤═══╤═══╤═══╤═══╗
- ║ B │ C │ TOP │ C │ C │ C │ E │ S │ P │ U │ O │ Z │ D │ I ║
- ║ │ 3 │ │ │ │ 2 │ 1 │ 0 │ S │ F │ E │ E │ E │ E │ E │ E ║
- ╚═══╧═══╧═══╧═══╧═══╧═══╧═══╧═══╧═══╧═══╧═══╧═══╧═══╧═══╧═══╧═══╝
-
- ERROR SUMMARY STATUS ─────────────┘ │ │ │ │ │ │ │
- STACK FAULT ──────────────────────────┘ │ │ │ │ │ │
- EXCEPTION FLAGS │ │ │ │ │ │
- PRECISION ──────────────────────────────┘ │ │ │ │ │
- UNDERFLOW ──────────────────────────────────┘ │ │ │ │
- OVERFLOW ───────────────────────────────────────┘ │ │ │
- ZERO DIVIDE ────────────────────────────────────────┘ │ │
- DENORMALIZED OPERAND ───────────────────────────────────┘ │
- INVALID OPERATION ──────────────────────────────────────────┘
-
- ─────────────────────────────────────────────────────────────────────────────
- NOTE:
- ES IS SET IF ANY UNMASKED EXCEPTION BIT IS SET; CLEARED OTHERWISE.
- SEE TABLE 2-1 FOR INTERPRETATION OF CONDITION CODE.
- TOP VALUES:
- 000 = REGISTER 0 IS TOP OF STACK
- 001 = REGISTER 1 IS TOP OF STACK
- .
- .
- .
- 111 = REGISTER 7 IS TOP OF STACK
- FOR DEFINITIONS OF EXCEPTIONS, REFER TO CHAPTER 3.
- ─────────────────────────────────────────────────────────────────────────────
-
-
- Table 2-1. Condition Code Interpretation
-
- ╓┌───────────────────┌──────────────────────────────────┌────────────────────╖
- Instruction C0 (S) C3 (Z) C1 (A) C2 (C)
-
- FPREM, FPREM1 Three least significant bits Reduction
- of quotient
-
- Q2 Q0 Q1 0=complete
- or O/U# 1=incomplete
-
- FCOM, FCOMP,
- FCOMPP, FTST, Result of comparison Zero Operand is not
- FUCOM, FUCOMP, or O/U# comparable
- FUCOMPP, FICOM,
- FICOMP
- Instruction C0 (S) C3 (Z) C1 (A) C2 (C)
- FICOMP
-
- FXAM Operand class Sign Operand class
- or O/U#
-
- FCHS, FABS,
- FXCH, FINCTOP,
- FDECTOP, Constant UNDEFINED Zero UNDEFINED
- loads, FXTRACT, or O/U#
- FLD, FILD, FBLD,
- FSTP (ext real)
-
- FIST, FBSTP,
- FRNDINT, FST,
- FSTP, FADD, FMUL,
- FDIV, FDIVR, FSUB, UNDEFINED Roundup UNDEFINED
- FSUBR, FSCALE, or O/U#
- FSQRT, FPATAN,
- F2XM1, FYL2X,
- FYL2XP1
- Instruction C0 (S) C3 (Z) C1 (A) C2 (C)
- FYL2XP1
-
- FPTAN, FSIN, UNDEFINED Roundup Reduction
- FCOS, FSINCOS or O/U# 0=complete
- undefined 1=incomplete
- if C2=1
-
- FLDENV, FRSTOR Each bit loaded
- from memory
-
-
- FLDCW, FSTENV,
- FSTCW, FSTSW, UNDEFINED
- FCLEX, FINIT,
- FSAVE
-
-
- ────────────────────────────────────────────────────────────────────────────
- NOTES
- O/U# When both IE and SF bits of status word are set,
- indicating a stack exception, this bit distinguishes
- between stack overflow (C1=1) and underflow (C1=0).
-
- Reduction If FPREM and FPREM1 produces a remainder that is less
- than the modulus, reduction is complete. When reduction
- is incomplete the value at the top of the stack is a
- partial remainder, which can be used as input to further
- reduction. For FPTAN, FSIN, FCOS, and FSINCOS, the
- reduction bit is set if the operand at the top of the
- stack is too large. In this case the original operand
- remains at the top of the stack.
-
- Roundup When the PE bit of the status word is set, this bit
- indicates whether the last rounding in the instruction
- was upward.
-
- UNDEFINED Do not rely on finding any specific value in these bits.
- ────────────────────────────────────────────────────────────────────────────
-
-
- Table 2-2. Correspondence between 80387 and 80386 Flag Bits
-
- 80387 Flag 80386 Flag
-
- C{0} CF
- C{1} (none)
- C{2} PF
- C{3} ZF
-
-
- 2.1.3 Control Word
-
- The NPX provides the programmer with several processing options, which are
- selected by loading a word from memory into the control word. Figure 2-3
- shows the format and encoding of the fields in the control word.
-
- The low-order byte of this control word configures the 80387 exception
- masking. Bits 0-5 of the control word contain individual masks for each of
- the six exception conditions recognized by the 80387. The high-order byte of
- the control word configures the 80387 processing options, including
-
- ■ Precision control
- ■ Rounding control
-
- The precision-control bits (bits 8-9) can be used to set the 80387 internal
- operating precision at less than the default precision (64-bit significand).
- These control bits can be used to provide compatibility with the
- earlier-generation arithmetic processors having less precision than the
- 80387. The precision-control bits affect the results of only the following
- five arithmetic instructions: ADD, SUB(R), MUL, DIV(R), and SQRT. No other
- operations are affected by PC.
-
- The rounding-control bits (bits 10-11) provide for the common
- round-to-nearest mode, as well as directed rounding and true chop. Rounding
- control affects only the arithmetic instructions (refer to Chapter 3 for
- lists of arithmetic and nonarithmetic instructions).
-
-
- Figure 2-3. 80387 Control Word Format
-
- ┌───┬───┬──────────────────────────────────────────────RESERVED
- │ │ │ ┌────────────────────────────── (INFINITY CONTROL)
- │ │ │ │ ┌───┬───────────────────────── ROUNDING CONTROL
- │ │ │ │ │ │ ┌───┬──────────────── PRECISION CONTROL
-
- 15 7 0
- ╔═══╤═══╤═══╤═══╤═══╤═══╤═══╤═══╤═══╤═══╤═══╤═══╤═══╤═══╤═══╤═══╗
- ║ X X X │ X │ RC │ PC │ X X │ P │ U │ O │ Z │ D │ I ║
- ║ │ │ │ │ │ │ │ │ │ │ M │ M │ M │ M │ M │ M ║
- ╚═══╧═══╧═══╧═══╧═══╧═══╧═══╧═══╧═══╧═══╧═══╧═══╧═══╧═══╧═══╧═══╝
-
- RESERVED ─────────────────────────┴───┘ │ │ │ │ │ │
- EXECEPTION MASKS │ │ │ │ │ │
- PRECISION ──────────────────────────────┘ │ │ │ │ │
- UNDERFLOW ──────────────────────────────────┘ │ │ │ │
- OVERFLOW ───────────────────────────────────────┘ │ │ │
- ZERO DIVIDE ────────────────────────────────────────┘ │ │
- DENORMALIZED OPERAND ───────────────────────────────────┘ │
- INVALID OPERATION ──────────────────────────────────────────┘
-
- ─────────────────────────────────────────────────────────────────────────────
- NOTE:
- PRECISION CONTROL ROUNDING CONTROL
- 00--24 BITS (SINGLE PRECISION) 00--ROUND TO NEAREST OR EVEN
- 01--(RESERVED) 01--ROUND DOWN (TOWARD -∞)
- 10--53 BITS (DOUBLE PRECISION) 10--ROUND UP (TOWARD +∞)
- 11--64 BITS (EXTENDED PRECISION) 11--CHOP (TRUNCATE TOWARDS ZERO)
- ─────────────────────────────────────────────────────────────────────────────
-
-
- 2.1.4 The NPX Tag Word
-
- The tag word indicates the contents of each register in the register stack,
- as shown in Figure 2-4. The tag word is used by the NPX itself to
- distinguish between empty and nonempty register locations. Programmers of
- exception handlers may use this tag information to check the contents of a
- numeric register without performing complex decoding of the actual data in
- the register. The tag values from the tag word correspond to physical
- registers 0-7. Programmers must use the current top-of-stack (TOP) pointer
- stored in the NPX status word to associate these tag values with the
- relative stack registers ST(0) through ST(7).
-
- The exact values of the tags are generated during execution of the FSTENV
- and FSAVE instructions according to the actual contents of the nonempty
- stack locations. During execution of other instructions, the 80387 updates
- the TW only to indicate whether a stack location is empty or nonempty.
-
-
- Figure 2-4. 80387 Tag Word Format
-
- 15 0
- ╔════╤═══╤════╤═══╤════╤═══╤════╤═══╤════╤═══╤════╤═══╤════╤═══╤════╤═══╗
- ║ TAG (7)│ TAG (6)│ TAG (5)│ TAG (4)│ TAG (3)│ TAG (2)│ TAG (1)│ TAG (0)║
- ╚════╧═══╧════╧═══╧════╧═══╧════╧═══╧════╧═══╧════╧═══╧════╧═══╧════╧═══╝
- TAG VALUES:
- 00 = VALID
- 01 = ZERO
- 10 = INVALID OR INFINITY
- 11 = EMPTY
-
-
- 2.1.5 The NPX Instruction and Data Pointers
-
- The instruction and data pointers provide support for programmed
- exception-handlers. These registers are actually located in the 80386, but
- appear to be located in the 80387 because they are accessed by the ESC
- instructions FLDENV, FSTENV, FSAVE, and FRSTOR. Whenever the 80386 decodes
- an ESC instruction, it saves the instruction address, the operand address
- (if present), and the instruction opcode.
-
- When stored in memory, the instruction and data pointers appear in one of
- four formats, depending on the operating mode of the 80386 (protected mode
- or real-address mode) and depending on the operand-size attribute in effect
- (32-bit operand or 16-bit operand). When the 80386 is in virtual-8086 mode,
- the real-address mode formats are used.
-
- Figures 2-5 through 2-8 show these pointers as they are stored following
- FSTENV instruction.
-
- The FSTENV and FSAVE instructions store this data into memory, allowing
- exception handlers to determine the precise nature of any numeric exceptions
- that may be encountered.
-
- The instruction address saved in the 80386 (as in the 80287) points to any
- prefixes that preceded the instruction. This is different from the 8087, for
- which the instruction address points only to the ESC instruction opcode.
-
- Note that the processor control instructions FINIT, FLDCW, FSTCW, FSTSW,
- FCLEX, FSTENV, FLDENV, FSAVE, FRSTOR, and FWAIT do not affect the data
- pointer. Note also that, except for the instructions just mentioned, the
- value of the data pointer is undefined if the prior ESC instruction did not
- have a memory operand.
-
-
- Figure 2-5. Protected Mode 80387 Instruction and Data Pointer Image in
- Memory, 32-Bit Format
-
- 32-BIT PROTECTED MODE FORMAT
-
- 31 23 15 7 0
- ╔═════════════════╪═════════════════╪═════════════════╪═════════════════╗
- ║ RESERVED │ CONTROL WORD ║0H
- ╟─────────────────┼─────────────────┼─────────────────┼─────────────────╢
- ║ RESERVED │ STATUS WORD ║4H
- ╟─────────────────┼─────────────────┼─────────────────┼─────────────────╢
- ║ RESERVED │ TAG WORD ║8H
- ╟─────────────────┼─────────────────┴─────────────────┼─────────────────╢
- ║ IP OFFSET ║CH
- ╟───────────┬─────┼─────────────────┬─────────────────┼─────────────────╢
- ║ 0 0 0 0 0 │ OPCODE 10..0 │ CS SELECTOR ║10H
- ╟───────────┴─────┼─────────────────┴─────────────────┼─────────────────╢
- ║ DATA OPERAND OFFSET ║14H
- ╟─────────────────┼─────────────────┬─────────────────┼─────────────────╢
- ║ RESERVED │ OPERAND SELECTOR ║18H
- ╚═════════════════╪═════════════════╪═════════════════╪═════════════════╝
-
-
- Figure 2-6. Real Mode 80387 Instruction and Data Pointer Image in
- Memory, 32-Bit Format
-
- 32-BIT REAL ADDRESS MODE FORMAT
-
- 31 23 15 7 0
- ╔═════════════════╪═════════════════╪═════════════════╪═════════════════╗
- ║ RESERVED │ CONTROL WORD ║0H
- ╟─────────────────┼─────────────────┼─────────────────┼─────────────────╢
- ║ RESERVED │ STATUS WORD ║4H
- ╟─────────────────┼─────────────────┼─────────────────┼─────────────────╢
- ║ RESERVED │ TAG WORD ║8H
- ╟─────────────────┼─────────────────┼─────────────────┼─────────────────╢
- ║ RESERVED │ INSTRUCTION POINTER 15..0 ║CH
- ╟─────────┬───────┼─────────────────┼───────────┬─┬───┼─────────────────╢
- ║ 0 0 0 0 │ INSTRUCTION POINTER 31..16 │0│ OPCODE 10..0 ║10H
- ╟─────────┴───────┼─────────────────┼───────────┴─┴───┼─────────────────╢
- ║ RESERVED │ OPERAND POINTER ║14H
- ╟─────────┬───────┼─────────────────┼───────────┬─────┼─────────────────╢
- ║ 0 0 0 0 │ OPERAND POINTER 31..16 │0 0 0 0 0 0 0 0 0 0 0 0║18H
- ╚═════════╧═══════╪═════════════════╪═══════════╧═════╪═════════════════╝
-
-
- Figure 2-7. Protected Mode 80387 Instruction and Data Pointer Image in
- Memory, 16-Bit Format
-
- 16-BIT PROTECTED MODE FORMAT
-
- 15 7 0
- ╔═════════════════╪════════════════╗
- ║ CONTROL WORD ║ 0H
- ╟─────────────────┼────────────────╢
- ║ STATUS WORD ║ 2H
- ╟─────────────────┼────────────────╢
- ║ TAG WORD ║ 4H
- ╟─────────────────┼────────────────╢
- ║ IP OFFSET ║ 6H
- ╟─────────────────┼────────────────╢
- ║ CB SELECTOR ║ 8H
- ╟─────────────────┼────────────────╢
- ║ OPERAND OFFSET ║ AH
- ╟─────────────────┼────────────────╢
- ║ OPERAND SELECTOR ║ CH
- ╚═════════════════╪════════════════╝
-
-
- Figure 2-8. Real Mode 80387 Instruction and Data Pointer Image in
- Memory, 16-Bit Format
-
- 16-BIT REAL-ADDRESS MODE
- AND VIRTUAL-8086 MODE FORMAT
-
- 15 7 0
- ╔════════════════╪════════════════╗
- ║ CONTROL WORD ║ 0H
- ╟────────────────┼────────────────╢
- ║ STATUS WORD ║ 2H
- ╟────────────────┼────────────────╢
- ║ TAG WORD ║ 4H
- ╟────────────────┼────────────────╢
- ║ INSTRUCTION POINTER 15..0 ║ 6H
- ╟─────────┬─┬────┼────────────────╢
- ║IP 19..16│0│ OPCODE 10..0 ║ 8H
- ╟─────────┴─┴────┼────────────────╢
- ║ OPERAND POINTER 15..0 ║ AH
- ╟─────────┬─┬────┼────────────────╢
- ║OP 19..16│0│0 0 0 0 0 0 0 0 0 0 0║ CH
- ╚═════════╧═╧════╪════════════════╝
-
-
- 2.2 Computation Fundamentals
-
- This section covers 80387 programming concepts that are common to all
- applications. It describes the 80387's internal number system and the
- various types of numbers that can be employed in NPX programs. The most
- commonly used options for rounding and precision (selected by fields in the
- control word) are described, with exhaustive coverage of less frequently
- used facilities deferred to later sections. Exception conditions that may
- arise during execution of NPX instructions are also described along with the
- options that are available for responding to these exceptions.
-
-
- 2.2.1 Number System
-
- The system of real numbers that people use for pencil and paper
- calculations is conceptually infinite and continuous. There is no upper or
- lower limit to the magnitude of the numbers one can employ in a calculation,
- or to the precision (number of significant digits) that the numbers can
- represent. When considering any real number, there are always arbitrarily
- many numbers both larger and smaller. There are also arbitrarily many
- numbers between (i.e., with more significant digits than) any two real
- numbers. For example, between 2.5 and 2.6 are 2.51, 2.5897, 2.500001, etc.
-
- While ideally it would be desirable for a computer to be able to operate on
- the entire real number system, in practice this is not possible. Computers,
- no matter how large, ultimately have fixed-size registers and memories that
- limit the system of numbers that can be accommodated. These limitations
- determine both the range and the precision of numbers. The result is a set
- of numbers that is finite and discrete, rather than infinite and
- continuous. This sequence is a subset of the real numbers that is designed
- to form a useful approximation of the real number system.
-
- Figure 2-9 superimposes the basic 80387 real number system on a real number
- line (decimal numbers are shown for clarity, although the 80387 actually
- represents numbers in binary). The dots indicate the subset of real numbers
- the 80387 can represent as data and final results of calculations. The
- 80387's range of double-precision, normalized numbers is approximately
- ±2.23 * 10^(-308) to ±1.80 * 10^(308). Applications that are required to
- deal with data and final results outside this range are rare. For reference,
- the range of the IBM System 370* is about ±0.54 * 10^(-78) to
- ±0.72 * 10^(76).
-
- The finite spacing in Figure 2-9 illustrates that the NPX can represent a
- great many, but not all, of the real numbers in its range. There is always a
- gap between two adjacent 80387 numbers, and it is possible for the result of
- a calculation to fall in this space. When this occurs, the NPX rounds the
- true result to a number that it can represent. Thus, a real number that
- requires more digits than the 80387 can accommodate (e.g., a 20-digit
- number) is represented with some loss of accuracy. Notice also that the
- 80387's representable numbers are not distributed evenly along the real
- number line. In fact, an equal number of representable numbers exists
- between successive powers of 2 (i.e., as many representable numbers exist
- between 2 and 4 as between 65,536 and 131,072). Therefore, the gaps between
- representable numbers are larger as the numbers increase in magnitude. All
- integers in the range ±2^(64) (approximately ±10^(18)), however, are exactly
- representable.
-
- In its internal operations, the 80387 actually employs a number system that
- is a substantial superset of that shown in Figure 2-9. The internal format
- (called extended real) extends the 80387's range to about ±3.30 * 10^(-4932)
- to ±1.21 * 10^(4932), and its precision to about 19 (equivalent decimal)
- digits. This format is designed to provide extra range and precision for
- constants and intermediate results, and is not normally intended for data
- or final results.
-
- From a practical standpoint, the 80387's set of real numbers is
- sufficiently large and dense so as not to limit the vast majority of
- microprocessor applications. Compared to most computers, including
- mainframes, the NPX provides a very good approximation of the real number
- system. It is important to remember, however, that it is not an exact
- representation, and that arithmetic on real numbers is inherently
- approximate.
-
- Conversely, and equally important, the 80387 does perform exact arithmetic
- on integer operands. That is, if an operation on two integers is valid and
- produces a result that is in range, the result is exact. For example, 4 ÷ 2
- yields an exact integer, 1 ÷ 3 does not, and 2^(40) * 2^(30) + 1 does not,
- because the result requires greater than 64 bits of precision.
-
-
- Figure 2-9. 80387 Double-Precision Number System
-
- |─── NEGATIVE RANGE (NORMALIZED) ──|
- | |
- | -5 -4 -3 -2 -1 |
- ┌───┬───┬──┬┐┌───┬───┬───┬───┬───┬───┐
- │ │ │ │││░░░│░░░│▒▒▒│▒▒▒│▓▓▓│███│
- └───┴───┴──┴┘└───┴───┴───┴───┴───┴───┘
-
- │ -2.23 X 10^(-308)┘
- └ -1.80 X 10^(308)
- ┌────────────────────────────────┐
- │ ───────┬───────── │
- │ ▓▓▓▓▓▓▓│▒▒▒▒▒▒▒▒▒ │
- |── POSITIVE RANGE (NORMALIZED) ───| │ ▓▓▓▓▓▓▓│▒▒▒▒▒▒▒▒▒ │
- | | │ ─∙─────∙─────∙─── │
- | 1 2 3 4 5 | │ │─┬─│ │
- ┌───┬───┬───┬───┬───┬───┐┌┬──┬───┬───┐ │ │ │ └2.00000000000000000 │
- │███│▓▓▓│▒▒▒│▒▒▒│░░░│░░░│││ │ │ │ │ │ └ (NOT REPRESENTABLE) │
- └───┴───┴───┴───┴───┴───┘└┴──┴───┴───┘ │ └──────1.99999999999999999 │
- └───┤ │ PRECISION│─ 18 DIGITS ─│ │
- │ └────────┐ 1.80 X 10^(308)┘ │ │
- └ 2.23 X 10^(-308) └─────────────────────┴────────────────────────────────┘
-
-
- 2.2.2 Data Types and Formats
-
- The 80387 recognizes seven numeric data types for memory-based values,
- divided into three classes: binary integers, packed decimal integers, and
- binary reals. A later section describes how these formats are stored in
- memory (the sign is always located in the highest-addressed byte).
-
- Figure 2-10 summarizes the format of each data type. In the figure, the
- most significant digits of all numbers (and fields within numbers) are the
- leftmost digits.
-
-
- Figure 2-10. 80387 Data Formats
-
- ┌─────────┬─────────┬─────────┬───────────────────────────────────────────┐
- │ │ │ │MOST HIGHEST ADDRESSED │
- │DATA │ RANGE │PRECISION│SIGNIFICANT BYTE BYTE │
- │FORMATS │ │ ├───┬───┬───┬───┬───┬───┬───┬───┬───┬───┐ │
- │ │ │ │7 0│7 0│7 0│7 0│7 0│7 0│7 0│7 0│7 0│7 0│ │
- ├─────────┼─────────┼─────────┼───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┤
- │WORD │ │ ├──────┐(TWO'S │
- │INTEGER │ 10^(4) │ 16 BITS ├──────┘COMPLEMENT) │
- │ │ │ │15 0 │
- ├─────────┼─────────┼─────────┼───────────────────────────────────────────┤
- │SHORT │ │ ├───────────────┐(TWO'S │
- │INTEGER │ 10^(2) │ 32 BITS ├───────────────┘COMPLEMENT) │
- │ │ │ │31 0 │
- ├─────────┼─────────┼─────────┼───────────────────────────────────────────┤
- │LONG │ │ ├───────────────────────────────┐(TWO'S │
- │INTEGER │ 10^(19) │ 64 BITS ├───────────────────────────────┘COMPLEMENT)│
- │ │ │ │6 0 │
- ├─────────┼─────────┼─────────┼───────────────────────────────────────────┤
- │ │ │ │ MAGNITUDE │
- │PACKED │ │ ├─┬───┬─────────────∙∙∙───────────────────┐ │
- │BCD │ 10^(18) │18 DIGITS│S│ X │d{17} d{16} d{2} d{1} d{0}│ │
- │ │ │ ├─┴───┴─────┴─────┴─∙∙∙─┴─────┴─────┴─────┘ │
- │ │ │ │ 72 0 │
- ├─────────┼─────────┼─────────┼───────────────────────────────────────────┤
- │ │ │ ├─┬─────┬───────┐ │
- │SINGLE │ 10^(±38)│ 24 BITS │S│ BE │ SIGN. │ │
- │PRECISION│ │ ├─┴─────┴───────┘ │
- │ │ │ │31 23 0 │
- ├─────────┼─────────┼─────────┼───────────────────────────────────────────┤
- │ │ │ ├─┬────────┬────────────────────┐ │
- │DOUBLE │10^(±308)│ 53 BITS │S│ BE │ SIGNIFICAND │ │
- │PRECISION│ │ ├─┴────────┴────────────────────┘ │
- │ │ │ │63 52 0 │
- ├─────────┼─────────┼─────────┼───────────────────────────────────────────┤
- │ │ │ ├─┬────────────┬───────────────────────────┐│
- │EXTENDED │10^(4932)│ 64 BITS │S│ BE ├─┐ SIGNIFICAND ││
- │PRECISION│ │ ├─┴────────────┴I┴─────────────────────────┘│
- │ │ │ │79 64 63 0 │
- └─────────┴─────────┴─────────┴───────────────────────────────────────────┘
-
- ─────────────────────────────────────────────────────────────────────────────
- NOTE:
- (1) BE = BIASED EXPONENT
- (2) S = SIGN BIT (0 = positive, 1 = negative)
- (3) d{n} = DECIMAL DIGIT (TWO PER TYPE)
- (4) X = BITS HAVE NO SIGNIFICANCE; 80387 IGNORES WHEN LOADING,
- ZEROS IN WHEN STORING
- (5) = POSITION OF IMPLICIT BINARY POINT
- (6) I = INTEGER BIT OF SIGNIFICAND; STORED IN TEMPORARY REAL,
- IMPLICIT IN SINGLE AND DOUBLE PRECISION
- (7) EXPONENT BIAS (NORMALIZED VALUES):
- SINGLE: 127 (7FH)
- DOUBLE: 1023 (3FFH)
- EXTENDED REAL: 16383 (3FFFH)
- (8) PACKED BCD: (-1)^(S) (D{17}...D{0})
- (9) REAL: (-1)^(S) (2^(E-BIAS)) (F{0}F{1}...)
- ─────────────────────────────────────────────────────────────────────────────
-
-
- 2.2.2.1 Binary Integers
-
- The three binary integer formats are identical except for length, which
- governs the range that can be accommodated in each format. The leftmost bit
- is interpreted as the number's sign: 0 = positive and 1 = negative. Negative
- numbers are represented in standard two's complement notation (the binary
- integers are the only 80387 format to use two's complement). The quantity
- zero is represented with a positive sign (all bits are 0). The 80387 word
- integer format is identical to the 16-bit signed integer data type of the
- 80386; the 80387 short integer format is identical to the 32-bit signed
- integer data type of the 80386.
-
- The binary integer formats exist in memory only. When used by the 80387,
- they are automatically converted to the 80-bit extended real format. All
- binary integers are exactly representable in the extended real format.
-
-
- 2.2.2.2 Decimal Integers
-
- Decimal integers are stored in packed decimal notation, with two decimal
- digits "packed" into each byte, except the leftmost byte, which carries the
- sign bit (0 = positive, 1 = negative). Negative numbers are not stored in
- two's complement form and are distinguished from positive numbers only by
- the sign bit. The most significant digit of the number is the leftmost
- digit. All digits must be in the range 0-9.
-
- The decimal integer format exists in memory only. When used by the 80387,
- it is automatically converted to the 80-bit extended real format. All
- decimal integers are exactly representable in the extended real format.
-
-
- 2.2.2.3 Real Numbers
-
- The 80387 represents real numbers of the form:
-
- (-1)^(s)2^(E)(b{0}b{1}b{2}b{3}..b{p-1})
-
- ...where...
-
- s = 0 or 1
- E = any integer between Emin and Emax, inclusive
- b{i} = 0 or 1
- p = number of bits of precision
-
- Table 2-3 summarizes the parameters for each of the three real-number
- formats.
-
- The 80387 stores real numbers in a three-field binary format that resembles
- scientific, or exponential, notation. The format consists of the following
- fields:
-
- ■ The number's significant digits are held in the significand field,
- b{0} b{1} b{2} b{3}..b{p-1}. (The term "significand" is analogous
- to the term "mantissa" used to describe floating point numbers on some
- computers.)
-
- ■ The exponent field, e = E+bias, locates the binary point within the
- significant digits (and therefore determines the number's magnitude).
- (The term "exponent" is analogous to the term "characteristic" used to
- describe floating point numbers on somecomputers.)
-
- ■ The 1-bit sign field indicates whether the number is positive or
- negative. Negative numbers differ from positive numbers only in the
- sign bits of their significands.
-
- Table 2-4 shows how the real number 178.125 (decimal) is stored in the
- 80387 single real format. The table lists a progression of equivalent
- notations that express the same value to show how a number can be converted
- from one form to another. (The ASM386 and PL/M-386 language translators
- perform a similar process when they encounter programmer-defined real number
- constants.) Note that not every decimal fraction has an exact binary
- equivalent. The decimal number 1/10, for example, cannot be expressed
- exactly in binary (just as the number 1/3 cannot be expressed exactly in
- decimal). When a translator encounters such a value, it produces a rounded
- binary approximation of the decimal value.
-
- The NPX usually carries the digits of the significand in normalized form.
- This means that, except for the value zero, the significand contains an
- integer bit and fraction bits as follows:
-
- 1{}fff...ff
-
- where {} indicates an assumed binary point. The number of fraction bits
- varies according to the real format: 23 for single, 52 for double, and 63
- for extended real. By normalizing real numbers so that their integer bit is
- always a 1, the 80387 eliminates leading zeros in small values (│X│ < 1).
- This technique maximizes the number of significant digits that can be
- accommodated in a significand of a given width. Note that, in the single
- and double formats, the integer bit is implicit and is not actually stored;
- the integer bit is physically present in the extended format only.
-
- If one were to examine only the significand with its assumed binary point,
- all normalized real numbers would have values greater than or equal to 1 and
- less than 2. The exponent field locates the actual binary point in the
- significant digits. Just as in decimal scientific notation, a positive
- exponent has the effect of moving the binary point to the right, and a
- negative exponent effectively moves the binary point to the left, inserting
- leading zeros as necessary. An unbiased exponent of zero indicates that the
- position of the assumed binary point is also the position of the actual
- binary point. The exponent field, then, determines a real number's
- magnitude.
-
- In order to simplify comparing real numbers (e.g., for sorting), the 80387
- stores exponents in a biased form. This means that a constant is added to
- the true exponent described above. As Table 2-3 shows, the value of this
- bias is different for each real format. It has been chosen so as to
- force the biased exponent to be a positive value. This allows two real
- numbers (of the same format and sign) to be compared as if they are unsigned
- binary integers. That is, when comparing them bitwise from left to right
- (beginning with the leftmost exponent bit), the first bit position that
- differs orders the numbers; there is no need to proceed further with the
- comparison. A number's true exponent can be determined simply by
- subtracting the bias value of its format.
-
- The single and double real formats exist in memory only. If a number in one
- of these formats is loaded into an 80387 register, it is automatically
- converted to extended format, the format used for all internal operations.
- Likewise, data in registers can be converted to single or double real for
- storage in memory. The extended real format may be used in memory also,
- typically to store intermediate results that cannot be held in registers.
-
- Most applications should use the double format to store real-number data
- and results; it provides sufficient range and precision to return correct
- results with a minimum of programmer attention. The single real format is
- appropriate for applications that are constrained by memory, but it should
- be recognized that this format provides a smaller margin of safety. It is
- also useful for the debugging of algorithms, because roundoff problems will
- manifest themselves more quickly in this format. The extended real format
- should normally be reserved for holding intermediate results, loop
- accumulations, and constants. Its extra length is designed to shield final
- results from the effects of rounding and overflow/underflow in intermediate
- calculations. However, the range and precision of the double format are
- adequate for most microcomputer applications.
-
-
- Table 2-3. Summary of Format Parameters
-
- Parameter ┌──────── Format ────────┐
- Single Double Extended
-
- Format width in bits 32 64 80
- p (bits of precision) 24 53 64
- Exponent width in bits 8 11 15
- Emax +127 +1023 +16383
- Emin -126 -1022 -16382
- Exponent bias +127 +1023 +16383
-
-
- Table 2-4. Real Number Notation
-
- Notation Value
-
- Ordinary Decimal 178.125
- Scientific Decimal 1{}78125E2
- Scientific Binary 1{}0110010001E111
- Scientific Binary 1{}0110010001E10000110
- (Biased Exponent)
- 80387 Single Format Sign Biased Exponent Significand
- (Normalized) 0 10000110 01100100010000000000000
- 1{}(implicit)
-
-
- 2.2.3 Rounding Control
-
- Internally, the 80387 employs three extra bits (guard, round, and sticky
- bits) that enable it to round numbers in accord with the infinitely precise
- true result of a computation; these bits are not accessible to programmers.
- Whenever the destination can represent the infinitely precise true result,
- the 80387 delivers it. Rounding occurs in arithmetic and store operations
- when the format of the destination cannot exactly represent the infinitely
- precise true result. For example, a real number may be rounded if it is
- stored in a shorter real format, or in an integer format. Or, the infinitely
- precise true result may be rounded when it is returned to a register.
-
- The NPX has four rounding modes, selectable by the RC field in the control
- word (see Figure 2-3). Given a true result b that cannot be represented by
- the target data type, the 80387 determines the two representable numbers a
- and c that most closely bracket b in value (a < b < c). The processor then
- rounds (changes) b to a or to c according to the mode selected by the RC
- field as shown in Table 2-5. Rounding introduces an error in a result that
- is less than one unit in the last place to which the result is rounded.
-
- ■ "Round to nearest" is the default mode and is suitable for most
- applications; it provides the most accurate and statistically unbiased
- estimate of the true result.
-
- ■ The "chop" or "round toward zero" mode is provided for integer
- arithmeticapplications.
-
- ■ "Round up" and "round down" are termed directed rounding and can be
- used to implement interval arithmetic. Interval arithmetic generates a
- certifiable result independent of the occurrence of rounding and other
- errors. The upper and lower bounds of an interval may be computed by
- executing an algorithm twice, rounding up in one pass and down in the
- other.
-
- Rounding control affects only the arithmetic instructions (refer to Chapter
- 3 for lists of arithmetic and nonarithmetic instructions).
-
-
- 2.2.4 Precision Control
-
- The 80387 allows results to be calculated with either 64, 53, or 24 bits of
- precision in the significand as selected by the precision control (PC) field
- of the control word. The default setting, and the one that is best suited
- for most applications, is the full 64 bits of significance provided by the
- extended real format. The other settings are required by the IEEE standard
- and are provided to obtain compatibility with the specifications of certain
- existing programming languages. Specifying less precision nullifies the
- advantages of the extended format's extended fraction length. When reduced
- precision is specified, the rounding of the fractional value clears the
- unused bits on the right to zeros.
-
-
- Table 2-5. Rounding Modes
-
- RC Field Rounding Mode Rounding Action
-
- 00 Round to nearest Closer to b of a or c; if equally
- close, select even number (the one
- whose least significant bit is zero).
- 01 Round down (toward -∞) a
- 10 Round up (toward +∞) c
- 11 Chop (toward 0) Smaller in magnitude of a or c.
-
- ────────────────────────────────────────────────────────────────────────────
- NOTE
- a < b < c; a and c are successive representable numbers; b is not
- representable.
- ────────────────────────────────────────────────────────────────────────────
-
-
- Chapter 3 Special Computational Situations
-
- ───────────────────────────────────────────────────────────────────────────
-
- Besides being able to represent positive and negative numbers, the 80387
- data formats may be used to describe other entities. These special values
- provide extra flexibility, but most users will not need to understand them
- in order to use the 80387 successfully. This section describes the special
- values that may occur in certain cases and the significance of each. The
- 80387 exceptions are also described, for writers of exception handlers and
- for those interested in probing the limits of computation using the 80387.
-
- The material presented in this section is mainly of interest to programmers
- concerned with writing exception handlers. Many readers will only need to
- skim this section.
-
- When discussing these special computational situations, it is useful to
- distinguish between arithmetic instructions and nonarithmetic instructions.
- Nonarithmetic instructions are those that have no operands or transfer their
- operands without substantial change; arithmetic instructions are those that
- make significant changes to their operands. Table 3-1 defines these two
- classes of instructions.
-
-
- Table 3-1. Arithmetic and Nonarithmetic Instructions
-
- ╓┌──────────────────────────────────────┌────────────────────────────────────╖
- Nonarithmetic Instructions Arithmetic Instructions
-
- FABS F2XM1
- FCHS FADD (P)
- FCLEX FBLD
- FDECSTP FBSTP
- FFREE FCOMP(P)(P)
- FINCSTP FCOS
- FINIT FDIV(R)(P)
- Nonarithmetic Instructions Arithmetic Instructions
- FINIT FDIV(R)(P)
- FLD (register-to-register) FIADD
- FLD (extended format from memory) FICOM(P)
- FLD constant FIDIV(R)
- FLDCW FILD
- FLDENV FIMUL
- FNOP FIST(P)
- FRSTOR FISUB(R)
- FSAVE FLD (conversion)
- FST(P) (register-to-register) FMUL(P)
- FSTP (extended format to memory) FPATAN
- FSTCW FPREM
- FSTENV FPREM1
- FSTSW FPTAN
- FWAIT FRNDINT
- FXAM FSCALE
- FXCH FSIN
- FSINCOS
- FSQRT
- FST(P) (conversion)
- Nonarithmetic Instructions Arithmetic Instructions
- FST(P) (conversion)
- FSUB(R)(P)
- FTST
- FUCOM(P)(P)
- FXTRACT
- FYL2X
- FYL2XP1
-
-
-
- 3.1 Special Numeric Values
-
- The 80387 data formats encompass encodings for a variety of special values
- in addition to the typical real or integer data values that result from
- normal calculations. These special values have significance and can express
- relevant information about the computations or operations that produced
- them. The various types of special values are
-
- ■ Denormal real numbers
- ■ Zeros
- ■ Positive and negative infinity
- ■ NaN (Not-a-Number)
- ■ Indefinite
- ■ Unsupported formats
-
- The following sections explain the origins and significance of each of
- these special values. Tables 3-6 through 3-9 at the end of this section
- show how each of these special values is encoded for each of the numeric
- data types.
-
-
- 3.1.1 Denormal Real Numbers
-
- The 80387 generally stores nonzero real numbers in normalized
- floating-point form; that is, the integer (leading) bit of the significand
- is always a one. (Refer to Chapter 2 for a review of operand formats.) This
- bit is explicitly stored in the extended format, and is implicitly assumed
- to be a one (1{}) in the single and double formats. Since leading zeros are
- eliminated, normalized storage allows the maximum number of significant
- digits to be held in a significand of a given width.
-
- When a numeric value becomes very close to zero, normalized floating-point
- storage cannot be used to express the value accurately. The term tiny is
- used here to precisely define what values require special handling by the
- 80387. A number R is said to be tiny when -2{Emin} < R < 0 or
- 0 < R < +2{Emin}. (As defined in Chapter 2, Emin is -126 for single format,
- -1022 for double format, and -16382 for extended format.) In other words, a
- nonzero number is tiny if its exponent would be too negative to store in the
- destination format.
-
- To accommodate these instances, the 80387 can store and operate on reals
- that are not normalized, i.e., whose significands contain one or more
- leading zeros. Denormals typically arise when the result of a calculation
- yields a value that is tiny.
-
- Denormal values have the following properties:
-
- ■ The biased floating-point exponent is stored at its smallest value
- (zero)
-
- ■ The integer bit of the significand (whether explicit or implicit) is
- zero
-
- The leading zeros of denormals permit smaller numbers to be represented, at
- the possible cost of some lost precision (the number of significant bits is
- reduced by the leading zeros). In typical algorithms, extremely small values
- are most likely to be generated as intermediate, rather than final, results.
- By using the NPX's extended real format for holding intermediate values,
- quantities as small as ±3.4*10{-4932} can be represented; this makes the
- occurrence of denormal numbers a rare phenomenon in 80387 applications.
- Nevertheless, the NPX can load, store, and operate on denormalized real
- numbers when they do occur.
-
- Denormals receive special treatment by the 80387 in three respects:
-
- ■ The 80387 avoids creating denormals whenever possible. In other words,
- it always normalizes real numbers except in the case of tiny numbers.
-
- ■ The 80387 provides the unmasked underflow exception to permit
- programmers to detect cases when denormals would be created.
-
- ■ The 80387 provides the denormal exception to permit programmers to
- detect cases when denormals enter into further calculations.
-
- Denormalizing means incrementing the true result's exponent and inserting a
- corresponding leading zero in the significand, shifting the rest of the
- significand one place to the right. Denormal values may occur in any of the
- single, double, or extended formats. Table 3-2 illustrates how a result
- might be denormalized to fit a single format destination.
-
- Denormalization produces either a denormal or a zero. Denormals are readily
- identified by their exponents, which are always the minimum for their
- formats; in biased form, this is always the bit string: 00..00. This same
- exponent value is also assigned to the zeros, but a denormal has a nonzero
- significand. A denormal in a register is tagged special. Tables 3-8 and
- 3-9 show how denormal values are encoded in each of the real data formats.
-
- The denormalization process causes loss of significance if low-order
- one-bits bits are shifted off the right of the significand. In a severe
- case, all the significand bits of the true result are shifted out and
- replaced by the leading zeros. In this case, the result of denormalization
- is a true zero, and, if the value is in a register, it is tagged as a zero.
-
- Denormals are rarely encountered in most applications. Typical debugged
- algorithms generate extremely small results during the evaluation of
- intermediate subexpressions; the final result is usually of an appropriate
- magnitude for its single or double format real destination. If intermediate
- results are held in temporary real, as is recommended, the great range of
- this format makes underflow very unlikely. Denormals are likely to arise
- only when an application generates a great many intermediates, so many that
- they cannot be held on the register stack or in extended format memory
- variables. If storage limitations force the use of single or double format
- reals for intermediates, and small values are produced, underflow may occur,
- and, if masked, may generate denormals.
-
- When a denormal number is single or double format is used as a source
- operand and the denormal exception is masked, the 80387 automatically
- normalizes the number when it is converted to extended format.
-
-
- Table 3-2. Denormalization Process
-
- Operation Sign Exponent Significand
-
- True Result 0 -129 1{}01011100..00
- Denormalize 0 -128 0{}101011100..00
- Denormalize 0 -127 0{}0101011100..00
- Denormalize 0 -126 0{}00101011100..00
- Denormal Result 0 -126 0{}00101011100..00
-
-
- 3.1.1.1 Denormals and Gradual Underflow
-
- Floating-point arithmetic cannot carry out all operations exactly for all
- operands; approximation is unavoidable when the exact result is not
- representable as a floating-point variable. To keep the approximation
- mathematically tractable, the hardware is made to conform to accuracy
- standards that can be modeled by certain inequalities instead of equations.
- Let the assignment
-
- X Y @ Z (where @ is some operation)
-
- represent a typical operation. In the default rounding mode (round to
- nearest), each operation is carried out with an absolute error no larger
- than half the separation between the two floating-point numbers closest to
- the exact results. Let x be the value stored for the variable whose name in
- the program is X, and similarly y for Y, and z for Z. Normally y and z will
- differ by accumulated errors from what is desired and from what would have
- been obtained in the absence of error. For the calculation of x we assume
- that y and z are the best approximations available, and we seek to compute x
- as well as we can. If y@z is representable exactly, then we expect x = y@z,
- and that is what we get for every algebraic operation on the 80387 (i.e.,
- when y@z is one of y+z, y-z, y*z, y÷z, sqrt z). But if y@z must be
- approximated, as is usually the case, then x must differ from y@z by no
- more than half the difference between the two representable numbers that
- straddle y@z. That difference depends on two factors:
-
- 1. The precision to which the calculation is carried out, as determined
- either by the precision control bits or by the format used in memory.
- On the 80387, the precisions are single (24 significant bits), double
- (53 significant bits), and extended (64 significant bits).
-
- 2. How close y@z is to zero. In this respect the presence of denormal
- numbers on the 80387 provides a distinct advantage over systems that
- do not admit denormal numbers.
-
- In any floating-point number system, the density of representable numbers
- is greater near zero than near the largest representable magnitudes.
- However, machines that do not use denormal numbers suffer from an enormous
- gap between zero and its closest neighbors. Figures 3-1 and 3-2 show what
- happens near zero in two kinds of floating-point number systems.
-
- Figure 3-1 shows a floating-point number system that (like the 80387)
- admits denormal numbers. For simplicity, only the non-negative numbers
- appear and the figure illustrates a number system that carries just four
- significant bits instead of the 24, 53, or 64 significant bits that the
- 80387 offers.
-
- Each vertical mark stands for a number representable in four significant
- bits, and the bolder marks stand for the normal powers of 2. The denormal
- numbers lie between 0 and the nearest normal power of 2. They are no less
- dense than the remaining normal nonzero numbers.
-
- Figure 3-2 shows a floating-point number system that (unlike the 80387)
- does not admit denormal numbers. There are two yawning gaps, one on the
- positive side of zero (as illustrated) and one on the negative side of zero
- (not illustrated). The gap between zero and the nearest neighbor of zero
- differs from the gap between that neighbor and the next bigger number by a
- factor of about 8.4 * 10^(6) for single, 4.5 * 10^(15) for double, and
- 9.2*10^(18) for extended format. Those gaps would horribly complicate error
- analysis.
-
- The advantage of denormal numbers is apparent when one considers what
- happens in either case when the underflow exception is masked and y@z falls
- into the space between zero and the smallest normal magnitude. The 80387
- returns the nearest denormal number. This action might be called "gradual
- underflow." The effect is no different than the rounding that can occur when
- y@z falls in the normal range.
-
- On the other hand, the system that does not have denormal numbers returns
- zero as the result, an action that can be much more inaccurate than
- rounding. This action could be called "abrupt underflow."
-
-
- Figure 3-1. Floating-Point System with Denormals
-
- 0+++++++│+++++++│-+-+-+-+-+-+-+-│---+---+---+---+---+---+---+---│------+...
-
- └──┬──┘ - - - - - - - - Normal Numbers - - - - - -
- Denormals
-
-
- Figure 3-2. Floating-Point System without Denormals
-
- 0 │+++++++│-+-+-+-+-+-+-+-│---+---+---+---+---+---+---+---│------+---...
-
- - - - - - - - - Normal Numbers - - - - - -
-
-
- 3.1.2 Zeros
-
- The value zero in the real and decimal integer formats may be signed either
- positive or negative, although the sign of a binary integer zero is always
- positive. For computational purposes, the value of zero always behaves
- identically, regardless of sign, and typically the fact that a zero may be
- signed is transparent to the programmer. If necessary, the FXAM instruction
- may be used to determine a zero's sign.
-
- If a zero is loaded or generated in a register, the register is tagged
- zero. Table 3-3 lists the results of instructions executed with zero
- operands and also shows how a zero may be created from nonzero operands.
-
-
- Table 3-3. Zero Operands and Results
-
- ╓┌──────────────────┌───────────────────────────┌────────────────────────────╖
- Key to symbols used in this table
-
- Operation Operands Result
-
- FLD,FBLD +0 +0
- -0 -0
- FILD +0 +0
- FST,FSTP +0 +0
- -0 -0
- +X +0
-
- -X -0
-
- FBSTP +0 +0
- -0 -0
- Key to symbols used in this table
-
- Operation Operands Result
- -0 -0
- FIST,FISTP +0 +0
- -0 -0
- +X +0
-
- -X -0
-
- Addition +0 plus +0 +0
- -0 plus -0 -0
- +0 plus -0, -0 plus +0 ±0
-
- -X plus +X, +X plus -X ±0
-
- ±0 plus ±X, ±X plus ±0 #X
- Subtraction +0 minus -0+0
- -0 minus +0 -0
- +0 minus +0, -0 minus -0 ±0
-
- Key to symbols used in this table
-
- Operation Operands Result
- +X minus +X, -X minus -X ±0
-
- ±0 minus ±X -#X
- ±X minus ±0 #X
- Multiplication +0 * +0, -0 * -0 +0
- +0 * -0, -0 * +0 -0
- +0 * +X, +X * +0 +0
- +0 * -X, -X * +0 -0
- -0 * +X, -X * +0 -0
- Multiplication -0 * -X, -X * -0 +0
- +X * +Y, -X * -Y +0
-
- +X * -Y, -X * +Y -0
-
- Division ±0 ÷ ±0 Invalid Operation
- ±X ÷ ±0 Φ∞ (Zero Divide)
- +0 ÷ +X, -0 ÷ -X +0
- Key to symbols used in this table
-
- Operation Operands Result
- +0 ÷ +X, -0 ÷ -X +0
- +0 ÷ -X, -0 ÷ +X -0
- -X ÷ -Y, +X ÷ +Y +0
-
- -X ÷ +Y, +X ÷ -Y -0
-
- FPREM, FPREM1 ±0 rem ±0 Invalid Operation
- ±X rem ±0 Invalid Operation
- +0 rem ±X +0
- -0 rem ±X -0
- FPREM +X rem ±Y +0 Y exactly divides X
- -X rem ±Y -0 Y exactly divides X
- FPREM1 +X rem ±Y +0 Y exactly divides X
- -X rem ±Y -0 Y exactly divides X
- FSQRT +0 +0
- -0 -0
- Compare ±0 : +X ±0 < +X
- ±0 : ±0 ±0 = ±0
- Key to symbols used in this table
-
- Operation Operands Result
- ±0 : ±0 ±0 = ±0
- ±0 : -X ±0 > -X
- FTST ±0 ±0 = 0
- +0 C{3}=1; C{2}=C{1}=C{0}=0
- -0 C{3}=C{1}=1; C{2}=C{0}=0
- FCHS +0 -0
- -0 +0
- FABS ±0 +0
- F2XM1 +0 +0
- -0 -0
- FRNDINT +0 +0
- -0 -0
- FSCALE ±0 scaled by -∞ *0
- ±0 scaled by +∞ Invalid Operation
- ±0 scaled by X *0
- FXTRACT +0 ST=+0,ST(1)=-∞, Zero divide
- -0 ST=-0,ST(1)=-∞, Zero divide
- FPTAN±0 *0
- Key to symbols used in this table
-
- Operation Operands Result
- FPTAN±0 *0
- FSIN (or ±0 *0
- SIN result of
- FSINCOS)
- FCOS (or ±0 +1
- COS result of
- FSINCOS)
- FPATAN ±0 ÷ +X *0
- ±0 ÷ -X *π
- ±X ÷ ±0 #π/2
- ±0 ÷ +0 *0
- ±0 ÷ -0 *π
- +∞ ÷ ±0 +π/2
- -∞ ÷ ±0 -π/2
- ±0 ÷ +∞ *0
- ±0 ÷ -∞ *π
- FYL2X ±Y * log(±0) Zero Divide
- ±0 * log(±0) Invalid Operation
- Key to symbols used in this table
-
- Operation Operands Result
- ±0 * log(±0) Invalid Operation
- FYL2XP1 +Y * log(±0+1) *0
- -Y * log(±0+1) -0
-
-
- 3.1.3 Infinity
-
- The real formats support signed representations of infinities. These values
- are encoded with a biased exponent of all ones and a significand of
- 1{}00..00; if the infinity is in a register, it is tagged special.
-
- A programmer may code an infinity, or it may be created by the NPX as its
- masked response to an overflow or a zero divide exception. Note that
- depending on rounding mode, the masked response may create the largest valid
- value representable in the destination rather than infinity.
-
- The signs of the infinities are observed, and comparisons are possible.
- Infinities are always interpreted in the affine sense; that is, -∞ < (any
- finite number) < +∞. Arithmetic on infinities is always exact and,
- therefore, signals no exceptions, except for the invalid operations
- specified in Table 3-4.
-
-
- Table 3-4. Infinity Operands and Results
-
- ╓┌───────────────────┌───────────────────┌───────────────────────────────────╖
- Key to symbols used in this table
-
- Operation Operands Result
-
- Addition +∞ plus +∞ +∞
- -∞ plus -∞ -∞
- +∞ plus -∞ Invalid Operation
- -∞ plus +∞ Invalid Operation
- ±∞ plus ±X *∞
- ±X plus ±∞ *∞
- Subtraction +∞ minus -∞ +∞
- -∞ minus +∞ -∞
- +∞ minus +∞ Invalid Operation
- Key to symbols used in this table
-
- Operation Operands Result
- +∞ minus +∞ Invalid Operation
- -∞ minus -∞ Invalid Operation
- ±∞ minus ±X *∞
- ±X minus ±∞ -*∞
- Multiplication ±∞ * ±∞ Φ∞
- ±∞ * ±Y, ±Y * ±∞ Φ∞
- ±0 * ±∞, ±∞ * ±0 Invalid Operation
- Division ±∞ ÷ ±∞ Invalid Operation
- ±∞ ÷ ±X Φ∞
- ±X ÷ ±∞ Φ0
- ±∞ ÷ ±0 Φ∞
- FSQRT -∞ Invalid Operation
- +∞ +∞
- FPREM, FPREM1 ±∞ rem ±∞ Invalid Operation
- ±∞ rem ±X Invalid Operation
- ±X rem ±∞ $X, Q = 0
- FRNDINT ±∞ *∞
- FSCALE ±∞ scaled by --∞ Invalid Operation
- Key to symbols used in this table
-
- Operation Operands Result
- FSCALE ±∞ scaled by --∞ Invalid Operation
- ±∞ scaled by +∞ *∞
- ±∞ scaled by ±X *∞
- ±0 scaled by -∞ ±0
-
- ±0 scaled by ∞I Invalid Operation
- ±Y scaled by +∞ #∞
- ±Y scaled by -∞ #0
- FXTRACT ±∞ ST = *∞, ST(1) = +∞
- Compare +∞ : +∞ +∞ = +∞
- -∞ : -∞ -∞ = -∞
- +∞ : -∞ +∞ > -∞
- -∞ : +∞ -∞ < +∞
- +∞ : ±X +∞ > X
- -∞ : ±X -∞ < X
- ±X : +∞ X < +∞
- ±X : -∞ X > +∞
- FTST +∞ +∞ > 0
- Key to symbols used in this table
-
- Operation Operands Result
- FTST +∞ +∞ > 0
- -∞ -∞ < 0
- FPATAN ±∞ ÷ ±X *π/2
- ±Y ÷ +∞ #0
- ±Y ÷ -∞ #π
- ±∞ ÷ +∞ *π/4
- ±∞ ÷ -∞ *3π/4
- ±∞ ÷ ±0 *π/2
- +0 ÷ +∞ +0
- +0 ÷ -∞ +π
- -0 ÷ +∞ -0
- -0 ÷ -∞ -π
- F2XM1 +∞ +∞
- -∞ -1
- FYL2X, FYL2XP1 ±∞ * log(1) Invalid Operation
- ±∞ * log(Y>1) *∞
- ±∞ * log(0<Y<1) -*∞
- ±Y * log(+∞) #∞
- Key to symbols used in this table
-
- Operation Operands Result
- ±Y * log(+∞) #∞
- ±0 * log(+∞) Invalid Operation
- ±Y * log(-∞) Invalid Operation
-
-
- 3.1.4 NaN (Not-a-Number)
-
- A NaN (Not a Number) is a member of a class of special values that exists
- in the real formats only. A NaN has an exponent of 11..11B, may have either
- sign, and may have any significand except 1{}00..00B, which is assigned to
- the infinities. A NaN in a register is tagged special.
-
- There are two classes of NaNs: signaling (SNaN) and quiet (QNaN). Among the
- QNaNs, the value real indefinite is of special interest.
-
-
- 3.1.4.1 Signaling NaNs
-
- A signaling NaN is a NaN that has a zero as the most significant bit of its
- significand. The rest of the significand may be set to any value. The 80387
- never generates a signaling NaN as a result; however, it recognizes
- signaling NaNs when they appear as operands. Arithmetic operations (as
- defined at the beginning of this chapter) on a signaling NaN cause an
- invalid-operation exception (except for load operations, FXCH, FCHS, and
- FABS).
-
- By unmasking the invalid operation exception, the programmer can use
- signaling NaNs to trap to the exception handler. The generality of this
- approach and the large number of NaN values that are available provide the
- sophisticated programmer with a tool that can be applied to a variety of
- special situations.
-
- For example, a compiler could use signaling NaNs as references to
- uninitialized (real) array elements. The compiler could preinitialize each
- array element with a signaling NaN whose significand contained the index
- (relative position) of the element. If an application program attempted to
- access an element that it had not initialized, it would use the NaN placed
- there by the compiler. If the invalid operation exception were unmasked, an
- interrupt would occur, and the exception handler would be invoked. The
- exception handler could determine which element had been accessed, since the
- operand address field of the exception pointers would point to the NaN, and
- the NaN would contain the index number of the array element.
-
-
- 3.1.4.2 Quiet NaNs
-
- A quiet NaN is a NaN that has a one as the most significant bit of its
- significand. The 80387 creates the quiet NaN real indefinite (defined below)
- as its default response to certain exceptional conditions. The 80387 may
- derive other QNaNs by converting an SNaN. The 80387 converts a SNaN by
- setting the most significant bit of its significand to one, thereby
- generating an QNaN. The remaining bits of the significand are not changed;
- therefore, diagnostic information that may be stored in these bits of the
- SNaN is propagated into the QNaN.
-
- The 80387 will generate the special QNaN, real indefinite, as its masked
- response to an invalid operation exception. This NaN is signed negative; its
- significand is encoded 1{}100..00. All other NaNs represent values created
- by programmers or derived from values created by programmers.
-
- Both quiet and signaling NaNs are supported in all operations. A QNaN is
- generated as the masked response for invalid-operation exceptions and as the
- result of an operation in which at least one of the operands is a QNaN. The
- 80387 applies the rules shown in Table 3-5 when generating a QNaN:
-
- Note that handling of a QNaN operand has greater priority than all
- exceptions except certain invalid-operation exceptions (refer to the section
- "Exception Priority" in this chapter).
-
- Quiet NaNs could be used, for example, to speed up debugging. In its early
- testing phase, a program often contains multiple errors. An exception
- handler could be written to save diagnostic information in memory whenever
- it was invoked. After storing the diagnostic data, it could supply a quiet
- NaN as the result of the erroneous instruction, and that NaN could point to
- its associated diagnostic area in memory. The program would then continue,
- creating a different NaN for each error. When the program ended, the NaN
- results could be used to access the diagnostic data saved at the time the
- errors occurred. Many errors could thus be diagnosed and corrected in one
- test run.
-
-
- Table 3-5. Rules for Generating QNaNs
-
- Operation Action
-
- Real operation on an SNaN and Deliver the QNaN operand.
- a QNaN
-
- Real operation on two SNaNs Deliver the QNaN that results from
- converting the SNaN that has the larger
- significand.
-
- Real operation on two QNaNs Deliver the QNaN that has the larger
- significand.
-
- Real operation on an SNaN and Deliver the QNaN that results from
- another number converting the SNaN.
-
- Real operation on a QNaN and Deliver the QNaN.
- another number
-
- Invalid operation that does not Deliver the default QNaN real indefinite.
- involve NaNs
-
-
- 3.1.5 Indefinite
-
- For every 80387 numeric data type, one unique encoding is reserved for
- representing the special value indefinite. The 80387 produces this encoding
- as its response to a masked invalid-operation exception.
-
- In the case of reals, the indefinite value is a QNaN as discussed in the
- prior section.
-
- Packed decimal indefinite may be stored by the NPX in a FBSTP instruction;
- attempting to use this encoding in a FBLD instruction, however, will have an
- undefined result; thus indefinite cannot be loaded from a packed decimal
- integer.
-
- In the binary integers, the same encoding may represent either indefinite
- or the largest negative number supported by the format (-2^(15), -2^(31), or
- -2^(63)). The 80387 will store this encoding as its masked response to
- an invalid operation, or when the value in a source register represents or
- rounds to the largest negative integer representable by the destination. In
- situations where its origin may be ambiguous, the invalid-operation
- exception flag can be examined to see if the value was produced by an
- exception response. When this encoding is loaded or used by an integer
- arithmetic or compare operation, it is always interpreted as a negative
- number; thus indefinite cannot be loaded from a binary integer.
-
-
- 3.1.6 Encoding of Data Types
-
- Tables 3-6 through 3-9 show how each of the special values just
- described is encoded for each of the numeric data types. In these tables,
- the least-significant bits are shown to the right and are stored in the
- lowest memory addresses. The sign bit is always the left-most bit of the
- highest-addressed byte.
-
-
- 3.1.7 Unsupported Formats
-
- The extended format permits many bit patterns that do not fall into any of
- the previously mentioned categories. Some of these encodings were supported
- by the 80287 NPX; however, most of them are not supported by the 80387 NPX.
- These changes are required due to changes made in the final version of the
- IEEE 754 standard that eliminated these data types.
-
- The categories of encodings formerly known as pseudozeros, pseudo-NaNs,
- pseudoinfinities, and unnormal numbers are not supported by the 80387. The
- 80387 raises the invalid-operation exception when they are encountered as
- operands.
-
- The encodings formerly known as pseudodenormal numbers are not generated by
- the 80387; however, they are correctly utilized when encountered in operands
- to 80387 instructions. The exponent is treated as if it were 00..01 and the
- mantissa is unchanged. The denormal exception is raised.
-
-
- Table 3-6. Binary Integer Encodings
-
- Class Sign Magnitude
- ┌────────────────────────────────────────────────────────────
- │ (Largest) 0 11...11
- │ ∙ ∙
- Positives ∙ ∙
- │ ∙ ∙
- │ (Smallest) 0 00...01
- └────────────────────────────────────────────────────────────
- Zero 0 00...00
- ┌────────────────────────────────────────────────────────────
- │ (Smallest) 1 11...11
- │ ∙ ∙
- Negatives ∙ ∙
- │ ∙ ∙
- │ (Largest/Indefinite) 1 00...00
- └────────────────────────────────────────────────────────────
- Word: ───15 bits───
- Short: ───31 bits───
- Long: ───63 bits───
-
-
- Table 3-7. Packed Decimal Encodings
-
- ╓┌──────────────────────────┌───────────────┌────────────────────────────────
-
-
- ┌─────────────────── Magnitude ───
- Class Sign digit digit digit
- ┌────────────────────────────────────────────────────────────────────────
- │ (Largest) 0 0000000 1 0 0 1 1 0 0 1 1 0 0 1
- │ ∙ ∙ ∙
- │ ∙ ∙ ∙
- Positives ∙ ∙ ∙
- │ (Smallest) 0 0000000 0 0 0 0 0 0 0 0 0 0 0 0
- │
- │ Zero 0 0000000 0 0 0 0 0 0 0 0 0 0 0 0
- └────────────────────────────────────────────────────────────────────────
- ┌────────────────────────────────────────────────────────────────────────
- │ Zero 1 0000000 0 0 0 0 0 0 0 0 0 0 0 0
- │
- │ (Smallest) 1 0000000 0 0 0 0 0 0 0 0 0 0 0 0
- Negatives ∙ ∙ ∙
- │ ∙ ∙ ∙
- │ ∙ ∙ ∙
- │ (Largest) 1 0000000 1 0 0 1 1 0 0 1 1 0 0 1
- └────────────────────────────────────────────────────────────────────────
-
- ┌─────────────────── Magnitude ───
- Class Sign digit digit digit
- └────────────────────────────────────────────────────────────────────────
- Indefinite 1 1111111 1 1 1 1 1 1 1 1 U U U U
- ──── 1 byte ─── ──────────────────────── 9 bytes
-
-
- Table 3-8. Single and Double Real Encodings
-
- ╓┌───────────────────────────────┌────────┌───────────┌──────────────────────╖
- Biased Significand
- Class Sign Exponent ff--ff
-
-
-
- ┌──────┬───────────────────────────────────────────────────────
- │ │ Quiet 0 11...11 11...11
- │ │ ∙ ∙
- │ │ ∙ ∙
- │ │ ∙ ∙
- Biased Significand
- Class Sign Exponent ff--ff
- │ │ ∙ ∙
- │ │ 0 11...11 10...00
- │ NaNs ──────────────────────────────────────────────
- │ │ Signaling 0 11...11 01...11
- │ │ ∙ ∙
- │ │ ∙ ∙
- │ │ ∙ ∙
- │ │ 0 11...11 00...01
- │ └───────────────────────────────────────────────────────
- │ ∞ 0 11...11 00...00
- │ ┌───────────────────────────────────────────────────────
- │ │ Normals 0 11...10 11...11
- │ │ ∙ ∙
- Positives │ ∙ ∙
- │ │ ∙ ∙
- │ │ 0 00...01 00...00
- │ │ ──────────────────────────────────────────────
- │ Reals Denormals 0 00...00 11...11
- │ │ ∙ ∙
- Biased Significand
- Class Sign Exponent ff--ff
- │ │ ∙ ∙
- │ │ ∙ ∙
- │ │ ∙ ∙
- │ │ 0 00...00 00...01
- │ │ ──────────────────────────────────────────────
- │ │ Zero 0 00...00 00...00
- └──────────────────────────────────────────────────────────────
- ┌──────────────────────────────────────────────────────────────
- │ │ Zero 1 00...00 00...00
- │ │ ──────────────────────────────────────────────
- │ │ Denormals 1 00...00 00...01
- │ │ ∙ ∙
- │ │ ∙ ∙
- │ Reals ∙ ∙
- │ │ 1 00...00 11...11
- │ │ ──────────────────────────────────────────────
- │ │ Normals 1 00...01 00...00
- │ │ ∙ ∙
- │ │ ∙ ∙
- Biased Significand
- Class Sign Exponent ff--ff
- │ │ ∙ ∙
- │ │ ∙ ∙
- │ │ 1 11...10 11...11
- Negatives └───────────────────────────────────────────────────────
- │ ∞ 1 11...11 00...00
- │ ┌─────────┬─────────────────────────────────────────────
- │ │ │ 1 11...11 00...01
- │ │ │ ∙ ∙
- │ │ Signaling ∙ ∙
- │ │ │ ∙ ∙
- │ │ │ 1 11...11 01...11
- │ NaNs ├─────────────────────────────────────────────
- │ │ │ Indefinite 1 11...11 10...00
- │ │ │ ─────────────────────────────────────────
- │ │ │ ∙ ∙
- │ │ Quiet ∙ ∙
- │ │ │ ∙ ∙
- │ │ │ 1 11...11 11...11
- └──────┴─────────┴─────────────────────────────────────────────
- Biased Significand
- Class Sign Exponent ff--ff
- └──────┴─────────┴─────────────────────────────────────────────
- Double: │ ───8 bits── │ ──23 bits── │
- Single: │ ──11 bits── │ ──52 bits── │
-
-
- Table 3-9. Extended Real Encodings
-
- ╓┌───────────────────────────────────────┌──────────┌────────────────────────╖
- Biased Significand
- Class Sign Exponent 1.ff--ff
- ┌──────┬──────────────────────────────────────────────────
- │ │ 0 11...11 1 11..11
- │ │ Quiet ∙ ∙ ∙
- │ │ ∙ ∙ ∙
- │ │ 0 11...11 1 10..01
- │ NaNs ───────────────────────────────────────────────
- │ │ 0 11...11 1 01..11
- │ │ Signaling ∙ ∙ ∙
- │ │ ∙ ∙ ∙
- Biased Significand
- Class Sign Exponent 1.ff--ff
- │ │ ∙ ∙ ∙
- │ │ 0 11...11 1 00..01
- │ └───────────────────────────────────────────────────
- │ ∞ 0 11...11 1 00..00
- │ ┌────────────────────────────────────────────────────
- │ │ 0 11...10 1 11..11
- │ │ Normals ∙ ∙ ∙
- │ │ ∙ ∙ ∙
- │ │ 0 00...01 1 00..00
- │ │ ───────────────────────────────────────────────
- Positives │ 0 11...10 0 11..11
- │ Reals Unsupported ∙ ∙ ∙
- │ │ 8087 Unnormals ∙ ∙ ∙
- │ │ 0 00...01 0 00..00
- │ │ ───────────────────────────────────────────────
- │ │ 0 00...00 1 11..11
- │ │ Pseudo- ∙ ∙ ∙
- │ │ normals ∙ ∙ ∙
- │ │ 0 00...00 1 00..00
- Biased Significand
- Class Sign Exponent 1.ff--ff
- │ │ 0 00...00 1 00..00
- │ │ ───────────────────────────────────────────────
- │ │ 0 00...00 0 11..11
- │ │ Denormals ∙ ∙ ∙
- │ │ ∙ ∙ ∙
- │ │ 0 00...00 0 00..01
- │ ├───────────────────────────────────────────────────
- │ │ Zero 0 00...00 000...00
- └──────────────────────────────────────────────────────────
- ┌──────────────────────────────────────────────────────────
- │ │ Zero 1 00...00 000...00
- │ ├───────────────────────────────────────────────────
- │ │ 1 00...00 0 00..01
- │ │ Denormals ∙ ∙ ∙
- │ │ ∙ ∙ ∙
- │ │ 1 00...00 0 11..11
- │ │ ───────────────────────────────────────────────
- │ │ 1 00...00 1 00..00s
- │ Reals Pseudo- ∙ ∙ ∙
- Biased Significand
- Class Sign Exponent 1.ff--ff
- │ Reals Pseudo- ∙ ∙ ∙
- │ │ normals ∙ ∙ ∙
- │ │ 1 00...00 1 11..11
- │ │ ───────────────────────────────────────────────
- │ │ 1 00...00 0 00..00
- Negatives │ Unsupported ∙ ∙ ∙
- │ │ 8087 Unnormals ∙ ∙ ∙
- │ │ 1 11...10 0 11..11
- │ │ ───────────────────────────────────────────────
- │ │ 1 00...01 1 00..00
- │ │ Normals ∙ ∙ ∙
- │ │ ∙ ∙ ∙
- │ │ 1 11...10 1 11..11
- │ └────────────────────────────────────────────────────
- │ ∞ 1 11...11 1 00..00
- │ ┌───┬───────────────────────────────────────────────
- │ │ │ 1 11...11 1 00..01
- │ │ Signaling ∙ ∙ ∙
- │ │ │ ∙ ∙ ∙
- Biased Significand
- Class Sign Exponent 1.ff--ff
- │ │ │ ∙ ∙ ∙
- │ │ │ 1 11...11 1 01..11
- │ │ ├───────────────────────────────────────────────
- │ NaNs │ Indefinite 1 11...11 110...00
- │ │ │ ──────────────────────────────────────────
- │ │ │ 1 11...11 1 10..00
- │ │ Quiet ∙ ∙ ∙
- │ │ │ ∙ ∙ ∙
- │ │ │ 1 11...11 1 11..11
- └──────┴───┴───────────────────────────────────────────────
- │──15 bits──│──64 bits──│
-
-
- 3.2 Numeric Exceptions
-
- The 80387 can recognize six classes of numeric exception conditions while
- executing numeric instructions:
-
- 1. I── Invalid operation
- ■ Stack fault
- ■ IEEE standard invalid operation
- 2. Z── Divide-by-zero
- 3. D── Denormalized operand
- 4. O── Numeric overflow
- 5. U── Numeric underflow
- 6. P── Inexact result (precision)
-
-
- 3.2.1 Handling Numeric Exceptions
-
- When numeric exceptions occur, the NPX takes one of two possible courses of
- action:
-
- ■ The NPX can itself handle the exception, producing the most reasonable
- result and allowing numeric program execution to continue undisturbed.
-
- ■ A software exception handler can be invoked by the CPU to handle the
- exception.
-
- Each of the six exception conditions described above has a corresponding
- flag bit in the 80387 status word and a mask bit in the 80387 control word.
- If an exception is masked (the corresponding mask bit in the control
- word = 1), the 80387 takes an appropriate default action and continues with
- the computation. If the exception is unmasked (mask = 0), the 80387 asserts
- the ERROR# output to the 80386 to signal the exception and invoke a software
- exception handler.
-
- Note that when exceptions are masked, the NPX may detect multiple
- exceptions in a single instruction, because it continues executing the
- instruction after performing its masked response. For example, the 80387
- could detect a denormalized operand, perform its masked response to this
- exception, and then detect an underflow.
-
-
- 3.2.1.1 Automatic Exception Handling
-
- The 80387 NPX has a default fix-up activity for every possible exception
- condition it may encounter. These masked-exception responses are designed to
- be safe and are generally acceptable for most numeric applications.
-
- As an example of how even severe exceptions can be handled safely and
- automatically using the NPX's default exception responses, consider a
- calculation of the parallel resistance of several values using only the
- standard formula (Figure 3-3). If R{1} becomes zero, the circuit resistance
- becomes zero. With the divide-by-zero and precision exceptions masked, the
- 80387 NPX will produce the correct result.
-
- By masking or unmasking specific numeric exceptions in the NPX control
- word, NPX programmers can delegate responsibility for most exceptions to the
- NPX, reserving the most severe exceptions for programmed exception handlers.
- Exception-handling software is often difficult to write, and the NPX's
- masked responses have been tailored to deliver the most reasonable result
- for each condition. For the majority of applications, masking all
- exceptions other than invalid-operation yields satisfactory results with the
- least programming effort. An invalid-operation exception normally indicates
- a program error that must be corrected; this exception should not normally
- be masked.
-
- The exception flags in the NPX status word provide a cumulative record of
- exceptions that have occurred since these flags were last cleared. Once set,
- these flags can be cleared only by executing the FCLEX (clear exceptions)
- instruction, by reinitializing the NPX, or by overwriting the flags with an
- FRSTOR or FLDENV instruction. This allows a programmer to mask all
- exceptions (except invalid operation), run a calculation, and then inspect
- the status word to see if any exceptions were detected at any point in the
- calculation.
-
-
- Figure 3-3. Arithmetic Example Using Infinity
-
- ───┬──────┬──────┐
- │ │ │
- │ │ │
- │ │ │
- R{1} R{2} R{3}
- │ │ │
- │ │ │
- │ │ │
- ───┴──────┴──────┘
-
- 1
- EQUIVALENT RESISTANCE = ──────────────────────────────
- 1/R{1} + 1/R{2} + 1/R{3}
-
-
- 3.2.1.2 Software Exception Handling
-
- If the NPX encounters an unmasked exception condition, it signals the
- exception to the 80386 CPU using the ERROR# status line between the two
- processors.
-
- The next time the 80386 CPU encounters a WAIT or ESC instruction in its
- instruction stream, the 80386 will detect the active condition of the ERROR#
- status line and automatically trap to an exception response routine using
- interrupt #16, the "processor extension error" exception.
-
- This exception response routine is normally a part of the systems software.
- Typical exception responses may include:
-
- ■ Incrementing an exception counter for later display or printing
-
- ■ Printing or displaying diagnostic information (e.g., the 80387
- environment andregisters)
-
- ■ Aborting further execution
-
- ■ Using the exception pointers to build an instruction that will run
- without exception and executing it
-
- For 80386 systems having systems software support for the 80387 NPX,
- applications programmers should consult the operating system's reference
- manuals for the appropriate system response to NPX exceptions. For systems
- programmers, specific details on writing software exception handlers are
- included in Chapter 6.
-
-
- 3.2.2 Invalid Operation
-
- This exception may occur in response to two general classes of operations:
-
- 1. Stack operations
- 2. Arithmetic operations
-
- The stack flag (SF) of the status word indicates which class of operation
- caused the exception. When SF is 1 a stack operation has resulted in stack
- overflow or underflow; when SF is 0, an arithmetic instruction has
- encountered an invalid operand.
-
-
- 3.2.2.1 Stack Exception
-
- When SF is 1, indicating a stack operation, the O/U# bit of the condition
- code (bit C{1}) distinguishes between stack overflow and underflow as
- follows:
-
- O/U# = 1 Stack overflow── an instruction attempted to push down a
- nonempty stack location.
-
- O/U# = 0 Stack underflow── an instruction attempted to read an
- operand from an empty stack location.
-
- When the invalid-operation exception is masked, the 80387 returns the QNaN
- indefinite. This value overwrites the destination register, destroying
- its original contents.
-
- When the invalid-operation exception is not masked, the 80386 exception
- "processor extension error" is triggered. TOP is not changed, and the source
- operands remain unaffected.
-
-
- 3.2.2.2 Invalid Arithmetic Operation
-
- This class includes the invalid operations defined in IEEE Std 754. The
- 80387 reports an invalid operation in any of the cases shown in Table 3-10.
- Also shown in this table are the 80387's responses when the invalid
- exception is masked. When unmasked, the 80386 exception "processor extension
- error" is triggered, and the operands remain unaltered. An invalid operation
- generally indicates a program error.
-
-
- Table 3-10. Masked Responses to Invalid Operations
-
- ╓┌───────────────────────────────────┌───────────────────────────────────────╖
- Condition Masked Response
-
- Any arithmetic operation Return the QNaN indefinite.
- on an unsupported format.
- Condition Masked Response
- on an unsupported format.
-
- Any arithmetic operation Return a QNaN (refer to the section
- on a signaling NaN. "Rules for Generating QNaNs").
-
- Compare and test operations: Set condition codes "not comparable."
- one or both operands is a NaN.
-
- Addition of opposite-signed Return the QNaN indefinite.
- infinities or subtraction of
- like-signed infinities.
-
- Multiplication: ∞ * 0; or 0 * ∞. Return the QNaN indefinite.
-
- Division: ∞ ÷ ∞; or 0 ÷ 0. Return the QNaN indefinite.
-
- Remainder instructions FPREM, Return the QNaN indefinite; set C{2}.
- FPREM1 when modulus (divisor)
- is zero or dividend is ∞.
-
- Condition Masked Response
- Trigonometric instructions FCOS, Return the QNaN indefinite; set C{2}.
- FPTAN, FSIN, FSINCOS when
- argument is ∞.
-
- FSQRT of negative operand (except Return the QNaN indefinite.
- FSQRT (-0) = -0), FYL2X of
- negative operand (except FYL2X
- (-0) = -∞), FYL2XP1 of operand
- more negative than -1.
-
- FIST(P) instructions when source Store integer indefinite.
- register is empty, a NaN, ∞,
- or exceeds representable range
- of destination.
-
- FBSTP instruction when source Store packed decimal indefinite.
- register is empty, a NaN, ∞, or
- exceeds 18 decimal digits.
-
- Condition Masked Response
- FXCH instruction when one or Change empty registers to the QNaN
- both registers are tagged empty. indefinite and then perform exchange.
-
-
- 3.2.3 Division by Zero
-
- If an instruction attempts to divide a finite nonzero operand by zero, the
- 80387 will report a zero-divide exception. This is possible for
- F(I)DIV(R)(P) as well as the other instructions that perform division
- internally: FYL2X and FXTRACT. The masked response for FDIV and FYL2X is to
- return an infinity signed with the exclusive OR of the signs of the
- operands. For FXTRACT, ST(1) is set to -∞; ST is set to zero with the same
- sign as the original operand. If the divide-by-zero exception is unmasked,
- the 80386 exception "processor extension error" is triggered; the operands
- remain unaltered.
-
-
- 3.2.4 Denormal Operand
-
- If an arithmetic instruction attempts to operate on a denormal operand, the
- NPX reports the denormal-operand exception. Denormal operands may have
- reduced significance due to lost low-order bits, therefore it may be
- advisable in certain applications to preclude operations on these operands.
- This can be accomplished by an exception handler that responds to unmasked
- denormal exceptions. Most users will mask this exception so that
- computation may proceed; any loss of accuracy will be analyzed by the user
- when the final result is delivered.
-
- When this exception is masked, the 80387 sets the D-bit in the status word,
- then proceeds with the instruction. Gradual underflow and denormal numbers
- as handled on the 80387 will produce results at least as good as, and often
- better than what could be obtained from a machine that flushes underflows to
- zero. In fact, a denormal operand in single- or double-precision format will
- be normalized to the extended-real format when loaded into the 80387.
- Subsequent operations will benefit from the additional precision of the
- extended-real format used internally.
-
- When this exception is not masked, the D-bit is set and the exception
- handler is invoked. The operands are not changed by the instruction and are
- available for inspection by the exception handler.
-
- If an 8087/80287 program uses the denormal exception to automatically
- normalize denormal operands, then that program can run on an 80387 by
- masking the denormal exception. The 8087/80287 denormal exception handler
- would not be used by the 80387 in this case. A numerics program runs faster
- when the 80387 performs normalization of denormal operands. A program can
- detect at run-time whether it is running on an 80387 or 8087/80287 and
- disable the denormal exception when an 80387 is used. The following code
- sequence is recommended to distinguish between an 80387 and an 8087/80287.
-
- FINIT ; Use default infinity mode:
- ; projective for 8087/80287,
- ; affine for 80387
- FLD1 ; Generate infinty
- FLDZ
- FDIV
- FLD ST
- ; Form negative infinity
- FCHS
- FCOMPP ; Compare +infinity with -infinity
- FSTSW temp ; 8087/80287 will say they are equal
- MOV AX, temp
- SAHF
- JNZ Using_80387
-
- The denormal-operand exception of the 80387 permits emulation of arithmetic
- on unnormal operands as provided by the 8087/80287. The standard does not
- require the denormal exception nor does it recognize the unnormal data type.
-
-
- 3.2.5 Numeric Overflow and Underflow
-
- If the exponent of a numeric result is too large for the destination real
- format, the 80387 signals a numeric overflow. Conversely, if the exponent of
- a result is too small to be represented in the destination format, a numeric
- underflow is signaled. If either of these exceptions occur, the result of
- the operation is outside the range of the destination real format.
-
- Typical algorithms are most likely to produce extremely large and small
- numbers in the calculation of intermediate, rather than final, results.
- Because of the great range of the extended-precision format (recommended as
- the destination format for intermediates), overflow and underflow are
- relatively rare events in most 80387 applications.
-
-
- 3.2.5.1 Overflow
-
- The overflow exception can occur whenever the rounded true result would
- exceed in magnitude the largest finite number in the destination format. The
- exception can occur in the execution of most of the arithmetic instructions
- and in some of the conversion instructions; namely, FST(P), F(I)ADD(P),
- F(I)SUB(R)(P), F(I)MUL(P), FDIV(R)(P), FSCALE, FYL2X, and FYL2XP1.
-
- The response to an overflow condition depends on whether the overflow
- exception is masked:
-
- ■ Overflow exception masked. The value returned depends on the rounding
- mode as Table 3-11 illustrates.
-
- ■ Overflow exception not masked. The unmasked response depends on
- whether the instruction is supposed to store the result on the stack
- or in memory:
-
- ── Destination is the stack. The true result is divided by 2^(24,576)
- and rounded. (The bias 24,576 is equal to 3 * 2^(13).) The
- significand is rounded to the appropriate precision (according to
- the precision control (PC) bit of the control word, for those
- instructions controlled by PC, otherwise to extended precision).
- The roundup bit (C{1}) of the status word is set if the
- significand was rounded upward.
-
- The biasing of the exponent by 24,576 normally translates the
- number as nearly as possible to the middle of the exponent range
- so that, if desired, it can be used in subsequent scaled
- operations with less risk of causing further exceptions. With the
- instruction FSCALE, however, it can happen that the result is too
- large and overflows even after biasing. In this case, the unmasked
- response is exactly the same as the masked round-to-nearest
- response, namely ± infinity. The intention of this feature is to
- ensure the trap handler will discover that a translation of the
- exponent by -24574 would not work correctly without obliging the
- programmer of Decimal-to-Binary or Exponential functions to
- determine which trap handler, if any, should be invoked.
-
- ── Destination is memory (this can occur only with the store
- instructions). No result is stored in memory. Instead, the operand
- is left intact in the stack. Because the data in the stack is in
- extended-precision format, the exception handler has the option
- either of reexecuting the store instruction after proper
- adjustment of the operand or of rounding the significand on the
- stack to the destination's precision as the standard requires. The
- exception handler should ultimately store a value into the
- destination location in memory if the program is to continue.
-
-
- Table 3-11. Masked Overflow Results
-
- Rounding Sign of
- Mode True Result Result
-
- To nearest + +∞
- - -∞
- Toward -∞ + Largest finite positive number
- - -∞
- Toward +∞ + +∞
- - Largest finite negative number
- Toward zero + Largest finite positive number
- - Largest finite negative number
-
-
- 3.2.5.2 Underflow
-
- Underflow can occur in the execution of the instructions FST(P), FADD(P),
- FSUB(RP), FMUL(P), F(I)DIV(RP), FSCALE, FPREM(1), FPTAN, FSIN, FCOS,
- FSINCOS, FPATAN, F2XM1, FYL2X, and FYL2XP1.
-
- Two related events contribute to underflow:
-
- 1. Creation of a tiny result which, because it is so small, may cause
- some other exception later (such as overflow upon division).
-
- 2. Creation of an inexact result; i.e. the delivered result differs from
- what would have been computed were both the exponent range and
- precision unbounded.
-
- Which of these events triggers the underflow exception depends on whether
- the underflow exception is masked:
-
- 1. Underflow exception masked. The underflow exception is signaled when
- the result is both tiny and inexact.
-
- 2. Underflow exception not masked. The underflow exception is signaled
- when the result is tiny, regardless of inexactness.
-
- The response to an underflow exception also depends on whether the
- exception is masked:
-
- 1. Masked response. The result is denormal or zero. The precision
- exception is also triggered.
-
- 2. Unmasked response. The unmasked response depends on whether the
- instruction is supposed to store the result on the stack or in memory:
-
- ■ Destination is the stack. The true result is multiplied by
- 2^(24,576) and rounded. (The bias 24,576 is equal to 3 * 2^(13).)
- The significand is rounded to the appropriate precision (according
- to the precision control (PC) bit of the control word, for those
- instructions controlled by PC, otherwise to extended precision).
- The roundup bit (C{1}) of the status word is set if the significand
- was rounded upward.
-
- The biasing of the exponent by 24,576 normally translates the
- number as nearly as possible to the middle of the exponent range so
- that, if desired, it can be used in subsequent scaled operations
- with less risk of causing further exceptions. With the instruction
- FSCALE, however, it can happen that the result is too tiny and
- underflows even after biasing. In this case, the unmasked response
- is exactly the same as the masked round-to-nearest response, namely
- ±0. The intention of this feature is to ensure the trap handler
- will discover that a translation by +24576 would not work correctly
- without obliging the programmer of Decimal-to-Binary or Exponential
- functions to determine which trap handler, if any, should be
- invoked.
-
- ■ Destination is memory (this can occur only with the store
- instructions). No result is stored in memory. Instead, the operand
- is left intact in the stack. Because the data in the stack is in
- extended-precision format, the exception handler has the option
- either of reexecuting the store instruction after proper adjustment
- of the operand or of rounding the significand on the stack to the
- destination's precision as the standard requires. The exception
- handler should ultimately store a value into the destination
- location in memory if the program is to continue.
-
-
- 3.2.6 Inexact (Precision)
-
- This exception condition occurs if the result of an operation is not
- exactly representable in the destination format. For example, the fraction
- 1/3 cannot be precisely represented in binary form. This exception occurs
- frequently and indicates that some (generally acceptable) accuracy has been
- lost.
-
- All the transcendental instructions are inexact by definition; they always
- cause the inexact exception.
-
- The C{1} (roundup) bit of the status word indicates whether the inexact
- result was rounded up (C{1} = 1) or chopped (C{1} = 0).
-
- The inexact exception accompanies the underflow exception when there is
- also a loss of accuracy. When underflow is masked, the underflow exception
- is signaled only when there is a loss of accuracy; therefore the precision
- flag is always set as well. When underflow is unmasked, there may or may not
- have been a loss of accuracy; the precision bit indicates which is the case.
-
- This exception is provided for applications that need to perform exact
- arithmetic only. Most applications will mask this exception. The 80387
- delivers the rounded or over/underflowed result to the destination,
- regardless of whether a trap occurs.
-
-
- 3.2.7 Exception Priority
-
- The 80387 deals with exceptions according to a predetermined precedence.
- Precedence in exception handling means that higher-priority exceptions are
- flagged and results are delivered according to the requirements of that
- exception. Lower-priority exceptions may not be flagged even if they occur.
- For example, dividing an SNaN by zero causes an invalid-operand exception
- (due to the SNaN) and not a zero-divide exception; the masked result is the
- QNaN real indefinite, not ∞. A denormal or inexact (precision) exception,
- however, can accompany a numeric underflow or overflow exception.
-
- The exception precedence is as follows:
-
- 1. Invalid operation exception, subdivided as follows:
-
- a. Stack underflow.
- b. Stack overflow.
- c. Operand of unsupported format.
- d. SNaN operand.
-
- 2. QNaN operand. Though this is not an exception, if one operand is a
- QNaN, dealing with it has precedence over lower-priority exceptions.
- For example, a QNaN divided by zero results in a QNaN, not a
- zero-divide exception.
-
- 3. Any other invalid-operation exception not mentioned above or zero
- divide.
-
- 4. Denormal operand. If masked, then instruction execution continues,
- and a lower-priority exception can occur as well.
-
- 5. Numeric overflow and underflow. Inexact result (precision) can be
- flagged as well.
-
- 6. Inexact result (precision).
-
-
- 3.2.8 Standard Underflow/Overflow Exception Handler
-
- As long as the underflow and overflow exceptions are masked, no additional
- software is required to cause the output of the 80387 to conform to the
- requirements of IEEE Std 754. When unmasked, these exceptions give the
- exception handler an additional option in the case of store instructions. No
- result is stored in memory; instead, the operand is left intact on the
- stack. The handler may round the significand of the operand on the stack to
- the destination's precision as the standard requires, or it may adjust the
- operand and reexecute the faulting instruction.
-
-
-
- Chapter 4 The 80387 Instruction Set
-
- ───────────────────────────────────────────────────────────────────────────
-
- This chapter describes the operation of all 80387 instructions. Within this
- section, the instructions are divided into six functional classes:
-
- ■ Data Transfer instructions
- ■ Nontranscendental instructions
- ■ Comparison instructions
- ■ Transcendental instructions
- ■ Constant instructions
- ■ Processor Control instructions
-
- Throughout this chapter, the instruction set is described as it appears to
- the ASM386 programmer who is coding a program. Not included in this chapter
- are details of instruction format, encoding, and execution times. This
- detailed information may be found in Appendix A. Refer also to Appendix B
- for a summary of the exceptions caused by each instruction.
-
-
- 4.1 Compatibility With the 80287 and 8087
-
- The instruction set for the 80387 NPX is largely the same as that for the
- 80287 NPX (used with 80286 systems) and that for the 8087 NPX (used with
- 8086 and 8088 systems). Most object programs generated for the 80287 or 8087
- will execute without change on the 80387. Several instructions are new to
- the 80387, and several 80287 and 8087 instructions perform no useful
- function on the 80387. Appendix C and Appendix D give details of these
- instruction set differences.
-
-
- 4.2 Numeric Operands
-
- The typical NPX instruction accepts one or two operands as inputs, operates
- on these, and produces a result as an output. An operand is most often the
- contents of a register or of a memory location. The operands of some
- instructions are predefined; for example, FSQRT always takes the square root
- of the number in the top NPX stack element. Others allow, or require, the
- programmer to explicitly code the operand(s) along with the instruction
- mnemonic. Still others accept one explicit operand and one implicit
- operand, which is usually the top NPX stack element. All 80387 instructions
- that have a data operand use ST as one operand or as the only operand.
-
- Whether supplied by the programmer or utilized automatically, the two basic
- types of operands are sources and destinations. A source operand simply
- supplies one of the inputs to an instruction; it is not altered by the
- instruction. Even when an instruction converts the source operand from one
- format to another (e.g., real to integer), the conversion is actually
- performed in an internal work area to avoid altering the source operand. A
- destination operand may also provide an input to an instruction. It is
- distinguished from a source operand, however, because its content may be
- altered when it receives the result produced by the operation; that is, the
- destination is replaced by the result.
-
- Many instructions allow their operands to be coded in more than one way.
- For example, FADD (add real) may be written without operands, with only a
- source or with a destination and a source. The instruction descriptions in
- this section employ the simple convention of separating alternative operand
- forms with slashes; the slashes, however, are not coded. Consecutive slashes
- indicate an option of no explicit operands. The operands for FADD are thus
- described as
-
- //source/destination, source
-
- This means that FADD may be written in any of three ways:
-
- Written Form Action
-
- FADD Add ST to ST(1), put result in ST(1), then pop ST
- FADD source Add source to ST(0)
- FADD destination, source Add source to destination
-
- The assembler can allow the same instruction to be specified in different
- ways; for example:
-
- FADD = FADDP ST(1), ST
- FADD ST(1) = FADD ST, ST(1)
-
- When reading this section, it is important to bear in mind that memory
- operands may be coded with any of the CPU's memory addressing methods
- provided by the ModR/M byte. To review these methods (BASE + (INDEX * SCALE)
- + DISPLACEMENT) refer to the 80386 Programmer's Reference Manual.
- Chapter 5 also provides several addressing mode examples.
-
-
- 4.3 Data Transfer Instructions
-
- These instructions (summarized in Table 4-1) move operands among elements
- of the register stack, and between the stack top and memory. Any of the
- seven data types can be converted to extended real and loaded (pushed) onto
- the stack in a single operation; they can be stored to memory in the same
- manner. The data transfer instructions automatically update the 80387 tag
- word to reflect whether the register is empty or full following the
- instruction.
-
- Table 4-1. Data Transfer Instructions
-
- Real Transfers
- FLD Load Real
- FST Store real
- FSTP Store real and pop
- FXCH Exchange registers
- Integer Transfers
- FILD Integer load
- FIST Integer store
- FISTP Integer store and pop
- Packed Decimal Transfers
- FBLD Packed decimal (BCD) load
- FBSTP Packed decimal (BCD) store and pop
-
-
- 4.3.1 FLD source
-
- FLD (load real) loads (pushes) the source operand onto the top of the
- register stack. This is done by decrementing the stack pointer by one and
- then copying the content of the source to the new stack top. ST(7) must be
- empty to avoid causing an invalid-operation exception. The new stack top is
- tagged nonempty. The source may be a register on the stack (ST(i)) or any of
- the real data types in memory. If the source is a register, the register
- number used is that before TOP is decremented by the instruction. Coding FLD
- ST(0) duplicates the stack top. Single and double real source operands are
- converted to extended real automatically. Loading an extended real operand
- does not require conversion; therefore, the I and D exceptions do not occur
- in this case.
-
-
- 4.3.2 FST destination
-
- FST (store real) copies the NPX stack top to the destination, which
- may be another register on the stack or a single or double (but not
- extended-precision) memory operand. If the destination is single or double
- real, the copy of the significand is rounded to the width of the destination
- according to the RC field of the control word, and the copy of the exponent
- is converted to the width and bias of the destination format. The
- over/underflow condition is checked for as well.
-
- If, however, the stack top contains zero, ±∞, or a NaN, then the stack
- top's significand is not rounded but is chopped (on the right) to fit the
- destination. Neither is the exponent converted, rather it also is chopped on
- the right and transferred "as is". This preserves the value's identification
- as ∞ or a NaN (exponent all ones) so that it can be properly loaded and used
- later in the program if desired.
-
- Note that the 80387 does not signal the invalid-operation exception when
- the destination is a nonempty stack element.
-
-
- 4.3.3 FSTP destination
-
- FSTP (store real and pop) operates identically to FST except that the NPX
- stack is popped following the transfer. This is done by tagging the top
- stack element empty and then incrementing TOP. FSTP also permits storing to
- an extended-precision real memory variable, whereas FST does not. If the
- source operand is a register, the register number used is that before TOP is
- incremented by the instruction. Coding FSTP ST(0) is equivalent to popping
- the stack with no data transfer.
-
-
- 4.3.4 FXCH //destination
-
- FXCH (exchange registers) swaps the contents of the destination and the
- stack top registers. If the destination is not coded explicitly, ST(1) is
- used. Many 80387 instructions operate only on the stack top; FXCH provides a
- simple means of effectively using these instructions on lower stack
- elements. For example, the following sequence takes the square root of the
- third register from the top (assuming that ST is nonempty):
-
- FXCH ST(3)
- FSQRT
- FXCH ST(3)
-
-
- 4.3.5 FILD source
-
- FILD (integer load) converts the source memory operand from its binary
- integer format (word, short, or long) to extended real and pushes the result
- onto the NPX stack. ST(7) must be empty to avoid causing an exception. The
- (new) stack top is tagged nonempty. FILD is an exact operation; the source
- is loaded with no rounding error.
-
-
- 4.3.6 FIST destination
-
- FIST (integer store) stores the content of the stack top to an integer
- according to the RC field (rounding control) of the control word and
- transfers the result to the destination, leaving the stack top unchanged.
- The destination may define a word or short integer variable. Negative zero
- is stored in the same encoding as positive zero: 0000...00.
-
-
- 4.3.7 FISTP destination
-
- FISTP (integer and pop) operates like FIST except that it also pops the NPX
- stack following the transfer. The destination may be any of the binary
- integer data types.
-
-
- 4.3.8 FBLD source
-
- FBLD (packed decimal (BCD) load) converts the content of the source operand
- from packed decimal to extended real and pushes the result onto the NPX
- stack. ST(7) must be empty to avoid causing an exception. The sign of the
- source is preserved, including the case where the value is negative zero.
- FBLD is an exact operation; the source is loaded with no rounding error.
-
- The packed decimal digits of the source are assumed to be in the range 0-9.
- The instruction does not check for invalid digits (A-FH), and the result of
- attempting to load an invalid encoding is undefined.
-
-
- 4.3.9 FBSTP destination
-
- FBSTP (packed decimal (BCD) store and pop) converts the content of the
- stack top to a packed decimal integer, stores the result at the destination
- in memory, and pops the stack. FBSTP rounds a nonintegral value according to
- the RC (rounding control) field of the control word.
-
-
- 4.4 Nontranscendental Instructions
-
- The 80387's nontranscendental instruction set (Table 4-2) provides a wealth
- of variations on the basic add, subtract, multiply, and divide operations,
- and a number of other useful functions. These range from a simple absolute
- value to a square root instruction that executes faster than ordinary
- division; 80387 programmers no longer need to spend valuable time
- eliminating square roots from algorithms because they run too slowly. Other
- nontranscendental instructions perform exact modulo division, round real
- numbers to integers, and scale values by powers of two.
-
- The 80387's basic nontranscendental instructions (addition, subtraction,
- multiplication, and division) are designed to encourage the development of
- very efficient algorithms. In particular, they allow the programmer to
- reference memory as easily as the NPX register stack.
-
- Table 4-3 summarizes the available operation/operand forms that are
- provided for basic arithmetic. In addition to the four normal operations,
- two "reversed" instructions make subtraction and division "symmetrical" like
- addition and multiplication. The variety of instruction and operand forms
- give the programmer unusual flexibility:
-
- ■ Operands may be located in registers or memory.
-
- ■ Results may be deposited in a choice of registers.
-
- ■ Operands may be a variety of NPX data types: extended real, double
- real, single real, short integer or word integer, with automatic
- conversion to extended real performed by the 80387.
-
- Five basic instruction forms may be used across all six operations, as
- shown in Table 4-3. The classical stack form may be used to make the 80387
- operate like a classical stack machine. No operands are coded in this form,
- only the instruction mnemonic. The NPX picks the source operand from the
- stack top and the destination from the next stack element. It then pops the
- stack, performs the operation, and returns the result to the new stack top,
- effectively replacing the operands by the result.
-
- The register form is a generalization of the classical stack form; the
- programmer specifies the stack top as one operand and any register on the
- stack as the other operand. Coding the stack top as the destination provides
- a convenient way to access a constant, held elsewhere in the stack, from the
- stack top. The destination need not always be ST, however. All two operand
- instructions allow use of another register as the destination. This coding
- (ST is the source operand) allows, for example, adding the stack top into a
- register used as an accumulator.
-
- Often the operand in the stack top is needed for one operation but then is
- of no further use in the computation. The register pop form can be used to
- pick up the stack top as the source operand, and then discard it by popping
- the stack. Coding operands of ST(1), ST with a register pop mnemonic is
- equivalent to a classical stack operation: the top is popped and the result
- is left at the new top.
-
- The two memory forms increase the flexibility of the 80387's
- nontranscendental instructions. They permit a real number or a binary
- integer in memory to be used directly as a source operand. This is useful in
- situations where operands are not used frequently enough to justify holding
- them in registers. Note that any memory addressing method may be used to
- define these operands, so they may be elements in arrays, structures, or
- other data organizations, as well as simple scalars.
-
- The six basic operations are discussed further in the next paragraphs, and
- descriptions of the remaining seven operations follow.
-
-
- Table 4-2. Nontranscendental Instructions
-
- Addition
- FADD Add real
- FADDP Add real and pop
- FIADD Integer add
-
- Subtraction
- FSUB Subtract real
- FSUBP Subtract real and pop
- FISUB Integer subtract
- FSUBR Subtract real reversed
- FSUBRP Subtract real reversed and pop
- FISUBR Integer subtract reversed
-
- Multiplication
- FMUL Multiply real
- FMULP Multiply real and pop
- FIMUL Integer multiply
-
- Division
- FDIV Divide real
- FDIVP Divide real and pop
- FIDIV Integer divide
- FDIVR Divide real reversed
- FDIVRP Divide real reversed and pop
- FIDIVR Integer divide reversed
-
- Other Operations
- FSQRT Square root
- FSCALE Scale
- FPREM Partial remainder
- FPREM1 IEEE standard partial remainder
- FRNDINT Round to integer
- FXTRACT Extract exponent and significand
- FABS Absolute value
- FCHS Change sign
-
-
- Table 4-3. Basic Nontranscendental Instructions and Operands
-
- Instruction Form Mnemonic Operand Forms
- Form destination, source ASM386 Example
-
- Classical stack Fop [ST(1), ST] FADD
- Classical stack, extra pop FopP [ST(1), ST] FADDP
- Register Fop ST(i), ST or ST, ST(i) FSUB ST, ST(3)
- Register pop FopP ST(i), ST FMULP ST(2), ST
- Real memory Fop [ST,] single/double FDIV AZIMUTH
- Integer memory FIop [ST,] word-integer/ FIDIV PULSES
- short-integer
-
- ───────────────────────────────────────────────────────────────────────────
- NOTES
- Brackets ([]) surround implicit operands; these are not coded, and are
- shown here for information only.
-
- op= ADD destination destination + source
- SUB destination destination - source
- SUBR destination source - destination
- MUL destination destination * source
- DIV destination destination ÷ source
- DIVR destination source ÷ destination
- ───────────────────────────────────────────────────────────────────────────
-
-
- 4.4.1 Addition
-
- FADD //source/destination,source
- FADDP //destination,source
- FIADD source
-
- The addition instructions (add real, add real and pop, integer add) add the
- source and destination operands and return the sum to the destination. The
- operand at the stack top may be doubled by coding:
-
- FADD ST, ST(0)
-
- If the source operand is in memory, conversion of an integer, a single
- real, or a double real operand to extended real is performed automatically.
-
-
- 4.4.2 Normal Subtraction
-
- FSUB //source/destination,source
- FSUBP //destination,source
- FISUB source
-
- The normal subtraction instructions (subtract real, subtract real and pop,
- integer subtract) subtract the source operand from the destination and
- return the difference to the destination.
-
-
- 4.4.3 Reversed Subtraction
-
- FSUBR //source/destination,source
- FSUBRP //destination,source
- FISUBR source
-
- The reversed subtraction instructions (subtract real reversed, subtract
- real reversed and pop, integer subtract reversed) subtract the destination
- from the source and return the difference to the destination. For example,
- FSUBR ST, ST(1) means subtract ST from ST(1) and leave the result in ST.
-
-
- 4.4.4 Multiplication
-
- FMUL //source/destination,source
- FMULP //destination,source
- FIMUL source
-
- The multiplication instructions (multiply real, multiply real and pop,
- integer multiply) multiply the source and destination operands and return
- the product to the destination. Coding FMUL ST, ST(0) squares the content of
- the stack top.
-
-
- 4.4.5 Normal Division
-
- FDIV //source/destination,source
- FDIVP //destination,source
- FIDIV source
-
- The normal division instructions (divide real, divide real and pop, integer
- divide) divide the destination by the source and return the quotient to the
- destination.
-
-
- 4.4.6 Reversed Division
-
- FDIVR //source/destination,source
- FDIVRP //destination,source
- FIDIVR source
-
- The reversed division instructions (divide real reversed, divide real
- reversed and pop, integer divide reversed) divide the source operand by the
- destination and return the quotient to the destination.
-
-
- 4.4.7 FSQRT
-
- FSQRT (square root) replaces the content of the top stack element with its
- square root. (Note: The square root of -0 is defined to be -0.)
-
-
- 4.4.8 FSCALE
-
- FSCALE (scale) interprets the value contained in ST(1) as an integer and
- adds this value to the exponent of the number in ST. This is equivalent to
-
- ST ST * 2^(ST(1))
-
- Thus, FSCALE provides rapid multiplication or division by integral powers
- of 2. It is particularly useful for scaling the elements of a vector.
-
- There is no limit on the range of the scale factor in ST(1). If the value
- is not integral, FSCALE uses the nearest integer smaller in magnitude; i.e.,
- it chops the value toward 0. If the resulting integer is zero, the value in
- ST is not changed.
-
-
- 4.4.9 FPREM ── Partial Remainder (80287/8087-Compatible)
-
- FPREM computes the remainder of division of ST by ST(1) and leaves the
- result in ST. FPREM finds a remainder REM and a quotient Q such that
-
- REM = ST - ST(1)*Q
-
- The quotient Q is chosen to be the integer obtained by chopping the exact
- value of ST/ST(1) toward zero. The sign of the remainder is the same as the
- sign of the original dividend from ST.
-
- By ignoring precision control, the 80387 produces an exact result with
- FPREM. The precision (inexact) exception does not occur and the rounding
- control has no effect.
-
- The FPREM instruction is not the remainder operation specified in the IEEE
- standard. To get that remainder, the FPREM1 instruction should be used.
-
- The FPREM instruction is designed to be executed iteratively in a
- software-controlled loop. It operates by performing successive scaled
- subtractions; therefore, obtaining the exact remainder when the operands
- differ greatly in magnitude can consume large amounts of execution time.
- Because the 80387 can only be preempted between instructions, the remainder
- function could seriously increase interrupt latency in these cases. For
- this reason, the maximum number of iterations is limited. The instruction
- may terminate before it has completely terminated the calculation. The C2
- bit of the status word indicates whether the calculation is complete or
- whether the instruction must be executed again.
-
- FPREM can reduce the exponent of ST by up to (but not including) 64 in one
- execution. If FPREM produces a remainder that is less than the modulus
- (i.e., the divisor), the function is complete and bit C2 of the status word
- condition code is cleared. If the function is incomplete, C2 is set to 1;
- the result in ST is then called the partial remainder. Software can inspect
- C2 by storing the status word following execution of FPREM, reexecuting the
- instruction (using the partial remainder in ST as the dividend) until C2 is
- cleared. A higher priority interrupting routine that needs the 80387 can
- force a context switch between the instructions in the remainder loop.
-
- An important use for FPREM is to reduce arguments (operands) of
- transcendental functions to the range permitted by these instructions. For
- example, the FPTAN (tangent) instruction requires its argument ST to be less
- than 2^(63). For π/4 < │ST│ < 2^(63), FPTAN (as well as the other
- trigonometric instructions) performs an internal reduction of ST to a value
- less than π/4 using an internally stored π/4 divisor that has 67 significant
- bits. Because of its greater accuracy, this method of reduction is
- recommended when the argument is within the required range.
-
- However, when │ST│ ≥ 2^(63), FPREM can be employed to reduce ST. With π/4 as
- a modulus, FPREM can reduce an argument so that it is within range of FPTAN
- and so that no further reduction is required by FPTAN.
-
- Because FPREM produces an exact result, the argument reduction does not
- introduce roundoff error into the calculation, even if several iterations
- are required to bring the argument into range. However, π is never accurate.
- The rounding of π, when it is used by FPREM to reduce an argument for a
- periodic trigonometric function, does not create the effect of a rounded
- argument, but of a rounded period.
-
- When reduction is complete, FPREM provides the least-significant three bits
- of the quotient generated by FPREM (in C{3}, C{1}, C{0}). This is also
- important for transcendental argument reduction, because it locates the
- original angle in the correct one of eight π/4 segments of the unit circle
- (see Table 4-4).
-
-
- Table 4-4. Condition Code Interpretation after FPREM and FPREM1
- Instructions
-
- ┌── Condition Code ──┐ Interpretation after
- C2(PF) C3 C1 C0 FPREM and FPREM1
- ───────────────────────────────────────────────────────────────────────────
- Incomplete Reduction:
- 1 X X X ─── further interation required
- or complete reduction
- Q1 Q0 Q2 Q MOD 8
-
- 0 0 0 0 ─┐
- 0 1 0 1 │
- 1 0 0 2 │ Complete Reduction:
- 0 1 1 0 3 ├─ C0, C3, C1 contain three least
- 0 0 1 4 │ significant bits of quotient
- 0 1 1 5 │
- 1 0 1 6 │
- 1 1 1 7 ─┘
-
-
- 4.4.10 FPREM1──Partial Remainder (IEEE Std. 754-Compatible)
-
- FPREM1 computes the remainder of division of ST by ST(1) and leaves the
- result in ST. FPREM1 finds a remainder REM1 and a quotient Q1 such that
-
- REM1 = ST - ST(1)*Q1
-
- The quotient Q1 is chosen to be the integer nearest to the exact value of
- ST/ST(1). When ST/ST(1) is exactly N + 1/2 (for some integer N), there are
- two integers equally close to ST/ST(1). In this case the value chosen for Q1
- is the even integer.
-
- The result produced by FPREM1 is always exact; no rounding is necessary,
- and therefore the precision exception does not occur and the rounding
- control has no effect.
-
- The FPREM1 instruction is designed to be executed iteratively in a
- software-controlled loop. FPREM1 operates by performing successive scaled
- subtractions; therefore, obtaining the exact remainder when the operands
- differ greatly in magnitude can consume large amounts of execution time.
- Because the 80387 can only be preempted between instructions, the remainder
- function could seriously increase interrupt latency in these cases. For
- this reason, the maximum number of iterations is limited. The instruction
- may terminate before it has completely terminated the calculation. The C2
- bit of the status word indicates whether the calculation is complete or
- whether the instruction must be executed again.
-
- FPREM1 can reduce the exponent of ST by up to (but not including) 64 in one
- execution. If FPREM1 produces a remainder that is less than the modulus
- (i.e., the divisor), the function is complete and bit C2 of the status word
- condition code is cleared. If the function is incomplete, C2 is set to 1;
- the result in ST is then called the partial remainder. Software can inspect
- C2 by storing the status word following execution of FPREM1, reexecuting
- the instruction (using the partial remainder in ST as the dividend) until C2
- is cleared. When C2 is cleared, FPREM1 also provides the least-significant
- three bits of the quotient generated by FPREM1 (in C{3}, C{1}, C{0}).
-
- The uses for FPREM1 are the same as those for FPREM.
-
- FPREM1 differs from FPREM it these respects:
-
- ■ FPREM and FPREM1 choose the value of the quotient differently; the
- low-order three bits of the quotient as reported in bits C3, C1, C0 of
- the status word may differ by one in some cases.
-
- ■ FPREM and FPREM1 may produce different remainders. FPREM produces a
- remainder R such that 0 ≤ R < │ST(1)│ or -│ST(1)│ < R ≤ 0, depending
- on the sign of the dividend. FPREM1 produces a remainder R1 such that
- -│ST(1)│/2 < R1 < +│ST(1)│/2.
-
-
- 4.4.11 FRNDINT
-
- FRNDINT (round to integer) rounds the top stack element to an integer
- according to the RC bits of the control word. For example, assume that ST
- contains the 80387 real number encoding of the decimal value 155.625.
- FRNDINT will change the value to 155 if the RC field of the control word is
- set to down or chop, or to 156 if it is set to up or nearest.
-
-
- 4.4.12 FXTRACT
-
- FXTRACT (extract exponent and significand) performs a superset of the
- IEEE-recommended logb(x) function by "decomposing" the number in the stack
- top into two numbers that represent the actual value of the operand's
- exponent and significand fields. The "exponent" replaces the original
- operand on the stack and the "significand" is pushed onto the stack. (ST(7)
- must be empty to avoid causing the invalid-operation exception.) Following
- execution of FXTRACT, ST (the new stack top) contains the value of the
- original significand expressed as a real number: its sign is the same as the
- operand's, its exponent is 0 true (16,383 or 3FFFH biased), and its
- significand is identical to the original operand's. ST(1) contains the value
- of the original operand's true (unbiased) exponent expressed as a real
- number.
-
- If the original operand is zero, FXTRACT leaves -∞ in ST(1) (the exponent)
- while ST is assigned the value zero with a sign equal to that of the
- original operand. The zero-divide exception is raised in this case, as well.
-
- To illustrate the operation of FXTRACT, assume that ST contains a number
- whose true exponent is +4 (i.e., its exponent field contains 4003H). After
- executing FXTRACT, ST(1) will contain the real number +4.0; its sign will be
- positive, its exponent field will contain 4001H (+2 true) and its
- significand field will contain 1{}00...00B. In other words, the value in
- ST(1) will be 1.0 * 2² = 4. If ST contains an operand whose true exponent
- is -7 (i.e., its exponent field contains 3FF8H), then FXTRACT will return an
- "exponent" of -7.0; after the instruction executes, ST(1)'s sign and
- exponent fields will contain C001H (negative sign, true exponent of 2), and
- its significand will be 1{}1100...00B. In other words, the value in ST(1)
- will be -1.75 * 2² = -7.0. In both cases, following FXTRACT, ST's sign and
- significand fields will be the same as the original operand's, and its
- exponent field will contain 3FFFH (0 true).
-
- FXTRACT is useful for power and range scaling operations. Both FXTRACT and
- the base 2 exponential instruction F2XM1 are needed to perform a general
- power operation. Converting numbers in 80387 extended real format to decimal
- representations (e.g., for printing or displaying) requires not only FBSTP
- but also FXTRACT to allow scaling that does not overflow the range of the
- extended format. FXTRACT can also be useful for debugging, because it allows
- the exponent and significand parts of a real number to be examined
- separately.
-
-
- 4.4.13 FABS
-
- FABS (absolute value) changes the top stack element to its absolute value
- by making its sign positive. Note that the invalid-operation exception is
- not signaled even if the operand is a signaling NaN or has a format that is
- not supported.
-
-
- 4.4.14 FCHS
-
- FCHS (change sign) complements (reverses) the sign of the top stack
- element. Note that the invalid-operation exception is not signaled even if
- the operand is a signaling NaN or has a format that is not supported.
-
-
- 4.5 Comparison Instructions
-
- The instructions of this class allow comparison of numbers of all supported
- real and integer data types. Each of these instructions (Table 4-5)
- analyzes the top stack element, often in relationship to another operand,
- and reports the result as a condition code in the status word.
-
- The basic operations are compare, test (compare with zero), and examine
- (report type, sign, and normalization). Special forms of the compare
- operation are provided to optimize algorithms by allowing direct comparisons
- with binary integers and real numbers in memory, as well as popping the
- stack after a comparison.
-
- The FSTSW (store status word) instruction may be used following a
- comparison to transfer the condition code to memory or to the 80386 AX
- register for inspection. The 80386 SAHF instruction is recommended for
- copying the 80387 flags from AX to the 80386 flags for easy conditional
- branching.
-
- Note that instructions other than those in the comparison group may update
- the condition code. To ensure that the status word is not altered
- inadvertently, store it immediately following a comparison operation.
-
-
- Table 4-5. Comparison Instructions
-
- FCOM Compare real
- FCOMP Compare real and pop
- FCOMPP Compare real and pop twice
- FICOM Integer compare
- FICOMP Integer compare and pop
- FTST Test
- FUCOM Unordered compare real
- FUCOMP Unordered compare real and pop
- FUCOMPP Unordered compare real and pop twice
- FXAM Examine
-
-
- 4.5.1 FCOM //source
-
- FCOM (compare real) compares the stack top to the source operand. The
- source operand may be a register on the stack, or a single or double real
- memory operand. If an operand is not coded, ST is compared to ST(1). The
- sign of zero is ignored, so that +0 = -0. Following the instruction, the
- condition codes reflect the order of the operands as shown in Table 4-6.
-
- If either operand is a NaN (either quiet or signaling) or an undefined
- format, or if a stack fault occurs, the invalid-operation exception is
- raised and the condition bits are set to "unordered."
-
-
- Table 4-6. Condition Code Resulting from Comparisons
-
- 80386
- Order C3 (ZF) C2 (PF) C0 (CF) Conditional
- Branch
-
- ST > Operand 0 0 0 JA
- ST < Operand 0 0 1 JB
- ST = Operand 1 0 0 JE
- Unordered 1 1 1 JP
-
-
- 4.5.2 FCOMP //source
-
- FCOMP (compare real and pop) operates like FCOM, and in addition pops the
- stack.
-
-
- 4.5.3 FCOMPP
-
- FCOMPP (compare real and pop twice) operates like FCOM and additionally
- pops the stack twice, discarding both operands. FCOMPP always compares ST to
- ST(1); no operands may be explicitly specified.
-
-
- 4.5.4 FICOM source
-
- FICOM (integer compare) converts the source operand, which may reference a
- word or short binary integer variable, to extended real and compares the
- stack top to it. The condition code bits in the status word are set as for
- FCOM.
-
-
- 4.5.5 FICOMP source
-
- FICOMP (integer compare and pop) operates identically to FICOM and
- additionally discards the value in ST by popping the NPX stack.
-
-
- 4.5.6 FTST
-
- FTST (test) tests the top stack element by comparing it to zero. The result
- is posted to the condition codes as shown in Table 4-7.
-
-
- Table 4-7. Condition Code Resulting from FTST
-
- 83086
- Order C3 (ZF) C2 (ZF) C0 (ZF) Conditional
- Branch
-
- ST > 0.0 0 0 0 JA
- ST < 0.0 0 0 1 JB
- ST = 0.0 1 0 0 JE
- Unordered 1 1 1 JP
-
-
- 4.5.7 FUCOM //source
-
- FUCOM (unordered compare real) operates like FCOM, with two differences:
-
- 1. It does not cause an invalid-operation exception when one of the
- operands is a NaN. If either operand is a NaN, the condition bits of
- the status word are set to unordered as shown in Table 4-6.
-
- 2. Only operands on the NPX stack can be compared.
-
-
- 4.5.8 FUCOMP //source
-
- FUCOMP (unordered compare real and pop) operates like FUCOM and in addition
- pops the NPX stack.
-
-
- 4.5.9 FUCOMPP
-
- FUCOMPP (unordered compare real and pop) operates like FUCOM and in
- addition pops the NPX stack twice, discarding both operands. FUCOMPP always
- compares ST to ST(1); no operands can be explicitly specified.
-
-
- 4.5.10 FXAM
-
- FXAM (examine) reports the content of the top stack element as
- positive/negative and NaN, denormal, normal, zero, infinity, unsupported, or
- empty. Table 4-8 lists and interprets all the condition code values that
- FXAM generates.
-
-
- 4.6 Transcendental Instructions
-
- The instructions in this group (Table 4-9) perform the time-consuming core
- calculations for all common trigonometric, inverse trigonometric,
- hyperbolic, inverse hyperbolic, logarithmic, and exponential functions. The
- transcendentals operate on the top one or two stack elements, and they
- return their results to the stack. The trigonometric operations assume their
- arguments are expressed in radians. The logarithmic and exponential
- operations work in base 2.
-
- The results of transcendental instructions are highly accurate. The
- absolute value of the relative error of the transcendental instructions is
- guaranteed to be less than 2^(-62). (Relative error is the ratio between the
- absolute error and the exact value.)
-
- The trigonometric functions accept a practically unrestricted range of
- operands, whereas the other transcendental instructions require that
- arguments be more restricted in range. FPREM or FPREM1 may be used to bring
- the otherwise valid operand of a periodic function into range. Prologue and
- epilogue software may be used to reduce arguments for other instructions to
- the expected range and to adjust the result to correspond to the original
- arguments if necessary. The instruction descriptions in this section
- document the allowed operand range for each instruction.
-
-
- Table 4-8. Condition Code Defining Operand Class
-
- C3 C2 C1 C0 Value at TOP
-
- 0 0 0 0 +Unsupported
- 0 0 0 1 +NaN
- 0 0 1 0 -Unsupported
- 0 0 1 1 -NaN
- 0 1 0 0 +Normal
- 0 1 0 1 +Infinity
- 0 1 1 0 -Normal
- 0 1 1 1 -Infinity
- 1 0 0 0 +0
- 1 0 0 1 +Empty
- 1 0 1 0 -0
- 1 0 1 1 -Empty
- 1 1 0 0 +Denormal
- 1 1 1 0 -Denormal
-
-
- Table 4-9. Transcendental Instructions
-
- FSIN Sine
- FCOS Cosine
- FSINCOS Sine and cosine
- FPTAN Tangent of ST
- FPATAN Arctangent of ST(1)/ST
- F2XM1 2{X-1}
- FYL2X Y * log{2}X; Y is ST(1), X is ST
- FYL2XP1 Y * log{2}(X + 1); Y is ST(1), X is ST
-
-
- 4.6.1 FCOS
-
- When complete, this function replaces the contents of ST with COS(ST). ST,
- expressed in radians, must lie in the range │Θ│ < 2^(63) (for most practical
- purposes unrestricted). If ST is in range, C2 of the status word is cleared
- and the result of the operation is produced.
-
- If the operand is outside of the range, C2 is set to one (function
- incomplete) and ST remains intact (i.e., no reduction of the operand is
- performed). It is the programmers responsibility to reduce the operand to an
- absolute value smaller than 2^(63). The instructions FPREM1 and FPREM are
- available for this purpose.
-
-
- 4.6.2 FSIN
-
- When complete, this function replaces the contents of ST with SIN(ST). FSIN
- is equivalent to FCOS in the way it reduces the operand. ST is expressed in
- radians.
-
-
- 4.6.3 FSINCOS
-
- When complete, this instruction replaces the contents of ST with SIN(ST),
- then pushes COS(ST) onto the stack. (ST(7) must be empty to avoid an invalid
- exception.) FSINCOS is equivalent to FCOS in the way it reduces the operand.
- ST is expressed in radians.
-
-
- 4.6.4 FPTAN
-
- When complete, FPTAN (partial tangent) computes the function Y = TAN (ST).
- ST is expressed in radians. Y replaces ST, then the value 1 is pushed,
- becoming the new stack top. (ST(7) must be empty to avoid an invalid
- exception.) When the function is complete ST(1) = TAN (arg) and ST = 1.
- FPTAN is equivalent to FCOS in the way it reduces the operand.
-
- The fact that FPTAN places two results on the stack maintains compatibility
- with the 8087/80287 and aids the calculation of other trigonometric
- functions that can be derived from tan via standard trigonometric
- identities. For example, the cot function is given by this identity:
-
- cot x = 1/tan x.
-
- Therefore, simply executing the reverse divide instruction FDIVR after
- FPTAN yields the cot function.
-
-
- 4.6.5 FPATAN
-
- FPATAN (arctangent) computes the function Θ = ARCTAN (Y/X). X is taken from
- ST(0) and Y from ST(1). The instruction pops the NPX stack and returns Θ to
- the (new) stack top, overwriting the Y operand. The result is expressed in
- radians. The range of operands is not restricted; however, the range of the
- result depends on the relationship between the operands according to Table
- 4-10.
-
- The fact that the argument of FPATAN is a ratio aids calculation of other
- trigonometric functions, including Arcsin and Arccos. These can be derived
- from Arctan via standard trigonometric identities. For example, the Arcsin
- function can be easily calculated using this identity:
-
- Arcsin x = Arctan (x / √(1 - x²)).
-
- Thus, to find Arcsin (Y), push Y onto the NPX stack, then calculate
- X = √(1 - Y²), pushing the result X onto the stack. Executing FPATAN then
- leaves Arcsin (Y) at the top of the stack.
-
-
- 4.6.6 F2XM1
-
- F2XM1 (2 to the X minus 1) calculates the function Y = 2^(X) - 1. X is taken
- from the stack top and must be in the range -1 ≤ X ≤ 1. The result Y
- replaces the argument X at the stack top. If the argument is out of range,
- the results are undefined.
-
- This instruction is designed to produce a very accurate result even when X
- is close to 0. For values of the argument very close in magnitude to 1, a
- larger error will be incurred. To obtain Y = 2^(X), add 1 to the result
- delivered by F2XM1.
-
- The following formulas show how values other than 2 may be raised to a
- power of X:
-
- 10^(X) = 2^(X * LOG2(10))
-
- e^(X) = 2^(X * LOG2(e))
-
- y^(X) = 2^(X * LOG2(Y))
-
- As shown in the next section, the 80387 has built-in instructions for
- loading the constants LOG{2}10 and LOG{2}e, and the FYL2X instruction may be
- used to calculate X*LOG{2}Y.
-
-
- Table 4-10. Results of FPATAN
-
- Sign(Y) Sign(X) │Y│ < │X│? Final Result
-
- + + Yes 0 < atan(Y/X) < π/4
- + + No π/4 < atan(Y/X) < π/2
- + - No π/2 < atan(Y/X) < 3 * π/4
- + - Yes 3 * π/4 < atan(Y/X) < π
- - + Yes -π/4 < atan(Y/X) < 0
- - + No -π/2 < atan(Y/X) < -π/4
- - - No -3 * π/4 < atan(Y/X) < -π/2
- - - Yes -π < atan(Y/X) < -3 * π/4
-
-
- 4.6.7 FYL2X
-
- FYL2X (Y log base 2 of X) calculates the function Z = Y * LOG{2}X. X is
- taken from the stack top and Y from ST(1). The operands must be in the
- following ranges:
-
- 0 ≤ X < +∞
- -∞ < Y < +∞
-
- The instruction pops the NPX stack and returns Z at the (new) stack top,
- replacing the Y operand. If the operand is out of range (i.e., in negative)
- the invalid-operation exception occurs.
-
- This function optimizes the calculations of log to any base other than two,
- because a multiplication is always required:
-
- LOG{N}x = (LOG{2}N){-1} * LOG{2}x
-
-
- 4.6.8 FYL2XP1
-
- FYL2XP1 (Y log base 2 of (X + 1)) calculates the function Z = Y*LOG{2}
- (X+1). X is taken from the stack top and must be in the range -(1-SQRT(2)/2)
- < X <1-SQRT(2)/2. Y is taken from ST(1) and is unlimited in range (-∞ < Y
- < +∞). FYL2XP1 pops the stack and returns Z at the (new) stack top,
- replacing Y. If the argument is out of range, the results are undefined.
-
- This instruction provides improved accuracy over FYL2X when computing the
- logarithm of a number very close to 1, for example 1 + ε where ε << 1.
- Providing ε rather than 1 + ε as the input to the function allows more
- significant digits to be retained.
-
-
- Table 4-11. Constant Instructions
-
- FLDZ Load + 0.0
- FLD1 Load + 1.0
- FLDPI Load π
- FLDL2T Load log{2}10
- FLDL2E Load log{2}e
- FLDLG2 Load log{10}2
- FLDLN2 Load log{e}2
-
-
- 4.7 Constant Instructions
-
- Each of these instructions (Table 4-11) loads (pushes) a commonly used
- constant onto the stack. (ST(7) must be empty to avoid an invalid
- exception.) The values have full extended real precision (64 bits) and are
- accurate to approximately 19 decimal digits. Because an external real
- constant occupies 10 memory bytes, the constant instructions, which are
- only two bytes long, save storage and improve execution speed, in addition
- to simplifying programming.
-
- The constants used by these instructions are stored internally in a format
- more precise even than extended real. When loading the constant, the 80387
- rounds the more precise internal constant according the RC (rounding
- control) bit of the control word. However, in spite of this rounding, the
- precision exception is not raised (to maintain compatibility). When the
- rounding control is set to round to nearest on the 80387, the 80387
- produces the same constant that is produced by the 80287.
-
-
- 4.7.1 FLDZ
-
- FLDZ (load zero) loads (pushes) +0.0 onto the NPX stack.
-
-
- 4.7.2 FLD1
-
- FLD1 (load one) loads (pushes) +1.0 onto the NPX stack.
-
-
- 4.7.3 FLDPI
-
- FLDPI (load π) loads (pushes) π onto the NPX stack.
-
-
- 4.7.4 FLDL2T
-
- FLDL2T (load log base 2 of 10) loads (pushes) the value LOG{2}10 onto the
- NPX stack.
-
-
- 4.7.5 FLDL2E
-
- FLDL2E (load log base 2 of e) loads (pushes) the value LOG{2}e onto the NPX
- stack.
-
-
- 4.7.6 FLDLG2
-
- FLDLG2 (load log base 10 of 2) loads (pushes) the value LOG{10}2 onto the
- NPX stack.
-
-
- 4.7.7 FLDLN2
-
- FLDLN2 (load log base e of 2) loads (pushes) the value LOG{e}2 onto the NPX
- stack.
-
-
- 4.8 Processor Control Instructions
-
- The processor control instructions are shown in Table 4-12. The instruction
- FSTSW is commonly used for conditional branching. The remaining instructions
- are not typically used in calculations; they provide control over the 80387
- NPX for system-level activities. These activities include initialization,
- exception handling, and task switching.
-
- As shown in Table 4-12, many of the NPX processor control instructions have
- two forms of assembler mnemonic:
-
- 1. A wait form, where the mnemonic is prefixed only with an F, such as
- FSTSW. This form checks for unmasked numeric exceptions.
-
- 2. A no-wait form, where the mnemonic is prefixed with an FN, such as
- FNSTSW. This form ignores unmasked numeric exceptions.
-
- When the control instruction is coded using the no-wait form of the
- mnemonic, the ASM386 assembler does not precede the ESC instruction with a
- wait instruction, and the CPU does not test the ERROR# status line from the
- NPX before executing the processor control instruction.
-
- Only the processor control class of instructions have this alternate
- no-wait form. All numeric instructions are automatically synchronized by the
- 80386; the CPU transfers all operands before initiating the next
- instruction. Because of this automatic synchronization by the 80386, numeric
- instructions for the 80387 need not be preceded by a CPU wait instruction
- in order to execute correctly.
-
- It should also be noted that the 8087 instructions FENI and FDISI and the
- 80287 instruction FSETPF perform no function in the 80387. If these opcodes
- are detected in an 80386/80387 instruction stream, the 80387 performs no
- specific operation and no internal states are affected. For programmers
- interested in porting numeric software from 80287 or 8087 environments to
- the 80386, however, it should be noted that program sections containing
- these exception-handling instructions are not likely to be completely
- portable to the 80387. Appendix C and Appendix D contains a more complete
- description of the differences between the 80387 and the 80287/8087.
-
-
- Table 4-12. Processor Control Instructions
-
- FINIT/FNINIT Initialize processor
- FLDCW Load control word
- FSTCW/FNSTCW Store control word
- FSTSW/FNSTSW Store status word
- FSTSW AX/FNSTSW AX Store status word to AX
- FCLEX/FNCLEX Clear exceptions
- FSTENV/FNSTENV Store environment
- FLDENV Load environment
- FSAVE/FNSAVE Save state
- FRSTOR Restore state
- FINCSTP Increment stack pointer
- FDECSTP Decrement stack pointer
- FFREE Free register
- FNOP No operation
- FWAIT CPU Wait
-
-
- 4.8.1 FINIT/FNINIT
-
- FINIT/FNINIT (initialize processor) sets the 80387 NPX into a known state,
- unaffected by any previous activity. It sets the control word to its default
- value 037FH (round to nearest, all exceptions masked, 64 bits of precision),
- clears the status word, and empties all floating-point stack registers. The
- no-wait form of this instruction causes the 80387 to abort any previous
- numeric operations currently executing in the NEU.
-
- This instruction performs the functional equivalent of a hardware RESET,
- with one exception: RESET causes the IM bit of the control word to be reset
- and the ES and IE bits of the status word to be set as a means of signaling
- the presence of an 80387; FINIT puts the opposite values in these bits.
-
- FINIT checks for unmasked numeric exceptions, FNINIT does not. Note that if
- FNINIT is executed while a previous 80387 memory-referencing instruction is
- running, 80387 bus cycles in progress are aborted. This instruction may be
- necessary to clear the 80387 if a processor-extension segment-overrun
- exception (interrupt 9) is detected by the CPU.
-
-
- 4.8.2 FLDCW source
-
- FLDCW (load control word) replaces the current processor control word with
- the word defined by the source operand. This instruction is typically used
- to establish or change the 80387's mode of operation. Note that if an
- exception bit in the status word is set, loading a new control word that
- unmasks that exception will activate the ERROR# output of the 80387. When
- changing modes, the recommended procedure is to first clear any exceptions
- and then load the new control word.
-
-
- 4.8.3 FSTCW/FNSTCW destination
-
- FSTCW/FNSTCW (store control word) writes the processor control word to the
- memory location defined by the destination. FSTCW checks for unmasked
- numeric exceptions; FNSTCW does not.
-
-
- 4.8.4 FSTSW/FNSTSW destination
-
- FSTSW/FNSTSW (store status word) writes the current value of the 80387
- status word to the destination operand in memory. The instruction is used to
-
- ■ Implement conditional branching following a comparison, FPREM, or
- FPREM1 instruction (FSTSW).
-
- ■ Invoke exception handlers (by polling the exception bits) in
- environments that do not use interrupts (FSTSW).
-
- FSTSW checks for unmasked numeric exceptions, FNSTSW does not.
-
-
- 4.8.5 FSTSW AX/FNSTSW AX
-
- FSTSW AX/FNSTSW AX (store status word to AX) is a special 80387 instruction
- that writes the current value of the 80387 status word directly into the
- 80386 AX register. This instruction optimizes conditional branching in
- numeric programs, where the 80386 CPU must test the condition of various NPX
- status bits. The waited form FSTSW AX checks for unmasked numeric
- exceptions, the non-waited form FNSTSW AX does not.
-
- When this instruction is executed, the 80386 AX register is updated with
- the NPX status word before the CPU executes any further instructions. The
- status stored is that from the completion of the prior ESC instruction.
-
-
- 4.8.6 FCLEX/FNCLEX
-
- FCLEX/FNCLEX (clear exceptions) clears all exception flags, the exception
- status flag and the busy flag in the status word. As a consequence, the
- 80387's ERROR# line goes inactive. FCLEX checks for unmasked numeric
- exceptions, FNCLEX does not.
-
-
- 4.8.7 FSAVE/FNSAVE destination
-
- FSAVE/FNSAVE (save state) writes the full 80387 state──environment plus
- register stack──to the memory location defined by the destination operand.
- Figure 4-1 and Figure 4-2 show the layout of the save area; the size and
- layout of the save the operating mode of the 80386 (real-address mode or
- protected mode) and on the operand-size attribute in effect for the
- instruction (32-bit operand or 16-bit operand). When the 80386 is in
- virtual-8086 mode, the real-address mode formats are used. Typically the
- instruction is coded to save this image on the CPU stack.
-
- The values in the tag word in memory are determined during the execution of
- FSAVE/FNSAVE. If the tag in the status register indicates that the
- corresponding register is nonempty, the 80387 examines the data in the
- register and stores the appropriate tag in memory. Thus the tag that is
- stored always reflects the actual content of the register.
-
- FNSAVE delays its execution until all NPX activity completes normally.
- Thus, the save image reflects the state of the NPX following the completion
- of any running instruction. After writing the state image to memory,
- FSAVE/FNSAVE initializes the 80387 as if FINIT/FNINIT had been executed.
-
- FSAVE/FNSAVE is useful whenever a program wants to save the current state
- of the NPX and initialize it for a new routine. Three examples are
-
- 1. An operating system needs to perform a context switch (suspend the
- task that had been running and give control to a new task).
-
- 2. An exception handler needs to use the 80387.
-
- 3. An application task wants to pass a "clean" 80387 to a subroutine.
-
- FSAVE checks for unmasked numeric exceptions before executing, FNSAVE does
- not.
-
-
- Figure 4-1. FSAVE/FRSTOR Memory Layout (32-Bit)
-
- 41 23 15 7 0
- ╔═══════════╪═══════════╪═══════════╪═══════════╗+0H
- ╟───────────|───────────|───────────|───────────╢+4H
- ╟───────────|──────── ────────|───────────╢+8H
- ╟───────────|───── ENVIRONMENT ─────|───────────╢+CH
- ╟───────────|──────── ────────|───────────╢+10H
- ╟───────────|───────────|───────────|───────────╢+14H
- ╟───────────|───────────|───────────|───────────╢+18H
- ╚═══════════╪═══════════╪═══════════╪═══════════╝
-
- ╔════╤════════╤══════════════════════════════════════════════╗
- ST(0)║SIGN│EXPONENT│ SIGNIFICAND ║+1CH
- ST(1)╟────┼────────┼──────────────────────────────────────────────╢+26H
- ST(2)╟────┼────────┼──────────────────────────────────────────────╢+30H
- ST(3)╟────┼────────┼──────────────────────────────────────────────╢+3AH
- ST(4)╟────┼────────┼──────────────────────────────────────────────╢+44H
- ST(5)╟────┼────────┼──────────────────────────────────────────────╢+4EH
- ST(6)╟────┼────────┼──────────────────────────────────────────────╢+58H
- ST(7)╟────┼────────┼──────────────────────────────────────────────╢+62H
- ╚════╧════════╧══════════════════════════════════════════════╝
- 79 78 64 63 0
-
-
- Figure 4-2. FSAVE/FRSTOR Memory Layout (16-Bit)
-
- 15 7 0
- ╔══════════╪══════════╗+0H
- ╟──────────|──────────╢+2H
- ╟─────── ───────╢+4H
- ╟──── ENVIRONMENT ────╢+6H
- ╟─────── ───────╢+8H
- ╟──────────|──────────╢+AH
- ╟──────────|──────────╢+CH
- ╚══════════╪══════════╝
-
- ╔════╤════════╤══════════════════════════════════════════════╗
- ST(0)║SIGN│EXPONENT│ SIGNIFICAND ║+EH
- ST(1)╟────┼────────┼──────────────────────────────────────────────╢+18H
- ST(2)╟────┼────────┼──────────────────────────────────────────────╢+22H
- ST(3)╟────┼────────┼──────────────────────────────────────────────╢+2CH
- ST(4)╟────┼────────┼──────────────────────────────────────────────╢+36H
- ST(5)╟────┼────────┼──────────────────────────────────────────────╢+40H
- ST(6)╟────┼────────┼──────────────────────────────────────────────╢+4AH
- ST(7)╟────┼────────┼──────────────────────────────────────────────╢+54H
- ╚════╧════════╧══════════════════════════════════════════════╝
- 79 78 64 63 0
-
-
- 4.8.8 FRSTOR source
-
- FRSTOR (restore state) reloads the 80387 state from the memory area defined
- by the source operand. This information should have been written by a
- previous FSAVE/FNSAVE instruction and not altered by any other instruction.
- FRSTOR automatically waits checking for interrupts until all data transfers
- are completed before continuing to the next instruction.
-
- Note that the 80387 "reacts" to its new state at the conclusion of the
- FRSTOR. It generates an exception request, for example, if the exception and
- mask bits in the memory image so indicate when the next WAIT or
- exception-checking ESC instruction is executed.
-
-
- 4.8.9 FSTENV/FNSTENV destination
-
- FSTENV/FNSTENV (store environment) writes the 80387's basic
- status──control, status, and tag words, and exception pointers──to the
- memory location defined by the destination operand. Typically, the
- environment is saved on the CPU stack. FSTENV/FNSTENV is often used by
- exception handlers because it provides access to the exception pointers
- that identify the offending instruction and operand. After saving the
- environment, FSTENV/FNSTENV sets all exception masks in the 80387 control
- word (i.e., masks all exceptions). FSTENV checks for pending exceptions
- before executing, FNSTENV does not.
-
- Figures 4-3 through 4-6 show the format of the environment data in memory
- the size and layout of the save area depends on the operating mode of the
- 80386 (real-address mode or protected mode) and on the operand-size
- attribute in effect for the instruction (32-bit operand or 16-bit operand).
- When the 80386 is in virtual-8086 mode, the real-address mode formats are
- used. FNSTENV does not store the environment until all NPX activity has
- completed. Thus, the data saved by the instruction reflects the 80387 after
- any previously decoded instruction has been executed.
-
- The values in the tag word in memory are determined during the execution of
- FNSTENV/FSTENV. If the tag in the status register indicates that the
- corresponding register is nonempty, the 80387 examines the data in the
- register and stores the appropriate tag in memory. Thus the tag that is
- stored always reflects the actual content of the register.
-
-
- Figure 4-3. Protected Mode 80387 Environment, 32-Bit Format
-
- 32-BIT PROTECTED MODE FORMAT
-
- 31 23 15 7 0
- ╔═════════════════╪═════════════════╤═════════════════╪═════════════════╗
- ║ RESERVED │ CONTROL WORD ║0H
- ╟─────────────────┼─────────────────┼─────────────────┼─────────────────╢
- ║ RESERVED │ STATUS WORD ║4H
- ╟─────────────────┼─────────────────┼─────────────────┼─────────────────╢
- ║ RESERVED │ TAG WORD ║8H
- ╟─────────────────┼─────────────────┼─────────────────┼─────────────────╢
- ║ IP OFFSET ║CH
- ╟──────────┬──────┼─────────────────┼─────────────────┼─────────────────╢
- ║ 0 0 0 0 0│ OPCODE 10..0 │ CS SELECTOR ║10H
- ╟──────────┴──────┼─────────────────┼─────────────────┼─────────────────╢
- ║ DATA OPERAND OFFSET ║14H
- ╟─────────────────┼─────────────────┼─────────────────┼─────────────────╢
- ║ RESERVED │ OPERAND SELECTOR ║18H
- ╚═════════════════╪═════════════════╧═════════════════╪═════════════════╝
-
-
- Figure 4-4. Real Mode 80387 Environment, 32-Bit Format
-
- 32-BIT PROTECTED MODE FORMAT
-
- 31 23 15 7 0
- ╔═════════════════╪═════════════════╤═════════════════╪═════════════════╗
- ║ RESERVED │ CONTROL WORD ║0H
- ╟─────────────────┼─────────────────┼─────────────────┼─────────────────╢
- ║ RESERVED │ STATUS WORD ║4H
- ╟─────────────────┼─────────────────┼─────────────────┼─────────────────╢
- ║ RESERVED │ TAG WORD ║8H
- ╟─────────────────┼─────────────────┼─────────────────┼─────────────────╢
- ║ RESERVED │ INSTRUCTION POINTER 15..0 ║CH
- ╟─────────┬───────┼─────────────────┼───────────┬─┬───┼─────────────────╢
- ║ 0 0 0 0 │ INSTRUCTION POINTER 31..16 │0│ OPCODE 10..0 ║10H
- ╟─────────┴───────┼─────────────────┼───────────┴─┴───┼─────────────────╢
- ║ RESERVED │ OPERAND POINTER 15..0 ║14H
- ╟─────────┬───────┼─────────────────┼───────────┬─────┼─────────────────╢
- ║ 0 0 0 0 │ OPERAND POINTER 31..16 │0 0 0 0 0 0 0 0 0 0 0 0║18H
- ╚═════════╧═══════╪═════════════════╧═══════════╧═════╪═════════════════╝
-
-
- Figure 4-5. Protected Mode 80387 Environment, 16-Bit Format
-
- 16-BIT PROTECTED MODE FORMAT
-
- 15 7 0
- ╔════════════════╪════════════════╗
- ║ CONTROL WORD ║ 0H
- ╟────────────────┼────────────────╢
- ║ STATUS WORD ║ 2H
- ╟────────────────┼────────────────╢
- ║ TAG WORD ║ 4H
- ╟────────────────┼────────────────╢
- ║ IP OFFSET ║ 6H
- ╟────────────────┼────────────────╢
- ║ CB SELECTOR ║ 8H
- ╟────────────────┼────────────────╢
- ║ OPERAND OFFSET ║ AH
- ╟────────────────┼────────────────╢
- ║ OPERAND SELECTOR ║ CH
- ╚════════════════╪════════════════╝
-
-
- Figure 4-6. Real Mode 80387 Environment, 16-Bit Format
-
- 16-BIT REAL-ADDRESS MODE
- AND VIRTUAL-8086 MODE FORMAT
-
- 15 7 0
- ╔════════════════╪════════════════╗
- ║ CONTROL WORD ║ 0H
- ╟────────────────┼────────────────╢
- ║ STATUS WORD ║ 2H
- ╟────────────────┼────────────────╢
- ║ TAG WORD ║ 4H
- ╟────────────────┼────────────────╢
- ║ INSTRUCTION POINTER 15..0 ║ 6H
- ╟─────────┬─┬────┼────────────────╢
- ║IP 19..16│0│ OPCODE 10..0 ║ 8H
- ╟─────────┴─┴────┼────────────────╢
- ║ OPERAND POINTER 15..0 ║ AH
- ╟─────────┬─┬────┼────────────────╢
- ║OP 19..16│0│0 0 0 0 0 0 0 0 0 0 0║ CH
- ╚═════════╧═╧════╪════════════════╝
-
-
- 4.8.10 FLDENV source
-
- FLDENV (load environment) reloads the environment from the memory area
- defined by the source operand. This data should have been written by a
- previous FSTENV/FNSTENV instruction. CPU instructions (that do not reference
- the environment image) may immediately follow FLDENV. FLDENV automatically
- waits for all data transfers to complete before executing the next
- instruction.
-
- Note that loading an environment image that contains an unmasked exception
- causes a numeric exception when the next WAIT or exception-checking ESC
- instruction is executed.
-
-
- 4.8.11 FINCSTP
-
- FINCSTP (increment NPX stack pointer) adds 1 to the stack top pointer (TOP)
- in the status word. It does not alter tags or register contents, nor does it
- transfer data. It is not equivalent to popping the stack, because it does
- not set the tag of the previous stack top to empty. Incrementing the stack
- pointer when ST=7 produces ST=0.
-
-
- 4.8.12 FDECSTP
-
- FDECSTP (decrement NPX stack pointer) subtracts 1 from ST, the stack top
- pointer in the status word. No tags or registers are altered, nor is any
- data transferred. Executing FDECSTP when ST=0 produces ST=7.
-
-
- 4.8.13 FFREE destination
-
- FFREE (free register) changes the destination register's tag to empty; the
- content of the register is unaffected.
-
-
- 4.8.14 FNOP
-
- FNOP (no operation) effectively performs no operation.
-
-
- 4.8.15 FWAIT (CPU Instruction)
-
- FWAIT is not actually an 80387 instruction, but an alternate mnemonic for
- the 80386 WAIT instruction. The FWAIT or WAIT mnemonic should be coded
- whenever the programmer wants to check for a pending error before modifying
- a variable used in the previous floating-point instruction. Coding an FWAIT
- instruction after an 80387 instruction ensures that unmasked numeric
- exceptions occur and exception handlers are invoked before the next
- instruction has a chance to examine the results of the 80387 instruction.
-
- More information on when to code an FWAIT instruction is given in Chapter 5
- in the section "Concurrent Processing with the 80387."
-
-
-
- Chapter 5 Programming Numeric Applications
-
- ───────────────────────────────────────────────────────────────────────────
-
- 5.1 Programming Facilities
-
- As described previously, the 80387 NPX is programmed simply as an extension
- of the 80386 CPU. This section describes how programmers in ASM386 and in a
- variety of higher-level languages can work with the 80387.
-
- The level of detail in this section is intended to give programmers a basic
- understanding of the software tools that can be used with the 80387, but
- this information does not document the full capabilities of these
- facilities. Complete documentation is available with each program
- development product.
-
-
- 5.1.1 High-Level Languages
-
- For programmers using high-level languages, the programming and operation
- of the NPX is handled automatically by the compiler. A variety of Intel
- high-level languages are available that automatically make use of the 80387
- NPX when appropriate. These languages include C-386 and PL/M-386. In
- addition many high-level language compilers are available from independent
- software vendors.
-
- Each of these high-level languages has special numeric libraries allowing
- programs to take advantage of the capabilities of the 80387 NPX. No special
- programming conventions are necessary to make use of the 80387 NPX when
- programming numeric applications in any of these languages.
-
- Programmers in PL/M-386 and ASM386 can also make use of many of these
- library routines by using routines contained in the 80387 Support Library.
- These libraries implement many of the functions provided by higher-level
- languages, including exception handlers, ASCII-to-floating-point
- conversions, and a more complete set of transcendental functions than that
- provided by the 80387 instruction set.
-
-
- 5.1.2 C Programs
-
- C programmers automatically cause the C compiler to generate 80387
- instructions when they use the double and float data types. The float type
- corresponds to the 80387's single real format; the double type corresponds
- to the 80387's double real format. The statement #include <math.h> causes
- mathematical functions such as sin and sqrt to return values of type
- double. Figure 5-1 illustrates the ease with which C programs interface
- with the 80387.
-
-
- Figure 5-1. Sample C-386 Program
-
- XENIX286 C386 COMPILER, V0.2 COMPILATION OF MODULE SAMPLE
- OBJECT MODULE PLACED IN sample.obj
- COMPILER INVOKED BY: c386 sample.c
-
- stmt level
-
- 1 /******************************************************
- 2 * *
- 3 * SAMPLE C PROGRAM *
- 4 * *
- 5 ******************************************************/
- 6
- 7 /** Include /usr/include/stdio.h if necessary **/
- 8 /** Include math declarations for transcendenatals and others
- 9
- 10 #include </usr/include/math.h>
- 36 #define PI 3.141592654
- 37
- 38 main()
- 39 {
- 40 1 double sin_result, cos_result;
- 41 1 double angle_deg = 0.0, angle_rad;
- 42 1 int i, no_of_trial = 4;
- 43
- 44 1 for( i = 1; i <= no_of_trial; i++){
- 45 2 angle_rad = angle_deg * PI / 180.0;
- 46 2 sin_result = sin (angle_rad);
- 47 2 cos_result = cos (angle_rad);
- 48 2 printf("sine of %f degrees equals %f\n", angle_deg,
- 49 2 printf("cosine of %f degrees equals %f\n\n", angle_d
- 50 2 angle_deg = angle_deg + 30.0;
- 51 2 }
- 52 1 /** etc. **/
- 53 1 }
-
- C386 COMPILATION COMPLETE. 0 WARNINGS, 0 ERRORS
-
-
- 5.1.3 PL/M-386
-
- Programmers in PL/M-386 can access a very useful subset of the 80387's
- numeric capabilities. The PL/M-386 REAL data type corresponds to the NPX's
- single real (32-bit) format. This data type provides a range of about
- 8.43 * 10^(-37) ≤ │X│ ≤ 3.38 * 10^(38), with about seven significant decimal
- digits. This representation is adequate for the data manipulated by many
- microcomputer applications.
-
- The utility of the REAL data type is extended by the PL/M-386 compiler's
- practice of holding intermediate results in the 80387's extended real
- format. This means that the full range and precision of the processor are
- utilized for intermediate results. Underflow, overflow, and rounding
- exceptions are most likely to occur during intermediate computations rather
- than during calculation of an expression's final result. Holding
- intermediate results in extended-precision real format greatly reduces the
- likelihood of overflow and underflow and eliminates roundoff as a serious
- source of error until the final assignment of the result is performed.
-
- The compiler generates 80387 code to evaluate expressions that contain REAL
- data types, whether variables or constants or both. This means that
- addition, subtraction, multiplication, division, comparison, and assignment
- of REALs will be performed by the NPX. INTEGER expressions, on the other
- hand, are evaluated on the CPU.
-
- Five built-in procedures (Table 5-1) give the PL/M-386 programmer access to
- 80387 functions manipulated by the processor control instructions. Prior to
- any arithmetic operations, a typical PL/M-386 program will set up the NPX
- using the INIT$REAL$MATH$UNIT procedure and then issue SET$REAL$MODE to
- configure the NPX. SET$REAL$MODE loads the 80387 control word, and its
- 16-bit parameter has the format shown for the control word in Chapter 2.
- The recommended value of this parameter is 033EH (round to nearest, 64-bit
- precision, all exceptions masked except invalid operation). Other settings
- may be used at the programmer's discretion.
-
- If any exceptions are unmasked, an exception handler must be provided in
- the form of an interrupt procedure that is designated to be invoked via CPU
- interrupt vector number 16. The exception handler can use the GET$REAL$ERROR
- procedure to obtain the low-order byte of the 80387 status word and to then
- clear the exception flags. The byte returned by GET$REAL$ERROR contains the
- exception flags; these can be examined to determine the source of the
- exception.
-
- The SAVE$REAL$STATUS and RESTORE$REAL$STATUS procedures are provided
- for multitasking environments where a running task that uses the 80387 may
- be preempted by another task that also uses the 80387. It is the
- responsibility of the operating system to issue SAVE$REAL$STATUS before it
- executes any statements that affect the 80387; these include the
- INIT$REAL$MATH$UNIT and SET$REAL$MODE procedures as well as arithmetic
- expressions. SAVE$REAL$STATUS saves the 80387 state (registers, status, and
- control words, etc.) on the CPU's stack. RESTORE$REAL$STATUS reloads the
- state information; the preempting task must invoke this procedure before
- terminating in order to restore the 80387 to its state at the time the
- running task was preempted. This enables the preempted task to resume
- execution from the point of its preemption.
-
-
- Table 5-1. PL/M-386 Built-In Procedures
-
- Procedure 80387 Description
- Instruction
-
- INIT$REAL$MATH$UNIT FINIT Initialize processor.
- SET$REAL$MODE FLDCW Set exception masks, rounding
- precision, and infinity controls.
- GET$REAL$ERROR FNSTSW Store, then clear, exception flags.
- & FNCLEX
- SAVE$REAL$STATUS FNSAVE Save processor state.
- RESTORE$REAL$STATUS FRSTOR Restore processor state.
-
-
- 5.1.4 ASM386
-
- The ASM386 assembly language provides programmers with complete access to
- all of the facilities of the 80386 and 80387 processors.
-
- The programmer's view of the 80386/80387 hardware is a single machine with
- these resources:
-
- ■ 160 instructions
- ■ 12 data types
- ■ 8 general registers
- ■ 6 segment registers
- ■ 8 floating-point registers, organized as a stack
-
-
- 5.1.4.1 Defining Data
-
- The ASM386 directives shown in Table 5-2 allocate storage for 80387
- variables and constants. As with other storage allocation directives, the
- assembler associates a type with any variable defined with these directives.
- The type value is equal to the length of the storage unit in bytes (10 for
- DT, 8 for DQ, etc.). The assembler checks the type of any variable coded in
- an instruction to be certain that it is compatible with the instruction.
- For example, the coding FIADD ALPHA will be flagged as an error if ALPHA's
- type is not 2 or 4, because integer addition is only available for word and
- short integer (doubleword) data types. The operand's type also tells the
- assembler which machine instruction to produce; although to the programmer
- there is only an FIADD instruction, a different machine instruction is
- required for each operand type.
-
- On occasion it is desirable to use an instruction with an operand that has
- no declared type. For example, if register BX points to a short integer
- variable, a programmer may want to code FIADD [BX]. This can be done by
- informing the assembler of the operand's type in the instruction, coding
- FIADD DWORD PTR [BX]. The corresponding overrides for the other storage
- allocations are WORD PTR, QWORD PTR, and TBYTE PTR.
-
- The assembler does not, however, check the types of operands used in
- processor control instructions. Coding FRSTOR [BP] implies that the
- programmer has set up register BP to point to the location (probably in the
- stack) where the processor's 94-byte state record has been previously saved.
-
- The initial values for 80387 constants may be coded in several different
- ways. Binary integer constants may be specified as bit strings, decimal
- integers, octal integers, or hexadecimal strings. Packed decimal values are
- normally written as decimal integers, although the assembler will accept and
- convert other representations of integers. Real values may be written as
- ordinary decimal real numbers (decimal point required), as decimal numbers
- in scientific notation, or as hexadecimal strings. Using hexadecimal strings
- is primarily intended for defining special values such as infinities, NaNs,
- and denormalized numbers. Most programmers will find that ordinary decimal
- and scientific decimal provide the simplest way to initialize 80387
- constants. Figure 5-2 compares several ways of setting the various 80387
- data types to the same initial value.
-
- Note that preceding 80387 variables and constants with the ASM386 EVEN
- directive ensures that the operands will be word-aligned in memory. The best
- performance is obtained when data transfers are double-word aligned. All
- 80387 data types occupy integral numbers of words so that no storage is
- "wasted" if blocks of variables are defined together and preceded by a
- single EVEN declarative.
-
-
- Table 5-2. ASM386 Storage Allocation Directives
-
- Directive Interpretation Data Types
-
- DW Define Word Word integer
- DD Define Doubleword Short integer, short real
- DQ Dfine Quadword Long integer, long real
- DT Define Tenbyte Packed decimal, temporary real
-
-
- Figure 5-2. Sample 80387 Constants
-
- ; THE FOLLOWING ALL ALLOCATE THE CONSTANT: -126
- ; NOTE TWO'S COMPLETE STORAGE OF NEGATIVE BINARY INTEGERS.
- ;
- ; EVEN ; FORCE WORD ALIGNMENT
- WORD_INTEGER DW 111111111000010B ; BIT STRING
- SHORT_INTEGER DD 0FFFFFF82H ; HEX STRING MUST START
- ; WITH DIGIT
- LONG_INTEGER DQ -126 ; ORDINARY DECIMAL
- SINGLE_REAL DD -126.0 ; NOTE PRESENCE OF '.'
- DOUBLE_REAL DD -1.26E2 ; "SCIENTIFIC"
- PACKED_DECIMAL DT -126 ; ORDINARY DECIMAL INTEGER
- ;
- ; IN THE FOLLOWING, SIGN AND EXPONENT IS 'C005'
- ; SIGNIFICAND IS '7E00...00', 'R' INFORMS ASSEMBLER THAT
- ; THE STRING REPRESENTS A REAL DATA TYPE.
- ;
- EXTENDED_REAL DT 0C0057E00000000000000R ; HEX STRING
-
-
- 5.1.4.2 Records and Structures
-
- The ASM386 RECORD and STRUC (structure) declaratives can be very useful in
- NPX programming. The record facility can be used to define the bit fields of
- the control, status, and tag words. Figure 5-3 shows one definition of the
- status word and how it might be used in a routine that polls the 80387 until
- it has completed an instruction.
-
- Because structures allow different but related data types to be grouped
- together, they often provide a natural way to represent "real world" data
- organizations. The fact that the structure template may be "moved" about in
- memory adds to its flexibility. Figure 5-4 shows a simple structure that
- might be used to represent data consisting of a series of test score
- samples. A structure could also be used to define the organization of the
- information stored and loaded by the FSTENV and FLDENV instructions.
-
-
- Figure 5-3. Status Word Record Definition
-
- ; RESERVE SPACE FOR STATUS WORD
- STATUS_WORD
- ; LAY OUT STATUS WORD FIELDS
- STATUS RECORD
- & BUSY: 1,
- & COND_CODE3: 1,
- & STACK_TOP: 3,
- & COND_CODE2: 1,
- & COND_CODE1: 1,
- & COND_CODE0: 1,
- & INT_REQ: 1,
- & S_FLAG: 1,
- & P_FLAG: 1,
- & U_FLAG: 1,
- & O_FLAG: 1,
- & Z_FLAG: 1,
- & D_FLAG: 1,
- & I_FLAG: 1
- ; REDUCE UNTIL COMPLETE
- REDUCE: FPREM1
- FNSTSW STATUS_WORD
- TEST STATUS_WORD, MASK_COND_CODE2
- JNZ REDUCE
-
-
- Figure 5-4. Structure Definition
-
- SAMPLE STRUC
- N_OBS DD ? ; SHORT INTEGER
- MEAN DQ ? ; DOUBLE REAL
- MODE DW ? ; WORD INTEGER
- STD_DEV DQ ? ; DOUBLE REAL
- ; ARRAY OF OBSERVATIONS -- WORD INTEGER
- TEST_SCORES DW 1000 DUP (?)
- SAMPLE ENDS
-
-
- 5.1.4.3 Addressing Methods
-
- 80387 memory data can be accessed with any of the memory addressing methods
- provided by the ModR/M byte and (optionally) the SIB byte. This means that
- 80387 data types can be incorporated in data aggregates ranging from simple
- to complex according to the needs of the application. The addressing methods
- and the ASM386 notation used to specify them in instructions make the
- accessing of structures, arrays, arrays of structures, and other
- organizations direct and straightforward. Table 5-3 gives several examples
- of 80387 instructions coded with operands that illustrate different
- addressing methods.
-
-
- Table 5-3. Addressing Method Examples
-
- Coding Interpretation
-
- FIADD ALPHA ALPHA is a simple scalar (mode is direct).
-
- FDIVR ALPHA.BETA BETA is a field in a structure that is
- "overlaid" on ALPHA (mode is direct).
-
- FMUL QWORD PTR [BX] BX contains the address of a long real
- variable (mode is register indirect).
-
- FSUB ALPHA [SI] ALPHA is an array and SI contains the
- offset of an array element from the start of
- the array (mode is indexed).
-
- FILD [BP].BETA BP contains the address of a structure on
- the CPU stack and BETA is a field in the
- structure (mode is based).
-
- FBLD TBYTE PTR [BX] [DI] BX contains the address of a packed
- decimal array and DI contains the offset of
- an array element (mode is based indexed).
-
-
- 5.1.5 Comparative Programming Example
-
- Figures 5-5 and 5-6 show the PL/M-386 and ASM386 code for a simple 80387
- program, called ARRSUM. The program references an array (X$ARRAY), which
- contains 0-100 single real values; the integer variable N$OF$X indicates the
- number of array elements the program is to consider. ARRSUM steps through
- X$ARRAY accumulating three sums:
-
- ■ SUM$X, the sum of the array values
-
- ■ SUM$INDEXES, the sum of each array value times its index, where the
- index of the first element is 1, the second is 2, etc.
-
- ■ SUM$SQUARES, the sum of each array element squared
-
- (A true program, of course, would go beyond these steps to store and use
- the results of these calculations.) The control word is set with the
- recommended values: round to nearest, 64-bit precision, interrupts enabled,
- and all exceptions masked except invalid operation. It is assumed that an
- exception handler has been written to field the invalid operation if it
- occurs, and that it is invoked by interrupt pointer 16. Either version of
- the program will run on an actual or an emulated 80387 without altering the
- code shown.
-
- The PL/M-386 version of ARRSUM (Figure 5-5) is very straightforward and
- illustrates how easily the 80387 can be used in this language. After
- declaring variables, the program calls built-in procedures to initialize the
- processor (or its emulator) and to load to the control word. The program
- clears the sum variables and then steps through X$ARRAY with a DO-loop. The
- loop control takes into account PL/M-386's practice of considering the
- index of the first element of an array to be 0. In the computation of
- SUM$INDEXES, the built-in procedure FLOAT converts I+1 from integer to real
- because the language does not support "mixed mode" arithmetic. One of the
- strengths of the NPX, of course, is that it does support arithmetic on mixed
- data types (because all values are converted internally to the 80-bit
- extended-precision real format).
-
- The ASM386 version (Figure 5-6) defines the external procedure INIT387,
- which makes the different initialization requirements of the processor and
- its emulator transparent to the source code. After defining the data and
- setting up the segment registers and stack pointer, the program calls
- INIT387 and loads the control word. The computation begins with the next
- three instructions, which clear three registers by loading (pushing) zeros
- onto the stack. As shown in Figure 5-7, these registers remain at the
- bottom of the stack throughout the computation while temporary values are
- pushed on and popped off the stack above them.
-
- The program uses the CPU LOOP instruction to control its iteration through
- X_ARRAY; register ECX, which LOOP automatically decrements, is loaded with
- N_OF_X, the number of array elements to be summed. Register ESI is used to
- select (index) the array elements. The program steps through X_ARRAY from
- back to front, so ESI is initialized to point at the element just beyond the
- first element to be processed. The ASM386 TYPE operator is used to determine
- the number of bytes in each array element. This permits changing X_ARRAY to
- a double-precision real array by simply changing its definition (DD to DQ)
- and reassembling.
-
- Figure 5-7 shows the effect of the instructions in the program loop on the
- NPX register stack. The figure assumes that the program is in its first
- iteration, that N_OF_X is 20, and that X_ARRAY(19) (the 20th element)
- contains the value 2.5. When the loop terminates, the three sums are left as
- the top stack elements so that the program ends by simply popping them into
- memory variables.
-
-
- Figure 5-5. Sample PL/M-386 Program
-
- XENIX286 PL/M-386 DEBUG X291a COMPILATION OF MODULE ARRAYSUM
- OBJECT MODULE PLACED IN arraysum.obj
- COMPILER INVOKED BY: plm386 arraysum.plm
-
-
- /***********************************************************
- * *
- * ARRAYSUM MODDULE *
- * *
- ***********************************************************/
-
- 1 array$sum: do;
-
- 2 1 declare (sum$x, sum$indexes, sum$squares) real;
- 3 1 declare x$array(100) real;
- 4 1 declare (n$of$x, i) integer;
- 5 1 declare control$387 literally `033eh';
-
- /* Assume x$array and n$of$x are initialized */
- 6 1 call init$real$math$unit;
- 7 1 call set$real$mode(control$387);
-
- /* Clear sums */
- 8 1 sum$x, sum$indexes, sum$squares = 0.0;
-
- /* Loop through array, accumulating sums */
- 9 1 do i = 0 to n$of$x - 1;
- 10 2 sum$x = sum$x + x$array(i);
- 11 2 sum$indexes = sum$indexes + (x$array(i)*float(i+1));
- 12 2 sum$squares = sum$squares + (x$array(i)*x$array(i));
- 13 2 end;
-
- /* etc. */
-
- 14 1 end array$sum;
-
-
- MODULE INFORMATION:
-
- CODE AREA SIZE = 000000A0H 160D
- CONSTANT AREA SIZE = 00000004H 4D
- VARIABLE AREA SIZE = 000001A4H 420D
- MAXIMUM STACK SIZE = 00000004H 4D
- 32 LINES READ
- 0 PROGRAM WARNINGS
- 0 PROGRAM ERRORS
-
- DICTIONARY SUMMARY:
-
- 8KB MEMORY USED
- 0KB DISK SPACE USED
-
- END OF PL/M-386 COMPILATION
-
-
- Figure 5-6. Sample ASM386 Program
-
- XENIX286 80386 MACRO ASSEMBLER V1.0, ASSEMBLY OF MODULE ARRAYSUM
- OBJECT MODULE PLACED IN arraysum.obj
- ASSEMBLER INVOKED BY: asm386 arraysum.asm
-
- LOC OBJ LINE SOURCE
-
- 1 name arraysum
- 2
- 3 ; Define initialization routine
- 4
- 5 extrn init387:far
- 6
- 7 ; Allocate space for data
- 8
- -------- 9 data segment rw public
- 00000000 3E03 10 control_387 dw 033eh
- 00000002 ???????? 11 n_of_x dd ?
- 00000006 (100 12 x_array cd 100 dup (?)
- ????????
- )
- 00000196 ???????? 13 sum_squares dd ?
- 0000019A ???????? 14 sum_indexes dd ?
- 0000019E ???????? 15 sum_x dd ?
- -------- 16 data ends
- 17
- 18 ; Allocate CPU stack space
- 19
- -------- 20 stack stackseg 400
- 21
- 22 ; Begin code
- 23
- -------- 24 code segment er public
- 25
- 26 assume ds:data, ss:stack
- 27
- 00000000 28 start:
- 00000000 66B8---- R 29 mov ax, data
- 00000004 8ED8 30 mov ds, ax
- 00000006 66B8---- R 31 mov ax, stack
- 0000000A B800000000 32 mov eax, 0h
- 0000000F 8E00 33 mov ss, ax
- 00000011 BC00000000 R 34 mov esp, stackstart stack
- 35
- 36 ; Assume x_array and n_of_x have
- 37 ; been initialized
- 38
- 39 ; Prepare the 80387 or its emulator
- 40
- 00000016 9A00000000---- E 41 call init387
- 0000001D D92D00000000 R 42 fldcw control_387
- 43
- 44 ; Clear three registers to hold
- 45 ; running sums
- 46
- 00000023 D9EE 47 fldz
- 00000025 D9EE 48 fldz
- 00000027 D9EE 49 fldz
- 50
- 51 ; Setup ECX as loop counter and ESI
- 52 ; as index into x array
- 53
- 00000029 8B0D02000000 R 54 mov ecx, n of x
- 0000002F F7E9 55 imul ecx
- 00000031 8BF0 56 mov esi, eax
- 57
- 58 ; ESI now contains index of last
- 59 ; element + 1
- 60 ; Loop through x_array and
- 61 ; accumulate sum
- 62
- 00000033 43 sum_next:
- 64 ; backup one element and push on
- 65 ; the stack
- 66
- 00000033 83EE04 67 sub esi, type x_array
- 00000036 D98606000000 R 68 fld x_array[esi]
- 69
- 70 ; add to the sum and duplicate x
- 71 ; on the stack
- 72
- 0000003C DCC3 73 fadd st(3), st
- 0000003E D9C0 74 fld st
- 75
- 76 ; square it and add into the sum of
- 77 ; (index+1) and discard
- 78
- 00000040 DCC8 79 fmul st, st
- 00000042 DEC2 80 facdp st(2), st
- 81
- 82 ; reduce index for next iteration
- 83
- 00000044 FF0D02000000 R 84 dec n_of_x
- 0000004A E2E7 85 loop sum_next
- 86
- 87 ; Pop sums into memory
- 88
- 0000004C 89 pop_results:
- 0000004C D91D96010000 R 90 fstp sum_squares
- 00000052 D91D9A010000 R 91 fstp sum_indexes
- 00000058 D91D9E010000 R 92 fstp sum_x
- 0000005E 9B 93 fwait
- 94
- 95 ;
- 96 ; Etc.
- 97 ;
- -------- 98 code ends
- 99 end start, ds:data, ss:stack
-
- ASSEMBLY COMPLETE, NO WARNINGS, NO ERRORS.
-
-
- Figure 5-7. Instructions and Register Stack
-
- FLDZ, FLDZ, FLDZ FLD X_ARRAY[SI]
- ╔══════════════╗─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─╔══════════════╗
- ST(0)║ 0.0 ║ SUM_SQUARES ST(O)║ 2.5 ║ X_ARRAY(19)
- ╠══════════════╣ ╠══════════════╣
- ST(1)║ 0.0 ║ SUM_INDEXES ST(1)║ ║ SUM_SQUARES
- ╠══════════════╣ ╠══════════════╣
- ST(2)║ 0.0 ║ SUM_X ST(2)║ 0.0 ║ SUM_INDEXES
- ╚══════════════╝ ╠══════════════╣
- ST(3)║ 0.0 ║ SUM_X
- ┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ╚══════════════╝
- │
- FADD_ST(3), ST ─┘ FLD_ST
- ╔══════════════╗─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─╔══════════════╗
- ST(O)║ 2.5 ║ X_ARRAY(19) ST(O)║ 2.5 ║ X_ARRAY(19)
- ╠══════════════╣ ╠══════════════╣
- ST(1)║ 0.0 ║ SUM_SQUARES ST(1)║ 2.5 ║ X_ARRAY(19)
- ╠══════════════╣ ╠══════════════╣
- ST(2)║ 0.0 ║ SUM_INDEXES ST(2)║ 0.0 ║ SUM_SQUARES
- ╠══════════════╣ ╠══════════════╣
- ST(3)║ 2.5 ║ SUM_X ST(3)║ 0.0 ║ SUM_INDEXES
- ╚══════════════╝ ╠══════════════╣
- ST(4)║ 2.5 ║ SUM_X
- ┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ╚══════════════╝
- │
- FMUL_ST, ST ──┘ FADDP_ST(2), ST
- ╔══════════════╗─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─╔══════════════╗
- ST(0)║ 6.25 ║ X_ARRAY(19)² ST(O)║ 2.5 ║ X_ARRAY(19)
- ╠══════════════╣ ╠══════════════╣
- ST(1)║ 2.5 ║ X_ARRAY(19) ST(1)║ 6.25 ║ SUM_SQUARES
- ╠══════════════╣ ╠══════════════╣
- ST(2)║ 0.0 ║ SUM_SQUARES ST(2)║ 0.0 ║ SUM_INDEXES
- ╠══════════════╣ ╠══════════════╣
- ST(3)║ 0.0 ║ SUM_INDEXES ST(3)║ 2.5 ║ SUM_X
- ╠══════════════╣ ╚══════════════╝
- ST(4)║ 2.5 ║ SUM_X │
- ╚══════════════╝ │
- ┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘
- FIMUL N_OF_X ──┘ FADDP_ST(2), ST
- ╔══════════════╗─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─╔══════════════╗
- ST(O)║ 50.0 ║ X_ARRAY(19)*20 ST(O)║ 6.25 ║ SUM_SQUARES
- ╠══════════════╣ ╠══════════════╣
- ST(1)║ 6.25 ║ SUM_SQUARES ST(1)║ 50.0 ║ SUM_INDEXES
- ╠══════════════╣ ╠══════════════╣
- ST(2)║ 0.0 ║ SUM_INDEXES ST(2)║ 2.5 ║ SUM_X
- ╠══════════════╣ ╚══════════════╝
- ST(3)║ 2.5 ║ SUM_X
- ╚══════════════╝
-
-
- 5.1.6 80387 Emulation
-
- The programming of applications to execute on both 80386 with an 80387 and
- 80386 systems without an 80387 is made much easier by the existence of an
- 80387 emulator for 80386 systems. The Intel EMUL387 emulator offers a
- complete software counterpart to the 80387 hardware; NPX instructions can be
- simply emulated in software rather than being executed in hardware. With
- software emulation, the distinction between 80386 systems with or without an
- 80387 is reduced to a simple performance differential. Identical numeric
- programs will simply execute more slowly (using software emulation of NPX
- instructions) on 80386 systems without an 80387 than on an 80386/80387
- system executing NPX instructions directly.
-
- When incorporated into the systems software, the emulation of NPX
- instructions on the 80386 systems is completely transparent to the
- applications programmer. Applications software needs no special libraries,
- linking, or other activity to allow it to run on an 80386 with 80387
- emulation.
-
- To the applications programmer, the development of programs for 80386
- systems is the same whether the 80387 NPX hardware is available or not. The
- full 80387 instruction set is available for use, with NPX instructions being
- either emulated or executed directly. Applications programmers need not be
- concerned with the hardware configuration of the computer systems on which
- their applications will eventually run.
-
- For systems programmers, details relating to 80387 emulators are described
- in Chapter 6.
-
- The EMUL387 software emulator for 80386 systems is available from Intel as
- a separate program product.
-
-
- 5.2 Concurrent Processing With the 80387
-
- Because the 80386 CPU and the 80387 NPX have separate execution units, it
- is possible for the NPX to execute numeric instructions in parallel with
- instructions executed by the CPU. This simultaneous execution of different
- instructions is called concurrency.
-
- No special programming techniques are required to gain the advantages of
- concurrent execution; numeric instructions for the NPX are simply placed in
- line with the instructions for the CPU. CPU and numeric instructions are
- initiated in the same order as they are encountered by the CPU in its
- instruction stream. However, because numeric operations performed by the NPX
- generally require more time than operations performed by the CPU, the CPU
- can often execute several of its instructions before the NPX completes a
- numeric instruction previously initiated.
-
- This concurrency offers obvious advantages in terms of execution
- performance, but concurrency also imposes several rules that must be
- observed in order to assure proper synchronization of the 80386 CPU and
- 80387 NPX.
-
- All Intel high-level languages automatically provide for and manage
- concurrency in the NPX. Assembly-language programmers, however, must
- understand and manage some areas of concurrency in exchange for the
- flexibility and performance of programming in assembly language. This
- section is for the assembly-language programmer or well-informed
- high-level-language programmer.
-
-
- 5.2.1 Managing Concurrency
-
- Concurrent execution of the host and 80387 is easy to establish and
- maintain. The activities of numeric programs can be split into two major
- areas: program control and arithmetic. The program control part performs
- activities such as deciding what functions to perform, calculating addresses
- of numeric operands, and loop control. The arithmetic part simply adds,
- subtracts, multiplies, and performs other operations on the numeric
- operands. The NPX and host are designed to handle these two parts separately
- and efficiently.
-
- Concurrency management is required to check for an exception before letting
- the 80386 change a value just used by the 80387. Almost any numeric
- instruction can, under the wrong circumstances, produce a numeric exception.
- For programmers in higher-level languages, all required synchronization is
- automatically provided by the appropriate compiler. For assembly-language
- programmers exception synchronization remains the responsibility of the
- assembly-language programmer.
-
- A complication is that a programmer may not expect his numeric program to
- cause numeric exceptions, but in some systems, they may regularly happen. To
- better understand these points, consider what can happen when the NPX
- detects an exception.
-
- Depending on options determined by the software system designer, the NPX
- can perform one of two things when a numeric exception occurs:
-
- ■ The NPX can provide a default fix-up for selected numeric exceptions.
- Programs can mask individual exception types to indicate that the NPX
- should generate a safe, reasonable result whenever that exception
- occurs. The default exception fix-up activity is treated by the NPX as
- part of the instruction causing the exception; no external indication
- of the exception is given. When exceptions are detected, a flag is set
- in the numeric status register, but no information regarding where or
- when is available. If the NPX performs its default action for all
- exceptions, then the need for exception synchronization is not
- manifest. However, as will be shown later, this is not sufficient
- reason to ignore exception synchronization when designing programs that
- use the 80387.
-
- ■ As an alternative to the NPX default fix-up of numeric exceptions, the
- 80386 CPU can be notified whenever an exception occurs. When a numeric
- exception is unmasked and the exception occurs, the NPX stops further
- execution of the numeric instruction and signals this event to the CPU.
- On the next occurrence of an ESC or WAIT instruction, the CPU traps to
- a software exception handler. The exception handler can then implement
- any sort of recovery procedures desired for any numeric exception
- detectable by the NPX. Some ESC instructions do not check for
- exceptions. These are the nonwaiting forms FNINIT, FNSTENV, FNSAVE,
- FNSTSW, FNSTCW, and FNCLEX.
-
- When the NPX signals an unmasked exception condition, it is requesting
- help. The fact that the exception was unmasked indicates that further
- numeric program execution under the arithmetic and programming rules of the
- NPX is unreasonable.
-
- If concurrent execution is allowed, the state of the CPU when it recognizes
- the exception is undefined. The CPU may have changed many of its internal
- registers and be executing a totally different program by the time the
- exception occurs. To handle this situation, the NPX has special registers
- updated at the start of each numeric instruction to describe the state of
- the numeric program when the failed instruction was attempted.
-
- Exception synchronization ensures that the NPX is in a well-defined state
- after an unmasked numeric exception occurs. Without a well-defined state, it
- would be impossible for exception recovery routines to determine why the
- numeric exception occurred, or to recover successfully from the exception.
-
- The following two sections illustrate the need to always consider
- exception synchronization when writing 80387 code, even when the code is
- initially intended for execution with exceptions masked. If the code is
- later moved to an environment where exceptions are unmasked, the same code
- may not work correctly. An example of how some instructions written without
- exception synchronization will work initially, but fail when moved into a
- new environment is shown in Figure 5-8.
-
-
- Figure 5-8. Exception Synchronization Examples
-
- INCORRECT ERROR SYNCHRONIZATION
-
- FILD COUNT ; NPX instruction
- INC COUNT ; CPU instruction alters operand
- FSQRT COUNT ; subsequent NPX instruction -- error from
- ; previous NPX instruction detected here
-
- PROPER ERROR SYNCHRONIZATION
-
- FILD COUNT ; NPX instruction
- FSQRT ; subsequent NPX instruction -- error from
- ; previous NPX instruction detected here
- INC COUNT ; CPU instruction alters operand
-
-
- 5.2.1.1 Incorrect Exception Synchronization
-
- In Figure 5-8, three instructions are shown to load an integer, calculate
- its square root, then increment the integer. The 80386-to-80387 interface
- and synchronous execution of the NPX emulator will allow this program to
- execute correctly when no exceptions occur on the FILD instruction.
-
- This situation changes if the 80387 numeric register stack is extended to
- memory. To extend the NPX stack to memory, the invalid exception is
- unmasked. A push to a full register or pop from an empty register sets SF
- and causes an invalid exception.
-
- The recovery routine for the exception must recognize this situation, fix
- up the stack, then perform the original operation. The recovery routine
- will not work correctly in the first example shown in the figure. The
- problem is that the value of COUNT is incremented before the NPX can signal
- the exception to the CPU. Because COUNT is incremented before the exception
- handler is invoked, the recovery routine will load an incorrect value of
- COUNT, causing the program to fail or behave unreliably.
-
-
- 5.2.1.2 Proper Exception Synchronization
-
- Exception synchronization relies on the WAIT instruction and the BUSY# and
- ERROR# signals of the 80387. When an unmasked exception occurs in the 80387,
- it asserts the ERROR# signal, signaling to the CPU that a numeric exception
- has occurred. The next time the CPU encounters a WAIT instruction or an
- exception-checking ESC instruction, the CPU acknowledges the ERROR# signal
- by trapping automatically to Interrupt #16, the processor-extension
- exception vector. If the following ESC or WAIT instruction is properly
- placed, the CPU will not yet have disturbed any information vital to
- recovery from the exception.
-
-
- Chapter 6 System-Level Numeric Programming
-
- ────────────────────────────────────────────────────────────────────────────
-
- System programming for 80387 systems requires a more detailed understanding
- of the 80387 NPX than does application programming. Such things as
- emulation, initialization, exception handling, and data and error
- synchronization are all the responsibility of the systems programmer. These
- topics are covered in detail in the sections that follow.
-
-
- 6.1 80386/80387 Architecture
-
- On a software level, the 80387 NPX appears as an extension of the 80386
- CPU. On the hardware level, however, the mechanisms by which the 80386 and
- 80387 interact are more complex. This section describes how the 80387 NPX
- and 80386 CPU interact and points out features of this interaction that are
- of interest to systems programmers.
-
-
- 6.1.1 Instruction and Operand Transfer
-
- All transfers of instructions and operands between the 80387 and system
- memory are performed by the 80386 using I/O bus cycles. The 80387 appears to
- the CPU as a special peripheral device. It is special in two respects: the
- CPU initiates I/O automatically when it encounters ESC instructions, and the
- CPU uses reserved I/O addresses to communicate with the 80387. These I/O
- operations are completely transparent to software.
-
- Because the 80386 actually performs all transfers between the 80387 and
- memory, no additional bus drivers, controllers, or other components are
- necessary to interface the 80387 NPX to the local bus. The 80387 can utilize
- instructions and operands located in any memory accessible to the 80386 CPU.
-
-
- 6.1.2 Independent of CPU Addressing Modes
-
- Unlike the 80287, the 80387 is not sensitive to the addressing and memory
- management of the CPU. The 80387 operates the same regardless of whether the
- 80386 CPU is operating in real-address mode, in protected mode, or in
- virtual 8086 mode.
-
- The instruction FSETPM that was necessary in 80286/80287 systems to set the
- 80287 into protected mode is not needed for the 80387. The 80387 treats this
- instruction as a no-op.
-
- Because the 80386 actually performs all transfers between the 80387 and
- memory, 80387 instructions can utilize any memory location accessible by the
- task currently executing on the 80386. When operating in protected mode, all
- references to memory operands are automatically verified by the 80386's
- memory management and protection mechanisms as for any other memory
- references by the currently-executing task. Protection violations associated
- with NPX instructions automatically cause the 80386 to trap to an
- appropriate exception handler.
-
- To the numerics programmer, the operating modes of the 80386 affect only
- the manner in which the NPX instruction and data pointers are represented in
- memory following an FSAVE or FSTENV instruction. Each of these instructions
- produces one of four formats depending on both the operating mode and on the
- operand-size attribute in effect for the instruction. The differences are
- detailed in the discussion of the FSAVE and FSTENV instructions in
- Chapter 4.
-
-
- 6.1.3 Dedicated I/O Locations
-
- The 80387 NPX does not require that any memory addresses be set aside for
- special purposes. The 80387 does make use of I/O port addresses, but these
- are 32-bit addresses with the high-order bit set (i.e. > 80000000H);
- therefore, these I/O operations are completely transparent to the 80386
- software. Because these addresses are beyond the 64 Kbyte I/O addressing
- limit of I/O instructions, 80386 programs cannot reference these reserved
- I/O addresses directly.
-
-
- 6.2 Processor Initialization and Control
-
- One of the principal responsibilities of systems software is the
- initialization, monitoring, and control of the hardware and software
- resources of the system, including the 80387 NPX. In this section, issues
- related to system initialization and control are described, including
- recognition of the NPX, emulation of the 80387 NPX in software if the
- hardware is not available, and the handling of exceptions that may occur
- during the execution of the 80387.
-
-
- 6.2.1 System Initialization
-
- During initialization of an 80386 system, systems software must
-
- ■ Recognize the presence or absence of the NPX.
-
- ■ Set flags in the 80386 MSW to reflect the state of the numeric
- environment.
-
- If an 80387 NPX is present in the system, the NPX must be initialized. All
- of these activities can be quickly and easily performed as part of the
- overall system initialization.
-
-
- 6.2.2 Hardware Recognition of the NPX
-
- The 80386 identifies the type of its coprocessor (80287 or 80387) by
- sampling its ERROR# input some time after the falling edge of RESET and
- before executing the first ESC instruction. The 80287 keeps its ERROR#
- output in inactive state after hardware reset; the 80387 keeps its ERROR#
- output in active state after hardware reset. The 80386 records this
- difference in the ET bit of control register zero (CR0). The 80386
- subsequently uses ET to control its interface with the coprocessor. If ET is
- set, it employs the 32-bit protocol of the 80387; if ET is not set, it
- employs the 16-bit protocol of the 80287.
-
- Systems software can (if necessary) change the value of ET. There are three
- reasons that ET may not be set:
-
- 1. An 80287 is actually present.
-
- 2. No coprocessor is present.
-
- 3. An 80387 is present but it is connected in a nonstandard manner that
- does not trigger the setting of ET.
-
- An example of case three is the PC/AT-compatible design described in
- Appendix F. In such cases, initialization software may need to change the
- value of ET.
-
-
- 6.2.3 Software Recognition of the NPX
-
- Figure 6-1 shows an example of a recognition routine that determines
- whether an NPX is present, and distinguishes between the 80387 and the
- 8087/80287. This routine can be executed on any 80386, 80286, or 8086
- hardware configuration that has an NPX socket.
-
- The example guards against the possibility of accidentally reading an
- expected value from a floating data bus when no NPX is present. Data read
- from a floating bus is undefined. By expecting to read a specific bit
- pattern from the NPX, the routine protects itself from the indeterminate
- state of the bus. The example also avoids depending on any values in
- reserved bits, thereby maintaining compatibility with future numerics
- coprocessors.
-
-
- Figure 6-1. Software Routine to Recognize the 80287
-
- 8086/87/88/186 MACRO ASSEMBLER Test for presence of a Numerics Chip, Revisio
-
-
- DOS 3.20 (033-N) 8086/87/88/186 MACRO ASSEMBLER V2.0 ASSEMBLY OF MODULE TEST_
- OBJECT MODULE PLACED IN FINDNPX.OBJ
-
- LOC OBJ LINE SOURCE
-
- 1 +1 $title('Test for presence of a Numerics Chip, Revis
- 2
- 3 name Test_NPX
- 4
- ---- 5 stack segment stack 'stack'
- 0000 (100 6 dw 100 dup (?)
- ????
- )
- 00C8 ???? 7 sst dw ?
- ---- 8 stack ends
- 9
- ---- 10 data segment public 'data'
- 0000 0000 11 temp dw 0h
- ---- 12 data ends
- 13
- 14 dgroup group data, stack
- 15 cgroup group code
- 16
- ---- 17 code segment public 'code'
- 18 assume cs:cgroup, ds:dgroup
- 19
- 0000 20 start:
- 21 ;
- 22 ; Look for an 8087, 80287, or 80387 NPX.
- 23 ; Note that we cannot execute WAIT on 8086/88
- 24 ;
- 0000 25 test npx:
- 0000 90DBE3 26 fninit ; Must use non-wait
- 0003 BE0000 R 27 mov [si],offset dgroup:temp
- 0006 C7045A5A 28 mov word ptr [si],5A5AH ; Initialize te
- 000A 90DD3C 29 fnstsw [si] ; Must use non-wait
- 30 ; It is not necessa
- 31 ; after fnstsw or
- 000D 803C00 32 cmp byte ptr [si],0 ; See if correct st
- 0010 752A 33 jne no_npx ; Jump if not a val
- 34 ;
- 35 ; Now see if ones can be correctly written fr
- 36 ;
- 0012 90D93C 37 fnstcw [si] ; Look at the contr
- 38 ; Do not use a WAIT
- 0015 8B04 39 mov ax,[si] ; See if ones can b
- 0017 253F10 40 and ax,103fh ; See if selected p
- 001A 3D3F00 41 cmp ax,3fh ; Check that ones a
- 001D 7510 42 jne no npx ; Jump if no NPX is
- 43 ;
- 44 ; Some numerics chip is installed. NPX instr
- 45 ; See if the NPX is an 8087, 80287, or 80387.
- 46 ; This code is necessary if a denormal except
- 47 ; new 80387 instructions will be used.
- 48 ;
- 001F 98D9E8 49 fld1 ; Must use default
- 0022 9BD9EE 50 fldz ; Form infinity
- 0025 9BDEF9 51 fdiv ; 8087/287 says +in
- 0028 9BD9C0 52 fld st ; Form negative inf
- 002B 9BD9E0 53 fchs ; 80387 says +inf <
- 002E 9BDED9 54 fcompp ; See if they are t
- 0031 9BDD3C 55 fstsw [si] ; Look at status fr
- 0034 8B04 56 mov ax,[si]
- 0036 9E 57 sahf ; See if the infini
- 0037 7406 58 je found_87_287 ; Jump if 8087/287
- 59 ;
- 60 ; An 80387 is present. If denormal exception
- 61 ; they must be masked. The 80387 will automa
- 62 ; operands faster than an exception handler c
- 63 ;
- 0039 EB0790 64 jmp found_387
- 003C 65 no_npx:
- 66 ; set up for no NPX
- 67 ; ...
- 68 ;
- 003C EB0490 69 jmp exit
- 003F 70 found_87_287:
- 71 ; set up for 87/287
- 72 ; ...
- 73 ;
- 003F EB0190 74 jmp exit
- 0042 75 found_387:
- 76 ; set up for 387
- 77 ; ...
- 78 ;
- 0042 79 exit:
- ---- 80 code ends
- 81 end start,ds:dgroup,ss:dgroup:sst
-
- ASSEMBLY COMPLETE, NO ERRORS FOUND
-
-
- 6.2.4 Configuring the Numerics Environment
-
- Once the 80386 CPU has determined the presence or absence of the 80387 or
- 80287 NPX, the 80386 must set either the MP or the EM bit in its own control
- register zero (CR0) accordingly. The initialization routine can either
-
- ■ Set the MP bit in CR0 to allow numeric instructions to be executed
- directly by the NPX.
-
- ■ Set the EM bit in the CR0 to permit software emulation of the numeric
- instructions.
-
- The MP (monitor coprocessor) flag of CR0 indicates to the 80386 whether an
- NPX is physically available in the system. The MP flag controls the function
- of the WAIT instruction. When executing a WAIT instruction, the 80386 tests
- the task switched (TS) bit only if MP is set; if it finds TS set under these
- conditions, the CPU traps to exception #7.
-
- The Emulation Mode (EM) bit of CR0 indicates to the 80386 whether NPX
- functions are to be emulated. If the CPU finds EM set when it executes an
- ESC instruction, program control is automatically trapped to exception #7,
- giving the exception handler the opportunity to emulate the functions of an
- 80387.
-
- For correct 80386 operation, the EM bit must never be set concurrently with
- MP. The EM and MP bits of the 80386 are described in more detail in the
- 80386 Programmer's Reference Manual. More information on software
- emulation for the 80387 NPX is described in the "80387 Emulation" section
- later in this chapter. In any case, if ESC instructions are to be executed,
- either the MP or EM bit must be set, but not both.
-
-
- 6.2.5 Initializing the 80387
-
- Initializing the 80387 NPX simply means placing the NPX in a known state
- unaffected by any activity performed earlier. A single FNINIT instruction
- performs this initialization. All the error masks are set, all registers are
- tagged empty, TOP is set to zero, and default rounding and precision
- controls are set. Table 6-1 shows the state of the 80387 NPX following
- FINIT or FNINIT. This state is compatible with that of the 80287 after
- FINIT or after hardware RESET.
-
- The FNINIT instruction does not leave the 80387 in the same state as that
- which results from the hardware RESET signal. Following a hardware RESET
- signal, such as after initial power-up, the state of the 80387 differs in
- the following respects:
-
- 1. The mask bit for the invalid-operation exception is reset.
-
- 2. The invalid-operation exception flag is set.
-
- 3. The exception-summary bit is set (along with its mirror image, the
- B-bit).
-
- These settings cause assertion of the ERROR# signal as described
- previously. The FNINIT instruction must be used to change the 80387 state to
- one compatible with the 80287.
-
-
- Table 6-1. NPX Processor State Following Initialization
-
- Field Value Interpretation
-
- Control Word
- (Infinity Control) 0 Affine
- Rounding Control 00 Round to nearest
- Precision Control 11 64 bits
- Exception Masks 111111 All exceptions masked
- Status Word
- (Busy) 0 ──
- Condition Code 0000 ──
- Stack Top 000 Register 0 is stack top
- Exception Summary 0 No exceptions
- Stack Flag 0 ──
- Exception Flags 000000 No exceptions
- Tag Word
- Tags 11 Empty
- Registers N.C. Not changed
- Exception Pointers
- Instruction Code N.C. Not changed
- Instruction Address N.C. Not changed
- Operand Address N.C. Not changed
-
-
- 6.2.6 80387 Emulation
-
- If it is determined that no 80387 NPX is available in the system, systems
- software may decide to emulate ESC instructions in software. This emulation
- is easily supported by the 80386 hardware, because the 80386 can be
- configured to trap to a software emulation routine whenever it encounters an
- ESC instruction in its instruction stream.
-
- Whenever the 80386 CPU encounters an ESC instruction, and its MP and EM
- status bits are set appropriately (MP=0, EM=1), the 80386 automatically
- traps to interrupt #7, the "processor extension not available" exception.
- The return link stored on the stack points to the first byte of the ESC
- instruction, including the prefix byte(s), if any. The exception handler can
- use this return link to examine the ESC instruction and proceed to emulate
- the numeric instruction in software.
-
- The emulator must step the return pointer so that, upon return from the
- exception handler, execution can resume at the first instruction following
- the ESC instruction.
-
- To an application program, execution on an 80386 system with 80387
- emulation is almost indistinguishable from execution on a system with an
- 80387, except for the difference in execution speeds.
-
- There are several important considerations when using emulation on an 80386
- system:
-
- ■ When operating in protected mode, numeric applications using the
- emulator must be executed in execute-readable code segments. Numeric
- software cannot be emulated if it is executed in execute-only code
- segments. This is because the emulator must be able to examine the
- particular numeric instruction that caused the emulation trap.
-
- ■ Only privileged tasks can place the 80386 in emulation mode. The
- instructions necessary to place the 80386 in emulation mode are
- privileged instructions, and are not typically accessible to an
- application.
-
- An emulator package (EMUL387) that runs on 80386 systems is available from
- Intel. This emulation package operates in both real and protected mode as
- well as in virtual 8086 mode, providing a complete functional equivalent for
- the 80387 emulated in software.
-
- When using the EMUL387 emulator, writers of numeric exception handlers
- should be aware of one slight difference between the emulated 80387 and the
- 80387 hardware:
-
- ■ On the 80387 hardware, exception handlers are invoked by the 80386 at
- the first WAIT or ESC instruction following the instruction causing the
- exception. The return link, stored on the 80386 stack, points to this
- second WAIT or ESC instruction where execution will resume following a
- return from the exception handler.
-
- ■ Using the EMUL387 emulator, numeric exception handlers are invoked
- from within the emulator itself. The return link stored on the stack
- when the exception handler is invoked will therefore point back to the
- EMUL387 emulator, rather than to the program code actually being
- executed (emulated). An IRET return from the exception handler returns
- to the emulator, which then returns immediately to the emulated
- program. This added layer of indirection should not cause confusion,
- however, because the instruction causing the exception can always be
- identified from the 80387's instruction and data pointers.
-
-
- 6.2.7 Handling Numerics Exceptions
-
- Once the 80387 has been initialized and normal execution of applications
- has been commenced, the 80387 NPX may occasionally require attention in
- order to recover from numeric processing exceptions. This section provides
- details for writing software exception handlers for numeric exceptions.
- Numeric processing exceptions have already been introduced in Chapter 3.
-
- The 80387 NPX can take one of two actions when it recognizes a numeric
- exception:
-
- ■ If the exception is masked, the NPX will automatically perform its own
- masked exception response, correcting the exception condition according
- to fixed rules, and then continuing with its instruction execution.
-
- ■ If the exception is unmasked, the NPX signals the exception to the
- 80386 CPU using the ERROR# status line between the two processors. Each
- time the 80386 encounters an ESC or WAIT instruction in its instruction
- stream, the CPU checks the condition of this ERROR# status line. If
- ERROR# is active, the CPU automatically traps to Interrupt vector #16,
- the Processor Extension Error trap.
-
- Interrupt vector #16 typically points to a software exception handler,
- which may or may not be a part of systems software. This exception handler
- takes the form of an 80386 interrupt procedure.
-
- When handling numeric errors, the CPU has two responsibilities:
-
- ■ The CPU must not disturb the numeric context when an error is
- detected.
-
- ■ The CPU must clear the error and attempt recovery from the error.
-
- Although the manner in which programmers may treat these responsibilities
- varies from one implementation to the next, most exception handlers will
- include these basic steps:
-
- ■ Store the NPX environment (control, status, and tag words, operand and
- instruction pointers) as it existed at the time of the exception.
-
- ■ Clear the exception bits in the status word.
-
- ■ Enable interrupts on the CPU.
-
- ■ Identify the exception by examining the status and control words in
- the saved environment.
-
- ■ Take some system-dependent action to rectify the exception.
-
- ■ Return to the interrupted program and resume normal execution.
-
-
- 6.2.8 Simultaneous Exception Response
-
- In cases where multiple exceptions arise simultaneously, the 80387 signals
- one exception according to the precedence shown at the end of Chapter 3.
- This means, for example, that an SNaN divided by zero results in an invalid
- operation, not in a zero divide exception.
-
-
- 6.2.9 Exception Recovery Examples
-
- Recovery routines for NPX exceptions can take a variety of forms. They can
- change the arithmetic and programming rules of the NPX. These changes may
- redefine the default fix-up for an error, change the appearance of the NPX
- to the programmer, or change how arithmetic is defined on the NPX.
-
- A change to an exception response might be to automatically normalize all
- denormals loaded from memory. A change in appearance might be extending the
- register stack into memory to provide an "infinite" number of numeric
- registers. The arithmetic of the NPX can be changed to automatically extend
- the precision and range of variables when exceeded. All these functions can
- be implemented on the NPX via numeric exceptions and associated recovery
- routines in a manner transparent to the application programmer.
-
- Some other possible application-dependent actions might include:
-
- ■ Incrementing an exception counter for later display or printing
-
- ■ Printing or displaying diagnostic information (e.g., the 80387
- environment andregisters)
-
- ■ Aborting further execution
-
- ■ Storing a diagnostic value (a NaN) in the result and continuing with
- the computation
-
- Notice that an exception may or may not constitute an error, depending on
- the application. Once the exception handler corrects the condition causing
- the exception, the floating-point instruction that caused the exception can
- be restarted, if appropriate. This cannot be accomplished using the IRET
- instruction, however, because the trap occurs at the ESC or WAIT instruction
- following the offending ESC instruction. The exception handler must obtain
- (using FSAVE or FSTENV) the address of the offending instruction in the task
- that initiated it, make a copy of it, execute the copy in the context of the
- offending task, and then return via IRET to the current CPU instruction
- stream.
-
- In order to correct the condition causing the numeric exception, exception
- handlers must recognize the precise state of the NPX at the time the
- exception handler was invoked, and be able to reconstruct the state of the
- NPX when the exception initially occurred. To reconstruct the state of the
- NPX, programmers must understand when, during the execution of an NPX
- instruction, exceptions are actually recognized.
-
- Invalid operation, zero divide, and denormalized exceptions are detected
- before an operation begins, whereas overflow, underflow, and precision
- exceptions are not raised until a true result has been computed. When a
- before exception is detected, the NPX register stack and memory have
- not yet been updated, and appear as if the offending instructions has not
- been executed.
-
- When an after exception is detected, the register stack and memory appear
- as if the instruction has run to completion; i.e., they may be updated.
- (However, in a store or store-and-pop operation, unmasked over/underflow is
- handled like a before exception; memory is not updated and the stack is not
- popped.) The programming examples contained in Chapter 7 include an outline
- of several exception handlers to process numeric exceptions for the 80387.
-
-
- Chapter 7 Numeric Programming Examples
-
- ───────────────────────────────────────────────────────────────────────────
-
- The following sections contain examples of numeric programs for the 80387
- NPX written in ASM386. These examples are intended to illustrate some of the
- techniques for programming the 80386/80387 computing system for numeric
- applications.
-
-
- 7.1 Conditional Branching Example
-
- As discussed in Chapter 2, several numeric instructions post their results
- to the condition code bits of the 80387 status word. Although there are many
- ways to implement conditional branching following a comparison, the basic
- approach is as follows:
-
- ■ Execute the comparison.
-
- ■ Store the status word. (80387 allows storing status directly into AX
- register.)
-
- ■ Inspect the condition code bits.
-
- ■ Jump on the result.
-
- Figure 7-1 is a code fragment that illustrates how two memory-resident
- double-format real numbers might be compared (similar code could be used
- with the FTST instruction). The numbers are called A and B, and the
- comparison is A to B.
-
- The comparison itself requires loading A onto the top of the 80387 register
- stack and then comparing it to B, while popping the stack with the same
- instruction. The status word is then written into the 80386 AX register.
-
- A and B have four possible orderings, and bits C3, C2, and C0 of the
- condition code indicate which ordering holds. These bits are positioned in
- the upper byte of the NPX status word so as to correspond to the CPU's zero,
- parity, and carry flags (ZF, PF, and CF), when the byte is written into the
- flags. The code fragment sets ZF, PF, and CF of the CPU status word to the
- values of C3, C2, and C0 of the NPX status word, and then uses the CPU
- conditional jump instructions to test the flags. The resulting code is
- extremely compact, requiring only seven instructions.
-
- The FXAM instruction updates all four condition code bits. Figure 7-2 shows
- how a jump table can be used to determine the characteristics of the value
- examined. The jump table (FXAM_TBL) is initialized to contain the 32-bit
- displacement of 16 labels, one for each possible condition code setting.
- Note that four of the table entries contain the same value, "EMPTY." The
- first two condition code settings correspond to "EMPTY." The two other table
- entries that contain "EMPTY" will never be used on the 80387, but may be
- used if the code is executed with an 80287.
-
- The program fragment performs the FXAM and stores the status word. It then
- manipulates the condition code bits to finally produce a number in register
- BX that equals the condition code times 2. This involves zeroing the unused
- bits in the byte that contains the code, shifting C3 to the right so that it
- is adjacent to C2, and then shifting the code to multiply it by 2. The
- resulting value is used as an index that selects one of the displacements
- from FXAM_TBL (the multiplication of the condition code is required because
- of the 2-byte length of each value in FXAM_TBL). The unconditional JMP
- instruction effectively vectors through the jump table to the labeled
- routine that contains code (not shown in the example) to process each
- possible result of the FXAM instruction.
-
-
- Figure 7-1. Conditional Branching for Compares
-
- .
- .
- .
- A DQ ?
- B DQ ?
- .
- .
- .
- FLD A ; LOAD A ONTO TOP OF 387 STACK
- FCOMP B ; COMPARE A:B, POP A
- FSTSW AX ; STORE RESULT TO CPU AX REGISTER
- ;
- ; CPU AX REGISTER CONTAINS CONDITION CODES
- ; (RESULTS OF COMPARE)
- ; LOAD CONDITION CODES INTO CPU FLAGS
- ;
- SAHF
- ;
- ; USE CONDITIONAL JUMPS TO DETERMINE ORDERING OF A TO B
- ;
- JP A_B_UNORDERED ; TEST C2 (PF)
- JB A_LESS ; TEST C0 (CF)
- JE A_EQUAL ; TEST C3 (ZF)
- A_GREATER: ; C0 (CF) = 0, C3 (ZF) = 0
- .
- .
- A_EQUAL: ; C0 (CF) = 0, C3 (ZF) = 1
- .
- .
- A_LESS: ; C0 (CF) = 1, C3 (ZF) = 0
- .
- .
- A_B_UNORDERED: ; C2 (PF) = 1
- .
- .
-
-
- Figure 7-2. Conditional Branching for FXAM
-
- ; JUMP TABLE FOR EXAMINE ROUTINE
- ;
- FXAM_TBL DD POS_UNNORM, POS NAN, NEG_UNNORM, NEG_NAN,
- & POS_NORM, POS_INFINITY, NEG_NORM,
- & NEG_INFINITY, POS_ZERO, EMPTY, NEG_ZERO,
- & EMPTY, POS_DENORM, EMPTY, NEG_DENORM, EMPTY
- .
- .
- ; EXAMINE ST AND STORE RESULT (CONDITION CODES)
-
- FXAM
- XOR EAX,EAX ; CLEAR EAX
- FSTSW AX
-
- ; CALCULATE OFFSET INTO JUMP TABLE
-
- AND AX,0100011100000000B ; CLEAR ALL BITS EXCEPT C3, C2-C0
- SHR EAX,6 ; SHIFT C2-C0 INTO PLACE (0000XXX0)
- SAL AH,5 ; POSITION C3 (000X0000)
- OR AL,AH ; DROP C3 IN ADJACENT TO C2 (000XXXX0)
- XOR AH,AH ; CLEAR OUT THE OLD COPY OF C3
-
- ; JUMP TO THE ROUTINE `ADDRESSED' BY CONDITION CODE
-
- JMP FXAM_TBL[EAX]
-
- ; HERE ARE THE JUMP TARGETS, ONE TO HANDLE
- ; EACH POSSIBLE RESULT OF FXAM
-
- POS_UNNORM:
- .
- POS_NAN:
- .
- NEG_UNNORM:
- .
- NEG_NAN:
- .
- POS_NORM:
- .
- POS_INFINITY:
- .
- NEG_NORM:
- .
- NEG_INFINITY:
- .
- POS_ZERO:
- .
- EMPTY:
- .
- NEG_ZERO:
- .
- POS_DENORM:
- .
- NEG_DENORM:
-
-
- 7.2 Exception Handling Examples
-
- There are many approaches to writing exception handlers. One useful
- technique is to consider the exception handler procedure as consisting of
- "prologue," "body," and "epilogue" sections of code. This procedure is
- invoked via interrupt number 16.
-
- At the beginning of the prologue, CPU interrupts have been disabled. The
- prologue performs all functions that must be protected from possible
- interruption by higher-priority sources. Typically, this involves saving CPU
- registers and transferring diagnostic information from the 80387 to memory.
- When the critical processing has been completed, the prologue may enable CPU
- interrupts to allow higher-priority interrupt handlers to preempt the
- exception handler.
-
- The body of the exception handler examines the diagnostic information and
- makes a response that is necessarily application-dependent. This response
- may range from halting execution, to displaying a message, to attempting to
- repair the problem and proceed with normal execution.
-
- The epilogue essentially reverses the actions of the prologue, restoring
- the CPU and the NPX so that normal execution can be resumed. The epilogue
- must not load an unmasked exception flag into the 80387 or another exception
- will be requested immediately.
-
- Figures 7-3 through 7-5 show the ASM386 coding of three skeleton
- exception handlers. They show how prologues and epilogues can be written for
- various situations, but provide comments indicating only where the
- application dependent exception handling body should be placed.
-
- Figures 7-3 and 7-4 are very similar; their only substantial difference is
- their choice of instructions to save and restore the 80387. The tradeoff
- here is between the increased diagnostic information provided by FNSAVE and
- the faster execution of FNSTENV. For applications that are sensitive to
- interrupt latency or that do not need to examine register contents, FNSTENV
- reduces the duration of the "critical region," during which the CPU does not
- recognize another interrupt request.
-
- After the exception handler body, the epilogues prepare the CPU and the NPX
- to resume execution from the point of interruption (i.e., the instruction
- following the one that generated the unmasked exception). Notice that the
- exception flags in the memory image that is loaded into the 80387 are
- cleared to zero prior to reloading (in fact, in these examples, the entire
- status word image is cleared).
-
- The examples in Figures 7-3 and 7-4 assume that the exception handler
- itself will not cause an unmasked exception. Where this is a possibility,
- the general approach shown in Figure 7-5 can be employed. The basic
- technique is to save the full 80387 state and then to load a new control
- word in the prologue. Note that considerable care should be taken when
- designing an exception handler of this type to prevent the handler from
- being reentered endlessly.
-
-
- Figure 7-3. Full-State Exception Handler
-
- SAVE_ALL PROC
- ;
- ; SAVE CPU REGISTERS, ALLOCATE STACK SPACE
- ; FOR 80387 STATE IMAGE
- PUSH EBP
- MOV EBP,ESP
- SUB ESP,108
- ; SAVE FULL 80387 STATE, ENABLE CPU INTERRUPTS
- FNSAVE [EBP-108]
- STI
- ;
- ; APPLICATION-DEPENDENT EXCEPTION HANDLING
- ; CODE GOES HERE
- ;
- ; CLEAR EXCEPTION FLAGS IN STATUS WORD
- ; (WHICH IS IN MEMORY)
- ; RESTORE MODIFIED STATE IMAGE
- MOV BYTE PTR [EBP-104], 0H
- FRSTOR [EBP-108]
- ; DEALLOCATE STACK SPACE, RESTORE CPU REGISTERS
- MOVE ESP,EBP
- .
- .
- POP EBP
- ;
- ; RETURN TO INTERRUPTED CALCULATION
- IRET
- SAVE_ALL ENDP
-
-
- Figure 7-4. Reduced-Latency Exception Handler
-
- SAVE_ENVIRONMENT PROC
- ;
- ; SAVE CPU REGISTERS, ALLOCATE STACK SPACE
- ; FOR 80387 ENVIRONMENT
- PUSH EBP
- .
- MOV EBP,ESP
- SUB ESP,28
- ; SAVE ENVIRONMENT, ENABLE CPU INTERRUPTS
- FNSTENV [EBP-28]
- STI
- ;
- ; APPLICATION EXCEPTION-HANDLING CODE GOES HERE
- ;
- ; CLEAR EXCEPTION FLAGS IN STATUS WORD
- ; (WHICH IS IN MEMORY)
- ; RESTORE MODIFIED ENVIRONMENT IMAGE
- MOV BYTE PTR [EBP-24], 0H
- FLDENV [EBP-28]
- ; DE-ALLOCATE STACK SPACE, RESTORE CPU REGISTERS
- MOV ESP,EBP
- POP EBP
- ;
- ; RETURN TO INTERRUPTED CALCULATION
- IRET
- SAVE_ENVIRONMENT ENDP
-
-
- Figure 7-5. Reentrant Exception Handler
-
- .
- .
- .
- LOCAL CONTROL DW ? ; ASSUME INITIALIZED
- .
- .
- .
- REENTRANT PROC
- ;
- ; SAVE CPU REGISTERS, ALLOCATE STACK SPACE FOR
- ; 80387 STATE IMAGE
- PUSH EBP
- .
- .
- .
- MOV EBP,ESP
- SUB ESP,108
- ; SAVE STATE, LOAD NEW CONTROL WORD,
- ; ENABLE CPU INTERRUPTS
- FNSAVE [EBP-108]
- FLDCW LOCAL_CONTROL
- STI
- .
- .
- .
- ; APPLICATION EXCEPTION HANDLING CODE GOES HERE.
- ; AN UNMASKED EXCEPTION GENERATED HERE WILL
- ; CAUSE THE EXCEPTION HANDLER TO BE REENTERED.
- ; IF LOCAL STORAGE IS NEEDED, IT MUST BE
- ; ALLOCATED ON THE CPU STACK.
- .
- .
- .
- ; CLEAR EXCEPTION FLAGS IN STATUS WORD
- ; (WHICH IS IN MEMORY)
- ; RESTORE MODIFIED STATE IMAGE
- MOV BYTE PTR [EBP-104], 0H
- FRSTOR [EBP-108]
- ; DE-ALLOCATE STACK SPACE, RESTORE CPU REGISTERS
- MOV ESP,EBP
- .
- .
- .
- POP EBP
- ; RETURN TO POINT OF INTERRUPTION
- IRET
- REENTRANT ENDP
-
-
- 7.3 Flaoting-Point to ASCII Conversion Examples
-
- Numeric programs must typically format their results at some point for
- presentation and inspection by the program user. In many cases, numeric
- results are formatted as ASCII strings for printing or display. This example
- shows how floating-point values can be converted to decimal ASCII character
- strings. The function shown in Figure 7-6 can be invoked from PL/M-386,
- Pascal-386, FORTRAN-386, or ASM386 routines.
-
- Shortness, speed, and accuracy were chosen rather than providing the
- maximum number of significant digits possible. An attempt is made to keep
- integers in their own domain to avoid unnecessary conversion errors.
-
- Using the extended precision real number format, this routine achieves a
- worst case accuracy of three units in the 16th decimal position for a
- noninteger value or integers greater than 10^(18). This is double precision
- accuracy. With values having decimal exponents less than 100 in magnitude,
- the accuracy is one unit in the 17th decimal position.
-
- Higher precision can be achieved with greater care in programming, larger
- program size, and lower performance.
-
-
- Figure 7-6. Floating-Point to ASCII Conversion Routine
-
- XENIX286 80380 MACRO ASSEMBLER V1.0, ASSEMBLY OF MODULE FLOATING_TO_ASCII
- OBJECT MODULE PLACED IN fpasc.obj
- ASSEMBLER INVOKED BY: asm386 fpasc.asm
-
- LOC OBJ LINE SOURCE
-
- 1 +1 $title(`Convert a floating point number
- 2
- 3 name floating_to_asci
- 4
- 00000000 5 public floating_to_asci
- 6 extrn get_power_10:nea
- 7 ;
- 8 ; This subroutine will convert the float
- 9 ; number in the top of the NPX stack to
- 10 ; string and separate power of 10 scalin
- 11 ; (in binary). The maximum width of the
- 12 ; formed is controlled by a parameter wh
- 13 ; > 1. Unnormal values, denormal values
- 14 ; zeroes will be correctly converted. Ho
- 15 ; and pseudo zeros are no longer support
- 16 ; 80387( in conformance with the IEEE fl
- 17 ; standard) and hence not generated inte
- 18 ; returned value will indicate how many
- 19 ; of precision were lost in an unnormal
- 20 ; value. The magnitude (in terms of bin
- 21 ; of a pseudo zero will also be indicate
- 22 ; less than 10**18 in magnitude are accu
- 23 ; if the destination ASCII string field
- 24 ; to hold all the digits. Otherwise the
- 25 ; to scientific notation.
- 26 ;
- 27 ; The status of the conversion is identi
- 28 ; return value, it can be:
- 29 ;
- 30 ; 0 conversion complete, str
- 31 ; 1 invalid arguments
- 32 ; 2 exact integer conversion
- 33 ; 3 indefinite
- 34 ; 4 + NAN (Not A Number)
- 35 ; 5 - NAN
- 36 ; 6 + Infinity
- 37 ; 7 - Infinity
- 38 ; 8 pseudo zero found, strin
- 39 ;
- 40 ; The PLM/386 calling convention
- 41 ;
- 42 ; floating_to_ascii:
- 43 ; procedure (number,denormal_ptr,s
- 44 ; field_size, power_ptr) word exte
- 45 ; declare (denormal_ptr,string_ptr
- 46 ; pointer;
- 47 ; declare field_size word,
- 48 ; string_size based size ptr word;
- 49 ; declare number real;
- 50 ; declare denormal integer based d
- 51 ; declare power integer based powe
- 52 ; end floating_to_ascii:
- 53 ;
- 54 ; The floating point value is ex
- 55 ; on the top of the NPX stack. This
- 56 ; expects 3 free entries on the NPX st
- 57 ; will pop the passed value off when d
- 58 ; generated ASCII string will have a l
- 59 ; character either `-' or `+' indicati
- 60 ; of the value. The ASCII decimal dig
- 61 ; immediately follow. The numeric valu
- 62 ; ASCII string is (ASCII STRING.)*10**
- 63 ; the given number was zero, the ASCII
- 64 ; contain a sign and a single zero cha
- 65 ; value string_size indicates the tota
- 66 ; the ASCII string including the sign
- 67 ; String(0) will always hold the sign.
- 68 ; possible for string size to be less
- 69 ; field_size. This occurs for zeroes o
- 70 ; values. A pseudo zero will return a
- 71 ; return code. The denormal count wil
- 72 ; the power of two originally assoc
- 73 ; value. The power of ten and ASCII s
- 74 ; be as if the value was an ordinary z
- 75 ;
- 76 ; This subroutine is accurate up to a
- 77 ; 18 decimal digits for integers. Int
- 78 ; will have a decimal power of zero as
- 79 ; with them. For non integers, the res
- 80 ; accurate to within 2 decimal digits
- 81 ; decimal place(double precision). Th
- 82 ; instruction is also used for scaling
- 83 ; the range acceptable for the BCD dat
- 84 ; roundirg mode in effect on entry to
- 85 ; subroutine is used for the conversio
- 86 ;
- 87 ; The following registers are no
- 88 ;
- 89 ; eax ebx ecx edx esi edi
- 90 ;
- 91 ;
- 92 ; Define the stack layout.
- 93 ;
- 00000000[] 94 ebp_save equ dword ptr [ebp]
- 00000004[] 95 es_save equ ebp_save + size e
- 00000008[] 96 return_ptr equ es_save + size es
- 0000000C[] 97 power_ptr equ return_ptr + size
- 00000010[] 98 field_size equ power_ptr + size
- 00000014[] 99 size_ptr equ field_size + size
- 00000018[] 100 string_ptr equ size_ptr + size s
- 0000001C[] 101 denormal_ptr equ string_ptr + size
- 102
- 0014 103 parms_size equ size power_ptr +
- 104 & size size_ptr + size stri
- 105 & size denormal_ptr
- 106 ;
- 107 ; Define constants used
- 108 ;
- 109 BCD_DIGITS equ 18 ; Number
- 110 WORD_SIZE equ 4
- 111 BCD_SIZE equ 10
- 112 MINUS equ 1 ; Define
- 113 NAN equ 4 ; The ex
- 114 INFINITY equ 6 ; here a
- 115 INDEFINITE equ 3 ; corres
- 116 PSEUDO_ZERO equ 8 ; values
- 117 INVALID equ -2 ; order
- 118 ZERO equ -4
- 119 DENORMAL equ -6
- 120 UNNORMAL equ -8
- 121 NORMAL equ 0
- 122 EXACT equ 2
- 123 ;
- 124 ; Define layout of temporary stor
- 125 ;
- 126 power_two equ word ptr [ebp - W
- 127 bcd_value equ tbyte ptr power t
- 128 bcd_byte equ byte ptr bcd_valu
- 129 fraction equ bcd_value
- 130
- 131 local_size equ size power_two +
- 132 ;
- 133 ; Allocate stack space for the te
- 134 ; the stack will be big enough
- 135 ;
- 136 stack stackseg (local_size+6) ; Allocat
- 137 ; space f
- 138 +1 $eject
- 139 code segment public er
- 140 extrn power_table:qword
- 141 ;
- 142 ; Constants used by this function.
- 143 ;
- 144 even ;
- 00000000 0A00 145 const10 dw 10 ;
- 140 ; ; too big BCD
- 147 ;
- 148 ; Convert the C3,C2,C1,C0 encoding from
- 149 ; into meaningful bit flags and values.
- 150 ;
- 00000002 F8 151 status_table db UNNORMAL, NAN, UN
- 00000003 04 152 & NAN + MINUS, NORMAL, INFINITY,
- 00000004 F9 153 & NORMAL + MINUS, INFINITY + MINU
- 00000005 05 154 & ZERO, INVALID, ZERO + MIN
- 00000006 00 155 & DENORMAL, INVALID, DENORM
- 00000007 06
- 00000008 01
- 00000009 07
- 0000000A FC
- 0000000B FE
- 0000000C FD
- 0000000D FE
- 0000000E FA
- 0000000F FE
- 00000010 FB
- 00000011 FE
- 156
- 00000012 157 floating_to_ascii proc
- 158
- 00000012 E800000000 E 159 call tos_status ; Look a
- 160
- 161 ; Get descriptor from table
- 00000017 2E0FB68002000000 R 162 movzx eax, status_table[eax]
- 0000001F 3CFE 163 cmp al,INVALID ;
- 00000021 7527 164 jne not_empty
- 165 ;
- 166 ; ST(0) is empty! Return the st
- 167 ;
- 00000023 C21400 168 ret parms_size
- 169 ;
- 170 ; Remove infinity from stack and
- 171 ;
- 00000026 172 found_infinity:
- 00000026 DDD8 173 fstp st(0) ; OK to
- 00000028 EB02 174 jmp short exit_proc
- 175 ;
- 176 ; String space is too small!
- 177 ; Return invalid code.
- 178 ;
- 0000002A 179 small_string:
- 0000002A B0FE 180 mov al,INVALID
- 0000002C 181 exit_proc:
- 0000002C C9 182 leave ; Restore stack s
- 0000002D 07 183 pop es
- 0000002E C21400 184 ret parms_size
- 185 ;
- 186 ; ST(0) is NAN or indefinite. Store the
- 187 ; value in memory and look at the fracti
- 188 ; field to separate indefinite from an o
- 189 ;
- 00000031 190 NAN_or_indefinite:
- 00000031 DB7DF2 191 fstp fraction ; Remove
- 192 ; for examin
- 00000034 A801 193 test al,MINUS ; Look a
- 00000036 9B 194 fwait
- 00000037 74F3 195 jz exit_proc
- 196 ; positive
- 197
- 00000039 BB000000C0 198 mov ebx,0C0000000H ; Match
- 199 ;bits of fra
- 200
- 201 ; Compare bits 63-32
- 0000003E 2B5DF6 202 sub ebx, dword ptr fraction
- 203
- 204 ; Bits 31-0 must be zero
- 00000041 0B5DF2 205 or ebx, dword ptr fraction
- 00000044 75E6 206 jnz exit_proc
- 207
- 208 ; Set return value for indefinite value
- 00000046 B003 209 mov al,INDEFINITE
- 00000048 EBE2 210 jmp exit_proc
- 211 ;
- 212 ; Allocate stack space for local
- 213 ; and establish parameter addressibi
- 214 ;
- 0000004A 215 not_empty:
- 0000004A 06 216 push es ; Save w
- 0000004B C80C0000 217 enter local_size, 0 ; Setup
- 218
- 219
- 220 ; Check for enough string space
- 0000004F 8B4D10 221 mov ecx,field size
- 00000052 83F902 222 cmp ecx,2
- 00000055 7CD3 223 jl small_string
- 224
- 00000057 49 225 dec ecx ; Adjust
- 226
- 227 ; See if string is too large for BCD
- 00000058 83F912 228 cmp ecx,BCD_DIGITS
- 0000005B 7605 229 jbe size_ok
- 230
- 231 ; Else set maximum string size
- 0000005D B912000000 232 mov ecx,BCD_DIGITS
- 00000002 233 size_ok:
- 00000062 3C06 234 cmp al,INFINITY ; Look f
- 235
- 236 ; Return status value for + or - inf
- 00000064 7DC0 237 jge found_infinity
- 238
- 00000066 3C04 239 cmp al,NAN ; Look
- 00000068 7DC7 240 jge NAN_or_indefinite
- 241 ;
- 242 ; Set default return values and check t
- 243 ; the number is normalized.
- 244 ;
- 0000006A D9E1 245 fabs ; Use positive value onl
- 246 ; sign bit in al
- 0000006C 31D2 247 xor edx,edx
- 0000006E 8B7D1C 248 mov edi,denormal_ptr; Zero d
- 00000071 668917 249 mov [edi], dx
- 00000074 8B5D0C 250 mov ebx,power_ptr ; Zero p
- 00000077 668913 251 mov [ebx], dx
- 0000007A 88C2 252 mov dl, al
- 0000007C 80E201 253 and dl, 1
- 0000007F 80C202 254 add dl, EXACT
- 00000082 3CFC 255 cmp al,ZERO
- 00000084 0F83BC000000 256 jae convert_integer ; Ship p
- 257
- 0000008A DB7DF2 258 fstp fraction
- 00000080 9B 259 fwait
- 0000008E 8A45F9 260 mov al, bcd_byte + 7
- 00000091 804DF980 261 or byte ptr bcd_byte + 7, 80h
- 00000095 DB6DF2 262 fld fraction
- 00000098 D9F4 263 fxtract
- 0000009A A880 264 test al, 80h
- 0000009C 7524 265 jnz normal_value
- 266
- 0000009E D9E8 267 fld1
- 000000A0 DEE9 268 fsub
- 000000A2 D9E4 269 ftat
- 000000A4 9BDFE0 270 fatsw ax
- 000000A7 9E 271 sahf
- 000000A8 7510 272 jnz set_unnormal_count
- 273 ;
- 274 ; Found a pseudo zero
- 275 ;
- 000000AA D9EC 276 fldlg2 ; Develop power
- 000000AC 80C206 277 add dl, PSEUDO ZERO - EXACT
- 000000AF DECA 278 fmulp st(2), st
- 000000B1 D9C9 279 fxch ; Get power of
- 000000B3 DF1B 280 fistp word ptr [ebx] ; Set power of
- 000000B5 E98C000000 281 jmp convert_integer
- 282
- 000000BA 283 set_unnonmal_count:
- 000000BA D9F4 284 fxtract ; Get original
- 285 ; now normaliz
- 000000BC D9C9 286 fxch ; Get unnormal
- 000000BE D9E0 287 fchs
- 000000C0 DF1F 288 fistp word ptr [edi] ; Set unnormal
- 289
- 290
- 291 ; Calculate the decimal magnitude assoc
- 292 ; with this number to within one order.
- 293 ; error will always be inevitable due t
- 294 ; rounding and lost precision. As a res
- 295 ; we will deliberately fail to consider
- 296 ; LOG10 of the fraction value in calcul
- 297 ; the order. Since the fraction will al
- 298 ; be 1 <= F < 2, its LOG10 will not ch
- 299 ; the basic accuracy of the function. T
- 300 ; get the decimal order of magnitude, s
- 301 ; multiply the power of two by LOG10(2)
- 302 ; truncate the result to an integer.
- 303 ;
- 304 normal_value:
- 305 fstp fraction ; Save t
- 306 ; for later u
- 307 fist power_two ; Save p
- 308 fldlg2
- 309
- 310 fmul ; Form LOG10(of
- 311 fistp word ptr [ebx] ; Any ro
- 312
- 313 ;
- 314 ; Check if the magnitude of the
- 315 ; out treating it as an integer.
- 316 ;
- 317 ; CX has the maximum number of dec
- 318 ; allowed.
- 319 ;
- 320 fwait ; Wait for power
- 321
- 322 ; Get power of ten of value
- 323 movsx si, word ptr [ebx]
- 324 sub esi,ecx
- 325 ; necessary
- 326 ja adjust result ; Jump i
- 327 ;
- 328 ; The number is between 1 and 10
- 329 ; Test if it is an integer.
- 330 ;
- 331 fild power_two ; Restor
- 332 sub dl,NORMAL-EXACT ; Conver
- 333 ; value
- 334 fld fraction
- 335 fscale
- 336 ; is safe he
- 337 fst st(1)
- 338 frndint
- 339 fcomp
- 340 fstsw ax
- 341 sahf
- 342 ; an integer
- 343 jnz convert_integer
- 344
- 345 fstp st(0) ; Remove
- 346 add dl,NORMAL-EXACT ; Restor
- 347 ;
- 348 ; Scale the number to within the ra
- 349 ; by the BCD format.The scaling operati
- 350 ; produce a number within one decimal o
- 351 ; magnitude of the largest decimal numb
- 352 ; representable within the given string
- 353 ;
- 354 ; The scaling power of ten value
- 355 ;
- 000000F2 356 adjust_result:
- 000000F2 8BC6 357 mov eax,esi ;
- 000000F4 668903 358 mov word ptr [ebx],ax ;
- 359 ; of ten
- 000000F7 F7D8 360 neg eax ; Subtra
- 361 ; magnit
- 000000F9 E800000000 E 362 call get_power_10 ; Scalin
- 363 ; return
- 364 ; expone
- 000000FE DB6DF2 365 fld fraction
- 00000101 DEC9 366 fmul
- 00000103 8BF1 367 mov esi,ecx ;
- 368 ; the ma
- 00000105 C1E603 369 shl esi,3
- 370 ; the st
- 00000108 DF45FC 371 fild power_two ;
- 0000010B DEC2 372 faddp st(2),st
- 0000010D D9FD 373 fscale
- 374 ; expone
- 0000010F DDD9 375 fstp st(1) ;
- 376 ;
- 377 ; Test the adjusted value against
- 378 ; of exact powers of ten. The combine
- 379 ; of the magnitude estimate and power
- 380 ; can result in a value one order of
- 381 ; too small or too large to fit corre
- 382 ; the BCD field. To handle this probl
- 383 ; the adjusted value, if it is too sm
- 384 ; large, then adjust it by ten and ad
- 385 ; power of ten value.
- 386 ;
- 00000111 387 test_power:
- 388
- 389 ; Compare against exact power entry. Use
- 390 ; entry since cx has been decremented by
- 00000111 2EDC9608000000 E 391 fcom power_table[esi]+type po
- 00000118 9BDFE0 392 fstsw ax
- 0000011B 9E 393 sahf ; If C3 = C0
- 0000011C 720F 394 jb test_for_small ; too bi
- 395
- 0000011E 2EDE3500000000 R 396 fidiv const10 ; Else
- 00000125 80E2FD 397 and dl,not EXACT ; Remov
- 00000128 66FF03 398 inc word ptr [ebx] ; Adjus
- 0000012B EB17 399 jmp short in range ; Conve
- 400 ; integer
- 0000012D 401 test for small:
- 0000012D 2EDC9600000000 E 402 fcom power table[esi]
- 0000134 9BDFE0 403 fstsw ax
- 0000137 9E 404 sahf
- 405
- 10000138 720A 406 jc in_range
- 407
- 408
- 000013A 2EDE0D00000000 R 409 fimul const10 ; Adjust
- 0000141 66FF0B 410 dec word ptr [ebx] ; Adjust
- 0000144 411 in_range:
- 0000144 D9FC 412 frndint
- 413 ;
- 414 ; Assert: 0 <= TOS <= 999,999,999,
- 415 ; The TOS number will be exactly r
- 416 ; in 18 digit BCD format.
- 417 ;
- 00000146 418 convert_integer:
- 00000146 DF75F2 419 fbstp bcd_value ; Store
- 420 ;
- 421 ; while the store BCD runs, setu
- 422 ; for the conversion to ASCII.
- 423 ;
- 00000149 BE08000000 424 mov esi,BCD_SIZE.2 ; Initia
- 0000014E 66B9040F 425 mov cx,0f04h
- 00000152 BB01000000 426 mov ebx,1
- 427 ; field for
- 00000157 8B7D18 428 mov edi,string_ptr ; Get ad
- 429 ; ASCII stri
- 0000015A 8CD8 430 mov ax,ds
- 0000015C 8EC0 431 mov es,ax
- 0000015E FC 432 cld
- 0000015F B02B 433 mov al,'+'
- 00000161 F6C201 434 test dl,MINUS ; Look f
- 00000164 7402 435 jz positive_result
- 436
- 00000166 B02D 437 mov al,`.'
- 00000168 438 positive_result:
- 00000168 AA 439 stosb
- 440 ; past sign
- 00000169 80E2FE 441 and dl,not MINUS ; Turn o
- 0000016C 9B 442 fwait
- 443 ;
- 444 ; Register usage:
- 445 ; ah:
- 446 ; al:
- 447 ; dx:
- 448 ; ch:
- 449 ; cl:
- 450 ; bx:
- 451 ; esi:
- 452 ; di:
- 453 ; ds,es:
- 454 ;
- 455 ; Remove leading zeroes from the
- 456 ;
- 0000016D 457 skip_leading_zeroes:
- 0000016D 8A6435F2 458 mov ah,bcd_byte[esi]
- 00000171 88E0 459 mov al,ah ;
- 00000173 D2E8 460 shr al,cl ;
- 00000175 240F 461 and al,0fh ;
- 00000177 7517 462 jnz enter_odd ;
- 463 ; non zero fo
- 464
- 00000179 88E0 465 mov al,ah ;
- 0000017B 240F 466 and al,0fh ;
- 0000017D 7519 467 jnz enter_even ;
- 468 ; digit found
- 469
- 0000017F 4E 470 dec esi
- 00000180 79EB 471 jns ship_leading_zeroes
- 472 ;
- 473 ; The significand was all zeroes.
- 474 ;
- 00000182 B030 475 mov al,`O' ;
- 00000184 AA 476 stosb
- 00000185 43 477 inc ebx
- 00000186 EB17 478 jmo short exit_with_value
- 479 ;
- 480 ; Now expand the BCD string into
- 481 ; per byte values 0-9.
- 482 ;
- 00000188 483 digit_loop:
- 00000188 8A6435F2 484 mov ah,bcd_byte[esi] ;
- 0000018C 88E0 485 mov al,ah
- 0000018E D2E8 486 shr al,cl ;
- 00000190 487 enter_odd:
- 00000190 0430 488 add al,`O' ;
- 00000192 AA 489 stosb ;
- 490 ; string area
- 00000193 88E0 491 mov al,ah ;
- 00000195 240F 492 and al,0fh
- 00000197 43 493 inc ebx ;
- 00000198 494 enter_even:
- 00000198 0430 495 add al,`0' ; Conver
- 0000019A AA 496 stosb ; Put di
- 0000019B 43 497 inc ebx ;
- 0000019C 4E 498 dec esi ;
- 0000019D 79E9 499 jns digit_loop
- 500 ;
- 501 ; Conversion complete. Set the s
- 502 ; size and remainder.
- 503 ;
- 0000019F 504 exit_with_value:
- 0000019F 8B7D14 505 mov edi,size_ptr
- 000001A2 66891F 506 mov word ptr [edi],bx
- 000001A5 8BC2 507 mov eax,edx ;
- 000001A7 E980FEFFFF 508 jmp exit_proc
- 509
- 000001AC 510 floating_to_ascii endp
- 511
- -------- 512 code ends
- 513 end
-
- ASSEMBLY COMPLETE, NO WARNINGS, NO ERRORS.
-
-
- XENIX286 80386 MACRO ASSEMBLER V1.0, ASSEMBLY OF MODULE_GET_POWER 10
- OBJECT MODULE PLACED IN power10.obj
- ASSEMBLER INVOKED BY: asm386 power10.asm
-
- LOC OBJ LINE SOURCE
-
- 1 +1 $title(Calculate the value of 10**ax)
- 2 ;
- 3 ; This subroutine will calculate the
- 4 ; value of 10**eax. For values of
- 5 ; 0 <= eax < 19, the result will exact.
- 6 ; All 80386 registers are transparent
- 7 ; and the value is returned on the TOS
- 8 ; as two numbers, exponent in ST(1) and
- 9 ; fraction in ST(0). The exponent value
- 10 ; can be larger than the largest
- 11 ; exponent of an extended real format
- 12 ; number. Three stack entries are used.
- 13 ;
- 14 name get_power_10
- 00000000 15 public get_power_10,power_
- 16
- -------- 17 stack stackseg 8
- 18
- -------- 19 code segment public er
- 20 ;
- 21 ; Use exact values from 1.0 to 1e18
- 22 ;
- 23 even ; Optimize
- 00000000 000000000000F03F 24 power_table dq 1.0,1e1,1e2,1e3
- 00000008 00000000000D2440
- 00000010 0000000000005940
- 00000018 0000000000408F40
- 00000020 000000000088C340 25 dq 1e4,1e5,1e6,1e7
- 00000028 00000000006AF840
- 00000030 0000000080842E41
- 00000038 00000000D0126341
- 00000040 0000000084D79741 26 dq 1e8,1e9,1e10,1e11
- 00000048 0000000065CDCD41
- 00000050 000000205FA00242
- 00000058 000000E876483742
- 00000060 000000A2941A6D42 27 dq 1e12,1e13,1e14,1e15
- 00000068 000040E59C30A242
- 00000070 0000901EC4BCD642
- 00000078 00003420F56B0C43
- 00000080 0080E03779C34143 28 dq 1e16,1e17,1e18
- 00000088 00A0D88557347643
- 00000090 00C84E676DC1ABC3
- 29
- 00000098 30 get_power_10 proc
- 31
- 00000098 3D12000000 32 cmp eax,18 ; Test for
- 0000009D 770B 33 ja out_of_range
- 34
- 0000009F 2EDD04C500000000 R 35 fld power_table[eax*8]; Get exa
- 000000A7 D9F4 36 fxtract ; S
-
-
- 7.3.1 Function Partitioning
-
- Three separate modules implement the conversion. Most of the work of the
- conversion is done in the module FLOATING_TO_ASCII. The other modules are
- provided separately, because they have a more general use. One of them,
- GET_POWER_10, is also used by the ASCII to floating-point conversion
- routine. The other small module, TOS_STATUS, identifies what, if anything,
- is in the top of the numeric register stack.
-
-
- 7.3.2 Exception Considerations
-
- Care is taken inside the function to avoid generating exceptions. Any
- possible numeric value is accepted. The only possible exception is
- insufficient space on the numeric register stack.
-
- The value passed in the numeric stack is checked for existence, type (NaN
- or infinity), and status (denormal, zero, sign). The string size is tested
- for a minimum and maximum value. If the top of the register stack is empty,
- or the string size is too small, the function returns with an error code.
-
- Overflow and underflow is avoided inside the function for very large or
- very small numbers.
-
-
- 7.3.3 Special Instructions
-
- The functions demonstrate the operation of several numeric instructions,
- different data types, and precision control. Shown are instructions for
- automatic conversion to BCD, calculating the value of 10 raised to an
- integer value, establishing and maintaining concurrency, data
- synchronization, and use of directed rounding on the NPX.
-
- Without the extended precision data type and built-in exponential function,
- the double precision accuracy of this function could not be attained with
- the size and speed of the shown example.
-
- The function relies on the numeric BCD data type for conversion from binary
- floating-point to decimal. It is not difficult to unpack the BCD digits into
- separate ASCII decimal digits. The major work involves scaling the
- floating-point value to the comparatively limited range of BCD values. To
- print a 9-digit result requires accurately scaling the given value to an
- integer between 10^(8) and 10^(9). For example, the number +0.123456789
- requires a scaling factor of 10^(9) to produce the value +123456789.0, which
- can be stored in 9 BCD digits. The scale factor must be an exact power of
- 10 to avoid changing any of the printed digit values.
-
- These routines should exactly convert all values exactly representable in
- decimal in the field size given. Integer values that fit in the given string
- size are not be scaled, but directly stored into the BCD form. Noninteger
- values exactly representable in decimal within the string size limits are
- also exactly converted. For example, 0.125 is exactly representable in
- binary or decimal. To convert this floating-point value to decimal, the
- scaling factor is 1000, resulting in 125. When scaling a value, the function
- must keep track of where the decimal point lies in the final decimal value.
-
-
- 7.3.4 Description of Operation
-
- Converting a floating-point number to decimal ASCII takes three major
- steps: identifying the magnitude of the number, scaling it for the BCD data
- type, and converting the BCD data type to a decimal ASCII string.
-
- Identifying the magnitude of the result requires finding the value X such
- that the number is represented by I * 10^(X), where 1.0 ≤ I < 10.0. Scaling
- the number requires multiplying it by a scaling factor 10^(S), so that the
- result is an integer requiring no more decimal digits than provided for in
- the ASCII string.
-
- Once scaled, the numeric rounding modes and BCD conversion put the number
- in a form easy to convert to decimal ASCII by host software.
-
- Implementing each of these three steps requires attention to detail. To
- begin with, not all floating-point values have a numeric meaning. Values
- such as infinity, indefinite, or NaN may be encountered by the conversion
- routine. The conversion routine should recognize these values and identify
- them uniquely.
-
- Special cases of numeric values also exist. Denormals have numeric values,
- but should be recognized because they indicate that precision was lost
- during some earlier calculations.
-
- Once it has been determined that the number has a numeric value, and it is
- normalized (setting appropriate denormal flags, if necessary, to indicate
- this to the calling program), the value must be scaled to the BCD range.
-
-
- 7.3.5 Scaling the Value
-
- To scale the number, its magnitude must be determined. It is sufficient to
- calculate the magnitude to an accuracy of 1 unit, or within a factor of 10
- of the required value. After scaling the number, a check is made to see if
- the result falls in the range expected. If not, the result can be adjusted
- one decimal order of magnitude up or down. The adjustment test after the
- scaling is necessary due to inevitable inaccuracies in the scaling value.
-
- Because the magnitude estimate for the scale factor need only be close, a
- fast technique is used. The magnitude is estimated by multiplying the power
- of 2, the unbiased floating-point exponent, associated with the number by
- log{10}2. Rounding the result to an integer produces an estimate of
- sufficient accuracy. Ignoring the fraction value can introduce a maximum
- error of 0.32 in the result.
-
- Using the magnitude of the value and size of the number string, the scaling
- factor can be calculated. Calculating the scaling factor is the most
- inaccurate operation of the conversion process. The relation
- 10^(X) = 2^(X * log{2}10) is used for this function. The exponentiate
- instruction F2XM1 is used.
-
- Due to restrictions on the range of values allowed by the F2XM1
- instruction, the power of 2 value is split into integer and fraction
- components. The relation 2^(I + F) = 2^(I) * 2^(F) allows using the FSCALE
- instruction to recombine the 2^(F) value, calculated through F2XM1, and the
- 2^(I) part.
-
-
- 7.3.5.1 Inaccuracy in Scaling
-
- The inaccuracy in calculating the scale factor arises because of the
- trailing zeros placed into the fraction value of the power of two when
- stripping off the integer valued bits. For each integer valued bit in the
- power of 2 value separated from the fraction bits, one bit of precision is
- lost in the fraction field due to the zero fill occurring in the least
- significant bits.
-
- Up to 14 bits may be lost in the fraction because the largest allowed
- floating point exponent value is 2^(14) - 1. These bits directly reduce the
- accuracy of the calculated scale factor, thereby reducing the accuracy of
- the scaled value. For numbers in the range of 10^(±30), a maximum of 8 bits
- of precision are lost in the scaling process.
-
-
- 7.3.5.2 Avoiding Underflow and Overflow
-
- The fraction and exponent fields of the number are separated to avoid
- underflow and overflow in calculating the scaling values. For example, to
- scale 10^(-4932) to 10^(8) requires a scaling factor of 10^(4950), which
- cannot be represented by the NPX.
-
- By separating the exponent and fraction, the scaling operation involves
- adding the exponents separate from multiplying the fractions. The exponent
- arithmetic involves small integers, all easily represented by the NPX.
-
-
- 7.3.5.3 Final Adjustments
-
- It is possible that the power function (Get_Power_10) could produce a
- scaling value such that it forms a scaled result larger than the ASCII field
- could allow. For example, scaling 9.9999999999999999 * 10^(4900) by
- 1.00000000000000010 * 10^(-4883) produces 1.00000000000000009 * 10^(18). The
- scale factor is within the accuracy of the NPX and the result is within the
- conversion accuracy, but it cannot be represented in BCD format. This is why
- there is a post-scaling test on the magnitude of the result. The result can
- be multiplied or divided by 10, depending on whether the result was too
- small or too large, respectively.
-
-
- 7.3.6 Output Format
-
- For maximum flexibility in output formats, the position of the decimal
- point is indicated by a binary integer called the power value. If the power
- value is zero, then the decimal point is assumed to be at the right of the
- rightmost digit. Power values greater than zero indicate how many trailing
- zeros are not shown. For each unit below zero, move the decimal point to the
- left in the string.
-
- The last step of the conversion is storing the result in BCD and indicating
- where the decimal point lies. The BCD string is then unpacked into ASCII
- decimal characters. The ASCII sign is set corresponding to the sign of the
- original value.
-
-
- 7.4 Trigonometric Calculation Examples (Not Tested)
-
- In this example, the kinematics of a robot arm is modeled with the 4 * 4
- homogeneous transformation matrices proposed by Denavit and Hartenberg.
- The translational and rotational relationships between adjacent links are
- described with these matrices using the D-H matrix method. For each link,
- there is a 4 * 4 homogeneous transformation matrix that represents the
- link's coordinate system (L{i}) at the joint (J{i}) with respect to the
- previous link's coordinate system (J{i-1}, L{i-1}). The following four
- geometric quantities completely describe the motion of any rigid joint/link
- pair (J{i}, L{i}), as Figure 7-7 illustrates.
-
- Θ{i} = The angular displacement of the x{i} axis from the x{i-1} axis by
- rotating around the z{i-1} axis (anticlockwise).
-
- d{i} = The distance from the origin of the (i-1)^(th) coordinate system
- along the z{i-1} axis to the x{i} axis.
-
- a{i} = The distance of the origin of the i^(th) coordinate system from
- the z{i-1} axis along the -x{i} axis.
-
- α{i} = The angular displacement of the z{i} axis from the z{i-1} about
- the x{i} axis (anticlockwise).
-
- The D-H transformation matrix A=^(i){i-1} for adjacent coordinate frames
- (from joint{i-1} to joint{i}) is calculated as follows:
-
- A^(i){i-1} = T{z,d} * T{z,Θ} * T{x,a} * T{x,α}
-
- ...where...
-
- T{z,d} represents a translation along the z=i-1 axis
-
- T{z,Θ} represents a rotation of angle Θ about the z=i-1 axis
-
- T{x,a} represents a translation along the x{i}axis
-
- T{x,α} represents a rotation of angle α about the x{i}axis
-
- │ COS Θ{i} -COS α{i}SIN Θ{i} SIN α{i}SIN Θ{i} COS Θ{i} │
- A^(i){i-1} = │ SIN Θ{i} COS α{i}COS Θ{i} -SIN α{i}COS Θ{i} SIN Θ{i} │
- │ 0 SIN α{i} COS α{i} d{i} │
- │ 0 0 0 1 │
-
- The composite homogeneous matrix T which represents the position and
- orientation of the joint/link pair with respect to the base system is
- obtained by successively multiplying the D-H transformation matrices for
- adjacent coordinate frames.
-
- T^(i){0} = A^(1){0} * A^(2){1} * ... * A^(i){i-1}
-
- This example in Figure 7-8 illustrates how the transformation process can
- be accomplished using the 80387. The program consists of two major
- procedures. The first procedure TRANS_PROC is used to calculate the elements
- in each D-H matrix, A^(i){i-1}. The second procedure MATRIXMUL_PROC finds
- the product of two successive D-H matrices.
-
-
- Figure 7-8. Robot Arm Kinematics Example
-
- XENIX286 80386 MACRO ASSEMBLER V1.0, ASSEMBLY OF MODULE TOS_STATUS
- OBJECT MODULE PLACED IN tos.obj
- ASSEMBLER INVOKED BY: asm386 tos.asm
-
- LOC OBJ LINE SOURCE
-
- 1 +1 $title(Determine TOS register contents)
- 2 ;
- 3 ; This subroutine will return a value
- 4 ; from 0-15 in eax corresponding
- 5 ; to the contents of NPX TOS. All
- 6 ; registers are transparent and no
- 7 ; errors are possible. The return
- 8 ; value corresponds to c3,c2,c1,c0
- 9 ; of FXAM instruction.
- 10 ;
- 11 name tos_status
- 00000000 12 public tos_status
- 13
- -------- 14 stack stackseg 6
- 15
- -------- 16 code segment public er
- 17
- 00000000 18 tos_status proc
- 19
- 00000000 D9E5 20 fxam ; Get status of T
- 00000002 9BDFE0 21 fstsw ax ; Get current status
- 00000D05 88E0 22 mov al,ah ; Put bit 10.8 in
- 00000007 2507400000 23 and eax,4007h ; Mask out bits c
- 0000000C C0EC03 24 shr ah, 3 ; Put bit c3 into
- 0000000F 08E0 25 or al,ah ; Put c3 into bit
- 00000011 B400 26 mov ah,0 ; Clear return va
- 00000013 C3 27 ret
- 28
- 00000014 29 tos_status endp
- 30
- -------- 31 code ends
- 32 end
-
- ASSEMBLY COMPLETE, NO WARNINGS, NO ERRORS.
-
-
- LOC OBJ LINE SOURCE
-
- 37 ; and fraction
- 000000A9 C3 38 rat ; OK to leave fxtract runni
- 39 ;
- 40 ; Calculate the value using the
- 41 ; exponentiate instruction. The following
- 42 ; relations are used:
- 43 ; 10**x = 2**(log2(10)*x)
- 44 ; 2**(I+F) = 2**I * 2**F
- 45 ; if st(1) = I and st(0) = 2**F then
- 46 ; fscale produces 2**(I+F)
- 47 ;
- 000000AA 48 out of range:
- 49
- 000000AA D9E9 50 fld12t ; TOS = LOG2(10)
- 000000AC C8040000 51 enter 4,0
- 52
- 53 ; save power of 10 value, P
- 000000B0 8945FC 54 mov [ebp-4],eax
- 55
- 56 ; T0S,X = LOG2(10)*P = LOG2(10**P)
- 000000B3 DA4DFC 57 fimul dword ptr [ebp-4]
- 000000B6 D9E8 58 fld1 ; Set TOS = -1.0
- 000000B8 D9E0 59 fchs
- 000000BA D9C1 60 fld st(1) ; Copy power value
- 61 ; in base two
- 000000BC D9FC 62 frndint ; TOS = I: -inf < I <= X
- 63 ; where I is an integer
- 64 ; Rounding mode does
- 65 ; not matter
- 0000003E D9CA 66 fxch st(2) ; TOS = X, ST(1) = -1.0
- 67 ; ST(2) = I
- 000000C0 D8E2 68 fsub st,st(2) ; T0S,F = X-I:
- 69 ; -1.0 < TOS <= 1.0
- 70
- 71 ; Restore orignal rounding control
- 000000C2 58 72 pop eax
- 000000C3 D9F0 73 f2xm1 ; TOS = 2**(F) - 1.
- 000000C5 C9 74 leave ; Restore stack
- 000000C6 DEE1 75 fsubr ; Form 2**(F)
- 000000C8 C3 76 rat ; OK to leave fsubr
- 77
- 000000C9 78 get_power_10 endp
- 79
- -------- 80 code ends
- 81 end
-
- ASSEMBLY COMPLETE, NO WARNINGS, NO ERRORS.
-
-
- XENIX286 80386 MACRO ASSEMBLER V1.0, ASSEMBLY OF MODULE ROT_MATRIX_CAL
- OBJECT MODULE PLACED IN transx.obj
- ASSEMBLER INVOKED BY: asm386 transx.asm
-
- LOC OBJ LINE SOURCE
-
- 1 Name ROT_MATRIX_CAL
- 2
- 3
- 4
- 5 ; This example illustrates the use
- 6 ; of the 80387 floating point
- 7 ; instructions, in particular, the
- 8 ; FSINCOS function which gives both
- 9 ; the SIN and COS values.
- 10 ; The program calculates the
- 11 ; composite matrix for base to
- 12 ; end-effector transformation.
- 13 ;
- 14 ; Only the kinematics is considered in
- 15 ; this example.
- 16 ;
- 17 ; If the composite matrix mentioned above
- 18 ; is given by:
- 19 ; T1n = A1 x A2 x ... x An
- 20 ; T1n is found by successively calling
- 21 ; trans_proc and matrixmul_pro until
- 22 ; all matrices have been exhausted.
- 23 ;
- 24 ; trans_proc calculates entries in each
- 25 ; A(A1,...,An) while matrixmul_proc
- 26 ; performs the matrix multiplication for
- 27 ; Ai and Ai+1. matrixmul_proc in turn
- 28 ; calls matrix_row and matrix_elem to
- 29 ; do the multiplication.
- 30
- 31
- 32 ; Define stack space
- 33
- -------- 34 trans_stack stackseg 400
- 35
- 36 ; Define the matrix structure for
- 37 ; 4X4 transformational matrices
- 38
- -------- 39 a_matrix struc
- 00000000 40 a11 dq ?
- 00000008 41 a12 dq ?
- 00000010 42 a13 dq ?
- 00000018 43 a14 dq ?
- 00000020 44 a21 dq ?
- 00000028 45 a22 dq ?
- 00000030 46 a23 dq ?
- 00000038 47 a24 dq ?
- 00000040 48 a31 dq 0h
- 00000048 49 a32 dq ?
- 00000050 50 a33 dq ?
- 00000058 51 a34 dq ?
- 00000060 52 a41 dq 0h
- 00000068 53 a42 dq 0h
- 00000070 54 a43 dq 0h
- 00000078 55 a44 dq 1h
- -------- 56 a_matrix ends
- 57
- 58 ; Assume One joint in the storage
- 59 ; allocation and hence for
- 60 ; two sets of parameters; however,
- 61 ; more joints are possible
- 62 ;
- 63 alp_deg struc
- 00000000 64 alpha_deg1 dd ?
- 00000004 65 alpha_deg2 dd
- -------- 66 alp_deg ends
- 67
- -------- 68 tht_deg struc
- 00000000 69 theta_deg1 dd ?
- 00000004 70 theta_deg2 dd
- -------- 71 tht_deg ends
- 72
- -------- 73 A_array struc
- 00000000 74 A1 dq ?
- 00000008 75 A2 dq ?
- -------- 76 A_array ends
- 77
- -------- 78 D_array struc
- 00000000 79 D1 dq ?
- 00000008 80 D2 dq ?
- -------- 81 D_array ends
- 82
- 83 ; trans_data is the data segment
- 84 ;
- 85
- ------- 86 trans_data segment rw public
- 87
- 88 Amx a_matrix<>
- 00000000 ????????????????
- 00000008 ????????????????
- 00000010 ????????????????
- 00000018 ????????????????
- 00000020 ????????????????
- 00000028 ????????????????
- 00000030 ????????????????
- 00000038 ????????????????
- 00000040 0000000000000000
- 00000048 ????????????????
- 00000050 ????????????????
- 00000058 ????????????????
- 00000060 0000000000000000
- 00000068 0000000000000000
- 00000070 0000000000000000
- 00000078 0100000000000000
- 00000080 ???????????????? 89 Bmx a_matrix<>
- 00000088 ????????????????
- 00000090 ????????????????
- 00000098 ????????????????
- 000000A0 ????????????????
- 000000A8 ????????????????
- 000000B0 ????????????????
- 000000B8 ????????????????
- 000000C0 0000000000000000
- 000000C8 ????????????????
- 000000D0 ????????????????
- 000000D8 ????????????????
- 000000E0 0000000000000000
- 000000E8 0000000000000000
- 000000F0 0000000000000000
- 000000F8 0100000000000000
- 00000100 ???????????????? 90 Tmx a matrix<>
- 00000108 ????????????????
- 00000110 ????????????????
- 00000118 ????????????????
- 00000120 ????????????????
- 00000128 ????????????????
- 00000130 ????????????????
- 00000138 ????????????????
- 00000140 0000000000000000
- 00000148 ????????????????
- 00000150 ????????????????
- 00000158 ????????????????
- 00000160 0000000000000000
- 00000168 0000000000000000
- 00000170 0000000000000000
- 00000178 0100000000000000
- 00000180 ???????? 91 ALPHA_DEG alp_deg<>
- 00000184 ????????
- 00000188 ???????? 92 THETA_DEG tht_deg<>
- 0000018C ????????
- 00000190 ???????????????? 93 A_VECT0R A_array<>
- 00000198 ????????????????
- 000001A0 ???????????????? 94 D_VECT0R D_array<>
- 000001A8 ????????????????
- 000001B0 00000000 95 ZER0 dd 0
- 000001B4 B4000000 96 d180 dd 180
- 0001 97 NUM_JOIMT equ 1
- 0004 98 NUM_ROW equ 4
- 0004 99 NUM_CDL equ 4
- 000001B8 01 100 REVERSE db 1h
- -------- 101 trans_data ends
- 102
- 103 assume ds:trans_data, es:trans_data
- 104
- 105
- 106 ; trans_code contains the procedures
- 107 ; for calculating matrix elements and
- 108 ; matrix multiplications
- 109
- -------- 110 trans_code segment er public
- 111
- 112 ; create mnemonics for fsincos which is n
- 113 ; yet available from ASM386 as of now
- 114
- C MACRO 115 codemacro fsincos
- # 116 dw 0fbd9h
- # 117 endm
- 118
- 00000000 119 trans_proc proc far
- 120
- 121
- 122 ; Calculate alpha and theta in radian
- 123 ; from their values in degrees
- 124
- 00000000 D9EB 125 fldpi
- 00000002 D835B4010000 R 126 fdiv d180
- 127
- 128 ; Duplicate pi/180
- 00000008 D9C0 129 fld st
- 130
- 0000000A DC0CCD80010000 R 131 fmul qword ptr ALPHA_DEG[ecx*8]
- 00000011 D9C9 132 fxch st(1)
- 00000013 DC0CCD88010000 R 133 fmul qword ptr THETA_DEG[ecx*8]
- 134
- 135 ; theta(radians) in ST and
- 136 ; alpha(radians) in ST(1)
- 137
- 138 ; Calculate matrix elements
- 139 ; a11 = cos theta
- 140 ; a12 = - cos alpha * sin theta
- 141 ; a13 = sin alpha * sin theta
- 142 ; a14 = A * cos theta
- 143 ; a21 = sin theta
- 144 ; a22 = cos alpha * cos theta
- 145 ; a23 = -sin alpha * cos theta
- 146 ; a24 = A * sin theta
- 147 ; a32 = sin alpha
- 148 ; a33 = cos alpha
- 149 ; a34 = D
- 150 ; a31 = a41 = a42 = a43 = 0.0
- 151 ; a44 =1
- 152
- 153 ; ebx contains the offset for the mat
- 154
- 0000001A D9FB 155 fsincos ;cos theta in ST
- 156 ;sin theta in ST(1
- 0000001C D9C0 157 fld st ;duplicate cos the
- 0000001E DD13 158 fst [ebx].a11 ;cos theta in a11
- 00000020 DC0CCD90010000 R 159 fmul qword ptr A_VECTOR[ecx*8]
- 00000027 DD5B18 160 fstp [ebx].a14 ;A * cos theta in
- 0000002A D9C9 161 fxch st(1) ;sin theta in ST
- 0000002C DD5320 162 fst [ebx].a21 ;sin theta in a21
- 0000002F D9C0 163 fld st ;duplicate sin the
- 00000031 DC0CCD90010000 R 164 fmul qword ptr A_VECTOR[ecx*8]
- 00000038 DD5B38 165 fstp [ebx].a24 ;A * sin theta in
- 0000003B D9C2 166 fld st(2) ;alpha in ST
- 0000003D D9FB 167 fsincos ;cos alpha in ST
- 168 ;sin alpha in ST(1
- 169 ;sin theta in ST(2
- 170 ;cos theta in ST(3
- 0000003F DD5350 171 fst [ebx].a33 ;cos alpha in a33
- 00000042 D9C9 172 fxch st(1) ;sin alpha in ST
- 00000044 DD5348 173 fat [ebx].a32 ;sin alpha in a32
- 00000047 D9C2 174 fld ST(2) ;sin theta in ST
- 175 ;sin alpha in ST(1
- 00000049 D8C9 176 fmul st,st(1) ;sin alpha * sin t
- 0000004B DD5B10 177 fstp [ebx].a13 ;stored in a13
- 0000004E D8CB 178 fmul st,st(3) ;cos theta * sin a
- 00000050 D9E0 179 fchs ;-cos theta * sin
- 00000052 DD5B30 180 fstp [ebx].a23 ;stored in a23
- 00000055 D9C2 181 fld st(2) ;cos theta in ST
- 182 ;cos alpha in ST(1
- 183 ;sin theta in ST(2
- 184 ;cos theta in ST(3
- 00000057 D8C9 185 fmul st,st(1) ;cos theta * cos a
- 00000059 DD5B28 186 fstp [ebx].a22 ;stored in a22
- 0000005C D8C9 187 fmul st,st(1) ;cos alpha * sin t
- 188 ;
- 189 ; To take advantage of parallel opera
- 190 ; between the CPU and NPX
- 191 ;
- 0000005E 50 192 push eax ; save eax
- 193 ;
- 194 ; also move D into a34 in a faster wa
- 0000005F 8B04CDA0010000 R 195 mov eax, dword ptr D_VECTOR[ecx*
- 00000066 894358 196 mov dword ptr [ebx + 88], eax
- 00000069 8B04CDA4010000 R 197 mov eax, dword ptr D VECTOR[ecx*
- 00000070 89435C 198 mov dword ptr [ebx + 92], eax
- 00000073 58 199 pop eax ; restore eax
- 00000074 D9E0 200 fchs ;-cos alpha * sin
- 00000076 DD5B08 201 fstp [ebx].a12 ;stored in a12
- 202 ;and all nonzero e
- 203 ;have been calcula
- 00000079 CB 204 rat
- 205
- 0000007A 206 trans_proc endp
- 207
- 208
- 0000007A 209 matrix_elem proc far
- 210
- 211 ; This procedure calculate the dot pr
- 212 ; of the ith row of the first matrix
- 213 ; the jth column of the second matrix
- 214 ;
- 215 ; Tij where Tij = sum of Aik x Bkj ov
- 216 ;
- 217 ; parameters passed from the calling
- 218 ; matrix_row:
- 219 ; ESI = (i-1)*8
- 220 ; EDI = (j-1)*8
- 221 ; local register, EBP = (k-1)*8
- 222 ;
- 0000007A 55 223 push ebp ; save ebp
- 0000007B 51 224 push ecx ; ecx to be used as
- 0000007C 8BCE 225 mov ecx, esi; save it for later
- 226
- 227 ; locating the element in the first m
- 0000007E 6BC904 228 imul ecx, NUM_COL ; ecx contai
- 229 ; to precedi
- 230 ; offset is
- 231 ; beginning
- 232
- 00000081 31ED 233 xor ebp, ebp; clear ebp, which
- 234 ; used a temp reg t
- 235 ; across the ith ro
- 236 ; matrix as well as
- 237 ; column of the sec
- 238
- 239 ; clear Tij for accumulating Aik*Bkj
- 00000083 892C39 240 mov dword ptr [ecx][edi],ebp
- 00000086 896C3904 241 mov dword ptr [ecx][edi+4], ebp
- 242
- 0000008A 51 243 push ecx ; save on stack: es
- 244 ; the offset of the
- 245 ; of the ith row fr
- 246 ; beginning of the
- 247
- 0000008B 248 NXT_k:
- 0000008B 01E9 249 add ecx, ebp ; get to the kth c
- 250 ; of the ith row o
- 251
- 252 ; load AiK into 80387
- 0000008D DD0408 253 fld qword ptr [eax][ecx]
- 254
- 255 ; locating Bkj
- 00000090 8BCD 256 mov ecx, ebp
- 00000092 6BC904 257 imul ecx, NUM_ROW ; ecx contains
- 258 ; of the begin
- 259 ; kth row from
- 260 ; beginning of
- 00000095 01F9 261 add ecx, edi ; get to the j
- 262 ; of the kth r
- 263 ; matrix
- 00000097 DC0C0B 264 fmul qword ptr [ebx][ecx]; Aik *
- 0000009A 59 265 pop ecx ; esi * num_co
- 266 ; in ecx again
- 0000009B 51 267 push ecx ; also at top
- 268 ; stack
- 269
- 270 ; add to the result in the output mat
- 0000009C 01F9 271 add ecx, edi
- 272
- 273 ; accumulating the sum of Aik * Bkj
- 0000009E DC040A 274 fadd qword ptr [edx][ecx]
- 000000A1 DD1C0A 275 fstp qword ptr [edx][ecx]
- 276 ; increment k by 1, i.e., ebp by 8
- 000000A4 83C508 277 add ebp, 8
- 278
- 279 ; Has k reached the width of the matr
- 000000A7 83FD20 280 cmp ebp, NUM_COL*8
- 000000AA 7CDF 281 jl NXT_k
- 282
- 283 ; Restore registers
- 000000AC 59 284 pop ecx ; clear esi*num_col
- 000000AD 59 285 pop ecx ; restore ecx
- 000000AE 5D 286 pop ebp ; restore ebp
- 000000AF CB 287 ret
- 288
- 000000B0 289 matrix_elem endp
- 290
- 291
- 000000B0 292 matrix_row proc far
- 293
- 000000B0 31FF 294 xor edi, edi
- 295 ; scan across a row
- 296
- 000000B2 297 NXT_COL:
- 000000B2 9A7A000000.... R 298 call matrix_elem
- 000000B9 83C708 299 add edi, 8
- 000000BC 83FF20 300 cmp edi, NUM_COL*8
- 000000BF 7CF1 301 jl NXT_COL
- 000000C1 CB 302 ret
- 303
- 000000C2 304 matrix_row endp
- 305
- 306
- 000000C2 307 matrixmul_proc proc far
- 308
- 309 ; This procedure does the matrix
- 310 ; multiplication by calling matrix_ro
- 311 ; to calculate entries in each row
- 312 ;
- 313 ; The matrix multiplication is
- 314 ; performed in the following manner,
- 315 ; Tij = Aik x Bkj
- 316 ; where i and j denote the row and co
- 317 ; respectively and k is the index for
- 318 ; scanning across the ith row of the
- 319 ; first matrix and the jth column of
- 320 ; second matrix.
- 000000C2 5A 321 pop edx ; offset Tmx in edx
- 000000C3 5B 322 pop ebx ; offset Bmx in ebx
- 000000C4 58 323 pop eax ; offset Amx in eax
- 324
- 325 ; setup esi and edi
- 326 ; edi points to the column
- 327 ; eai points to the row
- 328
- 000000C5 31F6 329 xor esi, esi ; clear esi
- 330
- 000000C7 331 NXT_ROW:
- 000000C7 9AB0000000---- R 332 call matrix_row
- 000000CE 83C608 333 add esi, 8
- 000000D1 83FE20 334 cmp esi, NUM_ROW*8
- 000000D4 7CF1 335 jl NXT_ROW
- 000000D6 CB 336 ret
- 337
- 000000D7 338 matrixmul_proc endp
- 339
- 340
- -------- 341 trans_code ends
- 342
- 343 ;***************************************
- 344 ; ;
- 345 ; ;
- 346 ; ;
- 347 ; Main program ;
- 348 ; ;
- 349 ; ;
- 350 ; ;
- 351 ;***************************************
- 352
- -------- 353 main_code segment er
- 354
- 00000000 355 START:
- 356
- 00000000 BC00000000 R 357 mov esp, stackstart trans_stack
- 358 ; save all registers
- 359
- 00000005 60 360 pushed
- 361
- 362 ; ECX denotes the number of joints
- 363 ; where no of matrices = NUM_JOINT +
- 364 ; Find the first matrix( from the bas
- 365 ; of the system to the first joint)
- 366 ; and call it Bmx
- 00000006 31C9 367 xor ecx, ecx ; 1st matrix
- 00000008 BB80000000 R 368 mov ebx, offset Bmx ;
- 0000000D 9A00000000---- R 369 call trans_proc ; is Bmx
- 00000014 41 370 inc ecx
- 371
- 00000015 372 NXT MATRIX:
- 373 ; From the 2nd matrix and on, it
- 374 ; will be stored in Amx.
- 375 ; The result from the first matrix mu
- 376 ; is stored in Tmx but will be access
- 377 ; as Bmx in the next multiplication.
- 378 ; As a matter of fact, the roles of B
- 379 ; and Tmx alternate in successive
- 380 ; multiplications. This is achieved b
- 381 ; reversing the order of the Bmx and
- 382 ; pointers being passed onto the prog
- 383 ; stack: Thus, this is invisible to
- 384 ; matrix multiplication procedure.
- 385 ; REVERSE serves as the indicator;
- 386 ; REVERSE = 0 means that the result
- 387 ; is to placed in Tmx.
- 388
- 00000015 BB00000000 R 389 mov ebx, offset Amx ;find Amx
- 0000001A 9A00000000---- R 390 call trans_proc
- 00000021 41 391 inc ecx
- 00000022 8035B801000001 R 392 xor REVERSE, 1h
- 00000029 7511 393 jnz Bmx_as_Tmx
- 394
- 395 ; no reversing. Bmx as the second in
- 396 ; matrix while Tmx as the output matr
- 0000002B 6800000000 R 397 push offset Amx
- 00000030 6880000000 R 398 push offset Bmx
- 00000035 6800010000 R 399 push offset Tmx
- 0000003A EB0F 400 jmp CONTINUE
- 481
- 402 ; reversing. Tmx as the second input
- 403 ; matrix while Bmx as the output matr
- 0000003C 404 Bmx_as_Tmx:
- 0000003C 6800000000 R 405 push offset Amx
- 00000041 6800010000 R 406 push offset Tmx ;reversing the
- 00000046 6880000000 R 407 push offset Bmx ;pointers passe
- 408
- UUUUUU4B 409 CONTINUE:
- 0000004B 9AC2000000---- R 410 call matrixmul_proc
- 00000052 83F901 411 cmp ecx, NUM_JOINT
- 00000055 7EBE 412 jle NXT_MATRIX
- 413
- 414 ; if REVERSE = 1 then the final answe
- 415 ; will be in Bmx otherwise, in Tmx.
- 416
- 00000057 61 417 popad
- 418
- -------- 419 main_code ends
- 420
- 421 end START, ds:trans data, ss:trans stack
-
- ASSEMBLY COMPLETE, NO WARNINGS, NO ERRORS.
-
-
- Appendix A Machine Instruction Encoding and Decoding
-
- ───────────────────────────────────────────────────────────────────────────
- ╓┌────┌───────────┌─────────────┌────────────┌───────────────────────────────╖
- ┌──1st Byte──┐
- Hex Binary 2nd Byte Bytes 3-7 ASM386 Instruction Format
- ┌──1st Byte──┐
- Hex Binary 2nd Byte Bytes 3-7 ASM386 Instruction Format
-
- D8 1101 1000 MOD 000 R/M SIB, displ FADD single-real
- D8 1101 1000 MOD 001 R/M SIB, displ FMUL single-real
- D8 1101 1000 MOD 010 R/M SIB, displ FCOM single-real
- D8 1101 1000 MOD 011 R/M SIB, displ FCOMP single-real
- D8 1101 1000 MOD 100 R/M SIB, displ FSUB single-real
- D8 1101 1000 MOD 101 R/M SIB, displ FSUBR single-real
- D8 1101 1000 MOD 110 R/M SIB, displ FDIV single-real
- D8 1101 1000 MOD 111 R/M SIB, displ FDIVR single-real
- D8 1101 1000 1100 0 REG FADD ST,ST(i)
- D8 1101 1000 1100 1 REG FMUL ST,ST(i)
- D8 1101 1000 1101 0 REG FCOM ST(i)
- D8 1101 1000 1101 1 REG FCOMP ST(i)
- D8 1101 1000 1110 0 REG FSUB ST,ST(i)
- D8 1101 1000 1110 1 REG FSUBR ST,ST(i)
- D8 1101 1000 1111 0 REG FDIV ST,ST(i)
- D8 1101 1000 1111 1 REG FDIVR ST,ST(i)
- D9 1101 1001 MOD 000 R/M SIB, displ FLD single-real
- D9 1101 1001 MOD 001 R/M reserved
- ┌──1st Byte──┐
- Hex Binary 2nd Byte Bytes 3-7 ASM386 Instruction Format
- D9 1101 1001 MOD 001 R/M reserved
- D9 1101 1001 MOD 010 R/M SIB, displ FST single-real
- D9 1101 1001 MOD 011 R/M SIB, displ FSTP single-real
- D9 1101 1001 MOD 100 R/M SIB, displ FLDENV 14 or 28 bytes
-
-
-
-
- D9 1101 1001 MOD 101 R/M SIB, displ FLDCW 2 bytes
- D9 1101 1001 MOD 110 R/M SIB, displ FSTENV 14 or 28 bytes
-
-
-
-
- D9 1101 1001 MOD 111 R/M SIB, displ FSTCW 2 bytes
- D9 1101 1001 1100 0 REG FLD ST(i)
- D9 1101 1001 1100 1 REG FXCH ST(i)
- D9 1101 1001 1101 0000 FNOP
- D9 1101 1001 1101 0001 reserved
- ┌──1st Byte──┐
- Hex Binary 2nd Byte Bytes 3-7 ASM386 Instruction Format
- D9 1101 1001 1101 0001 reserved
- D9 1101 1001 1101 001- reserved
- D9 1101 1001 1101 01-- reserved
- D9 1101 1001 1101 1 REG reserved
- D9 1101 1001 1110 0000 FCHS
- D9 1101 1001 1110 0001 FABS
- D9 1101 1001 1110 001- reserved
- D9 1101 1001 1110 0100 FTST
- D9 1101 1001 1110 0101 FXAM
- D9 1101 1001 1110 011- reserved
- D9 1101 1001 1110 1000 FLD1
- D9 1101 1001 1110 1001 FLDL2T
- D9 1101 1001 1110 1010 FLDL2E
- D9 1101 1001 1110 1011 FLDPI
- D9 1101 1001 1110 1100 FLDLG2
- D9 1101 1001 1110 1101 FLDLN2
- D9 1101 1001 1110 1110 FLDZ
- D9 1101 1001 1110 1111 reserved
- D9 1101 1001 1111 0000 F2XM1
- ┌──1st Byte──┐
- Hex Binary 2nd Byte Bytes 3-7 ASM386 Instruction Format
- D9 1101 1001 1111 0000 F2XM1
- D9 1101 1001 1111 0001 FYL2X
- D9 1101 1001 1111 0010 FPTAN
- D9 1101 1001 1111 0011 FPATAN
- D9 1101 1001 1111 0100 FXTRACT
- D9 1101 1001 1111 0101 FPREM1
- D9 1101 1001 1111 0110 FDECSTP
- D9 1101 1001 1111 0111 FINCSTP
- D9 1101 1001 1111 1000 FPREM
- D9 1101 1001 1111 1001 FYL2XP1
- D9 1101 1001 1111 1010 FSQRT
- D9 1101 1001 1111 1011 FSINCOS
- D9 1101 1001 1111 1100 FRNDINT
- D9 1101 1001 1111 1101 FSCALE
- D9 1101 1001 1111 1110 FSIN
- D9 1101 1001 1111 1111 FCOS
- DA 1101 1010 MOD 000 R/M SIB, displ FIADD short-integer
- DA 1101 1010 MOD 001 R/M SIB, displ FIMUL short-integer
- DA 1101 1010 MOD 010 R/M SIB, displ FICOM short-integer
- ┌──1st Byte──┐
- Hex Binary 2nd Byte Bytes 3-7 ASM386 Instruction Format
- DA 1101 1010 MOD 010 R/M SIB, displ FICOM short-integer
- DA 1101 1010 MOD 011 R/M SIB, displ FICOMP short-integer
- DA 1101 1010 MOD 100 R/M SIB, displ FISUB short-integer
- DA 1101 1010 MOD 101 R/M SIB, displ FISUBR short-integer
- DA 1101 1010 MOD 110 R/M SIB, displ FIDIV short-integer
- DA 1101 1010 MOD 111 R/M SIB, displ FIDIVR short-integer
- DA 1101 1010 110- ---- reserved
- DA 1101 1010 1110 0--- reserved
- DA 1101 1010 1110 1000 reserved
- DA 1010 1010 1110 1001 FUCOMPP
- DA 1101 1010 1110 101- reserved
- DA 1101 1010 1110 11-- reserved
- DA 1101 1010 1111 ---- reserved
- DB 1101 1011 MOD 000 R/M SIB, displ FILD short-integer
- DB 1101 1011 MOD 001 R/M SIB, displ reserved
- DB 1101 1011 MOD 010 R/M SIB, displ FIST short-integer
- DB 1101 1011 MOD 011 R/M SIB, displ FISTP short-integer
- DB 1101 1011 MOD 100 R/M SIB, displ reserved
- DB 1101 1011 MOD 101 R/M SIB, displ FLD extended-real
- ┌──1st Byte──┐
- Hex Binary 2nd Byte Bytes 3-7 ASM386 Instruction Format
- DB 1101 1011 MOD 101 R/M SIB, displ FLD extended-real
- DB 1101 1011 MOD 110 R/M SIB, displ reserved
- DB 1101 1011 MOD 111 R/M SIB, displ FSTP extended-real
- DB 1101 1011 110- ---- reserved
- DB 1101 1011 1110 0000
-
-
-
-
- DB 1101 1011 1110 0001
-
-
-
-
- DB 1101 1011 1110 0010 FCLEX
- DB 1101 1011 1110 0011 FINIT
- DB 1101 1011 1110 0100
-
-
- ┌──1st Byte──┐
- Hex Binary 2nd Byte Bytes 3-7 ASM386 Instruction Format
-
-
- DB 1101 1011 1110 0101 reserved
- DB 1101 1011 1110 011- reserved
- DB 1101 1011 1110 1--- reserved
- DB 1101 1011 1111 ---- reserved
- DC 1101 1100 MOD 000 R/M SIB, displ FADD double-real
- DC 1101 1100 MOD 001 R/M SIB, displ FMUL double-real
- DC 1101 1100 MOD 010 R/M SIB, displ FCOM double-real
- DC 1101 1100 MOD 011 R/M SIB, displ FCOMP double-real
- DC 1101 1100 MOD 100 R/M SIB, displ FSUB double-real
- DC 1101 1100 MOD 101 R/M SIB, displ FSUBR double-real
- DC 1101 1100 MOD 110 R/M SIB, displ FDIV double-real
- DC 1101 1100 MOD 111 R/M SIB, displ FDIVR double-real
- DC 1101 1100 1100 0 REG FADD ST(i),ST
- DC 1101 1100 1100 1 REG FMUL ST(i),ST
- DC 1101 1100 1101 0 REG reserved
- DC 1101 100 1101 1 REG reserved
- ┌──1st Byte──┐
- Hex Binary 2nd Byte Bytes 3-7 ASM386 Instruction Format
- DC 1101 100 1101 1 REG reserved
- DC 1101 1100 1110 0 REG FSUBR ST(i),ST
- DC 1101 1100 1110 1 REG FSUB ST(i),ST
- DC 1101 1100 1111 0 REG FDIVR ST(i),ST
- DC 1101 1100 1111 1 REG FDIV ST(i),ST
- DD 1101 1101 MOD 000 R/M SIB, displ FLD double-real
- DD 1101 1101 MOD 001 R/M reserved
- DD 1101 1101 MOD 010 R/M SIB, displ FST double-real
- DD 1101 1101 MOD 011 R/M SIB, displ FSTP double-real
- DD 1101 1101 MOD 100 R/M SIB, displ FRSTOR 94 or 108 bytes
-
-
-
-
- DD 1101 1101 MOD 101 R/M SIB, displ reserved
- DD 1101 1101 MOD 110 R/M SIB, displ FSAVE 94 or 108 bytes
-
-
-
- ┌──1st Byte──┐
- Hex Binary 2nd Byte Bytes 3-7 ASM386 Instruction Format
-
- DD 1101 1101 MOD 111 R/M SIB, displ FSTSW 2 bytes
- DD 1101 1101 1100 0 REG FFREE ST(i)
- DD 1101 1101 1100 1 REG reserved
- DD 1101 1101 1101 0 REG FST ST(i)
- DD 1101 1101 1101 1 REG FSTP ST(i)
- DD 1101 1101 1110 0 REG FUCOM ST(i)
- DD 1101 1101 1110 1 REG FUCOMP ST(i)
- DD 1101 1101 1111 ---- reserved
- DE 1101 1110 MOD 000 R/M SIB, displ FIADD word-integer
- DE 1101 1110 MOD 001 R/M SIB, displ FIMUL word-integer
- DE 1101 1110 MOD 010 R/M SIB, displ FICOM word-integer
- DE 1101 1110 MOD 011 R/M SIB, displ FICOMP word-integer
- DE 1101 1110 MOD 100 R/M SIB, displ FISUB word-integer
- DE 1101 1110 MOD 101 R/M SIB, displ FISUBR word-integer
- DE 1101 1110 MOD 110 R/M SIB, displ FIDIV word-integer
- DE 1101 1110 MOD 111 R/M SIB, displ FIDIVR word-integer
- DE 1101 1110 1100 0 REG FADDP ST(i),ST
- ┌──1st Byte──┐
- Hex Binary 2nd Byte Bytes 3-7 ASM386 Instruction Format
- DE 1101 1110 1100 0 REG FADDP ST(i),ST
- DE 1101 1110 1100 1 REG FMULP ST(i),ST
- DE 1101 1110 1101 0--- reserved
- DE 1101 1110 1101 1000 reserved
- DE 1101 1110 1101 1001 FCOMPP
- DE 1101 1110 1101 101- reserved
- DE 1101 1110 1101 11-- reserved
- DE 1101 1110 1110 0 REG FSUBRP ST(i),ST
- DE 1101 1110 1110 1 REG FSUBP ST(i),ST
- DE 1101 1110 1111 0 REG FDIVRP ST(i),ST
- DE 1101 1110 1111 1 REG FDIVP ST(i),ST
- DF 1101 1111 MOD 000 R/M SIB, displ FILD word-integer
- DF 1101 1111 MOD 001 R/M SIB, displ reserved
- DF 1101 1111 MOD 010 R/M SIB, displ FIST word-integer
- DF 1101 1111 MOD 011 R/M SIB, displ FISTP word-integer
- DF 1101 1111 MOD 100 R/M SIB, displ FBLD packed-decimal
- DF 1101 1111 MOD 101 R/M SIB, displ FILD long-integer
- DF 1101 1111 MOD 110 R/M SIB, displ FBSTP packed-decimal
- DF 1101 1111 MOD 111 R/M SIB, displ FISTP long-integer
- ┌──1st Byte──┐
- Hex Binary 2nd Byte Bytes 3-7 ASM386 Instruction Format
- DF 1101 1111 MOD 111 R/M SIB, displ FISTP long-integer
- DF 1101 1111 1100 0 REG reserved
- DF 1101 1111 1100 1 REG reserved
- DF 1101 1111 1101 0 REG reserved
- DF 1101 1111 1101 1 REG reserved
- DF 1101 1111 1110 0000 FSTSW AX
- DF 1101 1111 1110 0001 reserved
- DF 1101 1111 1110 001- reserved
- DF 1101 1111 1110 01-- reserved
- DF 1101 1111 1110 1--- reserved
- DF 1101 1111 1111 ---- reserved
-
-
- Appendix B Exception Summary
-
- ────────────────────────────────────────────────────────────────────────────
-
- The following table lists the instruction mnemonics in alphabetical order.
- For each mnemonic, it summarizes the exceptions that the instruction may
- cause. When writing 80387 programs that may be used in an environment that
- employs numerics exception handlers, assembly-language programmers should be
- aware of the possible exceptions for each instruction in order to determine
- the need for exception synchronization. Chapter 4 explains the need for
- exception synchronization.
-
- ╓┌───────────────────┌───────────────────────┌────┌───┌───┌───┌───┌───┌──────╖
- Mnemonic Instruction IS I D Z O U P
-
-
-
-
-
- F2XM1 2^(X) - 1 Y Y Y Y Y
- FABS Absolute value Y
- FADD(P) Add real Y Y Y Y Y Y
- FBLD BCD load Y
- FBSTP BCD store and pop Y Y Y
- FCHS Change sign Y
- FCLEX Clear exceptions
- FCOM(P)(P) Compare real Y Y Y
- Mnemonic Instruction IS I D Z O U P
- FCOM(P)(P) Compare real Y Y Y
- FCOS Cosine Y Y Y Y Y
- FDECSTP Decrement stack pointer
- FDIV(R)(P) Divide real Y Y Y Y Y Y Y
- FFREE Free register
- FIADD Integer add Y Y Y Y Y Y
- FICOM(P) Integer compare Y Y Y
- FIDIV Integer divide Y Y Y Y Y Y
- FIDIVR Integer divide reversed Y Y Y Y Y Y Y
- FILD Integer load Y
- FIMUL Integer multiply Y Y Y Y Y Y
- FINCSTP Increment stack pointer
- FINIT Initialize processor
- FIST(P) Integer store Y Y Y
- FISUB(R) Integer subtract Y Y Y Y Y Y
- FLD extended
- or stack Load real Y
- FLD single
- or double Load real Y Y Y
- FLD1 Load + 1.0 Y
- Mnemonic Instruction IS I D Z O U P
- FLD1 Load + 1.0 Y
- FLDCW Load Control word Y Y Y Y Y Y Y
- FLDENV Load environment Y Y Y Y Y Y Y
- FLDL2E Load log{2}e Y
- FLDL2T Load log{2}10 Y
- FLDLG2 Load log{10}2 Y
- FLDLN2 Load log{e}2 Y
- FLDPI Load π Y
- FLDZ Load + 0.0 Y
- FMUL(P) Multiply real Y Y Y Y Y Y
- FNOP No operation
- FPATAN Partial arctangent Y Y Y Y Y
- FPREM Partial remainder Y Y Y Y
- FPREM1 IEEE partial remainder Y Y Y Y
- FPTAN Partial tangent Y Y Y Y Y
- FRNDINT Round to integer Y Y Y Y
- FRSTOR Restore state Y Y Y Y Y Y Y
- FSAVE Save state
- FSCALE Scale Y Y Y Y Y Y
- FSIN Sine Y Y Y Y Y
- Mnemonic Instruction IS I D Z O U P
- FSIN Sine Y Y Y Y Y
- FSINCOS Sine and cosine Y Y Y Y Y
- FSQRT Square root Y Y Y Y
- FST(P) stack
- or extended Store real Y
- FST(P) single
- or double Store real Y Y Y Y Y Y
- FSTCW Store control word
- FSTENV Store Environment
- FSTSW (AX) Store status word
- FSUB(R)(P) Subtract real Y Y Y Y Y Y
- FTST Test Y Y Y
- FUCOM(P)(P) Unordered compare real Y Y Y
- FWAIT CPU Wait
- FXAM Examine
- FXCH Exchange registers Y
- FXTRACT Extract Y Y Y Y
- FYL2X Y * log{2}X Y Y Y Y Y Y Y
- FYL2XP1 Y * log{2}(X + 1) Y Y Y Y Y
-
-
- Appendix C Compatibility Between the 80387 and the 80287/8087
-
- ───────────────────────────────────────────────────────────────────────────
-
- This appendix summarizes the differences between the 80387 and its
- predecessors the 80287 and the 8087, and analyzes the impact of these
- differences on software that must be transported from the 80287 or 8087 to
- the 80387. Any migration from the 8087 directly to the 80387 must also take
- into account the additional differences between the 8087 and the 80387 as
- listed in Appendix D of this manual.
-
- ╓┌───────────────┌────────────────────────────────┌──────────────────────────
- ┌────────────────────Difference Description──────────────────
- Issue 80387 Behavior 8087/80287 Behavior
-
- C.1 INITIALIZATION SEQUENCE
-
- RESET, After a hardware RESET, No difference between
- FINIT, the ERROR# output is RESET and FINIT.
- and asserted to indicate that an
- ┌────────────────────Difference Description──────────────────
- Issue 80387 Behavior 8087/80287 Behavior
- and asserted to indicate that an
- ERROR# 80387 is present. To
- PIN accomplish this, the IE and
- ES bits of the status word
- are set, and the IM bit in
- the control word is reset.
- After FINIT, the status
- word and the control word
- have the same values as in
- an 80287/8087 after
- RESET.
-
- C.2 DATA TYPES AND EXCEPTION HANDLING
-
- NaN The 80387 distinguishes The 80287/8087 only
- between signaling NaNs generates one kind of NaN
- and quiet NaNs. The 80387 (the equivalent of a quiet
- only generates quiet NaNs. NaN) but raises an
- An invalid-operation invalid-operation exception
- ┌────────────────────Difference Description──────────────────
- Issue 80387 Behavior 8087/80287 Behavior
- An invalid-operation invalid-operation exception
- exception is raised only upon encountering any kind
- upon encountering a of NaN.
- signaling NaN (except for
- FCOM, FIST, and FBSTP
- which also raise IE for
- quiet NaNs).
-
- Pseudozero, The 80387 neither The 80287/8087 defines
- Pseudo-NaN, generates not supports these and supports special
- Pseudoinfinity, formats; it raises an handling for these formats.
- and Unnormal invalid-operation exception
- Formats whenever it encounters
- them in an arithmetic
- operation.
-
- Tag Word The encoding in the tag The encoding for pseudo-
- Bits for word for the unsupported zero and unnormal is
- Unsupported data formats mentioned in "valid" (type 00); the
- ┌────────────────────Difference Description──────────────────
- Issue 80387 Behavior 8087/80287 Behavior
- Unsupported data formats mentioned in "valid" (type 00); the
- Data Section C.2.2 is "special others are"special data"
- Formats data" (type 10). (type 10).
-
- Invalid- No invalid-operation Upon encountering a
- Operation exception is raised upon denormal in FSQRT, FDIV,
- Exception encountering a denormal in or FPREM or upon
- FSQRT, FDIV, or FPREM conversion to BCD or to
- or upon conversion to integer, the invalid-
- BCD or to integer. The operation exception is
- operation proceeds by first raised.
- normalizing the value.
-
- Denormal The denormal exception is The denormal exception is
- Exception raised in transcendental not raised in transcendental
- instructions and FXTRACT. instructions and FXTRACT.
-
-
- Overflow Overflow exception Overflow exception
- ┌────────────────────Difference Description──────────────────
- Issue 80387 Behavior 8087/80287 Behavior
- Overflow Overflow exception Overflow exception
- Exception masked. masked.
- If the rounding mode is set The 80287/8087 does not
- to chop (toward zero), the signal the overflow
- result is the most positive exception when the masked
- or most negative number. response is not infinity;
- i.e., it signals overflow
- only when the rounding
- control is not set to round
- to zero .If rounding is set
- to chop (toward zero), the
- result is positive or
- negative infinity.
-
- Overflow exception not Overflow exception not
- masked. masked.
- The precision exception is The precision exception is
- flagged. When the result is not flagged and the
- stored in the stack, the significand is not rounded.
- ┌────────────────────Difference Description──────────────────
- Issue 80387 Behavior 8087/80287 Behavior
- stored in the stack, the significand is not rounded.
- significand is rounded
- according to the precision
- control (PC) bit of the
- control word of according
- to the opcode.
-
- Underflow Conditions for underflow. Conditions for underflow.
- Exception When the underflow When the underflow
- exception is masked, the exception is masked and
- Two related underflow exception is rounding is toward zero, the
- events signaled when both the underflow exception flag is
- contribute to result is tiny and raised on tininess,
- underflow: denormalization results regardless of loss of
- in a loss of accuracy. accuracy.
- 1. The creation
- tiny result. Response to underflow. Response to underflow.
- A tiny When the underflow When the underflow
- number, exception is unmasked exception is not masked and
- ┌────────────────────Difference Description──────────────────
- Issue 80387 Behavior 8087/80287 Behavior
- number, exception is unmasked exception is not masked and
- because it and the instruction is the destination is the
- is so small, supposed to store the stack, the significand is
- may cause result on the stack, the not rounded but rather is
- some other significand is rounded to left as is.
- exception the appropriate precision
- later (such (according to the precision
- as overflow control (PC) bit of the
- upon control word, for those
- division). instructions controlled by
- PC, otherwise to extended
- 2. Loss of precision).
- accuracy
- during the
- denormaliza-
- tion of a
- tiny number.
- Which of
- these events
- ┌────────────────────Difference Description──────────────────
- Issue 80387 Behavior 8087/80287 Behavior
- these events
- triggers the
- underflow
- exception
- depends on
- whether the
- underflow
- exception
- is masked.
-
- Which of these
- events triggers
- the underflow
- exception
- depends on
- whether the
- underflow
- exception is
- masked.
- ┌────────────────────Difference Description──────────────────
- Issue 80387 Behavior 8087/80287 Behavior
- masked.
-
- Exception There is no difference in When the denormal
- Precedence the precedence of the exception is not masked,
- denormal exception, it takes precedence over
- whether it be masked or all other exceptions.
- not.
-
- C.3 TAG, STATUS, AND CONTROL WORDS
-
- Bits C3-C0 of After FINIT, incomplete After FINIT, incomplete
- Status Word FPREM, and hardware FPREM, and hardware
- reset, the 80387 sets these reset, the 80287/8087
- bits to zero. leaves these bits intact
- (they contain the prior
- value).
-
- Bit C2 of Bit 10 (C2) serves as an This bit is undefined for
- Status Word incomplete bit for FPTAN. FPTAN.
- ┌────────────────────Difference Description──────────────────
- Issue 80387 Behavior 8087/80287 Behavior
- Status Word incomplete bit for FPTAN. FPTAN.
-
-
- Infinity Only affine closure is Both affine and projective
- Control supported. Bit 12 remains closures are supported.
- programmable but has no After RESET, the default
- effect on 80387 operation. value in the control word is
- projective.
-
- Status Word When an invalid-operation When an invalid-operation
- Bit 6 for exception occurs due to exception occurs due to
- Stack Fault stack overflow or stack overflow or underflow,
- underflow, not only is bit 0 only bit 0 (IE) of the
- (IE) the status word set, but status word is set. Bit 6 is
- also bit 6 is set to indicate RESERVED.
- a stack fault and bit 9 (C1)
- specifies overflow or
- underflow. Bit 6 is called
- SF and serves to distinguish
- ┌────────────────────Difference Description──────────────────
- Issue 80387 Behavior 8087/80287 Behavior
- SF and serves to distinguish
- invalid exceptions caused by
- stack overflow/underflow from
- those caused by numeric
- operations.
-
- Tag Word When loading the tag word The corresponding tag is
- with an FLDENV or checked before each
- FRSTOR instruction, the register access to determine
- only interpretations of tag the class of operand in the
- values used by the 80387 register; the tag is updated
- are empty (value 11) and after every change to a
- Nonempty (values 00, 01, register so that the tag
- and 10). Subsequent always reflects the most
- operations on a nonempty recent status of the
- register always examine register. Programmers can
- the value in the register, load a tag with a value that
- not the value in its tag. disagrees with the contents
- The FSTENV and FSAVE of a register (for example,
- ┌────────────────────Difference Description──────────────────
- Issue 80387 Behavior 8087/80287 Behavior
- The FSTENV and FSAVE of a register (for example,
- instructions examine the the register contains valid
- nonempty registers and contents, but the tag says
- put the correct values in special; the 80287/8087, in
- the tags before storing the this case, honors the tag
- tag word. and does not examine the
- register).
-
- C.4 INSTRUCTION SET
-
- FBSTP, FDIV, Operation on denormal Operation on denormal
- FIST(P), FPREM, operand is supported. An operand raises
- FSQRT underflow exception can invalid-operation exception.
- occur. Underflow is not possible.
-
-
-
-
- FSCALE The range of the scaling The range of the scaling
- ┌────────────────────Difference Description──────────────────
- Issue 80387 Behavior 8087/80287 Behavior
- FSCALE The range of the scaling The range of the scaling
- operand is not restricted. operand is retricted. If 0 <
- If 0 < │ST(1)│ < 1, the │ST(1)│ < 1, the result is
- scaling factor is zero; undefined and no exception
- therefore, ST(0) remains is signaled.
- unchanged. If the rounded
- result is not exact or if
- there was a loss of
- accuracy (masked underflow),
- the precision exception
- is signaled.
-
- FPREM1 Performs partial remainder Does not exist.
- according to IEEE
- Standard 754 standard.
-
- FPREM Bits C0, C3, C1 of the The quotient bits are
- status word, correctly incorrect when performing a
- reflect the three low-order reduction of 64^(N) + M when
- ┌────────────────────Difference Description──────────────────
- Issue 80387 Behavior 8087/80287 Behavior
- reflect the three low-order reduction of 64^(N) + M when
- bits of the quotient. N ≥ 1 and M=1 or M=2.
-
-
- FUCOM, FUCOMP, Perform unordered Do not exist.
- FUCOMPP compare according to
- IEEE Standard 754
- standard.
-
- FPTAN Range of operand is much Range of operand is
- less restricted (│ST(0)│ < restricted (│ST(0)│ < π/4);
- 2^(63)); reduces operand operand must be reduced
- internally using an internal to range using FPREM.
- π/4 constant that is more
- accurate.
-
- After a stack overflow After a stack overflow
- when the invalid-operation when the invalid-operation
- exception is masked, both exception is masked, the
- ┌────────────────────Difference Description──────────────────
- Issue 80387 Behavior 8087/80287 Behavior
- exception is masked, both exception is masked, the
- ST and ST(1) contain quiet original operand remains
- NaNs. unchanged, but is pushed
- to ST(1).
-
- FSIN, FCOS, Perform three common Do not exist.
- FSINCOS trigonometric functions.
-
- FPATAN Range of operands is │ST(0)│ must be smaller
- unrestricted. than │ST(1)│.
-
- F2XM1 Wider range of operand The supported operand
- (-1 ≤ ST(0) ≤ +1). range is 0 ≤ ST(0) ≤ 0.5.
-
- FLD Does not report denormal Reports denormal exception.
- extended-real exception because the
- instruction is not arithmetic.
-
- FXTRACT If the operand is zero, the If the operand is zero,
- ┌────────────────────Difference Description──────────────────
- Issue 80387 Behavior 8087/80287 Behavior
- FXTRACT If the operand is zero, the If the operand is zero,
- zero-divide exception is ST(1) is zero and no
- reported and ST(1) is -∞. exception is reported. If
- If the operand is +∞, no the operand is +∞, the
- exception is reported. invalid-operation exception
- is reported.
-
- FLD constant Rounding control is in Rounding control is not in
- effect. effect.
-
-
-
-
-
-
-
-
-
-
- ┌────────────────────Difference Description──────────────────
- Issue 80387 Behavior 8087/80287 Behavior
-
-
- FLD Loading a denormal Loading a denormal causes
- single/double causes the number to be the number to be converted
- precision converted to extended to an unnormal.
- precision (because it is put
- on the stack).
-
- FLD When loading a signaling Does not raise an
- single/double NaN, raises invalid exception. exception when loading a
- precision signaling NaN.
-
- FSETPM Treated as FNOP (no Informs the 80287 that the
- operation). system is in protected
- mode.
-
-
-
- ┌────────────────────Difference Description──────────────────
- Issue 80387 Behavior 8087/80287 Behavior
- FXAM When encountering an May generate these
- empty register, the 80387 combinations, among others.
- will not generate
- combinations of C3-C0 equal to
- 1101 or 1111.
-
- All May generate different Round-up bit of status
- Transcendental results in round-up bit of word is undefined for these
- Instructions status word. instructions.
-
-
- Appendix D Compatibility Between the 80387 and the 8087
-
- ───────────────────────────────────────────────────────────────────────────
-
- The 80386/80387 operating in real-address mode will execute 8087 programs
- without major modification. However, because of differences in the handling
- of numeric exceptions between the 80387 NPX and the 8087 NPX,
- exception-handling routines may need to be changed.
-
- This appendix summarizes the additional differences between the 80387 NPX
- and the 8087 NPX (other than those already included in Appendix B), and
- provides details showing how 8087 programs can be ported to the 80387.
-
- 1. The 80387 signals exceptions through a dedicated ERROR# line to
- the 80386; no interrupt controller is needed for this purpose. The
- 8087 requires an interrupt controller (8259A) to interrupt the CPU
- when an unmasked exception occurs. Therefore, any
- interrupt-controller-oriented instructions in numeric exception
- handlers for the 8087 should be deleted.
-
- 2. The 8087 instructions FENI/FNENI and FDISI/FNDISI perform no useful
- function in the 80387. If the 80387 encounters one of these opcodes in
- its instruction stream, the instruction will effectively be
- ignored──none of the 80387 internal states will be updated. While 8087
- code containing these instructions may be executed on the 80387, it
- is unlikely that the exception-handling routines containing these
- instructions will be completely portable to the 80387.
-
- 3. In real mode and protected mode (not including virtual 8086 mode),
- interrupt vector 16 must point to the numeric exception handling
- routine. In virtual 8086 mode, the V86 monitor can be programmed to
- accommodate a different location of the interrupt vector for numeric
- exceptions.
-
- 4. The ESC instruction address saved in the 80386/80387 or 80386/80287
- includes any leading prefixes before the ESC opcode. The corresponding
- address saved in the 8086/8087 does not include leading prefixes.
-
- 5. In protected mode (not including virtual 8086 mode), the format of
- the 80387's saved instruction and address pointers is different than
- for the 8087. The instruction opcode is not saved in protected
- mode──exception handlers will have to retrieve the opcode from memory
- if needed.
-
- 6. Interrupt 7 will occur in the 80386 when executing ESC instructions
- with either TS (task switched) or EM (emulation) of the 80386 MSW set
- (TS=1 or EM=1). If TS is set, then a WAIT instruction will also cause
- interrupt 7. An exception handler should be included in 80387 code to
- handle these situations.
-
- 7. Interrupt 9 will occur if the second or subsequent words of a
- floating-point operand fall outside a segment's size. Interrupt 13
- will occur if the starting address of a numeric operand falls outside
- a segment's size. An exception handler should be included to report
- these programming errors.
-
- 8. Except for the processor control instructions, all of the 80387
- numeric instructions are automatically synchronized by the 80386
- CPU──the 80386 automatically waits until all operands have been
- transferred between the 80386 and the 80387 before executing the
- next ESC instruction. No explicit WAIT instructions are required to
- assure this synchronization. For the 8087 used with 8086 and 8088
- processors, explicit WAITs are required before each numeric
- instruction to ensure synchronization. Although 8087 programs having
- explicit WAIT instructions will execute perfectly on the 80387
- without reassembly, these WAIT instructions are unnecessary.
-
- 9. Since the 80387 does not require WAIT instructions before each
- numeric instruction, the ASM386 assembler does not automatically
- generate these WAIT instructions. The ASM86 assembler, however,
- automatically precedes every ESC instruction with a WAIT
- instruction. Although numeric routines generated using the ASM86
- assembler will generally execute correctly on the 80386/20,
- reassembly using ASM386 may result in a more compact code image and
- faster execution.
-
- The processor control instructions for the 80387 may be coded using
- either a WAIT or No-WAIT form of mnemonic. The WAIT forms of these
- instructions cause ASM386 to precede the ESC instruction with a CPU
- WAIT instruction, in the identical manner as does ASM86.
-
- 10. The address of a memory operand stored by FSAVE or FSTENV is
- undefined if the previous ESC instruction did not refer to memory.
-
- 11. Because the 80387 automatically normalizes denormal numbers when
- possible, an 8087 program that uses the denormal exception solely to
- normalize denormal operands can run on an 80387 by masking the
- denormal exception. The 8087 denormal exception handler would not be
- used by the 80387 in this case. A numerics program runs faster when
- the 80387 performs normalization of denormal operands. A program can
- detect at run-time whether it is running on an 80387 or 8087/80287 and
- disable the denormal exception when an 80387 is used.
-
-
- Appendix E 80387 80-Bit CHMOS III Numeric Processor Extension
-
- For Advance Information on the Intel 80387 please consult Appendix E of the
- printed version of this book or the 80387 Data Sheet, order number 231920.
-
-
- Appendix F PC/AT-Compatible 80387 Connection
-
- ───────────────────────────────────────────────────────────────────────────
-
- The PC/AT uses a nonstandard scheme to report 80287 exceptions to the
- 80286. When replicating the PC/AT coprocessor interface in 80386-based
- systems, the PC/AT interface cannot be used in exactly the same way;
- however, this appendix outlines a similar interface that works on
- 80386/80387 systems and maintains compatibility with the nonstandard PC/AT
- scheme.
-
- Note that the interface outlined here does not represent a new interface
- standard; it needs to be incorporated in AT-compatible designs only because
- the 80286 and 80287 in the PC/AT are not connected according to the
- standards defined by Intel. The standard 80386/80387 connection recommended
- by Intel in the 80387 Data Sheet functions properly; the 80386
- implementation has not been and will not be altered.
-
-
- F.1 The PC/AT Interface
-
- In the PC/AT, the ERROR# input to the 80286 is tied inactive (high)
- permanently. The ERROR# output of the 80287 is tied to an interrupt port
- (IRQ13). This interrupt replaces exception signaling via the 80286's ERROR#
- input. To guarantee (in the case of an 80287 exception) that INTR 13 will be
- serviced prior to the execution of any further 80287 instructions, an
- edge-triggered flip-flop latches BUSY# using ERROR# as a clock. The output
- of this latch is ORed with the BUSY# output of the 80287 and drives the
- BUSY# input of the 80286. This PC/AT scheme effectively delays deactivation
- of BUSY# at the 80286 whenever an 80287 ERROR# is signaled.
-
- Since the 80286 BUSY# input remains active after an exception, the 80286
- interrupt 13 handler is guaranteed to execute before any other 80287
- instructions may begin. The interrupt 13 handler clears the BUSY# latch (via
- a write to a special I/O port), thus allowing execution of 80287
- instructions to proceed. The interrupt 13 handler then branches to the NMI
- handler, where the user-defined numerics exception handler resides in
- PC-compatible systems.
-
- The use of an interrupt guarantees that an exception from a coprocessor
- instruction will be detected. Latching BUSY# guarantees that any coprocessor
- instruction (except FINIT, FSETPM, and FCLEX) following the instruction that
- raised the exception will not be executed before the NMI handler is
- executed.
-
- This PC/AT scheme approximates the exception reporting scheme between the
- 8087 and 8088 in the original PC.
-
-
- F.2 How to Achieve the Same Effect in an 80386 System
-
- The 80386 can use a PC/AT-compatible interface to communicate with an 80387
- provided that, when an NPX exception occurs, BUSY# active time is extended
- and PEREQ is reactivated only after 80387 BUSY# has gone inactive. The 80387
- is left active (tying STEN high) at all times. Also, the 80386 and 80387
- must be reset by the same RESET signal.
-
- The reactivation of PEREQ for the 80386 is needed for store instructions
- (for example, FST mem) because the 80387 drops PEREQ once it signals an
- exception. While the 80386 has not yet recognized the occurrence of the
- exception, it still expects the data transfers to complete via PEREQ
- reactivation. It is permissible for the 80386 to receive undefined data
- during such I/O read cycles. Disabling the 80387 is not necessary, because
- the dummy data-transfer cycles directed to the 80387 when PEREQ is
- externally reactivated for the 80386 will not disturb the operation of the
- 80387. The interrupt 13 handler should remove the extension of BUSY# and
- reactivation of PEREQ via a write to PC/AT-compatible hardware at I/O port
- F0H.
-
-
- Glossary of 80387 and Floating-Point Terminology
-
- ───────────────────────────────────────────────────────────────────────────
-
- This glossary defines many terms that have precise technical meanings as
- specified in the IEEE 754 Standard or as specified in this manual. Where
- these terms are used, they have been italicized to emphasize the precision
- of their meanings.
-
- Base
- (1) a term used in logarithms and exponentials. In both contexts, it is a
- number that is being raised to a power. The two equations (y = log base b
- of x) and (b^(y) = x) are the same.
-
- Base
- (2) a number that defines the representation being used for a string of
- digits. Base 2 is the binary representation; base 10 is the decimal
- representation; base 16 is the hexadecimal representation. In each case,
- the base is the factor of increased significance for each succeeding
- digit (working up from the bottom).
-
- Bias
- a constant that is added to the true exponent of a real number to obtain
- the exponent field of that number's floating-point representation in the
- 80387. To obtain the true exponent, you must subtract the bias from the
- given exponent. For example, the single real format has a bias of 127
- whenever the given exponent is nonzero. If the 8-bit exponent field
- contains 10000011, which is 131, the true exponent is 131-127, or +4.
-
- Biased Exponent
- the exponent as it appears in a floating-point representation of a number.
- The biased exponent is interpreted as an unsigned, positive number. In the
- above example, 131 is the biased exponent.
-
- Binary Coded Decimal
- a method of storing numbers that retains a base 10 representation. Each
- decimal digit occupies 4 full bits (one hexadecimal digit). The
- hexadecimal values A through F (1010 through 1111) are not used. The
- 80387 supports a packed decimal format that consists of 9 bytes of binary
- coded decimal (18 decimal digits) and one sign byte.
-
- Binary Point
- an entity just like a decimal point, except that it exists in binary
- numbers. Each binary digit to the right of the binary point is multiplied
- by an increasing negative power of two.
-
- C3──C0
- the four "condition code" bits of the 80387 status word. These bits are
- set to certain values by the compare, test, examine, and remainder
- functions of the 80387.
-
- Characteristic
- a term used for some non-Intel computers, meaning the exponent field of a
- floating-point number.
-
- Chop
- to set one or more low-order bits of a real number to zero, yielding the
- nearest representable number in the direction of zero.
-
- Condition Code
- the four bits of the 80387 status word that indicate the results of the
- compare, test, examine, and remainder functions of the 80387.
-
- Control Word
- a 16-bit 80387 register that the user can set, to determine the modes of
- computation the 80387 will use and the exception interrupts that will be
- enabled.
-
- Denormal
- a special form of floating-point number. On the 80387, a denormal is
- defined as a number that has a biased exponent of zero. By providing a
- significand with leading zeros, the range of possible negative
- exponents can be extended by the number of bits in the significand.
- Each leading zero is a bit of lost accuracy, so the extended exponent
- range is obtained by reducing significance.
-
- Double Extended
- the Standard's term for the 80387's extended format, with more exponent
- and significand bits than the double format and an explicit integer bit
- in the significand.
-
- Double Format
- a floating-point format supported by the 80387 that consists of a sign, an
- 11-bit biased exponent, an implicit integer bit, and a 52-bit
- significand──a total of 64 explicit bits.
-
- Environment
- the 14 or 28 (depending on addressing mode) bytes of 80387 registers
- affected by the FSTENV and FLDENV instructions. It encompasses the entire
- state of the 80387, except for the 8 registers of the 80387 stack.
- Included are the control word, status word, tag word, and the instruction,
- opcode, and operand information provided by interrupts.
-
- Exception
- any of the six conditions (invalid operand, denormal, numeric overflow,
- numeric underflow, zero-divide, and precision) detected by the 80387 that
- may be signaled by status flags or by traps.
-
- Exception Pointers
- The data maintained by the 80386 to help exception handlers identify
- the cause of an exception. This data consists of a pointer to the most
- recently executed ESC instruction and a pointer to the memory operand of
- this instruction, if it had a memory operand. An exception handler can use
- the FSTENV and FSAVE instructions to access these pointers.
-
- Exponent
- (1) any number that indicates the power to which another number is raised.
-
- Exponent
- (2) the field of a floating-point number that indicates the magnitude of
- the number. This would fall under the above more general definition (1),
- except that a bias sometimes needs to be subtracted to obtain the correct
- power.
-
- Extended Format
- the 80387's implementation of the Standard's double extended format.
- Extended format is the main floating-point format used by the 80387.
- It consists of a sign, a 15-bit biased exponent, and a significand with an
- explicit integer bit and 63 fractional-part bits.
-
- Floating-Point
- of or pertaining to a number that is expressed as base, a sign, a
- significand, and a signed exponent. The value of the number is the signed
- product of its significand and the base raised to the power of the
- exponent. Floating-point representations are more versatile than integer
- representations in two ways. First, they include fractions. Second, their
- exponent parts allow a much wider range of magnitude than possible with
- fixed-length integer representations.
-
- Gradual Underflow
- a method of handling the underflow error condition that minimizes the loss
- of accuracy in the result. If there is a denormal number that represents
- the correct result, that denormal is returned. Thus, digits are lost only
- to the extent of denormalization. Most computers return zero when
- underflow occurs, losing all significant digits.
-
- Implicit Integer Bit
- a part of the significand in the single real and double real formats
- that is not explicitly given. In these formats, the entire given
- significand is considered to be to the right of the binary point. A single
- implicit integer bit to the left of the binary point is always one, except
- in one case. When the exponent is the minimum (biased exponent is zero),
- the implicit integer bit is zero.
-
- Indefinite
- a special value that is returned by functions when the inputs are such
- that no other sensible answer is possible. For each floating-point format
- there exists one quiet NaN that is designated as the indefinite value. For
- binary integer formats, the negative number furthest from zero is often
- considered the indefinite value. For the 80387 packed decimal format, the
- indefinite value contains all 1's in the sign byte and the uppermost
- digits byte.
-
- Inexact
- The Standard's term for the 80387's precision exception.
-
- Infinity
- a value that has greater magnitude than any integer or any real number. It
- is often useful to consider infinity as another number, subject to special
- rules of arithmetic. All three Intel floating-point formats provide
- representations for +∞ and -∞.
-
- Integer
- a number (positive, negative, or zero) that is finite and has no
- fractional part. Integer can also mean the computer representation for
- such a number: a sequence of data bytes, interpreted in a standard way. It
- is perfectly reasonable for integers to be represented in a floating-point
- format; this is what the 80387 does whenever an integer is pushed onto the
- 80387 stack.
-
- Integer Bit
- a part of the significand in floating-point formats. In these formats, the
- integer bit is the only part of the significand considered to be to the
- left of the binary point. The integer bit is always one, except in one
- case: when the exponent is the minimum (biased exponent is zero), the
- integer bit is zero. In the extended format the integer bit is explicit;
- in the single format and double format the integer bit is implicit; i.e.,
- it is not actually stored in memory.
-
- Invalid Operation
- the exception condition for the 80387 that covers all cases not covered by
- other exceptions. Included are 80387 stack overflow and underflow, NaN
- inputs, illegal infinite inputs, out-of-range inputs, and inputs in
- unsupported formats.
-
- Long Integer
- an integer format supported by the 80387 that consists of a 64-bit two's
- complement quantity.
-
- Long Real
- an older term for the 80387's 64-bit double format.
-
- Mantissa
- a term used with some non-Intel computers for the significand of a
- floating-point number.
-
- Masked
- a term that applies to each of the six 80387 exceptions I,D,Z,O,U,P. An
- exception is masked if a corresponding bit in the 80387 control word is
- set to one. If an exception is masked, the 80387 will not generate an
- interrupt when the exception condition occurs; it will instead provide its
- own exception recovery.
-
- Mode
- One of the status word fields "rounding control" and "precision control"
- which programs can set, sense, save, and restore to control the execution
- of subsequent arithmetic operations.
-
- NaN
- an abbreviation for "Not a Number"; a floating-point quantity that does
- not represent any numeric or infinite quantity. NaNs should be returned
- by functions that encounter serious errors. If created during a sequence
- of calculations, they are transmitted to the final answer and can contain
- information about where the error occurred.
-
- Normal
- the representation of a number in a floating-point format in which the
- significand has an integer bit one (either explicit or implicit).
-
- Normalize
- convert a denormal representation of a number to a normal representation.
-
- NPX
- Numeric Processor Extension. This is the 80387, 80287, or 8087.
-
- Overflow
- an exception condition in which the correct answer is finite, but has
- magnitude too great to be represented in the destination format. This kind
- of overflow (also called numeric overflow) is not to be confused with
- stack overflow.
-
- Packed Decimal
- an integer format supported by the 80387. A packed decimal number is a
- 10-byte quantity, with nine bytes of 18 binary coded decimal digits and
- one byte for the sign.
-
- Pop
- to remove from a stack the last item that was placed on the stack.
-
- Precision
- The effective number of bits in the significand of the floating-point
- representation of a number.
-
- Precision Control
- an option, programmed through the 80387 control word, that allows all
- 80387 arithmetic to be performed with reduced precision. Because no
- speed advantage results from this option, its only use is for strict
- compatibility with the standard and with other computer systems.
-
- Precision Exception
- an 80387 exception condition that results when a calculation does not
- return an exact answer. This exception is usually masked and ignored; it
- is used only in extremely critical applications, when the user must know
- if the results are exact. The precision exception is called inexact
- in the standard.
-
- Pseudozero
- one of a set of special values of the extended real format. The set
- consists of numbers with a zero significand and an exponent that is
- neither all zeros nor all ones. Pseudozeros are not created by the 80387
- but are handled correctly when encountered as operands.
-
- Quiet NaN
- a NaN in which the most significant bit of the fractional part of the
- significand is one. By convention, these NaNs can undergo certain
- operations without causing anexception.
-
- Real
- any finite value (negative, positive, or zero) that can be represented by
- a (possibly infinite) decimal expansion. Reals can be represented as the
- points of a line marked off like a ruler. The term real can also refer
- to a floating-point number that represents a real value.
-
- Short Integer
- an integer format supported by the 80387 that consists of a 32-bit two's
- complement quantity. short integer is not the shortest 80387 integer
- format──the 16-bit word integer is.
-
- Short Real
- an older term for the 80387's 32-bit single format.
-
- Signaling NaN
- a NaN that causes an invalid-operation exception whenever it enters into
- a calculation or comparison, even a nonordered comparison.
-
- Significand
- the part of a floating-point number that consists of the most significant
- nonzero bits of the number, if the number were written out in an unlimited
- binary format. The significand is composed of an integer bit and a
- fraction. The integer bit is implicit in the single format and double
- format. The significand is considered to have a binary point after the
- integer bit; the binary point is then moved according to the value of the
- exponent.
-
- Single Extended
- a floating-point format, required by the standard, that provides greater
- precision than single; it also provides an explicit integer bit in the
- significand. The 80387's extended format meets the single extended
- requirement as well as the double extended requirement.
-
- Single Format
- a floating-point format supported by the 80387, which consists of a sign,
- an 8-bit biased exponent, an implicit integer bit, and a 23-bit
- significand──a total of 32 explicit bits.
-
- Stack Fault
- a special case of the invalid-operation exception which is indicated by a
- one in the SF bit of the status word. This condition usually results from
- stack underflow or overflow.
-
- Standard
- "IEEE Standard for Binary Floating-Point Arithmetic," ANSI/IEEE
- Std 754-1985.
-
- Status Word
- A 16-bit 80387 register that can be manually set, but which is usually
- controlled by side effects to 80387 instructions. It contains condition
- codes, the 80387 stack pointer, busy and interrupt bits, and exception
- flags.
-
- Tag Word
- a 16-bit 80387 register that is automatically maintained by the 80387. For
- each space in the 80387 stack, it tells if the space is occupied by a
- number; if so, it gives information about what kind of number.
-
- Temporary Real
- an older term for the 80387's 80-bit extended format.
-
- Tiny
- of or pertaining to a floating-point number that is so close to zero that
- its exponent is smaller than smallest exponent that can be represented in
- the destination format.
-
- TOP
- The three-bit field of the status word that indicates which 80387 register
- is the current top of stack.
-
- Transcendental
- one of a class of functions for which polynomial formulas are always
- approximate, never exact for more than isolated values. The 80387 supports
- trigonometric, exponential, and logarithmic functions; all are
- transcendental.
-
- Two's Complement
- a method of representing integers. If the uppermost bit is zero, the
- number is considered positive, with the value given by the rest of the
- bits. If the uppermost bit is one, the number is negative, with the value
- obtained by subtracting (2^(bit count)) from all the given bits. For
- example, the 8-bit number 11111100 is -4, obtained by subtracting 2^(8)
- from 252.
-
- Unbiased Exponent
- the true value that tells how far and in which direction to move the
- binary point of the significand of a floating-point number. For example,
- if a single-format exponent is 131, we subtract the Bias 127 to obtain the
- unbiased exponent +4. Thus, the real number being represented is the
- significand with the binary point shifted 4 bits to the right.
-
- Underflow
- an exception condition in which the correct answer is nonzero, but has a
- magnitude too small to be represented as a normal number in the
- destination floating-point format. The Standard specifies that an attempt
- be made to represent the number as a denormal. This denormalization may
- result in a loss of significant bits from the significand. This kind of
- underflow (also called numeric overflow) is not to be confused with stack
- underflow.
-
- Unmasked
- a term that applies to each of the six 80387 exceptions: I,D,Z,O,U,P. An
- exception is unmasked if a corresponding bit in the 80387 control word is
- set to zero. If an exception is unmasked, the 80387 will generate an
- interrupt when the exception condition occurs. You can provide an
- interrupt routine that customizes your exception recovery.
-
- Unnormal
- a extended real representation in which the explicit integer bit of the
- significand is zero and the exponent is nonzero. Unnormal values are
- not supported by the 80387; they cause the invalid-operation exception
- when encountered as operands.
-
- Unsupported Format
- Any number representation that is not recognized by the 80387. This
- includes several formats that are recognized by the 8087 and 80287;
- namely: pseudo-NaN, pseudoinfinity, and unnormal.
-
- Word Integer
- an integer format supported by both the 80386 and the 80387 that consists
- of a 16-bit two's complement quantity.
-
- Zero divide
- an exception condition in which the inputs are finite, but the correct
- answer, even with an unlimited exponent, has infinite magnitude.
-