home *** CD-ROM | disk | FTP | other *** search
Text File | 2013-11-08 | 1.8 MB | 46,684 lines |
Text Truncated. Only the first 1MB is shown below. Download the file for the complete contents.
- Microsoft Macro Assembler - Programmer's Guide
-
-
-
-
-
-
-
-
- ────────────────────────────────────────────────────────────────────────────
- Microsoft (R) Macro Assembler - Programmer's Guide
-
- Version 6.0
- ────────────────────────────────────────────────────────────────────────────
-
-
- For MS (R) OS/2 and MS-DOS (R) Operating Systems
-
-
-
-
-
-
-
-
- Microsoft Corporation
-
- Information in this document is subject to change without notice and does
- not represent a commitment on the part of Microsoft Corporation. The
- software described in this document is furnished under a license agreement
- or nondisclosure agreement. The software may be used or copied only in
- accordance with the terms of the agreement. It is against the law to copy
- the software on any medium except as specifically allowed in the license or
- nondisclosure agreement. No part of this manual may be reproduced or
- transmitted in any form or by any means, electronic or mechanical, including
- photocopying and recording, for any purpose without the express written
- permission of Microsoft.
- RESTRICTED RIGHTS: Use, duplication, or disclosure by the U.S. Government is
- subject to restrictions as set forth in subparagraph (c)(1)(ii) of the
- Rights in Technical Data and Computer Software clause at DFARS 252.227-7013
- or subparagraphs (c)(1) and (2) of Commercial Computer Software
- ─Restricted Rights at 48 CFR 52.227-19, as applicable.
- Contractor/Manufacturer is Microsoft Corporation, One Microsoft Way,
- Redmond, WA 98052-6399.
-
-
- (C) Copyright Microsoft Corporation, 1991. All rights reserved.
-
- Printed in the United States of America.
-
-
- Microsoft, MS, MS-DOS, CodeView, QuickC,
- and XENIX are registered trademarks and Making it all make sense,
- Microsoft QuickBasic, QuickPascal, and Windows are trademarks of Microsoft
- Corporation.
-
- U.S. Patent No. 4,955,066
-
- Hercules is a registered trademark of Hercules Computer
- Technology.
-
- IBM is a registered trademark of International Business
- Machines Corporation.
-
- Intel is a registered trademark of Intel Corporation.
-
- NEC and V25 are registered trademarks and V35 is a trademark
- of NEC Corporation.
-
- Document No. LN06556-0291
-
- 10 9 8 7 6 5 4 3 2 1
-
-
-
-
-
-
-
- Introduction
- New and Extended Features in MASM 6.0
- New MASM Language Features
- ML and MASM Command Lines
- Compatibility with Earlier Versions of MASM
- Scope and Organization of this Book
- Books for Further Reading
- Document Conventions
- Getting Assistance and Reporting Problems
-
- Chapter 1 Understanding Global Concepts
-
- 1.1 The Processing Environment
- 1.1.1 8086-Based Processors
- 1.1.2 Operating Systems
- 1.1.3 Segmented Architecture
- 1.1.4 Segment Protection
- 1.1.5 Segmented Addressing
- 1.1.6 Segment Arithmetic
- 1.2 Language Components of MASM
- 1.2.1 Reserved Words
- 1.2.2 Identifiers
- 1.2.3 Predefined Symbols
- 1.2.4 Integer Constants and Constant Expressions
- 1.2.5 Operators
- 1.2.6 Data Types
- 1.2.7 Registers
- 1.2.8 Statements
- 1.3 The Assembly Process
- 1.3.1 Generating and Running Executable Programs
- 1.3.2 Using the OPTION Directive
- 1.3.3 Conditional Directives
- 1.4 Related Topics in Online Help
-
- Chapter 2 Organizing MASM Segments
-
- 2.1 Overview of Memory Segments
- 2.2 Using Simplified Segment Directives
- 2.2.1 Defining Basic Attributes with .MODEL
- 2.2.2 Specifying a Processor and Coprocessor
- 2.2.3 Creating a Stack
- 2.2.4 Creating Data Segments
- 2.2.5 Creating Code Segments
- 2.2.6 Starting and Ending Code with .STARTUP and .EXIT
- 2.3 Using Full Segment Definitions
- 2.3.1 Defining Segments with the SEGMENT Directive
- 2.3.2 Controlling the Segment Order
- 2.3.3 Setting the ASSUME Directive for Segment Registers
- 2.3.4 Defining Segment Groups
- 2.4 Related Topics in Online Help
-
- Chapter 3 Using Addresses and Pointers
-
- 3.1 Programming Segmented Addresses
- 3.1.1 Initializing Default Segment Registers
- 3.1.2 Near and Far Addresses
- 3.2 Specifying Addressing Modes
- 3.2.1 Register Operands
- 3.2.2 Immediate Operands
- 3.2.3 Direct Memory Operands
- 3.2.4 Indirect Memory Operands
- 3.3 Accessing Data with Pointers and Addresses
- 3.3.1 Defining Pointer Types with TYPEDEF
- 3.3.2 Defining Register Types with ASSUME
- 3.3.3 Basic Pointer and Address Operations
- 3.4 Related Topics in Online Help
-
- Chapter 4 Defining and Using Integers
-
- 4.1 Declaring Integer Variables
- 4.1.1 Allocating Memory for Integer Variables
- 4.1.2 Data Initialization
- 4.2 Integer Operations
- 4.2.1 Moving and Loading Integers
- 4.2.2 Pushing and Popping Stack Integers
- 4.2.3 Adding and Subtracting Integers
- 4.2.4 Multiplying and Dividing Integers
- 4.3 Manipulating Integers at the Bit Level
- 4.3.1 Logical Operations
- 4.3.2 Shifting and Rotating Bits
- 4.3.3 Multiplying and Dividing with Shift Instructions
- 4.4 Related Topics in Online Help
-
- Chapter 5 Defining and Using Complex Data Types
-
- 5.1 Arrays and Strings
- 5.1.1 Declaring and Referencing Arrays
- 5.1.2 Declaring and Initializing Strings
- 5.1.3 Processing Arrays and Strings
- 5.2 Structures and Unions
- 5.2.1 Declaring Structure and Union Types
- 5.2.2 Defining Structure and Union Variables
- 5.2.3 Referencing Structures, Unions, and Fields
- 5.2.4 Nested Structures and Unions
- 5.3 Records
- 5.3.1 Declaring Record Types
- 5.3.2 Defining Record Variables
- 5.3.3 Record Operators
- 5.4 Related Topics in Online Help
-
- Chapter 6 Using Floating-Point and Binary Coded Decimal Numbers
-
- 6.1 Using Floating-Point Numbers
- 6.1.1 Declaring Floating-Point Variables and Constants
- 6.1.2 Storing Numbers in Floating-Point Format
- 6.2 Using a Math Coprocessor
- 6.2.1 Coprocessor Architecture
- 6.2.2 Instruction and Operand Formats
- 6.2.3 Coordinating Memory Access
- 6.2.4 Using Coprocessor Instructions
- 6.3 Using Emulator Libraries
- 6.4 Using Binary Coded Decimal Numbers
- 6.4.1 Defining BCD Constants and Variables
- 6.4.2 Calculating with BCDs
- 6.5 Related Topics in Online Help
-
- Chapter 7 Controlling Program Flow
-
- 7.1 Jumps
- 7.1.1 Unconditional Jumps
- 7.1.2 Conditional Jumps
- 7.2 Loops
- 7.2.1 Loop-Generating Directives
- 7.2.2 Writing Loop Conditions
- 7.3 Procedures
- 7.3.1 Defining Procedures
- 7.3.2 Passing Arguments on the Stack
- 7.3.3 Declaring Parameters with the PROC Directive
- 7.3.4 Using Local Variables
- 7.3.5 Creating Local Variables Automatically
- 7.3.6 Declaring Procedure Prototypes
- 7.3.7 Calling Procedures with INVOKE
- 7.3.8 Generating Prologue and Epilogue Code
- 7.4 DOS Interrupts
- 7.4.1 Calling DOS and ROM-BIOS Interrupts
- 7.4.2 Replacing or Redefining Interrupt Routines
- 7.5 Related Topics in Online Help
-
- Chapter 8 Sharing Data and Procedures among Modules and Libraries
-
- 8.1 Selecting Data-Sharing Methods
- 8.2 Sharing Symbols with Include Files
- 8.2.1 Organizing Modules
- 8.2.2 Declaring Symbols Public and External
- 8.2.3 Positioning External Declarations
- 8.3 Using Alternatives to Include Files
- 8.3.1 PUBLIC and EXTERN
- 8.3.2 Other Alternatives
- 8.4 Developing Libraries
- 8.4.1 Associating Libraries with Modules
- 8.4.2 Using EXTERN with Library Routines
- 8.5 Related Topics in Online Help
-
- Chapter 9 Using Macros
-
- 9.1 Text Macros
- 9.2 Macro Procedures
- 9.2.1 Creating Macro Procedures
- 9.2.2 Passing Arguments to Macros
- 9.2.3 Specifying Required and Default Parameters
- 9.2.4 Defining Local Symbols in Macros
- 9.3 Assembly Time Variables and Macro Operators
- 9.3.1 Text Delimiters (< >) and the Literal-Character
- Operator (!)
- 9.3.2 Expansion Operator (%)
- 9.3.3 Substitution Operator (&)
- 9.4 Defining Repeat Blocks with Loop Directives
- 9.4.1 REPEAT Loops
- 9.4.2 WHILE Loops
- 9.4.3 FOR Loops and Variable-Length Parameters
- 9.4.4 FORC Loops
- 9.5 String Directives and Predefined Functions
- 9.6 Returning Values with Macro Functions
- 9.7 Advanced Macro Techniques
- 9.7.1 Nesting Macro Definitions
- 9.7.2 Testing for Argument Type and Environment
- 9.7.3 Using Recursive Macros
- 9.8 Related Topics in Online Help
-
- Chapter 10 Managing Projects with NMAKE
-
- 10.1 Overview of NMAKE
- 10.2 Running NMAKE
- 10.3 NMAKE Description Files
- 10.3.1 Description Blocks
- 10.3.2 Pseudotargets
- 10.3.3 Comments
- 10.3.4 Macros
- 10.3.5 Inference Rules
- 10.3.6 Directives
- 10.3.7 Preprocessing Directives
- 10.3.8 Extracting Filename Components
- 10.4 Command-Line Options
- 10.5 NMAKE Command File
- 10.6 The TOOLS.INI File
- 10.7 Inline Files
- 10.8 Sequence of NMAKE Operations
- 10.9 A Sample NMAKE Description File
- 10.10 Differences between NMAKE and MAKE
- 10.11 Using NMK
- 10.12 Using Exit Codes with NMAKE
- 10.13 Related Topics in Online Help
-
- Chapter 11 Creating Help Files with HELPMAKE
-
- 11.1 Structure and Contents of a Help Database
- 11.1.1 Contents of a Help File
- 11.1.2 Help File Formats
- 11.2 Invoking HELPMAKE
- 11.3 HELPMAKE Options
- 11.3.1 Options for Encoding
- 11.3.2 Options for Decoding
- 11.3.3 Options for Help
- 11.4 Creating a Help Database
- 11.5 Help Text Conventions
- 11.5.1 Structure of the Help Text File
- 11.5.2 Local Contexts
- 11.5.3 Context Prefixes
- 11.5.4 Hyperlinks
- 11.6 Using Help Database Formats
- 11.6.1 QuickHelp Format
- 11.6.2 Rich Text Format
- 11.6.3 Minimally Formatted ASCII Format
- 11.7 Related Topics in Online Help
-
- Chapter 12 Linking Object Files with LINK
-
- 12.1 Overview
- 12.2 LINK Output Files
- 12.3 LINK Syntax and Input
- 12.3.1 The objfiles Field
- 12.3.2 The exefile Field
- 12.3.3 The mapfile Field
- 12.3.4 The libraries Field
- 12.3.5 The deffile Field
- 12.3.6 Examples
- 12.4 Running LINK
- 12.4.1 Specifying Input with LINK Prompts
- 12.4.2 Specifying Input in a Response File
- 12.5 LINK Options
- 12.5.1 Specifying Options
- 12.5.2 The /ALIGN Option
- 12.5.3 The /BATCH Option
- 12.5.4 The /CO Option
- 12.5.5 The /CPARM Option
- 12.5.6 The /DOSSEG Option
- 12.5.7 The /DSALLOC Option
- 12.5.8 The /EXEPACK Option
- 12.5.9 The /FARCALL Option
- 12.5.10 The /HELP Option
- 12.5.11 The /HIGH Option
- 12.5.12 The /INCR Option
- 12.5.13 The /INFO Option
- 12.5.14 The /LINE Option
- 12.5.15 The /MAP Option
- 12.5.16 The /NOD Option
- 12.5.17 The /NOE Option
- 12.5.18 The /NOFARCALL Option
- 12.5.19 The /NOGROUP Option
- 12.5.20 The /NOI Option
- 12.5.21 The /NOLOGO Option
- 12.5.22 The /NONULLS Option
- 12.5.23 The /NOPACKC Option
- 12.5.24 The /OV Option
- 12.5.25 The /PACKC Option
- 12.5.26 The /PACKD Option
- 12.5.27 The /PADC Option
- 12.5.28 The /PADD Option
- 12.5.29 The /PAUSE Option
- 12.5.30 The /PM Option
- 12.5.31 The /Q Option
- 12.5.32 The /SEG Option
- 12.5.33 The /STACK Option
- 12.5.34 The /TINY Option
- 12.5.35 The /W Option
- 12.5.36 The /? Option
- 12.6 Setting Options with the LINK Environment Variable
- 12.6.1 Setting the LINK Environment Variable
- 12.6.2 Behavior of the LINK Environment Variable
- 12.6.3 Clearing the LINK Environment Variable
- 12.7 Using Overlays under DOS
- 12.7.1 Restrictions on Overlays
- 12.7.2 Specifying Overlays
- 12.7.3 How Overlays Work
- 12.7.4 Overlay Interrupts
- 12.8 Linker Operation under DOS
- 12.8.1 Segment Alignment
- 12.8.2 Frame Number
- 12.8.3 Segment Order
- 12.8.4 Combined Segments
- 12.8.5 Groups
- 12.8.6 Fixups
- 12.9 LINK Temporary Files
- 12.10 LINK Exit Codes
- 12.11 Related Topics in Online Help
-
- Chapter 13 Module-Definition Files
-
- 13.1 Overview
- 13.2 Module Statements
- 13.2.1 Syntax Rules
- 13.2.2 Reserved Words
- 13.3 The NAME Statement
- 13.4 The LIBRARY Statement
- 13.5 The DESCRIPTION Statement
- 13.6 The STUB Statement
- 13.7 The EXETYPE Statement
- 13.8 The PROTMODE Statement
- 13.9 The REALMODE Statement
- 13.10 The STACKSIZE Statement
- 13.11 The HEAPSIZE Statement
- 13.12 The CODE Statement
- 13.13 The DATA Statement
- 13.14 The SEGMENTS Statement
- 13.15 CODE, DATA, and SEGMENTS Attributes
- 13.16 The OLD Statement
- 13.17 The EXPORTS Statement
- 13.18 The IMPORTS Statement
- 13.19 Related Topics in Online Help
-
- Chapter 14 Customizing the Microsoft Programmer's WorkBench
-
- 14.1 Setting Switches
- 14.1.1 Changing Current Assignments and Switch Settings
- 14.1.2 Editing the TOOLS.INI Initialization File
- 14.2 Assigning Functions to Keystrokes
- 14.3 Writing Macros
- 14.3.1 Macro Syntax
- 14.3.2 Macro Responses
- 14.3.3 Macro Arguments
- 14.3.4 Macro Conditionals
- 14.3.5 Recording Macros
- 14.3.6 Temporary Macros
- 14.4 Related Topics in Online Help
-
- Chapter 15 Debugging Assembly-Language Programs with CodeView
-
- 15.1 Understanding Windows in CodeView
- 15.2 Overview of Debugging Techniques
- 15.3 Viewing and Modifying Program Data
- 15.3.1 Displaying Variables in the Watch Window
- 15.3.2 Displaying Expressions in the Watch Window
- 15.3.3 Displaying Local Variables
- 15.3.4 Using Pointers to Display Arrays and Strings
- 15.3.5 Displaying Structures
- 15.3.6 Using Quick Watch
- 15.3.7 Displaying Memory
- 15.3.8 Displaying the Processor Registers
- 15.3.9 Modifying the Values of Variables, Memory,
- and Registers
- 15.4 Controlling Execution
- 15.4.1 Continuous Execution
- 15.4.2 Single-Stepping
- 15.4.3 Changing the Program Display Mode
- 15.5 Replaying a Debug Session
- 15.6 Advanced CodeView Techniques
- 15.7 CodeView Command-Line Options
- 15.8 Customizing CodeView with the TOOLS.INI File
- 15.9 Related Topics in Online Help
-
- Chapter 16 Converting C Header Files to MASM Include Files
-
- 16.1 Basic H2INC Operation
- 16.2 H2INC Syntax and Options
- 16.3 Converting Data and Data Structures
- 16.3.1 User-Defined and Predefined Constants
- 16.3.2 Variables
- 16.3.3 Pointers
- 16.3.4 Structures and Unions
- 16.3.5 Bit Fields
- 16.3.6 Enumerations
- 16.3.7 Type Definitions
- 16.4 Converting Function Prototypes
- 16.5 Related Topics in Online Help
-
- Chapter 17 Writing OS/2 Applications
-
- 17.1 OS/2 Overview
- 17.2 Differences between DOS and OS/2
- 17.3 A Sample Program
- 17.4 Building an OS/2 Application
- 17.5 Binding OS/2 MASM Programs
- 17.6 Register and Memory Initialization
- 17.7 Other OS/2 Utilities
- 17.8 Module-Definition Files
- 17.9 Related Topics in Online Help
-
- Chapter 18 Creating Dynamic-Link Libraries
-
- 18.1 DLL Overview
- 18.2 DLL Programming Requirements
- 18.2.1 Separate Stack and Data Requirement
- 18.2.2 Floating-Point Math Requirement
- 18.2.3 Re-entrance Requirement
- 18.2.4 Segment Strategy in a DLL
- 18.3 Writing the DLL Code
- 18.3.1 Choosing Module Attributes
- 18.3.2 Defining Procedures and Data
- 18.3.3 Creating Initialization and Termination Code
- 18.4 Building the DLL
- 18.4.1 Writing the Module-Definition File
- 18.4.2 Generating an Import Library with IMPLIB
- 18.4.3 Creating and Using the DLL
- 18.5 Related Topics in Online Help
-
- Chapter 19 Writing Memory-Resident Software
-
- 19.1 Terminate-and-Stay-Resident Programs
- 19.1.1 Structure of a TSR
- 19.1.2 Passive TSRs
- 19.1.3 Active TSRs
- 19.2 Interrupt Handlers in Active TSRs
- 19.2.1 Auditing Hardware Events for TSR Requests
- 19.2.2 Monitoring System Status
- 19.2.3 Determining Whether to Invoke the TSR
- 19.3 Example of a Simple TSR: ALARM
- 19.4 Using DOS in Active TSRs
- 19.4.1 Understanding DOS Stacks
- 19.4.2 Determining DOS Activity
- 19.4.3 Interrupting DOS Functions
- 19.4.4 Monitoring the Critical Error Flag
- 19.5 Preventing Interference
- 19.5.1 Trapping Errors
- 19.5.2 Preserving an Existing Condition
- 19.5.3 Preserving Existing Data
- 19.6 Communicating through the Multiplex Interrupt
- 19.6.1 The Multiplex Handler
- 19.6.2 Using the Multiplex Interrupt Under DOS Version 2.x
- 19.7 Deinstalling TSRs
- 19.8 Example of an Advanced TSR: SNAP
- 19.8.1 Building SNAP.EXE
- 19.8.2 Outline of SNAP
- 19.9 Related Topics in Online Help
-
- Chapter 20 Mixed-Language Programming
-
- 20.1 Naming and Calling Conventions
- 20.1.1 Naming Conventions
- 20.1.2 The C Calling Convention
- 20.1.3 The Pascal Calling Convention
- 20.1.4 The Standard Calling Convention
- 20.2 Writing the Assembly-Language Procedure
- 20.3 The MASM/High-Level-Language Interface
- 20.3.1 The C/MASM Interface
- 20.3.2 The FORTRAN/MASM Interface
- 20.3.3 The Basic/MASM Interface
- 20.3.4 The Pascal/MASM Interface
- 20.3.5 The QuickPascal/MASM Interface
- 20.4 Related Topics in Online Help
-
- Appendix A Differences between MASM 6.0 and 5.1
-
- A.1 New Features of Version 6.0
- A.1.1 The Assembler, Environment, and Utilities
- A.1.2 Segment Management
- A.1.3 Data Types
- A.1.4 Procedures, Loops, and Jumps
- A.1.5 Simplifying Multiple-Module Projects
- A.1.6 Expanded State Control
- A.1.7 New Processor Instructions
- A.1.8 Renamed Directives
- A.1.9 Macro Enhancements
- A.1.10 MASM 6.0 Programming Practices
- A.2 Compatibility between MASM 5.1 and 6.0
- A.2.1 Rewriting Code for Compatibility
- A.2.2 Using the OPTION Directive
- A.2.3 Changes to Instruction Encodings
-
- Appendix B BNF Grammar
-
-
- Appendix C Generating and Reading Assembly Listings
-
- C.1 Generating Listing Files
- C.1.1 Generating a First Pass Listing
- C.1.2 Controlling the Contents of the Listing File
- C.1.3 Controlling Listing Information on Macros
- C.1.4 Controlling the Page Format
- C.1.5 Precedence of Command-Line Options and Listing
- Directives
- C.2 Reading the Listing File
- C.2.1 Code Generated
- C.2.2 Error Messages
- C.2.3 Symbols and Abbreviations
- C.2.4 Reading Tables in a Listing File
-
- Appendix D MASM Reserved Words
-
- D.1 Operands and Symbols
- D.1.1 Special Operands for the 80386/486
- D.1.2 Predefined Symbols
- D.2 Registers
- D.3 Operators and Directives
- D.4 Processor Instructions
- D.4.1 8086/8088 Processor Instructions
- D.4.2 80186 Processor Instructions
- D.4.3 80286 Processor Instructions
- D.4.4 80286 and 80386 Privileged-Mode Instructions
- D.4.5 80386 Processor Instructions
- D.4.6 80486 Processor Instructions
- D.4.7 Instruction Prefixes
- D.5 Coprocessor Instructions
- D.5.1 8087 Coprocessor Instructions
- D.5.2 80287 Privileged-Mode Instruction
- D.5.3 80387 Instructions
-
- Appendix E Default Segment Names
-
-
- Appendix F Error Messages
-
- F.1 BIND Error Messages
- F.2 CodeView Error Messages
- F.3 EXEHDR Error Messages
- F.4 HELPMAKE Error Messages
- F.4.1 HELPMAKE Fatal Errors
- F.4.2 HELPMAKE Errors
- F.4.3 HELPMAKE Warnings
- F.5 H2INC Error Messages
- F.5.1 H2INC Fatal Errors
- F.5.2 H2INC Compilation Errors
- F.5.3 H2INC Warnings
- F.6 IMPLIB Error Messages
- F.6.1 IMPLIB Fatal Errors
- F.6.2 IMPLIB Errors
- F.7 LIB Error Messages
- F.7.1 LIB Fatal Errors
- F.7.2 LIB Errors
- F.7.3 LIB Warnings
- F.8 LINK Error Messages
- F.8.1 LINK Fatal Errors
- F.8.2 LINK Errors
- F.8.3 LINK Warnings
- F.9 ML Error Messages
- F.9.1 ML Fatal Errors
- F.9.2 ML Errors
- F.9.3 ML Warnings
- F.10 NMAKE Error Messages
- F.10.1 NMAKE Fatal Errors
- F.10.2 NMAKE Errors
- F.10.3 NMAKE Warnings
- F.11 PWB.COM Error Messages
- F.12 PWBRMAKE Error Messages
- F.12.1 PWBRMAKE Fatal Errors
- F.12.2 PWBRMAKE Warnings
-
- Glossary
-
-
- Index
-
-
-
-
- Introduction
- ────────────────────────────────────────────────────────────────────────────
-
- The Microsoft (R) Macro Assembler Programmer's Guide provides the
- information you need to write and debug assembly-language programs with the
- Microsoft Macro Assembler (MASM), version 6.0. This book documents enhanced
- features of the language and the programming environment for MASM 6.0. It
- also describes new features that take advantage of the capabilities of the
- 80386/486 processors.
-
- The Programmer's Guide is written for experienced programmers who know
- assembly language and are familiar with an assembler. The book does not
- teach the basics of assembly language; it does explain Microsoft-specific
- features. If you want to learn or review the basics of assembly language,
- refer to "Books for Further Reading" later in this introduction.
-
- The documentation for MASM 6.0 is an integrated set, comprehensive and
- cohesive. This book emphasizes writing efficient code with the new and
- advanced features of MASM. Installing and Using the Professional Development
- System explains not only how to set up MASM 6.0 but also how to use the
- extensive online reference system, the Microsoft Advisor.
-
- Installing and Using also introduces the integrated environment called the
- Programmer's WorkBench (PWB) and shows how to manage development projects
- with it. The Microsoft Macro Assembler Reference provides a full listing of
- all MASM instructions, directives, statements, and operators, and it serves
- as a quick reference to utility commands.
-
- For more information on these same topics, see the online Microsoft Advisor,
- which is a complete reference to Macro Assembler language topics, to the
- utilities, and to PWB. You should be able to find most of the information
- you need in the Microsoft Advisor. The printed documents give more in-depth
- and background information.
-
-
- New and Extended Features in MASM 6.0
-
- Version 6.0 of MASM differs from version 5.1 in many ways, from optional
- extensions to features that replace or modify previous assembler behavior.
-
- MASM 6.0 includes the Programmer's WorkBench, an integrated software
- development environment, and the CodeView (R) source-level debugger. From
- within PWB you can edit, build, debug, or run a program, and you can perform
- most of these operations with either menu selections or keyboard commands.
- You can also customize PWB to suit your individual programming and editing
- requirements and preferences.
-
-
- New MASM Language Features
-
- MASM 6.0 includes a number of new features, described in the list below,
- designed to make programming more efficient and intuitive and to increase
- your productivity. For example, MASM's new high-level-language features mean
- that you can get the speed of assembly language with the ease of high-level
- languages. You can also maintain your programs more easily.
-
-
- ■ MASM 6.0 has many enhancements related to types. You can now use the
- same type specifiers in initializations as in other contexts (BYTE
- instead of DB). You can also define your own types, including pointer
- types, with the new TYPEDEF directive. See Chapter 3, "Using Addresses
- and Pointers," and Chapter 4, "Defining and Using Integers."
-
- ■ The syntax for defining and using structures and records has been
- enhanced. You can also define unions with the new UNION directive. See
- Chapter 5, "Defining and Using Complex Data Types."
-
- ■ MASM now generates complete CodeView information for all types. See
- Chapter 3, "Using Addresses and Pointers," and Chapter 4, "Defining
- and Using Integers."
-
- ■ New control-flow directives let you use high-level-language constructs
- such as loops and if-then-else blocks defined with .REPEAT and .UNTIL
- (or .UNTILCXZ); .WHILE and .ENDW; and .IF, .ELSE, and .ELSEIF. The
- assembler generates the appropriate code to implement the control
- structure. See Chapter 7, "Controlling Program Flow."
-
- ■ MASM now has more powerful features for defining and calling
- procedures. The extended PROC syntax for generating stack frames has
- been enhanced in version 6.0. You can also use the PROTO directive to
- prototype a procedure, which you can then call with the INVOKE
- directive. INVOKE automatically generates code to pass arguments
- (converting them to a related type, if appropriate) and make the call
- according to the specified calling convention. See Chapter 7,
- "Controlling Program Flow."
-
- ■ MASM optimizes jumps by automatically determining the most efficient
- coding for a jump and then generating the appropriate code. See
- Chapter 7, "Controlling Program Flow."
-
- ■ Maintaining multiple-module programs is easier in MASM 6.0. The
- EXTERNDEF and PROTO directives make it easy to maintain all global
- definitions in include files shared by all the source modules of a
- project. See Chapter 8, "Sharing Data and Procedures among Modules and
- Libraries."
-
-
- The assembler has many new macro features that make complex macros clearer
- and easier to write:
-
-
- ■ You can specify default values for macro arguments or mark arguments
- as required. And with the VARARG keyword, one parameter can accept a
- variable number of arguments.
-
- ■ You can implement loops inside of macros in various ways. For example,
- the new WHILE directive expands the statements in a macro body while
- an expression is not zero.
-
- ■ You can define macro functions, which return text macros. Several
- predefined text macros are also provided for processing strings. Macro
- operators and other features related to processing text macros and
- macro arguments have been enhanced. For more information on all these
- macro features, see Chapter 9, "Using Macros."
-
-
- Finally, MASM 6.0 has improved customizable capabilities:
-
-
- ■ With the new .STARTUP and .EXIT directives you can automatically
- generate appropriate start-up and exit code for DOS or OS/2 modules.
- See Chapter 2, "Organizing MASM Segments."
-
- ■ MASM 6.0 supports flat memory model, available with OS/2 version 2.0.
- In flat model, segments can be as large as 4 gigabytes instead of 64K
- (kilobytes). Offsets are 32 bits instead of 16 bits. See Chapter 2,
- "Organizing MASM Segments."
-
- ■ The program H2INC.EXE converts C include files to MASM include files
- and translates data structures and declarations. See Chapter 16,
- "Converting C Header Files to MASM Include Files."
-
-
- MASM 6.0 includes many other minor new features as well as extended support
- for features of earlier versions of MASM. These features are listed in
- Appendix A, "Differences between MASM 6.0 and 5.1," with cross-references to
- the chapters where they are discussed in detail.
-
-
- ML and MASM Command Lines
-
- MASM 6.0 provides a new command-line driver, ML, which is more powerful and
- flexible than the previous driver (MASM). ML assembles and links with one
- command. The old MASM driver command syntax is still supported, however, to
- support existing batch files and makefiles that use MASM command lines.
-
- ────────────────────────────────────────────────────────────────────────────
- NOTE
-
- The name MASM has traditionally been used to refer to the Microsoft Macro
- Assembler. It is used in that context throughout this book. But MASM also
- refers to MASM.EXE, which has been replaced by ML.EXE. In MASM 6.0, the
- MASM.EXE file is a small utility that translates command-line options to
- those accepted by ML.EXE, and then calls ML.EXE. The distinction between
- ML.EXE and MASM.EXE is made whenever necessary. Otherwise, MASM refers to
- the assembler and its features.
- ────────────────────────────────────────────────────────────────────────────
-
-
- Compatibility with Earlier Versions of MASM
-
- In many cases, MASM 5.1 code will assemble without modification under MASM
- 6.0. However, MASM 6.0 provides a new OPTION directive that lets you
- selectively modify the assembly process. In particular, you can use the M510
- argument with OPTION or the /Zm command-line option to set most features to
- be compatible with version 5.1 code.
-
- See Appendix A, "Differences between MASM 6.0 and 5.1," for information
- about obsolete features that will not assemble correctly under MASM 6.0. The
- appendix also discusses how to update code to use the new features.
-
-
- Scope and Organization of this Book
-
- The Programmer's Guide describes how to get the most out of the Microsoft
- Macro Assembler 6.0 and the Programmer's WorkBench. The book is arranged by
- topic, with each topic answering a question or solving a problem. The last
- section in each chapter lists topics in the online reference system that
- provide additional information.
-
- The Programmer's Guide is divided into three parts:
-
- Part 1, "Programming in Assembly Language," explains how to program
- efficiently using both the new and old features of MASM. It reviews the
- basic components of assembly language and also describes the new and
- enhanced features.
-
- Part 2, "Improving Programmer Productivity," introduces the utility programs
- included with MASM 6.0. These programs can help you program more quickly and
- efficiently. For example, the chapters in Part 2 show you how to
- automatically update your project (Chapter 10), use program lists as input
- (Chapter 11), use the Microsoft linker (LINK) (Chapter 12), write
- module-definition files (Chapter 13), customize PWB to suit your programming
- style (Chapter 14), use the CodeView debugger to record and play back a
- debugging session (Chapter 15), and easily port data structures from C
- programs to MASM programs (Chapter 16).
-
- Part 3, "Advanced Topics," covers specialized areas. It describes how to
- write programs to run under OS/2 (Chapter 17) and how to build dynamic-link
- libraries (Chapter 18). Chapter 19 shows how to write a
- terminate-and-stay-resident (TSR) program. Chapter 20, on mixed-language
- programming, defines the calling conventions and equivalent data types that
- allow MASM to call and be called by C, FORTRAN, Basic, and Pascal.
-
- In addition, six appendixes and a glossary detail the features of MASM 6.0.
- Of particular interest are Appendix A, "Differences between MASM 6.0 and
- 5.1," and Appendix B, "BNF Grammar." Appendix A lists the new features of
- MASM 6.0 and also explains how to update MASM 5.1 code. The BNF grammar, or
- Backus-Naur Form for grammar notation, lets you determine the exact syntax
- for any MASM language component. It clearly defines recursive definitions
- and shows all the available options for any placeholder. Other appendixes
- cover assembly listings, reserved words, default segment names, and error
- messages.
-
-
- Books for Further Reading
-
- The following books may help you learn to program in assembly language or
- write specialized programs. These books are listed only for your
- convenience. Microsoft makes no specific recommendations concerning any of
- these books.
-
-
- Books about Programming in Assembly Language
-
- Abrash, Michael, Zen of Assembly Language.
- Glenview, IL: Scott, Foresman and Co., 1990.
-
- Duntemann, Jeff, Assembly Language from Square One: For the PC AT and
- Compatibles.
- Glenview, IL: Scott, Foresman and Co., 1990.
-
- Fernandez, Judi N., and Ashley, Ruth, Assembly Language Programming for the
- 80386.
- New York: McGraw-Hill, 1990.
-
- Miller, Alan R., DOS Assembly Language Programming.
- San Francisco: SYBEX, 1988.
-
- Scanlon, Leo J., 80286 Assembly Language Programming on MS-DOS Computers.
- New York: Brady Communications, 1986.
-
- Turley, James L., Advanced 80386 Programming Techniques.
- Berkeley, CA: Osborne McGraw-Hill, 1988.
-
-
- Books about DOS and BIOS
-
- "Article 11." MS-DOS Encyclopedia.
- Redmond, WA: Microsoft Press, 1988. Contains information about
- terminate-and-stay-resident programs.
-
- Duncan, Ray, Advanced MS-DOS.
- 2nd ed. Redmond, WA: Microsoft Press, 1988.
-
- Jourdain, Robert, Programmer's Problem Solver for the IBM PC, XT and AT.
- New York: Brady Communications, 1986.
-
- Microsoft MS-DOS Programmer's Reference.
- Redmond, WA: Microsoft Press, 1986-87.
-
- Norton, Peter and Wilton, Richard, The New Peter Norton Programmer's Guide
- to the IBM PC and PS/2.
- Redmond, WA: Microsoft Press, 1988.
-
- Wilton, Richard, Programmer's Guide to PC & PS/2 Video Systems.
- Redmond, WA: Microsoft Press, 1987.
-
-
- Books about OS/2
-
- Duncan, Ray, Advanced OS/2 Programming.
- Redmond, WA: Microsoft Press, 1989.
-
- ───, Essential OS/2 Functions.
- Redmond, WA: Microsoft Press, 1989.
-
- Letwin, Gordon, Inside OS/2.
- Redmond, WA: Microsoft Press, 1989.
-
- OS/2 Programmer's Reference.
- 4 vols. Redmond, WA: Microsoft Press, 1989.
-
-
- Books about Other Topics
-
- Nelson, Ross P., The 80386 Book.
- Redmond, WA: Microsoft Press, 1988.
-
- Startz, Richard, 8087/80287/80387 for the IBM PC and Compatibles.
- Bowie, MD: Robert J. Brady Co., 1988.
-
- Writing ROMable Code in Microsoft C.
- Costa Mesa, CA: SSI Corporation.
-
-
- Document Conventions
-
- The following document conventions are used throughout this manual:
-
- Example of Description
- Convention
- ────────────────────────────────────────────────────────────────────────────
- SAMPLE2.ASM Uppercase letters indicate file names,
- segment names, registers, and terms used
- at the command level.
-
- .MODEL Boldface type indicates
- assembly-language directives,
- instructions, type specifiers, and
- predefined macros, as well as keywords
- in other programming languages.
-
- placeholders Italic letters indicate placeholders for
- information you must supply, such as a
- file name. Italics are also occasionally
- used for emphasis in the text.
-
- target This font is used to indicate example
- programs, user input, and screen output.
-
- ; A semicolon in the first column of an
- example signals illegal code. A
- semicolon also marks a comment.
-
- SHIFT Small capital letters signify names of
- keys on the keyboard. Notice that a plus
- (+) indicates a combination of keys. For
- example, CTRL+E means to hold down the
- CTRL key while pressing the E key.
-
- «argument» Items inside double square brackets are
- optional.
-
- {register|memory} Braces and a vertical bar indicate a
- choice between two or more items. You
- must choose one of the items unless
- double square brackets surround the
- braces.
-
- Repeating elements... A horizontal ellipsis (...) following an
- item indicates that more items having
- the same form may appear.
-
- Program A vertical ellipsis tells you that part
- . of a program has been intentionally
- . omitted.
- .
- Fragment
-
-
- Getting Assistance and Reporting Problems
-
- If you need help or think you have discovered a problem in the software,
- please provide the following information to help us locate the problem:
-
-
- ■ The version of DOS or OS/2 that you are running
-
- ■ Your system configuration: the type of machine you are using, its
- total memory, and its total free memory at assembler execution time,
- as well as any other information you think might be useful
-
- ■ The assembly command line used, or the link command line if the
- problem occurred during linking
-
- ■ Any object files or libraries you linked with if the problem occurred
- at link time
-
-
- If your program is very large, please try to reduce its size to the smallest
- possible program that still produces the problem.
-
- Use the Product Assistance Request form at the back of this book to send
- this information to Microsoft. If you have comments or suggestions regarding
- any of the books accompanying this product, please indicate them on the
- Document Feedback Card at the back of this book.
-
- If you are not a registered Macro Assembler owner, you should fill out and
- return the Registration Card. This enables Microsoft to keep you informed of
- updates and other information about the assembler.
-
-
-
-
-
-
- Chapter 1 Understanding Global Concepts
- ────────────────────────────────────────────────────────────────────────────
-
- With the development of the Microsoft Macro Assembler (MASM) version 6.0,
- you now have more options available to you for approaching a programming
- task. This chapter explains the general concepts of programming in assembly
- language, beginning with the environment and reviewing the components you
- need to work in the assembler environment. Even if you are familiar with
- previous versions of MASM, you should examine this chapter for information
- on new terms and features.
-
- The first section of the chapter takes a look at the available processors
- and operating systems and how they work together. It also discusses the
- relationship of segmented architecture to assembly programming and the
- differences it makes for programming in OS/2 rather than in DOS.
-
- The second section describes some of the language components of MASM that
- are common to most programs, such as reserved words, constant expressions,
- operators, and registers. The rest of this book assumes that you understand
- the information presented in this section.
-
- The last section summarizes the assembly process, from assembling a program
- through running it. You can affect this process by the way you develop your
- code. Finally, this section explores how you can change the assembly process
- with the OPTION directive and conditional assembly.
-
- ────────────────────────────────────────────────────────────────────────────
- NOTE
-
- This manual does not cover information specific to programming for Microsoft
- Windows(tm). For information on this, see the Microsoft Windows Software
- Development Kit.
- ────────────────────────────────────────────────────────────────────────────
-
-
- 1.1 The Processing Environment
-
- The processing environment for MASM 6.0 includes the processor on which your
- programs run, the operating system your programs will use, and the aspects
- of the segmented architecture that influence the choice of programming
- models. This section summarizes these elements of the environment and how
- they affect your programming choices.
-
-
- 1.1.1 8086-Based Processors
-
- The 8086 "family" of processors uses segments to control data and code. The
- later 8086-based processors have larger instruction sets and more memory
- capacity, but they still use the same segmented architecture. Knowing the
- differences between the various 8086-based processors can help you select
- the target processor for your programs.
-
- The instruction set of the 8086 processor is upwardly compatible with its
- successors. To write code that runs on the widest number of machines, select
- the 8086 instruction set. By choosing to use the instruction set of a more
- advanced processor, you increase the capabilities and efficiency of your
- program, but you also reduce the number of systems on which the program can
- run.
-
- Table 1.1 lists modes, memory, and segment size of processors on which your
- application may need to run. Each processor is discussed in more detail
- below.
-
- Table 1.1 8086 Family of Processors
-
- ╓┌────────────┌───────────────────┌──────────────────┌───────────────────────╖
- Available Addressable Segment
- Processor Modes Memory Size
- ────────────────────────────────────────────────────────────────────────────
- 8086/8088 Real 1 megabyte 16 bit
-
- 80186/80188 Real 1 megabyte 16 bit
-
- Available Addressable Segment
- Processor Modes Memory Size
- ────────────────────────────────────────────────────────────────────────────
- 80286 Real and Protected 16 megabytes 16 bit
-
- 80386 Real and Protected 4 gigabytes 16 or 32 bit
-
- 80486 Real and Protected 4 gigabytes 16 or 32 bit
-
- ────────────────────────────────────────────────────────────────────────────
-
-
-
- Processor Modes - Real mode allows only one process to run at a time. The
- DOS operating system runs in real mode. The OS/2 operating system can
- execute programs written for DOS, but is designed to provide capabilities
- available only in protected mode. In protected mode, more than one process
- can be active at any one time. Memory accessed by these different processes
- is protected from access by another process.
-
- Protected-mode addresses do not correspond directly to physical memory.
- Under protected-mode operating systems, the processor allocates and manages
- memory dynamically. Additional privileged instructions initialize protected
- mode and control multiple processes. Section 1.1.2 provides more information
- on operating systems.
-
- 8086 and 8088 - The 8086 is faster than the 8088 because of its 16-bit data
- bus; the 8088 has only an 8-bit data bus. The 16-bit data bus allows you to
- use EVEN and ALIGN on an 8086 processor to word-align data and thus improve
- data-handling efficiency. Memory addresses on the 8086 and 8088 refer to
- actual physical addresses.
-
- 80186 and 80188 - These two processors are identical to the 8086 and 8088
- except that new instructions have been added and several old instructions
- have been optimized. These processors run significantly faster than the
- 8086.
-
- 80286 - The 80286 processor adds some instructions to control protected
- mode, and it runs faster. It also provides the optional protected mode that
- can be used by the operating system to allow multiple processes to run at
- the same time. The 80286 is the minimum for running 16-bit versions of OS/2.
-
- 80386 - Unlike its predecessors, the 80386 processor can handle both 16-bit
- and 32-bit data. It is fully software-compatible with the 80286. It
- implements many new hardware-level features, including virtual paged memory,
- multiple virtual 8086 processes, addressing of up to four gigabytes of
- memory, and specialized debugging registers.
-
- Under DOS, the 80836 supports all the instructions of the 80286 as well as
- several additional ones. It also allows limited use of 32-bit registers and
- addressing modes. The 80386 operates at faster processor speeds than the
- 80286 and is the minimum for running 32-bit versions of OS/2 and other
- 32-bit operating systems.
-
- 80486 - The 80486 processor is an enhanced version of the 80386, with
- instruction "pipelining" that executes many instructions two to three times
- faster. It incorporates an enhanced version of the 80387 coprocessor, as
- well as an 8K (kilobyte) memory cache. The 80486 includes several new
- instructions and is fully compatible with 80386 software.
-
- 8087, 80287, and 80387 - These math coprocessors work concurrently with the
- 8086 family of processors. Performing floating-point calculations with math
- coprocessors is up to 100 times faster than emulating the calculations with
- integer instructions. Although there are technical and performance
- differences among the three coprocessors, the main difference to the
- applications programmer is that the 80287 and 80387 can operate in protected
- mode. The 80387 also has several new instructions. The 80486 does not use
- any of these coprocessors; its floating-point processor is built in and is
- functionally equivalent to the 80387.
-
-
- 1.1.2 Operating Systems
-
- With MASM, you can create programs that run under DOS, Windows, or OS/2─or
- all three, in some cases. For example, ML.EXE can produce executable files
- that run in any of the target environments, regardless of the programmer's
- environment. For information on building programs for different
- environments, see "Building and Running Programs" in PWB's online help.
-
- DOS and OS/2 provide different processing modes. DOS uses the single-process
- real mode. OS/2 uses the multiple-process protected mode. While OS/2 can
- also run in real mode, this book assumes it is being used in protected mode.
-
-
- DOS and OS/2 differ primarily in system access methods, size of addressable
- memory, and segment selection. Table 1.2 summarizes these differences.
-
- Table 1.2 The DOS and OS/2 Operating Systems
-
- Available Contents
- Operating System Active Addressabl of Segment Word Length
- System Access Processes e Memory
- Register
- ─────────────────────────────────────────────────────────────────────────────
- DOS (and Direct to One 1 megabyte Actual 16 bit
- OS/2 1.x hardware address
- real mode)
-
- OS/2 1.x Operating Multiple 16 Segment 16 bit
- protected system megabytes selectors
- mode call
-
- OS/2 2.x Operating Multiple 4 Segment 32 bit
- system gigabytes selectors
- call
-
- ─────────────────────────────────────────────────────────────────────────────
-
-
- DOS - In real-mode programming, you can access system functions by calling
- DOS, calling the basic input/output system (BIOS), or directly addressing
- hardware. Access is through DOS interrupt 21h.
-
- Protected-mode programs cannot directly access hardware ports.
-
- OS/2 1.x - As you can see in Table 1.2, protected mode allows for much
- larger data structures than real mode, since the addressable memory is
- extended to 16 megabytes. In protected mode, segment registers contain
- segment selectors rather than actual segment values. These selectors cannot
- be calculated by the program; they must be obtained by calling the operating
- system. Programs that attempt to calculate segment values or to address
- memory directly do not work.
-
- Note that protected-mode operating systems such as XENIX (R) and OS/2
- provide system functions for memory and hardware accesses that would be
- prohibited with direct processor commands. This software interface permits
- access without the possibility of corrupting memory or crashing the system.
-
-
- Protected mode uses privilege levels to maintain system integrity and
- security. Programs cannot access data or code that is in a higher privilege
- level. Some instructions that directly access ports or clear interrupts
- (such as CLI, STI, IN, and OUT) are available at privilege levels normally
- used only by systems programmers.
-
- OS/2 protected mode enforces the separation of segment values. The segments
- have selectors that have no relationship to the offset. The operating system
- combines the segment and offset so that your programs can address up to 16
- megabytes of virtual memory in a 16-bit system.
-
- OS/2 2.x and flat model eliminate segments.
-
- OS/2 2.x - OS/2 2.x uses an unsegmented architecture. (See Section 1.1.3.)
- It creates a "flat model" in which the entire address space is within one
- 32-bit segment. Section 2.2.1, "Defining Basic Attributes with .MODEL,"
- explains how to use the flat model. In a 32-bit system, you can access up to
- four gigabytes of virtual memory. (The term "virtual memory" means that if
- the programs running under OS/2 request more memory than is physically
- available, part of the memory is temporarily swapped out to disk.) Since
- code, data, and stack are in the same segment, the value of segment
- registers never needs to change. Internal mechanisms of OS/2 2.x implement
- protection at a lower level.
-
-
- 1.1.3 Segmented Architecture
-
- The 8086 processors differ from many other microprocessors in that they use
- a segmented architecture: that is, each address is represented in two
- parts─a segment and an offset. Segmented addresses affect many aspects of
- assemblylanguage programming, especially addresses and pointers.
-
- Only 64K of data can be addressed by a 16-bit segment address.
-
- Segmented architecture was originally designed to enable a 16-bit processor
- to access an address space larger than 64K. (Section 1.1.5, "Segmented
- Addressing," explains how the processor uses both the segment and offset to
- create addresses larger than 64K.) DOS is an example of an operating system
- that uses segmented architecture on a 16-bit processor.
-
- With the advent of protected-mode processors such as the 80286, segmented
- architecture gained a second purpose. Segments can separate different blocks
- of code and data to protect them from undesirable interactions. OS/2 1.x is
- an operating system that takes advantage of the protection features of the
- 16-bit segments on the 80286.
-
- Segmented architecture went through another significant change with the
- release of 32-bit processors, starting with the 80386. These processors are
- backward compatible with the older 16-bit processors, but they also offer a
- 32-bit mode that minimizes the memory limitations of a 16-bit segmented
- architecture. Both offer paging to maintain segment protection. XENIX 386 is
- an example of a 32-bit segmented operating system using segment protection.
-
-
- OS/2 2.x takes advantage of the 32-bit processors to allow a nonsegmented
- memory configuration. The processor still uses 32-bit segments, but from the
- user's viewpoint, there is only one segment. The flat memory model used by
- OS/2 2.x places code and data in a single segment. See Section 2.2.1,
- "Defining Basic Attributes with .MODEL," for more information about the flat
- memory model.
-
-
- 1.1.4 Segment Protection
-
- Segmented architecture is an important part of the OS/2 memory-protection
- scheme. In a "multitasking" operating system where numerous programs can run
- simultaneously, programs must not access the code and data of another
- process without permission.
-
- In DOS, the data and code segments are usually allocated adjacent to each
- other, as shown in Figure 1.1. In OS/2, the data and code segments may be
- anywhere in memory. The programmer knows nothing about their location and
- has no control over it. The segments may even be moved to a new memory
- location or swapped to disk while the program is running.
-
- (This figure may be found in the printed book.)
-
- Segment protection prevents a bug in one program from corrupting another
- program.
-
- Segment protection makes software development easier and more reliable in
- OS/2 than in DOS because, in OS/2, any illegal access is detected
- immediately. The operating system intercepts illegal memory accesses,
- terminates the program, and displays a message. This makes the bug easier to
- track down and fix.
-
- In DOS, an illegal access is not detected and may not cause an error until
- later, when another part of the program attempts to use the corrupted
- memory.
-
-
- 1.1.5 Segmented Addressing
-
- Segmented addressing is the internal mechanism that combines a segment value
- and an offset value to create an address. The two parts of an address are
- represented as
-
- segment:offset
-
- The segment portion is always a 16-bit value. The offset portion is a 16-bit
- value in 16-bit mode or a 32-bit value in 32-bit mode.
-
- In real mode, the segment value is a physical address that has an arithmetic
- relationship to the offset value. The segment and offset together create a
- 20-bit physical address (explained in the next section). Although 20-bit
- addresses can access up to one megabyte of memory, the operating system on
- IBM (R) PCs and compatibles uses part of this memory, leaving 640K of memory
- for programs.
-
-
- 1.1.6 Segment Arithmetic
-
- Manipulating segment and offset addresses directly in real-mode programming
- is called "segment arithmetic." Programs that perform segment arithmetic are
- not portable to protected-mode operating systems, where addresses do not
- correspond to a known segment and offset.
-
- The segment selects a region of memory; the offset selects the byte within
- that region.
-
- To perform segment arithmetic successfully, it helps to understand how the
- processor combines a 16-bit segment and a 16-bit offset to form a 20-bit
- linear address. In effect, the segment selects a 64K region of memory, and
- the offset selects the byte within that region. Here's how it works:
-
-
- 1. The processor shifts the segment address to the left by four binary
- places, producing a 20-bit address ending in four zeros. This
- operation has the effect of multiplying the segment address by 16.
-
- 2. The processor adds this 20-bit segment address to the 16-bit offset
- address. The offset address is not shifted.
-
- 3. The processor uses the resulting 20-bit address, often called the
- "physical address," to access an actual location in the one-megabyte
- address space.
-
-
- Figure 1.2 illustrates this process.
-
- (This figure may be found in the printed book.)
-
- A 20-bit physical address may actually be specified by 4,096 equivalent
- segment:offset addresses. For example, the 20-bit physical address 0F800 is
- equivalent to 0000:F800, 0F00:0800, or 0F80:0000.
-
- You may need to convert two segmented addresses with different segments to
- segmented addresses with the same segment to write TSRs (see Chapter 19), to
- write code to handle huge arrays, or to determine the size of an area of
- memory.
-
-
- 1.2 Language Components of MASM
-
- Programming with MASM requires that you understand the MASM concepts of
- reserved words, identifiers, predefined symbols, constants, expressions,
- operators, data types, registers, and statements. This section defines
- important terms and provides lists that summarize these topics. See online
- help or the MASM Reference for detailed information.
-
-
- 1.2.1 Reserved Words
-
- A reserved word has a special meaning fixed by the language. You can use it
- only under certain conditions. MASM's reserved words include:
-
-
- ■ Instructions, which correspond to operations the processor can execute
-
- ■ Directives, which give commands to the assembler
-
- ■ Attributes, which provide a value for a field, such as segment
- alignment
-
- ■ Operators, which are used in expressions
-
- ■ Predefined symbols, which return information to your program
-
-
- MASM reserved words are not case sensitive except for predefined symbols
- (see Section 1.2.3).
-
- Use OPTION NOKEYWORD if you want to use a reserved word in another context.
-
-
- The assembler generates an error if you use a reserved word as a variable,
- code label, or other identifier within your source code. However, if you
- need to use a reserved word for another purpose, the OPTION NOKEYWORD
- directive can selectively disable a word's status as a reserved word.
-
- For example, to remove the STR instruction, the MASK operator, and the NAME
- directive from the set of words MASM recognizes as reserved, use this
- statement in the code segment of your program prior to the first reference
- to STR, MASK, or NAME:
-
- OPTION NOKEYWORD:<STR MASK NAME>
-
- The OPTION directive is discussed in Section 1.3.2. Appendix D provides a
- complete list of MASM reserved words.
-
-
- 1.2.2 Identifiers
-
- Identifiers are names of variables of a given type.
-
- An identifier is a name that you invent and attach to a definition.
- Identifiers can be symbols representing variables, constants, procedure
- names, code labels, segment names, and user-defined data types such as
- structures, unions, records, and types defined with TYPEDEF. Identifiers
- longer than 247 characters generate an error.
-
- Certain restrictions limit the names you can use for identifiers. Follow
- these rules to define a name for an identifier:
-
-
- ■ The first character of the identifier can be an alphabetic character
- (A-Z) or any of these four characters: @ _ $ ?
-
- ■ The other characters in the identifier can be any of the characters
- listed above or a decimal digit (0-9)
-
-
- Avoid starting an identifier with the at sign (@), because MASM 6.0
- predefines some special symbols starting with @ (see Section 1.2.3).
- Beginning an identifier with @ may also cause conflicts with future versions
- of the Macro Assembler.
-
- The symbol--and thus the identifier--is visible as long as it remains within
- scope. (See Section 8.2, "Sharing Symbols with Include Files," for
- additional information about visibility and scope.)
-
-
- 1.2.3 Predefined Symbols
-
- Macros and conditionalassembly blocks often use predefined symbols.
-
- The assembler includes a number of predefined symbols (also called
- predefined equates). You can use these symbol names at any point in your
- code to represent the equate value. For example, the predefined equate
- @FileName represents the base name of the current file. If the current
- source file is TASK.ASM, the value of @FileName is TASK. The MASM predefined
- symbols are listed below according to the kinds of information they provide.
- Case is important only if the /Cp option is used. (See online help on ML
- command-line options for additional details.)
-
-
- Predefined Symbols for Segment Information
-
- ╓┌─────────────────────────────────┌─────────────────────────────────────────╖
- Symbol Description
- ────────────────────────────────────────────────────────────────────────────
- @code Provides the name of the code segment,
- except in tiny model when it returns
- DGROUP.
-
- @CodeSize Returns an integer representing the
- default code distance.
-
- @CurSeg Returns the name of the current segment.
-
- @data Expands to DGROUP except in flat model.
-
- @DataSize Returns an integer representing the
- default data distance.
-
- @fardata Represents the name of the segment
- defined by the .FARDATA directive.
-
- @fardata? Represents the name of the segment
- Symbol Description
- ────────────────────────────────────────────────────────────────────────────
- @fardata? Represents the name of the segment
- defined by the .FARDATA? directive.
-
- @Model Returns the selected memory model.
-
- @stack Expands to DGROUP for near stacks or
- STACK for far stacks. (See Section 2.2.3,
- "Creating a Stack.")
-
- @WordSize Provides the size attribute of the
- current segment.
-
-
-
-
- Predefined Symbols for Environment Information
-
- Symbol Description
- ────────────────────────────────────────────────────────────────────────────
- @Cpu Contains a bit mask specifying the
- processor mode.
-
- @Environ Returns values of environment variables.
-
- @Interface Contains information about the language
- parameters.
-
- @Version Represents the text equivalent of the
- MASM version number. In MASM 6.0, this
- expands to 600.
-
-
-
- Predefined Symbols for Date and Time Information
-
- Symbol Description
- ────────────────────────────────────────────────────────────────────────────
- @Date Supplies the current system date.
- @Time Supplies the current system time.
-
-
- Predefined Symbols for File Information
-
- Symbol Description
- ────────────────────────────────────────────────────────────────────────────
- @FileCur Names the current file (base and suffix).
-
- @FileName Names the base name of the main file
- being assembled as it appears on the
- command line.
-
- @Line Gives the source line number in the
- current file.
-
-
-
- Predefined Functions for Macro String Manipulation
-
- Symbol Description
- ────────────────────────────────────────────────────────────────────────────
- @CatStr Returns concatenation of two strings.
- @InStr Returns the starting position of a string within another string.
- @SizeStr Returns the length of a given string.
- @SubStr Returns substring from a given string.
-
-
- 1.2.4 Integer Constants and Constant Expressions
-
- An integer constant is a series of one or more numerals followed by an
- optional radix specifier. For example, in these statements
-
- mov ax, 25
- mov ax, 0B3h
-
- the numbers 25 and 0B3h are integer constants. The h appended to 0B3
- is a radix specifier. The specifiers are
-
-
- ■ y for binary (or b if radix is less than or equal to 10)
-
- ■ o or q for octal
-
- ■ t for decimal (or d if radix is less than or equal to 10)
-
- ■ h for hexadecimal
-
-
- The default radix is decimal.
-
- Radix specifiers can be either uppercase or lowercase letters; sample code
- in this book uses lowercase. If no radix is specified, the assembler
- interprets the integer according to the current radix. The default radix is
- decimal, but it can be changed with the .RADIX directive.
-
- Hexadecimal numbers must always start with a decimal digit (0-9). If
- necessary, add a leading zero to distinguish between symbols and hexadecimal
- numbers that start with a letter. For example, ABCh is interpreted as an
- identifier. The hexadecimal digits A through F can be either uppercase or
- lowercase letters. Sample code in this book uses uppercase letters.
-
- Values of integer constants and expressions are known at assembly time.
-
- Constant expressions contain integer constants and (optionally) operators
- such as shift, logical, and arithmetic operators, and can be evaluated. The
- assembler evaluates them at assembly time. (In addition to constants,
- expressions can contain labels, types, registers, and their attributes.)
- Constant expressions do not change value during program execution.
-
- Symbolic Integer Constants - You can define symbolic integer constants with
- either of the data assignment directives, EQU or the equal sign (=). These
- directives assign values to symbols during assembly, not during program
- execution. Symbols defined as integer constants can then be used in
- subsequent statements as immediate operands having the assigned value.
- Symbolic constants are often used to assign mnemonic names to constant
- values, which makes your code more readable and easier to maintain.
-
- The assembler does not allocate data storage when you use either EQU or =.
- Instead, it replaces each occurrence of the symbol with the value of the
- expression.
-
- Symbols defined with EQU cannot be redefined.
-
- The difference between EQU and = is that integers defined with the =
- directive can be changed in your source code, but those defined with EQU
- cannot. Once a symbolic integer constant has been defined with the EQU
- directive, attempting to redefine it generates an error. The syntax is
-
- symbol EQU expression
-
- The symbol must be a unique name. The expression can be an integer, a
- constant expression, a one- or two-character string constant (four-character
- on the 80386/486), or an expression that evaluates to an address. If a
- constant value used in numerous places in the source code needs to be
- changed, you modify the expression in one place rather than throughout the
- source code.
-
- The following example shows the correct use of EQU to define symbolic
- integers.
-
- column EQU 80 ; Constant - 80
- row EQU 25 ; Constant - 25
- screen EQU column * row ; Constant - 2000
- line EQU row ; Constant - 25
-
- .DATA
-
- .CODE
- .
- .
- .
- mov cx, column
- mov bx, line
-
- The value of a symbol defined with the = directive can be different at
- different places in the source code. However, a constant value is assigned
- during assembly for each use, and that value does not change at run time.
-
- The syntax for the = directive is
-
- symbol = expression
-
- Size of Constants - The default word size for MASM 6.0 expressions is 32
- bits. This behavior can be modified using OPTION EXPR16 or OPTION M510. Both
- of these options set the expression word size to 16 bits, but OPTION M510
- affects other assembler behavior as well (see Appendix A).
-
- It is illegal to change the expression word size once it has been set with
- OPTION M510, OPTION EXPR16, or OPTION EXPR32, but you can repeat the same
- directive in a file. This can be useful for putting an OPTION EXPR16 in
- every include file, for example.
-
-
- 1.2.5 Operators
-
- Operators are used in expressions. The value of the expression is determined
- at assembly time and does not change when the program runs.
-
- Operators should not be confused with processor instructions. The reserved
- word ADD is an instruction. The plus sign (+) is an operator. For example,
- Amount+2 is a valid use of the plus operator (+); it tells the assembler to
- add 2 to Amount, which might be a value or an address. This operation,
- which occurs at assembly time, is different from the ADD instruction, which
- tells the processor to perform addition at run time.
-
- The assembler evaluates expressions that contain more than one operator
- according to the following rules:
-
-
- ■ Operations in parentheses are always performed before any adjacent
- operations.
-
- ■ Binary operations of highest precedence are performed first.
-
- ■ Operations of equal precedence are performed from left to right.
-
- ■ Unary operations of equal precedence are performed right to left.
-
-
- The order of precedence for all operators is listed in Table 1.3. Operators
- on the same line have equal precedence.
-
- Table 1.3 Operator Precedence
-
- ╓┌───────────────────┌───────────────────────────────────────────────────────╖
- Precedence Operators
- ────────────────────────────────────────────────────────────────────────────
- 1 ( ), [ ]
- 2 LENGTH, SIZE, WIDTH, MASK
- Precedence Operators
- ────────────────────────────────────────────────────────────────────────────
- 2 LENGTH, SIZE, WIDTH, MASK
- 3 . (structure-field-name operator)
- 4 : (segment-override operator), PTR
- 5 LROFFSET, OFFSET, SEG, THIS, TYPE
- 6 HIGH, HIGHWORD, LOW, LOWWORD
- 7 + ,- (unary)
- 8 *, /, MOD, SHL, SHR
- 9 +, - (binary)
- 10 EQ, NE, LT, LE, GT, GE
- 11 NOT
- 12 AND
- 13 OR, XOR
- 14 OPATTR, SHORT, .TYPE
- ────────────────────────────────────────────────────────────────────────────
-
-
-
- 1.2.6 Data Types
-
- A "data type" describes a set of values. A variable of a given type can have
- any of a set of values within the range specified for that type.
-
- The intrinsic types for MASM 6.0 are BYTE, SBYTE, WORD, SWORD, DWORD,
- SDWORD, FWORD, QWORD, and TBYTE. These types define integers and binary
- coded decimals (BCDs); they are discussed in Chapter 6. The signed data
- types SBYTE, SWORD, and SDWORD are new to MASM 6.0. They are useful in
- conjunction with directives such as INVOKE (for calling procedures) and .IF
- (introduced in Chapter 7). The REAL4, REAL8, and REAL10 directives can be
- used to define floating-point types. See Chapter 6.
-
- Previous versions of MASM have separate directives for types and
- initializers. For example, BYTE is a type and DB is the corresponding
- initializer. The distinction has been eliminated for MASM 6.0. Any type
- (intrinsic or user-defined) can be used as an initializer.
-
- MASM does not have specific types for arrays and strings. However, it allows
- a sequence of data units to be treated as arrays, and character (byte)
- sequences to be treated as strings. (See Section 5.1, "Arrays and Strings.")
-
-
- Types can also have attributes such as langtype and distance (NEAR and FAR).
- See Section 7.3.3, "Declaring Parameters with the PROC Directive," for
- information on these attributes.
-
- You can also define your own types with STRUCT, UNION, and RECORD. The types
- have fields that contain string or numeric data, or records that contain
- bits. These data types are similar to the user-defined data types in
- high-level languages such as C, Pascal, and FORTRAN. (See Chapter 5,
- "Defining and Using Complex Data Types.")
-
- The TYPEDEF directive defines aliases and pointer types.
-
- You can define new types, including pointer types, with the TYPEDEF
- directive, which is also new to MASM 6.0. TYPEDEF assigns a qualifiedtype
- (explained below) to a typename.
-
- ────────────────────────────────────────────────────────────────────────────
- NOTE
-
- The concept of the qualifiedtype is essential to understanding many of the
- new features in MASM 6.0, including prototypes and the .IF and INVOKE
- directives. Descriptions of these topics in later chapters refer to this
- section.
- ────────────────────────────────────────────────────────────────────────────
-
- Once assigned, the typename can be used as a data type in your program. Use
- of the qualifiedtype also allows the CodeView debugger to display
- information on the type. You cannot use a qualifiedtype as an initializer,
- but you can use a type defined with TYPEDEF.
-
- The qualifiedtype is any MASM type (such as structure types, union types,
- record types, or an intrinsic type) or can be a pointer to a type with the
- form
-
- «distance» PTR «qualifiedtype»
-
- where distance is NEAR, FAR, or any distance modifier. See Section 7.3.3,
- "Declaring Parameters with the PROC Directive," for more information on
- distance.
-
- The qualifiedtype can also be any type previously defined with TYPEDEF. For
- example, if you use TYPEDEF to create an alias for BYTE, as shown below,
- then you can use that CHAR type as a qualifiedtype when defining the
- pointer type PCHAR.
-
- CHAR TYPEDEF BYTE
- PCHAR TYPEDEF PTR CHAR
-
- Section 3.3, "Accessing Data with Pointers and Addresses," shows how to use
- the TYPEDEF directive to define pointers.
-
- Since distance and qualifiedtype are optional syntax elements, you can use
- variables of type PTR or FAR PTR. You can also define procedure prototypes
- with qualifiedtype. See Section 7.3.6, "Declaring Procedure Prototypes," for
- more information about procedure prototypes.
-
- Several rules govern the use of qualifiedtype:
-
-
- ■ The only component of a qualifiedtype definition that can be
- forwardreferenced is a structure or union type identifier.
-
- ■ If distance is not specified, the right operand and current memory
- model determine the type of the pointer. If the operand following PTR
- is not a distance or a function prototype, the operand is a pointer of
- the default data pointer type in the current mode. Otherwise, the type
- of the pointer is the distance of the right operand.
-
- ■ If .MODEL is not specified, SMALL model (and therefore NEAR pointers)
- is the default.
-
-
- A qualifiedtype can be used in seven places:
-
- ╓┌─────────────────────────────────────┌─────────────────────────────────────╖
- Use Example
- ────────────────────────────────────────────────────────────────────────────
- In procedure arguments proc1 PROC pMsg:PTR BYTE
-
- In prototype arguments proc2 PROTO pMsg:FAR PTR WORD
-
- With local variables declared inside LOCAL pMsg:PTR
- procedures
-
- Use Example
- ────────────────────────────────────────────────────────────────────────────
- With the LABEL directive TempMsg LABEL PTR WORD
-
- With the EXTERN and EXTERNDEF EXTERN pMsg:FAR PTR BYTE
- directives EXTERN MyProc:PROTO
-
- With the COMM directive COMM var1:WORD:3
-
- With the TYPEDEF directive PPBYTE TYPEDEF PTR PBYTE PFUNC
- TYPEDEF PROTO MyProc
-
-
-
- Section 3.3.1 shows ways to write a TYPEDEF type for a qualifiedtype.
- Attributes such as NEAR and FAR can also be applied to a qualifiedtype.
-
- You can also determine an accurate definition for TYPEDEF and qualifiedtype
- from the BNF grammar definitions given in Appendix B. The BNF grammar
- defines each component of the syntax for any directive, showing the
- recursive properties of components such as qualifiedtype.
-
-
- 1.2.7 Registers
-
- All the 8086 processors have the same base set of 16-bit registers. Some
- registers can be accessed as two separate 8-bit registers. In the 80386/486,
- most registers can also be accessed as extended 32-bit registers.
-
- Figure 1.3 shows the registers common to all the 8086-based processors. Each
- register has its own special uses and limitations.
-
- (This figure may be found in the printed book.)
-
- 80386/486 Only - The 80386/486 processors use the same 8-bit and 16-bit
- registers that the rest of the 8086 family uses. All of these registers can
- be further extended to 32 bits, except segment registers, which always
- occupy 16 bits. The extended register names begin with the letter "E." For
- example, the 32-bit extension of AX is EAX. The 80386/486 processors have
- two additional segment registers, FS and GS. Figure 1.4 shows the extended
- registers of the 80386/486.
-
- (This figure may be found in the printed book.)
-
-
- 1.2.7.1 Segment Registers
-
- At run time, all addresses are relative to one of four segment registers:
- CS, DS, SS, or ES. (The 80386/486 processors add two more, FS and GS.) These
- registers, their segments, and their purpose are listed below:
-
- Register and Segment Purpose
- ────────────────────────────────────────────────────────────────────────────
- CS (Code Segment) Contains processor instructions and
- their immediate operands.
-
- DS (Data Segment) Normally contains data allocated by the
- program.
-
- SS (Stack Segment) Creates stacks for use by PUSH, POP,
- CALLS,
- and RET.
-
- ES (Extra Segment) References secondary data segment. Used
- by string instructions.
-
- FS, GS Provides extra segments on the
- 80386/486.
-
-
- 1.2.7.2 General-Purpose Registers
-
- Operations on registers are usually faster than operations on memory
- locations.
-
- The AX, DX, CX, BX, BP, DI, and SI registers are 16-bit general-purpose
- registers. They can be used for temporary data storage. Since the processor
- accesses registers more quickly than it can access memory, you can speed up
- execution by keeping the most frequently used data in registers.
-
- The 8086 family of processors does not perform memory-to-memory operations.
- Thus, operations on more than one variable often require the data to be
- moved into registers.
-
- Four of the general registers, AX, DX, CX, and BX, can be accessed either as
- two 8-bit registers or as a single 16-bit register. The AH, DH, CH, and BH
- registers represent the high-order 8 bits of the corresponding registers.
- Similarly, AL, DL, CL, and BL represent the low-order 8 bits of the
- registers. All the general registers can be extended to 32 bits on the
- 80386/486.
-
-
- 1.2.7.3 Special-Purpose Registers
-
- The 8086 family of processors has two additional registers whose values are
- changed automatically by the processor.
-
- SP (Stack Pointer) - The SP register points to the current location within
- the stack segment. Pushing a value onto the stack decreases the value of SP
- by 2; popping from the stack increases the value of SP by 2. With 32-bit
- operands on 80386/486 processors, SP is increased or decreased by 4 instead
- of 2. Call instructions store the calling address on the stack and decrease
- SP accordingly; return instructions get the stored address and increase SP.
- SP can also be manipulated as a general-purpose register with instructions
- such as ADD.
-
- Only the processor can change IP.
-
- IP (Instruction Pointer) - The IP register always contains the address of
- the next instruction to be executed. You cannot directly access or change
- the instruction pointer. However, instructions that control program flow
- (such as calls, jumps, loops, and interrupts) automatically change the
- instruction pointer.
-
-
- 1.2.7.4 Flags Register
-
- Flags reveal the status of the processor.
-
- The 16 bits in the flags register control the execution of certain
- instructions and reflect the current status of the processor. In 80386/486
- processors, the flags register is extended to 32 bits. Some bits are
- undefined, so there are actually 9 flags for real mode, 11 flags (including
- a 2-bit flag) for 80286 protected mode, 13 for the 80386, and 14 for the
- 80486. The extended flags register of the 80386/486 is sometimes called
- "Eflags."
-
- Figure 1.5 shows the bits of the 32-bit flags register for the 80386/486.
- Only the lower word is used for the other 8086-family processors. The
- unmarked bits are reserved for processor use; do not modify them.
-
- (This figure may be found in the printed book.)
-
- The nine flags common to all 8086-family processors are summarized below,
- starting with the low-order flags. In these descriptions, "set" means the
- bit value is 1, and "cleared" means the bit value is 0.
-
- ╓┌─────────────────────────────────┌─────────────────────────────────────────╖
- Flag Description
- ────────────────────────────────────────────────────────────────────────────
- Carry Set if an operation generates a carry to
- or a borrow from a destination operand.
-
- Parity Set if the low-order bits of the result
- of an operation contain an even number
- of set bits.
- Flag Description
- ────────────────────────────────────────────────────────────────────────────
- of set bits.
-
- Auxiliary Carry Set if an operation generates a carry to
- or a borrow from the low-order four bits
- of an operand. This flag is used for
- binary coded decimal (BCD) arithmetic.
-
- Zero Set if the result of an operation is 0.
-
- Sign Equal to the high-order bit of the
- result of an operation (0 is positive, 1
- is negative).
-
- Trap If set, the processor generates a
- single-step interrupt after each
- instruction. A debugging program can use
- this feature to execute a program one
- instruction at a time.
-
- Flag Description
- ────────────────────────────────────────────────────────────────────────────
- Interrupt Enable If set, interrupts are recognized and
- acted on as they are received. The bit
- can be cleared to turn off interrupt
- processing temporarily.
-
- Direction Set to make string operations process
- down from high addresses to low
- addresses; can be cleared to make string
- operations process up from low addresses
- to high addresses.
-
- Overflow Set if the result of an operation is too
- large or small to fit in the destination
- operand.
-
-
-
-
- 1.2.8 Statements
-
- Statements are the line-by-line components of source files. Each MASM
- statement specifies an instruction or directive for the assembler.
- Statements have up to four fields. The syntax is shown below:
-
- «name» «operation» «operands»
- «;comment»
-
- The fields are explained below:
-
- Field Purpose
- ────────────────────────────────────────────────────────────────────────────
- name Defines a label that can be accessed
- from elsewhere in the program. For
- example, it can name a variable, type,
- segment, or code location.
-
- operation States the action of the statement. This
- field contains either an instruction or
- an assembler directive.
-
- operands Lists one or more items on which the
- instruction or directive operates.
-
- comment Provides a comment for the programmer.
- Comments
- are for documentation only; they are
- ignored by the
- assembler.
-
-
- The following line contains all four fields:
-
- mainlp: mov ax, 7 ; Comments follow the semicolon
-
- Here, mainlp is the label, mov is the operation, and ax and 7 are
- the operands, separated by a comma. The comment follows the semicolon.
-
- All fields are optional, although certain directives and instructions
- require an entry in the name or operand field. Some instructions and
- directives place restrictions on the choice of operands. By default, MASM is
- not case sensitive.
-
- Each field (except the comment field) must be separated from other fields by
- white-space characters (spaces or tabs). MASM also requires code labels to
- be followed by a colon, operands to be separated by commas, and comments to
- be preceded by a semicolon.
-
- The backslash character joins physical lines into one logical line.
-
- A logical line can contain up to 512 characters and occupy one or more
- physical lines. To extend a logical line into two or more physical lines,
- put the backslash character (\) as the last non-whitespace character before
- the comment or end of the line. You can place a comment after the backslash
- as shown in this example:
-
- .IF (x > 0) \ ; X must be positive
- && (ax > x) \ ; Result from function must be > x
- && (cx == 0) ; Check loop counter too
- mov dx, 20h
- .ENDIF
-
- Multiline comments can also be specified with the COMMENT directive. The
- assembler ignores all code between the delimiter character following the
- directive and the line containing the next instance of the delimiter
- character. This example illustrates the use of COMMENT.
-
- COMMENT ^ The assembler
- ignores this text
- ^ mov ax, 1 and this code
-
-
- 1.3 The Assembly Process
-
- Creating and running an executable file involves several processes:
-
-
- ■ Assembling the source code into an object file
-
- ■ Linking the object file with other modules or libraries into an
- executable program
-
- ■ Loading that program into memory
-
- ■ Running the program
-
-
- Once you have written your assembly-language program, MASM provides several
- options for assembling it. The OPTION directive, new to MASM 6.0, has
- several different arguments that let you control the way MASM assembles your
- programs.
-
- You can control assembly behavior with conditional assembly.
-
- Conditional assembly allows you to create one source file that can generate
- a variety of programs, depending on the status of various
- conditional-assembly statements.
-
-
- 1.3.1 Generating and Running Executable Programs
-
- This section briefly lists all the actions that take place during each of
- the assembly steps. You can change the behavior of some of these actions in
- various ways, for example, by using macros instead of procedures, or by
- using the OPTION directive or conditional assembly. The other chapters in
- this book discuss specific programming methods; this list simply gives you
- an overview.
-
-
- 1.3.1.1 Assembling
-
- The ML.EXE program does two things to create an executable program. First,
- it assembles the source code into an intermediate object file. Second, it
- calls the linker, LINK.EXE, which links the object files and libraries into
- an executable program (usually with the .EXE extension).
-
- At assembly time, the assembler
-
-
- ■ Evaluates conditional-assembly directives, assembling if the
- conditions are true.
-
- ■ Expands macros and macro functions.
-
- ■ Evaluates constant expressions such as MYFLAG AND 80H, substituting
- the calculated value for the expression.
-
- ■ Encodes instructions and nonaddress operands. For example, mov cx, 13
- can be encoded at assembly time because the instruction does not
- access memory.
-
- ■ Saves memory offsets as offsets from their segment.
-
- ■ Passes segments and segment attributes to the object file.
-
- ■ Saves placeholders for offsets and segments (relocatable addresses).
-
- ■ Outputs a listing if requested.
-
- ■ Passes messages (such as INCLUDELIB and .DOSSEG) directly to the
- linker.
-
-
- See Section 1.3.3 for information about conditional assembly; see Chapter 9
- for macros. Chapters 2 and 3 give further details about segments and
- offsets, and Appendix C explains listing files.
-
-
- 1.3.1.2 Linking
-
- Once your source code is assembled, the resulting object file is passed to
- the linker. At this point, the linker may combine several object files into
- an executable program.
-
- At link time, the linker
-
-
- ■ Combines segments according to the instructions in the object files,
- rearranging the positions of segments that share the same class or
- group.
-
- ■ Fills in placeholders for offsets (relocatable addresses).
-
- ■ Writes relocations for segments into the header of .EXE files (but not
- .COM files).
-
- ■ Writes an executable image.
-
-
- Section 2.3.4, "Defining Segment Groups," defines classes and groups.
- Chapter 3, "Using Addresses and Pointers," explains segments and offsets.
-
-
- 1.3.1.3 Loading
-
- The operating system loads the file generated by the linker into memory.
- When the executable file is loaded into memory, DOS
-
-
- ■ Reads the program segment prefix (PSP) header into memory.
-
- ■ Allocates memory for the program, based on the values in the PSP.
-
- ■ Loads the program.
-
- ■ Calculates the correct values for absolute addresses from the
- relocation table.
-
- ■ Loads the segment registers SS, CS, DS, and ES with values that point
- to the proper areas of memory.
-
- ■ Loads the instruction pointer (IP) to point to the start address in
- the code segment and the stack pointer (SP) to point to the stack.
-
- ■ Begins execution of the program.
-
-
- The process is similar for OS/2.
-
- See Section 1.2.7, "Registers," for information about segment registers, the
- instruction pointer (IP), and the stack pointer (SP). See MASM online help
- or a DOS reference for more information on the PSP.
-
-
- 1.3.1.4 Running
-
- Your program is now ready to run. Some program operations cannot be handled
- until the program runs, such as resolving indirect memory operands. See
- Section 7.1.1.2, "Indirect Operands."
-
-
- 1.3.2 Using the OPTION Directive
-
- The OPTION directive lets you modify global aspects of the assembly process.
- With OPTION, you can change command-line options and default arguments.
- These changes affect only statements that follow the use of OPTION.
-
- For example, you may have MASM code in which the first character of a
- variable, macro, structure, or field name is a dot (.). Since a leading dot
- causes MASM 6.0 to generate an error, you can use this statement in your
- program:
-
- OPTION DOTNAME
-
- This enables the use of the dot for the first character.
-
- Changes made with OPTION override any corresponding command-line option. For
- example, suppose you compile a module with this command line (which enables
- M510 compatibility):
-
- ML /Zm TEST.ASM
-
- but this statement is in the module:
-
- OPTION NOM510
-
- From this point on in the module, the M510 compatibility options are
- disabled.
-
- The lists below explain each of the arguments for the OPTION directive. You
- can put more than one OPTION statement on one line if you separate them by
- commas.
-
-
- Options for M510 Compatibility
-
- ╓┌─────────────────────────────────┌─────────────────────────────────────────╖
- Argument Description
- ────────────────────────────────────────────────────────────────────────────
- CASEMAP: maptype CASEMAP:NONE (or /Cx) causes internal
- symbol recognition to be case sensitive
- Argument Description
- ────────────────────────────────────────────────────────────────────────────
- symbol recognition to be case sensitive
- and causes the case of identifiers in
- the .OBJ file to be the same as
- specified in the
- EXTERNDEF, PUBLIC, or COMM statement.
- The default is CASEMAP:NOTPUBLIC (or
- /Cp). It specifies case insensitivity
- for internal symbol recognition and the
- same behavior as CASEMAP:NONE for case
- of identifiers in .OBJ files.
- CASEMAP:ALL (/Cu) specifies case
- insensitivity for identifiers and
- converts all identifier names to
- uppercase.
-
- DOTNAME | NODOTNAME Enables the use of the dot (.) as the
- leading character in variable, macro,
- structure, union, and member names.
- NODOTNAME is the default.
- Argument Description
- ────────────────────────────────────────────────────────────────────────────
- NODOTNAME is the default.
-
- M510 | NOM510 Sets all features to be compatible with
- MASM version 5.1, disabling the SCOPED
- argument and enabling OLDMACROS,
- DOTNAME, and, OLDSTRUCTS. OPTION M510
- conditionally sets other arguments for
- the OPTION directive. The default is
- NOM510. See Appendix A for more
- information on using OPTION M510.
-
- OLDMACROS | NOOLDMACROS Enables the version 5.1 treatment of
- macros. MASM 6.0 treats macros
- differently. The default is NOOLDMACROS.
-
-
-
- OLDSTRUCTS | NOOLDSTRUCTS Enables compatibility with MASM 5.1 for
- treatment of structure members. See
- Argument Description
- ────────────────────────────────────────────────────────────────────────────
- treatment of structure members. See
- Section 5.2 for information on
- structures.
-
- SCOPED | NOSCOPED Guarantees that all labels inside
- procedures are local to the procedure
- when SCOPED (the default) is enabled.
-
-
-
-
- Options for Procedure Use
-
- Argument Description
- ────────────────────────────────────────────────────────────────────────────
- LANGUAGE : langtype Specifies the default language type (C,
- PASCAL, FORTRAN, BASIC, SYSCALL, or
- STDCALL) to be used with PROC, EXTERN,
- and PUBLIC. This use of the OPTION
- directive overrides the .MODEL directive
- but is normally used when .MODEL is not
- given.
-
- EPILOGUE: macroname Instructs the assembler to call the
- macroname to generate a user-
- defined epilogue instead of the standard
- epilogue code when a RET instruction is
- encountered. See Section 7.3.8.
-
- PROLOGUE: macroname Instructs the assembler to call
- macroname to generate a user-
- defined prologue instead of generating
- the standard prologue code. See Section
- 7.3.8.
-
- PROC: visibility Allows the default visibility to be set
- explicitly. The default visibility is
- PUBLIC. The visibility can also be
- either EXPORT or PRIVATE.
-
-
-
- Other Options
-
- ╓┌─────────────────────────────────┌─────────────────────────────────────────╖
- Argument Description
- ────────────────────────────────────────────────────────────────────────────
- EXPR16 | EXPR32 Sets the expression word size to 16 or
- 32 bits. The default is 32 bits. The
- M510 argument to the OPTION directive
- sets the word size to 16 bits. Once set
- with the OPTION directive, the
- expression word size cannot be changed.
-
- EMULATOR | NOEMULATOR Controls the generation of
- floating-point instructions. The
- NOEMULATOR option generates the
- coprocessor instructions directly. The
- EMULATOR option generates instructions
- with special fixup records for the
- linker so that the Microsoft
- Argument Description
- ────────────────────────────────────────────────────────────────────────────
- linker so that the Microsoft
- floating-point emulator, supplied with
- other Microsoft languages, can be used.
- It produces the same result as setting
- the /Fpi command-line option. You can
- set this option only once per module.
-
- LJMP | NOLJMP Enables automatic conditional-jump
- lengthening. The default is LJMP. See
- Section 7.1.2 for information about
- conditional-jump lengthening.
-
- NOKEYWORD:<keywordlist> Disables the specified reserved words.
- See Section 1.2.1, "Reserved Words," for
- an example of the syntax for this
- argument.
-
- NOSIGNEXTEND Overrides the default sign-extended
- opcodes for the AND, OR, and XOR
- Argument Description
- ────────────────────────────────────────────────────────────────────────────
- opcodes for the AND, OR, and XOR
- instructions and generates the larger
- non-sign-extended forms of these
- instructions. Provided for compatibility
- with NEC V25 (R) and NEC V35(tm)
- controllers.
-
- OFFSET: offsettype Determines the result of OFFSET operator
- fixups. SEGMENT sets the defaults for
- fixups to be segment-
- relative (compatible with MASM 5.1).
- GROUP, the default, generates fixups
- relative to the group (if the label is
- in a group). FLAT causes fixups to be
- relative to a flat frame. (The .386 mode
- must be enabled to use FLAT.) See
- Appendix A for more information.
-
- READONLY | NOREADONLY Enables checking for instructions that
- Argument Description
- ────────────────────────────────────────────────────────────────────────────
- READONLY | NOREADONLY Enables checking for instructions that
- modify code segments, thereby
- guaranteeing that read-only code
- segments are not modified. Replaces the
- /p command-line option of MASM 5.1. It
- is useful for OS/2, where code segments
- are normally read-only.
-
- SEGMENT: segSize Allows global default segment size to be
- set. Also determines the default address
- size for external symbols defined
- outside any segment. The segSize can be
- USE16, USE32, or FLAT.
-
-
-
-
- 1.3.3 Conditional Directives
-
- MASM 6.0 provides conditional-assembly directives and conditional-error
- directives. You can also use conditional-assembly directives when you want
- to test for a specified condition and assemble a block of statements if the
- condition is true. You can use conditional-error directives when you want to
- test for a specified condition and generate an assembly error if the
- condition is true.
-
- Both kinds of conditional directives test assembly-time conditions, not
- run-time conditions. Only expressions that evaluate to constants during
- assembly can be compared or tested. Predefined symbols are often used in
- conditional assembly. See Section 1.2.3.
-
-
- Conditional-Assembly Directives
-
- The IF and ENDIF directives enclose the statements to be considered for
- conditional assembly. The optional ELSEIF and ELSE blocks follow the IF
- directive. There are many forms of the IF and ELSE directives. Online help
- provides a complete list.
-
- The syntax used for the IF directives is shown below. The syntax for other
- condition-assembly directives follow the same form.
-
- IF expression1
- ifstatements
- [[ELSEIF expression2
- elseifstatements]]
- [[ELSE
- elsestatements]]
- ENDIF
-
-
-
- The statements following the IF directive can be any valid statements,
- including other conditional blocks, which in turn can contain any number of
- ELSEIF blocks. ENDIF ends the block.
-
- The statements following the IF directive are assembled only if the
- corresponding condition is true. If the condition is not true and an ELSEIF
- directive is used, the assembler checks to see if the corresponding
- condition is true. If so, it assembles the statements following the ELSEIF
- directive. If no IF or ELSEIF conditions are satisfied, the statements
- following the ELSE directive are assembled.
-
- For example, you may want to assemble a line of code only if a particular
- variable has been defined. In this example,
-
- IFDEF buffer
- buff BYTE buffer DUP(?)
- ENDIF
-
- buff is allocated only if buffer has been previously defined.
-
- The following list summarizes the conditional-assembly directives:
-
- Directive Use
- ────────────────────────────────────────────────────────────────────────────
- IF and IFE Tests the value of an expression and
- allows
- assembly based on the result.
-
- IFDEF and IFNDEF Tests whether a symbol has been defined
- and allows assembly based on the result.
-
- IFB and IFNB Tests to see if a specified argument was
- passed to a macro and allows assembly
- based on the result.
-
- IFIDN and IFDIF Compares two macro arguments and allows
- assembly based on the result. (IFDIFI
- and IFIDNI perform the same action but
- are case insensitive.)
-
-
-
-
-
- Conditional-Error Directives
-
- You can use conditional-error directives to debug programs and check for
- assembly-time errors. By inserting a conditional-error directive at a key
- point in your code, you can test assembly-time conditions at that point. You
- can also use conditional-error directives to test for boundary conditions in
- macros.
-
- Like other severe errors, those generated by conditional-error directives
- cause the assembler to return a nonzero exit code. If a severe error is
- encountered during assembly, MASM does not generate the object module.
-
- For example, the .ERRNDEF directive produces an error if some label has not
- been defined. In this example, .ERRNDEF at the beginning of the conditional
- block makes sure that a publevel actually exists.
-
- .ERRNDEF publevel
- IF publevel LE 2
- PUBLIC var1, var2
- ELSE
- PUBLIC var1, var2, var3
- ENDIF
-
- These directives use the syntax given in the previous section. The following
- list summarizes the conditional-error directives.
-
- Directive Use
- ────────────────────────────────────────────────────────────────────────────
- .ERR Forces an error where the directives occur in
- the source file. The error is generated
- unconditionally when the directive is
- encountered, but the directives can be placed
- within conditional-assembly blocks to limit the
- errors to certain situations.
-
- .ERRE and .ERRNZ Tests the value of an expression and
- conditionally generates an error based on the
- result.
-
- .ERRDEF and Tests whether a symbol is defined and
- .ERRNDEF conditionally generates an error based on the
- result.
-
- .ERRB and .ERRNB Tests whether a specified argument was passed to
- a macro and conditionally generates an error
- based on the result.
-
- .ERRIDN and Compares two macro arguments and conditionally
- .ERRDIF generates an error based on the result. (
- .ERRIDNI and .ERRDIFI perform the same action
- but are case sensitive.)
-
-
-
- 1.4 Related Topics in Online Help
-
- In addition to information covered in this chapter, information on the
- following topics can be found in online help.
-
- ╓┌─────────────────────────────────────┌─────────────────────────────────────╖
- Topic Access
- ────────────────────────────────────────────────────────────────────────────
- Predefined symbols From the "MASM 6.0 Contents" screen,
- choose "Predefined Symbols"
-
- Operator precedence From the list of tables on the "MASM
- 6.0 Contents" screen, choose
- "Operator Precedence"
-
- Data types Choose "Directives" from the "MASM
- Topic Access
- ────────────────────────────────────────────────────────────────────────────
- Data types Choose "Directives" from the "MASM
- 6.0 Contents" screen; then choose
- "Data Allocation" or "Complex Data
- Types" from the resulting screen
-
- Registers From the "MASM 6.0 Contents" screen,
- choose "Language Overview"; then
- choose "Processor Register Summary"
-
- Processor directives To see a table of directives, choose
- "Processor Selection" from the "MASM
- 6.0 Contents" screen
-
- Conditional assembly and conditional Choose "Directives" from the "MASM
- errors 6.0 Contents" screen
-
- EVEN, ALIGN, From the "MASM 6.0 Contents" screen,
- OPTION choose "Directives," then
- "Miscellaneous"
- Topic Access
- ────────────────────────────────────────────────────────────────────────────
- "Miscellaneous"
-
- Radix specifiers From the "MASM 6.0 Contents" screen,
- choose "Language Overview"
-
- ML command-line options From the "Microsoft Advisor Contents"
- screen, choose "Macro Assembler"
- from the " Command Line" list
-
-
-
-
-
-
-
-
- Chapter 2 Organizing MASM Segments
- ────────────────────────────────────────────────────────────────────────────
-
- A segment is a collection of instructions or data whose addresses are all
- relative to the same segment register. The code in your assembly-language
- program defines and organizes them.
-
- Segments can be defined by using simplified segment directives or full
- segment definitions. Section 2.2, "Using Simplified Segment Directives,"
- covers the directives you can use to begin, end, and organize segment
- program modules. It also discusses how to access far data and code with
- simplified segment directives.
-
- Section 2.3, "Using Full Segment Definitions," describes how to order,
- combine, and divide segments, as well as how to use the SEGMENT directive to
- define full segments. It also tells you how to create a segment group so
- that you can use just one segment address to access all the data.
-
- Most of the information in this chapter also applies to writing modules to
- be called from other programs. Exceptions are noted when they apply. See
- Chapter 8, "Sharing Data and Procedures among Modules and Libraries," for
- more information about multiple-module programming.
-
-
- 2.1 Overview of Memory Segments
-
- A physical segment is an area of memory in which all locations are
- contiguous and share the same segment address. A segment always begins on a
- 16-byte (paragraph) boundary (unless an alignment attribute is specified
- with ALIGN). While 16-bit segments can occupy up to 64K (kilobytes), 32-bit
- segments can be as large as 4 gigabytes.
-
- Segments reflect the architecture of the original 8086 processor. Prior to
- the 80386 processors and OS/2 2.x, assembly-language programming meant using
- segmented memory. A flat address space is now available on 80386/486
- processors in 32-bit mode. This space is still segmented at the hardware
- level, but it allows you to ignore most segmentation concerns.
-
- Segments provide a means for associating similar kinds of data. Most
- programs have segments for code, data, constant data, and the stack. These
- logical segments are allocated by the assembler at assembly time.
-
- You can define segments in two ways: with simplified segment directives and
- with full segment definitions. You can also use both kinds of segment
- definitions in the same program.
-
- Simplified segment directives are easier to use than full segment
- definitions.
-
- Simplified segment directives hide many of the details of segment definition
- and assume the same conventions used by Microsoft high-level languages. (See
- Section 2.2.) The simplified segment directives generate necessary code,
- specify segment attributes, and arrange segment order.
-
- Full segment definitions require more complex syntax but provide more
- complete control over how the assembler generates segments. (See Section
- 2.3.) If you use full segment definitions, you must write code to handle all
- the tasks performed automatically by the simplified segment directives.
-
-
- 2.2 Using Simplified Segment Directives
-
- Structuring a MASM program using simplified segments requires use of several
- directives to assign standard names, alignment, and attributes to the
- segments in your program. These directives define the segments in such a way
- that linking with Microsoft high-level languages is easy.
-
- The simplified segment directives are .MODEL, .CODE, .CONST, .DATA, .DATA?,
- .FARDATA, .FARDATA?, .STACK, .STARTUP, and .EXIT. These directives and the
- arguments they take are discussed in the following sections.
-
- The main module is where execution begins.
-
- MASM programs consist of modules made up of segments. Every program written
- only in MASM has one main module, where program execution begins. This main
- module can contain code, data, or stack segments defined with all of the
- simplified segment directives. Any additional modules should contain only
- code and data segments. Every module that uses simplified segments must,
- however, begin with the .MODEL directive.
-
-
- The following example shows the structure of a main module using simplified
- segment directives. It uses the default processor (8086), the default
- operating system (OS_DOS), and the default stack distance (NEARSTACK).
- Additional modules linked to this main program would use only the .MODEL,
- .CODE, and .DATA directives and the END statement.
-
- ; This is the structure of a main module
- ; using simplified segment directives
-
- .MODEL small, c ; This statement is required before you
- ; can use other simplified segment
- ; directives
-
- .STACK ; Use default 1-kilobyte stack
-
- .DATA ; Begin data segment
-
- ; Place data declarations here
-
- .CODE ; Begin code segment
- .STARTUP ; Generate start-up code
-
- ; Place instructions here
-
- .EXIT ; Generate exit code
- END
-
- A module must always finish with the END directive.
-
- The .DATA and .CODE statements do not require any separate statements to
- define the end of a segment. They close the preceding segment and then open
- a new segment. The .STACK directive opens and closes the stack segment but
- does not close the current segment. The END statement closes the last
- segment and marks the end of the source code. It must be at the end of every
- module, whether or not it is the main module.
-
-
- 2.2.1 Defining Basic Attributes with .MODEL
-
- The .MODEL directive defines the attributes that affect the entire module:
- memory model, default calling and naming conventions, operating system, and
- stack type. This directive enables use of simplified segments and controls
- the name of the code segment and the default distance for procedures.
-
- You must place .MODEL in your source file before any other simplified
- segment directive. The syntax is
-
- .MODEL memorymodel «, modeloptions »
-
- The memorymodel field is required and must appear immediately after the
- .MODEL directive. The use of modeloptions, which define the other
- attributes, is optional. The modeloptions must be separated by commas. You
- can also use equates passed from the ML command line to define the
- modeloptions.
-
- The list below summarizes the memorymodel field and the modeloptions fields
- (language, operating system, and stack distance):
-
- Field Description
- ────────────────────────────────────────────────────────────────────────────
- Memory model TINY, SMALL, COMPACT, MEDIUM, LARGE,
- HUGE, or FLAT. Determines size of code
- and data pointers. This field is
- required.
-
- Language C, BASIC, FORTRAN, PASCAL, SYSCALL, or
- STDCALL. Sets calling and naming
- conventions for procedures and public
- symbols.
-
- Operating system OS_OS2 or OS_DOS. Determines behavior of
- .STARTUP and .EXIT.
-
- Stack distance NEARSTACK or FARSTACK. Specifying
- NEARSTACK groups the stack segment into
- a single physical segment (DGROUP) along
- with data. SS is assumed to equal DS.
- FARSTACK does not group the stack with
- DGROUP; thus SS does not equal DS.
-
-
- You can use no more than one reserved word from each field. The following
- examples show how you can combine various fields:
-
- .MODEL small ; Small memory model
- .MODEL large, c, farstack ; Large memory model,
- ; C conventions,
- ; separate stack
- .MODEL medium, pascal, os_os2 ; Medium memory model,
- ; Pascal conventions,
- ; OS/2 start-up/exit
-
- The next four sections give more detail on each field.
-
-
- Defining the Memory Model
-
- MASM supports the standard memory models used by Microsoft high-level
- languages─tiny, small, medium, compact, large, huge, and flat. You specify
- the memory model with attributes of the same name placed after the .MODEL
- directive. Your choice of a memory model does not limit the kind of
- instructions you can write. It does, however, control segment defaults and
- determine whether data and code are near or far by default (see Table 2.1).
-
-
- Table 2.1 Attributes of Memory Models
-
- ╓┌─────────────┌─────────────┌─────────────┌────────────────┌────────────────╖
- Memory Model Default Code Default Data Operating Data and Code
- System Combined
- ────────────────────────────────────────────────────────────────────────────
- Memory Model Default Code Default Data Operating Data and Code
- System Combined
- ────────────────────────────────────────────────────────────────────────────
- Tiny Near Near DOS Yes
-
- Small Near Near DOS, OS/2 1.x No
-
- Medium Far Near DOS, OS/2 1.x No
-
- Compact Near Far DOS, OS/2 1.x No
-
- Large Far Far DOS, OS/2 1.x No
-
- Huge Far Far DOS, OS/2 1.x No
-
- Flat Near Near OS/2 2.x Yes
-
- ────────────────────────────────────────────────────────────────────────────
-
-
-
- When writing assembler modules for a high-level language, you should use the
- same memory model as the calling language. Generally, choose the smallest
- memory model available that can contain your data and code, since near
- references are more efficient than far references.
-
- The predefined symbol @Model returns the memory model. It encodes memory
- models as integers 1 through 7. See Section 1.2.3 for more information on
- predefined symbols, and see online help for an example of how to use them.
-
- The seven memory models supported by MASM 6.0 divide into three groups.
-
- Small, Medium, Compact, Large, and Huge Models - The traditional memory
- models recognized by many DOS and OS/2 1.x languages are small, medium,
- compact, large, and huge. Small model supports one data segment and one code
- segment. All data and code are near by default. Large model supports
- multiple code and multiple data segments. All data and code are far by
- default. Medium and compact models are in between. Medium model supports
- multiple code and single data segments; compact model supports multiple data
- segments and a single code segment.
-
- Huge model implies individual data items larger than a single segment, but
- the implementation of huge data items must be coded by the programmer. Since
- the assembler provides no direct support for this feature, huge model is
- essentially the same as large model.
-
- In each of these models, you can override the default. For example, you can
- make large data items far in small model, or internal procedures near in
- large model.
-
- Tiny Model - OS/2 does not support tiny model, but DOS does under MASM 6.0.
- This model places all data and code in a single segment. Therefore, the
- total program size can be no more than 64K. The default is near for code and
- static data items; you cannot override this default. However, you can
- allocate far data dynamically at run time using DOS memory allocation
- services.
-
- Tiny model produces DOS .COM files. Specifying .MODEL tiny automatically
- sends a /TINY to the linker. Therefore, /AT is not necessary with .MODEL
- tiny. However, /AT does not insert a .MODEL directive. It only verifies that
- there are no base or pointer fixups, and sends /TINY to the linker.
-
- Flat Model - The flat memory model is a nonsegmented configuration available
- for 32-bit operating systems. It is similar to tiny model in that all code
- and data go in a single 32-bit segment.
-
- OS/2 2.x uses flat model when you specify the .386 or .486 directive before
- .MODEL FLAT. All data and code (including system resources) are in a single
- 32-bit segment. Segment registers are initialized automatically at load
- time; the programmer needs to modify them only when mixing 16-bit and 32-bit
- segments in a single application. CS, DS, ES, and SS are all assumed to the
- supergroup FLAT. FS and GS are assumed to ERROR, since 32-bit versions of
- OS/2 reserve the use of these registers. Addresses and pointers passed to
- system services are always 32-bit near addresses and pointers. Although the
- theoretical size of the single flat segment is four gigabytes, OS/2 2.0
- actually limits it to 512 megabytes in flat model.
-
-
- Choosing the Language Convention
-
- The language type is most important when you write a mixed-language program.
-
-
- The language option facilitates compatibility with high-level languages by
- determining the internal encoding for external and public symbol names, the
- code generated for procedure initialization and cleanup, and the order that
- arguments are passed to a procedure with INVOKE. It also facilitates
- compatibility with high-level-language modules. The PASCAL, BASIC, and
- FORTRAN conventions are identical. C and SYSCALL have the same calling
- convention but different naming conventions. OS/2 system calls require the
- PASCAL calling convention for OS/2 1.x, but require the SYSCALL convention
- for OS/2 2.x. Specifying STDCALL for the calling convention enables a
- different calling convention and the same naming convention (see Section
- 20.1).
-
- Procedure definitions (PROC) and high-level procedure calls (INVOKE)
- automatically generate code consistent with the calling convention of the
- specified language. The PROC, INVOKE, PUBLIC, and EXTERN directives all use
- the naming convention of the language. These directives follow the default
- language conventions from the .MODEL directive unless you specifically
- override the default. Chapter 7, "Controlling Program Flow," tells how to
- use these directives. You can also use the OPTION directive to set the
- language type. (See Section 1.3.2.) Not specifying a language type in either
- the .MODEL, OPTION, EXTERN, PROC, INVOKE, or PROTO statement causes the
- assembler to generate an error.
-
- The predefined symbol @Interface provides information about the language
- parameters. See online help for a description of the bit flags.
-
- See Chapter 20, "Mixed-Language Programming," for more information on
- calling and naming conventions. See Chapter 7, "Controlling Program Flow,"
- for information about writing procedures and prototypes. See Chapter 8,
- "Sharing Data and Procedures among Modules and Libraries," for information
- on multiple-module programming.
-
-
- Specifying the Operating System
-
- The operating-system options (OS_DOS or OS_OS2) are arguments of .MODEL.
- They specify the start-up and exit code generated by the .STARTUP and .EXIT
- directives. (See Section 2.2.6.) If you do not use .STARTUP and .EXIT, you
- can omit this option. The default is OS_DOS.
-
-
- Setting the Stack Distance
-
- The NEARSTACK setting places the stack segment in a group, DGROUP, shared
- with data. The .STARTUP directive then generates code to adjust SS:SP so
- that SS (Stack Segment register) holds the same address as DS (Data Segment
- register). If you do not use .STARTUP, you must make this adjustment
- yourself or your program may fail to run. (See Section 2.2.6 for information
- about start-up code.) In this case, you can use DS to access stack items
- (including parameters and local variables) and SS to access near data.
- Furthermore, since stack items share the same segment address as near data,
- you can reliably pass near pointers to stack items.
-
- Having SS equal to DS gives some programming advantages.
-
- The FARSTACK setting gives the stack a segment of its own. That is, SS does
- not equal DS. The default stack type, NEARSTACK, is a convenient setting for
- most programs. Use FARSTACK for special cases such as memory-resident
- programs and dynamic-link libraries (DLLs) when you cannot assume that the
- caller's stack is near.
-
- The stack specification also affects the ASSUME statement generated by
- .MODEL and .STACK. You can use the predefined symbol @Stack to determine if
- the stack location is DGROUP (for near stacks) or STACK (for far stacks).
-
-
- 2.2.2 Specifying a Processor and Coprocessor
-
- MASM supports a set of directives for selecting processors and coprocessors.
- Once you select a processor, you must use only the instruction set available
- for that processor. The default is the 8086 processor. If you always want
- your code to run on this processor, you do not need to add any processor
- directives.
-
- To enable a different processor mode and the additional instructions
- available on that processor, use the directives .186, .286, .386, and .486.
-
-
- The .286P, .386P, and .486P directives enable the instructions available
- only at higher privilege levels in addition to the normal instruction set
- for the given processor. Privileged instructions are not necessary for
- writing applications, even for OS/2. Generally, you don't need privileged
- instructions unless you are writing operating-systems code or device
- drivers.
-
- Processor directives affect availability of various MASM language features.
-
-
- In addition to enabling different instruction sets, the processor directives
- also affect the behavior of extended language features. For example, the
- INVOKE directive pushes arguments onto the stack. If the .286 directive is
- in effect, INVOKE takes advantage of operations possible only on 80286 and
- later processors.
-
- Use the directives .8087 (the default), .287, .387, and .NO87 to select a
- math coprocessor instruction set. The .NO87 directive turns off assembly of
- all coprocessor instructions. Note that .486 also enables assembly of all
- coprocessor instructions because the 80486 processor has a complete set of
- coprocessor registers and instructions built into the chip. The processor
- instructions imply the corresponding coprocessor directive. The coprocessor
- directives are provided to override the defaults.
-
-
- 2.2.3 Creating a Stack
-
- The stack is the section of memory used for pushing or popping registers and
- storing the return address when a subroutine is called. The stack often
- holds temporary and local variables.
-
- If your main module is written in a high-level language, that language
- handles the details of creating a stack. Use the .STACK directive only when
- you write a main module in assembly language.
-
- The .STACK directive creates a stack segment. By default, the assembler
- allocates 1K of memory for the stack. This size is sufficient for most small
- programs.
-
- To create a stack of a size other than the default size, give .STACK a
- single numeric argument indicating stack size in bytes:
-
- .STACK 2048 ; Use 2K stack
-
- For a description of how stack memory is used with procedure calls and local
- variables, see Chapter 7, "Controlling Program Flow."
-
-
- 2.2.4 Creating Data Segments
-
- Programs can contain both near and far data. In general, you should place
- important and frequently used data in the near data area, where data access
- is faster. This area can get crowded, however, because (in 16-bit operating
- systems) the total amount of all near data in all modules cannot exceed 64K.
- Therefore, you may want to place infrequently used or particularly large
- data items in a far data segment.
-
- The .DATA, .DATA?, .CONST, .FARDATA, and .FARDATA? directives create data
- segments. You can access the various segments within DGROUP without
- reloading segment registers (see Section 2.3.4, "Defining Segment Groups").
- These four directives also prevent instructions from appearing in data
- segments by assuming CS to ERROR. (See Section 2.3.3 for information about
- ASSUME.)
-
-
- Near Data Segments
-
- The .DATA directive creates a near data segment. This segment contains the
- frequently used data for your program. It can occupy up to 64K in DOS or 512
- megabytes under flat model in OS/2 2.0. It is placed in a special group
- identified as DGROUP, which is also limited to 64K.
-
- Near data pointers always point to DGROUP.
-
- When you use .MODEL, the assembler automatically defines DGROUP for your
- near data segment. The segments in DGROUP form near data, which can normally
- be accessed directly through DS or SS.
-
- You can also define the .DATA? and .CONST segments that go into DGROUP
- unless you are using flat model. Although all of these segments (along with
- the stack) are eventually grouped together and handled as data segments,
- .DATA? and .CONST enhance compatibility with Microsoft high-level languages.
- In Microsoft languages, .CONST is used for defining constant data such as
- strings and floating-point numbers that must be stored in memory. The .DATA?
- segment is used for storing uninitialized variables. You can follow this
- convention if you wish. If you use C start-up code, .DATA? is initialized to
- 0.
-
- You can use @data to determine the group of the data segment and @DataSize
- to determine the size of the memory model set by the .MODEL directive. The
- predefined symbols @WordSize and @CurSeg return the size attribute and name
- of the current segment, respectively. See Section 1.2.3, "Predefined
- Symbols."
-
-
- Far Data Segments
-
- The compact, large, and huge memory models use far data addresses by
- default. With these memory models, however, you can still use .DATA, .DATA?,
- and .CONST to create data segments. The effect of these directives does not
- change from one memory model to the next. They always contribute segments to
- the default data area, DGROUP, which has a total limit of 64K.
-
-
- When you use .FARDATA or .FARDATA? in the small and medium memory models,
- the assembler creates far data segments FAR_DATA and FAR_BSS, respectively.
- You can access variables with:
-
- mov ax, SEG farvar2
- mov ds, ax
-
- See Section 3.1.2 for more information on far data.
-
-
- 2.2.5 Creating Code Segments
-
- Whether you are writing a main module or a module to be called from another
- module, you can have both near and far code segments. This section explains
- how to use near and far code segments and how to use the directives and
- predefined equates that relate to code segments.
-
-
- Near Code Segments
-
- The small memory model is often the best choice for assembly programs that
- are not linked to modules in other languages, especially if you do not need
- more than 64K of code. This memory model defaults to near (two-byte)
- addresses for code and data, which makes the program run faster and use less
- memory.
-
- When you use .MODEL and simplified segment directives, the .CODE directive
- in your program instructs the assembler to start a code segment. The next
- segment directive closes the previous segment; the END directive at the end
- of your program closes remaining segments. The example at the beginning of
- Section 2.2, "Using Simplified Segment Directives," shows how to do this.
-
- You can use the predefined symbol @CodeSize to determine whether code
- pointers default to NEAR or FAR.
-
-
- Far Code Segments
-
- When you need more than 64K of code, use the medium, large, or huge memory
- model to create far segments.
-
- The medium, large, and huge memory models use far code addresses by default.
- In the larger memory models, the assembler creates a different code segment
- for each module. If you use multiple code segments in the small, compact, or
- tiny model, the linker combines the .CODE segments for all modules into one
- segment.
-
- The assembler assigns names to code segments.
-
- For far code segments, the assembler names each code segment MODNAME_TEXT,
- in which MODNAME is the name of the module. With near code, the assembler
- names every code segment _TEXT, causing the linker to concatenate these
- segments into one. You can override the default name by providing an
- argument after .CODE. (See Appendix E, "Default Segment Names," for a
- complete list of segment names generated by MASM.)
-
- With far code, a single module can contain multiple code segments. The .CODE
- directive takes an optional text argument that names the segment. For
- instance, the example below creates two distinct code segments, FIRST_TEXT
- and SECOND_TEXT.
-
- .CODE FIRST
- .
- . ; First set of instructions here
- .
- .CODE SECOND
- .
- . ; Second set of instructions here
- .
-
- Whenever the processor executes a far call or jump, it loads CS with the new
- segment address. No special action is necessary other than making sure that
- you use far calls and jumps. See Section 3.1.2, "Near and Far Addresses."
-
- ────────────────────────────────────────────────────────────────────────────
- NOTE
- The ASSUME directive is never necessary when you change code segments. In
- MASM 6.0, the assembler always assumes that the CS register contains the
- address of the current code segment or group. See Section 2.3.3 for more
- information about ASSUME used with segment registers.
- ────────────────────────────────────────────────────────────────────────────
-
-
- 2.2.6 Starting and Ending Code with .STARTUP and .EXIT
-
- The easiest way to begin and end a program is to use the .STARTUP and .EXIT
- directives in the main module. The main module contains the starting point
- and usually the termination point. You do not need these directives in a
- module called by another module.
-
- .STARTUP generates the start-up code required by either DOS or OS/2.
-
- These directives make programs easy to maintain. They automatically generate
- code appropriate to the operating system and stack types specified with
- .MODEL. Thus, you can specify the program is for a different operating
- system or stack type by altering keywords in the .MODEL directive.
-
- To start a program, place the .STARTUP directive where you want execution to
- begin. Usually, this location immediately follows the .CODE directive:
-
- .CODE
- .STARTUP
- .
- . ; Place executable code here
- .
- .EXIT
- END
-
- Note that .EXIT generates executable code, while END does not. The END
- directive informs the assembler that it has reached the end of the module.
- All modules must end with the END directive whether you use simplified or
- full segments.
-
- If you do not use .STARTUP, you must give the starting address as an
- argument to the END directive. When .STARTUP is present, the assembler
- ignores any argument to END.
-
- The code generated by .STARTUP depends on the operating system specified
- after .MODEL.
-
- If your program uses DOS for its operating system (the default), the
- initialization code sets DS to DGROUP, and adjusts SS:SP so that it is
- relative to the group for near data, DGROUP. To initialize a DOS program
- with the default NEARSTACK attribute, .STARTUP generates the following code:
-
-
- @Startup:
- mov dx, DGROUP
- mov ds, dx
- mov bx, ss
- sub bx, dx
- shl bx, 1 ; If .286 or higher, this is
- shl bx, 1 ; shortened to shl bx, 4
- shl bx, 1
- shl bx, 1
- cli ; Not necessary in .286 or higher
- mov ss, dx
- add sp, bx
- sti ; Not necessary in .286 or higher
- .
- .
- .
- END @Startup
-
- A DOS program with the FARSTACK attribute does not need to adjust SS:SP, so
- it just initializes DS:
-
- @Startup:
- mov dx, DGROUP
- mov ds, dx
- .
- .
- .
- END @Startup
-
- OS/2 initializes DS so that it points to DGROUP and sets SS:SP as desired.
- Thus, when the OS_OS2 attribute is given, .STARTUP generates only a starting
- address. This does not show up in the listing file, however, since the /Sg
- option for listing files shows only the generated instructions.
-
- When the program terminates, you can return an exit code to the operating
- system. Applications that check exit codes usually assume that an exit code
- of 0 means no problem occurred and that 1 means an error terminated the
- program. The .EXIT directive accepts the exit code as its one optional
- argument:
-
- .EXIT 1 ; Return exit code 1
-
- This directive generates a DOS interrupt or OS/2 system call, depending on
- the operating system specified in .MODEL. The code generated under DOS
- depends on the argument provided to .EXIT. One example is
-
- mov al, value
- mov ah, 04Ch
- int 21h
-
- if a return value is specified. The return value can be a constant, a memory
- reference, or a register that can be moved into the AL register. If no
- return value is specified, the first line in the example code above is not
- generated.
-
- For OS/2, .EXIT invokes DosExit if you provide a prototype for DosExit and
- if you include OS2.LIB. The listing file shows the statements generated by
- INVOKE if the /Sg command-line option is specified. If you specify a return
- value as an expression, the code generated passes the expression instead of
- the register contents to the DosExit function. See Chapter 17 for
- information on writing programs for OS/2.
-
-
- 2.3 Using Full Segment Definitions
-
- If you need complete control over segments, you can fully define the
- segments in your program. This section explains segment definitions,
- including how to order segments and how to define the segment types.
-
- If you write a program under DOS without .MODEL and .STARTUP, you must
- initialize registers yourself and use the END directive to indicate the
- starting address. Under OS/2 you do not have to initialize registers.
- Section 2.3.2, "Controlling the Segment Order," describes typical start-up
- code.
-
-
- 2.3.1 Defining Segments with the SEGMENT Directive
-
- The SEGMENT directive begins a segment, and the ENDS directive ends a
- segment:
-
- name SEGMENT «align» «READONLY»
- «combine» «use» «'class'»
- statements
- name ENDS
-
- The name defines the name of the segment. Within a module, all segment
- definitions with the same name are treated as though they reference the same
- segment. The linker also combines identically named segments from different
- modules unless the combine type is PRIVATE. In addition, segments can be
- nested.
-
- Options used with the SEGMENT directive can be in any order.
-
- The optional types that follow the SEGMENT directive give the linker and the
- assembler instructions on how to set up and combine segments. The list below
- summarizes these types; the following sections explain them in more detail.
-
-
- Type Description
- ────────────────────────────────────────────────────────────────────────────
- align Defines the memory boundary on which a
- new segment begins.
-
- READONLY Tells the assembler to report an error
- if it detects an instruction modifying
- any item in a
- READONLY segment.
-
- combine Determines how the linker combines
- segments from different modules when
- building executable files.
-
- use (80386/486 only) Determines the size of a segment. USE16
- indicates that offsets in the segment
- are 16 bits wide. USE32 indicates 32-bit
- offsets.
-
- class Provides a class name for the segment.
- The linker automatically groups segments
- of the same class in memory.
-
-
- Types can be specified in any order. You can specify only one attribute from
- each of these fields; for example, you cannot have two different align
- types.
-
- Once you define a segment, you can reopen it later with another SEGMENT
- directive. When you reopen a segment, you need only give the segment name.
-
- ────────────────────────────────────────────────────────────────────────────
- NOTE
-
- The PAGE align type and the PUBLIC combine type are distinct from the PAGE
- and PUBLIC directives. The assembler distinguishes them by means of context.
- ────────────────────────────────────────────────────────────────────────────
-
-
- Aligning Segments
-
- The optional align type in the SEGMENT directive defines the range of memory
- addresses from which a starting address for the segment can be selected. The
- align type can be any one of these:
-
- Align Type Starting Address
- ────────────────────────────────────────────────────────────────────────────
- BYTE Next available byte address.
-
- WORD Next available word address.
-
- DWORD Next available doubleword address.
-
- PARA Next available paragraph address (16
- bytes per paragraph). Default.
-
- PAGE Next available page address (256 bytes
- per page).
-
-
- The linker uses the alignment information to determine the relative starting
- address for each segment. The operating system calculates the actual
- starting address when the program is loaded.
-
-
- Making Segments Read-Only
-
- The optional READONLY attribute is helpful when creating read-only code
- segments for protected mode or when writing code to be placed in read-only
- memory (ROM). It protects against illegal self-modifying code.
-
- The READONLY attribute causes the assembler to check for instructions that
- modify the segment and to generate an error if it finds any. The assembler
- generates an error if you attempt to write directly to a read-only segment.
-
-
-
- Combining Segments
-
- The optional combine type in the SEGMENT directive defines how the linker
- combines segments having the same name but appearing in different modules.
- The combine type controls linker behavior, not assembler behavior. The
- combine types are described in full detail in online help and are summarized
- below.
-
- ╓┌─────────────────────────────────┌─────────────────────────────────────────╖
- Combine Type Linker Action
- ────────────────────────────────────────────────────────────────────────────
- PRIVATE Does not combine the segment with
- segments from other modules, even if
- they have the same name.
- Default.
-
- PUBLIC Concatenates all segments having the
- same name to form a single, contiguous
- segment.
-
- STACK Concatenates all segments having the
- Combine Type Linker Action
- ────────────────────────────────────────────────────────────────────────────
- STACK Concatenates all segments having the
- same name and causes the operating
- system to set SS:00 to the bottom and
- SS:SP to the top of the resulting
- segment. Data initialization is
- unreliable, as discussed below.
-
- COMMON Overlaps segments. The length of the
- resulting area is the length of the
- largest of the combined segments. Data
- initialization is unreliable, as
- discussed below.
-
- MEMORY Used as a synonym for the PUBLIC combine
- type.
-
- AT address Assumes address as the segment location.
- An AT segment cannot contain any code or
- initialized data, but it is useful for
- Combine Type Linker Action
- ────────────────────────────────────────────────────────────────────────────
- initialized data, but it is useful for
- defining structures or variables that
- correspond to specific far memory
- locations, such as a screen buffer or
- low memory.
- The AT combine type cannot be used in
- protected-mode programs.
-
-
-
- Do not place initialized data in STACK or COMMON segments. With these
- combine types, the linker overlays initialized data for each module at the
- beginning of the segment. The last module containing initialized data writes
- over any data from other modules.
-
- ────────────────────────────────────────────────────────────────────────────
- NOTE
-
- Normally, you should provide at least one stack segment (having STACK
- combine type) in a program. If no stack segment is declared, LINK displays a
- warning message. You can ignore this message if you have a specific reason
- for not declaring a stack segment. For example, you would not have a
- separate stack segment in a DOS tiny model (.COM) program, nor would you
- need a separate stack in a DLL library that used the caller's stack.
- ────────────────────────────────────────────────────────────────────────────
-
-
- Setting Segment Word Sizes (80386/486 Only)
-
- The use type in the SEGMENT directive specifies the segment word size on the
- 80386/486 processors. Segment word size determines the default operand and
- address size of all items in a segment.
-
- The 80386/486 can operate in 16-bit or 32-bit mode.
-
- The size attribute can be USE16, USE32, or FLAT. If the 80386 or 80486
- processor has been selected with the .386 or .486 directive, and this
- directive precedes .MODEL, then USE32 is the default. This attribute
- specifies that items in the segment are addressed with a 32-bit offset
- rather than a 16-bit offset. If .MODEL precedes the .386 or .486 directive,
- USE16 is the default. To make USE32 the default, put .386 or .486 before
- .MODEL. You can override the USE32 default with the USE16 attribute.
-
-
- ────────────────────────────────────────────────────────────────────────────
- NOTE
- Mixing 16-bit and 32-bit segments in the same program is possible but
- usually is necessary only in systems programming.
- ────────────────────────────────────────────────────────────────────────────
-
-
- Setting Segment Order with Class Type
-
- Segments of the same class are grouped together in the executable file.
-
- The optional class type in the SEGMENT directive helps control segment
- ordering. Two segments with the same name are not combined if their class is
- different. The linker arranges segments so that all segments identified with
- a given class type are next to each other in the executable file. However,
- within a particular class, the linker orders segments in the order
- encountered. The .ALPHA, .SEQ, or .DOSSEG directive determines this order in
- each .OBJ file. The most common application for specifying a class type is
- to place all code segments first in the executable file.
-
-
- 2.3.2 Controlling the Segment Order
-
- The assembler normally positions segments in the object file in the order in
- which they appear in source code. The linker, in turn, processes object
- files in the order in which they appear on the command line. Within each
- object file, the linker outputs segments in the order they appear, subject
- to any group, class, and .DOSSEG requirements.
-
- You can usually ignore segment ordering. However, it is important whenever
- you want certain segments to appear at the beginning or end of a program or
- when you make assumptions about which segments are next to each other in
- memory. For tiny model (.COM) programs, code segments must appear first in
- the executable file, because execution must start at the address 100h.
-
-
- Segment Order Directives
-
- You can control the order in which segments appear in the executable program
- with three directives. The default, .SEQ, arranges segments in the order in
- which they are declared.
-
- The .ALPHA directive specifies alphabetical segment ordering within a
- module. .ALPHA is provided for compatibility with early versions of the IBM
- assembler. If you have trouble running code from older books on assembly
- language, try using .ALPHA.
-
- The .DOSSEG directive specifies the DOS segment-ordering convention. It
- places segments in the standard order required by Microsoft languages. Do
- not use .DOSSEG in a module to be called from another module.
-
- The .DOSSEG directive orders segments in this order:
-
-
- 1. Code segments
-
- 2. Data segments, in this order:
-
- a. Segments not in class BSS or STACK
-
- b. Class BSS segments
-
- c. Class STACK segments
-
-
-
- When you declare two or more segments to be in the same class, the linker
- automatically makes them contiguous. This rule overrides the
- segment-ordering directives. (See "Setting Segment Order with Class Type" in
- the previous section for more about segment classes.)
-
-
- Linker Control
-
- Most of the segment-ordering techniques (class names, .ALPHA, .SEQ) control
- the order in which the assembler outputs segments. Usually, you are more
- interested in the order in which segments appear in the executable file. The
- linker controls this order.
-
- The linker processes object files in the order in which they appear on the
- command line. Within each module, it then outputs segments in the order
- given in the object file. If the first module defines segments DSEG and
- STACK and the second module defines CSEG, then CSEG is output last. If you
- want to place CSEG first, there are two ways to do so.
-
- .DOSSEG handles segment ordering.
-
- The simpler method is to use .DOSSEG. This directive is output as a special
- record to the object file linker, and it tells the linker to use the
- Microsoft segment-ordering convention. This convention overrides
- command-line order of object files, and it places all segments of class
- 'CODE' first. (See Section 2.3.1, "Defining Segments with the SEGMENT
- Directive.")
-
- The other method is to define all the segments as early as possible (in an
- include file, for example, or in the first module). These definitions can be
- "dummy segments"─that is, segments with no content. The linker observes the
- segment ordering given, then later combines the empty segments with segments
- in other modules that have the same name.
-
- For example, you might include the following at the start of the first
- module of your program or in an include file:
-
- _TEXT SEGMENT WORD PUBLIC 'CODE'
- _TEXT ENDS
- _DATA SEGMENT WORD PUBLIC 'DATA'
- _DATA ENDS
- CONST SEGMENT WORD PUBLIC 'CONST'
- CONST ENDS
- STACK SEGMENT PARA STACK 'STACK'
- STACK ENDS
-
- Later in the program, the order in which you write _TEXT, _DATA, or other
- segments does not matter because the ultimate order is controlled by the
- segment order defined in the include file.
-
-
- 2.3.3 Setting the ASSUME Directive for Segment Registers
-
- Many of the assembler instructions assume a default segment. For example,
- JMP assumes the segment associated with the CS register, PUSH and POP assume
- the segment associated with the SS register, and MOV instructions assume the
- segment associated with the DS register.
-
-
- The assembler must know the location of segment addresses.
-
- When the assembler needs to reference an address, it must know what segment
- contains the address. It finds this by using the default segment or group
- addresses assigned with the ASSUME directive. The syntax is
-
- ASSUME segregister:seglocation [[,segregister:seglocation]]
- ASSUME dataregister:qualifiedtype [[,dataregister:qualifiedtype]]
- ASSUME register:ERROR [[,register:ERROR]]
- ASSUME [[register:»NOTHING
- [[, register: NOTHING]]
-
- The seglocation must be the name of the segment or group that is to be
- associated with segregister. Subsequent instructions that assume a default
- register for referencing labels or variables automatically assume that if
- the default segment is segregister, the label or variable is in the
- seglocation. Beginning with MASM 6.0, the assembler automatically sets CS to
- have the address of the current code segment. Therefore, you do not need to
- include
-
- ASSUME CS : MY_CODE
-
- at the beginning of your program if you want the current segment associated
- with CS.
-
- ────────────────────────────────────────────────────────────────────────────
- NOTE
- Using the ASSUME directive to tell the assembler which segment to associate
- with a segment register is not the same as telling the processor. The ASSUME
- directive affects only assembly-time assumptions. You may need to use
- instructions to change run-time assumptions. Initializing segment registers
- at run time is discussed in Section 3.1.1.1, "Informing the Assembler about
- Segment Values."
- ────────────────────────────────────────────────────────────────────────────
-
- The ASSUME directive can define a segment for each of the segment registers.
- The segregister can be CS, DS, ES, or SS (and FS and GS on the 80386/486).
- The seglocation must be one of the following:
-
-
- ■ The name of a segment defined in the source file with the SEGMENT
- directive
-
- ■ The name of a group defined in the source file with the GROUP
- directive
-
- ■ The keyword NOTHING, ERROR, or FLAT
-
- ■ A SEG expression (see Section 3.2.2, "Immediate Operands")
-
- ■ A string equate (text macro) that evaluates to a segment or group name
- (but not a string equate that evaluates to a SEG expression)
-
-
- It is legal to combine assumes to FLAT with assumes to specific segments.
- Combinations might be necessary in operating-system code that handles both
- 16- and 32-bit segments.
-
- The keyword NOTHING cancels the current segment assumptions. For example,
- the statement ASSUME NOTHING cancels all register assumptions made by
- previous ASSUME statements.
-
- The ASSUME directive can be used anywhere in your program.
-
- Usually, a single ASSUME statement defines all four segment registers at the
- start of the source file. However, you can use the ASSUME directive at any
- point to change segment assumptions.
-
- Using the ASSUME directive to change segment assumptions is often equivalent
- to changing assumptions with the segment-override operator (:) (see Section
- 3.2.3, "Direct Memory Operands"). The segment-override operator is more
- convenient for one-time overrides, whereas the ASSUME directive may be more
- convenient if previous assumptions must be overridden for a sequence of
- instructions.
-
- You can also prevent the use of a register with
-
- ASSUME SegRegister : ERROR
-
- The assembler does an ASSUME CS:ERROR when you use simplified directives
- to create data segments, effectively preventing instructions or code labels
- from appearing in a data segment.
-
- See Section 3.3.2 for information on other applications of ASSUME.
-
-
- 2.3.4 Defining Segment Groups
-
- A group is a collection of segments totalling not more than 64K in 16-bit
- mode. Each code or data item in the group can be addressed relative to the
- beginning of the group through DS or SS.
-
- Segments within a group can be treated as if they shared the same segment
- address.
-
- A group lets you develop separate segments for different kinds of data and
- then combine these into one segment (a group) for all the data. Using a
- group can save you from having to continually reload segment registers to
- access different segments. As a result, the program uses fewer instructions
- and runs faster.
-
- The most common example of a group is the specially named group for near
- data, DGROUP. In the Microsoft segment model, several segments (_DATA, _BSS,
- CONST, and STACK) are combined into a single group called DGROUP. Microsoft
- high-level languages place all near data segments in this group. (By
- default, the stack is placed here, too.) The .MODEL directive automatically
- defines DGROUP. The DS register normally points to the beginning of the
- group, giving you relatively fast access to all data in DGROUP.
-
- The syntax of the group directive is
-
- name GROUP segment [[,segment]]...
-
- The name labels the group. It can refer to a group that was previously
- defined. This feature lets you add segments to a group one at a time. For
- example, if MYGROUP was previously defined to include ASEG and BSEG,
- then the statement
-
- MYGROUP GROUP CSEG
-
- is perfectly legal. It simply adds CSEG to the group MYGROUP; ASEG and
- BSEG are not removed.
-
- Each segment can be any valid segment name (including a segment defined
- later in source code), with one restriction: a segment cannot belong to more
- than one group.
-
- The GROUP directive does not affect the order in which segments of a group
- are loaded. You can place any number of 16-bit segments in a group as long
- as the total size does not exceed 65,536 bytes. If the processor is in
- 32-bit mode, the maximum size is four gigabytes. You need to make sure that
- non-grouped segments do not get placed between grouped segments in such a
- way that the size of the group exceeds 64K or 4 gigabytes. Neither can you
- place a 16-bit and a 32-bit segment in the same group.
-
-
- 2.4 Related Topics in Online Help
-
- In addition to information covered in this chapter, information on the
- following topics can be found in online help.
-
- Topic Access
- ────────────────────────────────────────────────────────────────────────────
- Memory models Choose "Memory Models" from the list
- of tables on the "MASM 6.0 Contents"
- screen
-
- @Model, @CodeSize, @DataSize Choose "Predefined Symbols" from the
- "MASM 6.0 Contents" screen
-
- Calling conventions From the MASM Index, choose "Calling
- Convention"
-
- Coprocessor Directives From the "MASM 6.0 Contents" screen,
- choose "Directives"; then choose
- "Processor Selection"
-
- Simplified and full (complete) From the "MASM 6.0 Contents" screen,
- segment control choose "Directives"; then choose
- "Simplified Segment Control" or
- "Complete Segment Control"
-
-
-
-
-
-
-
- Chapter 3 Using Addresses and Pointers
- ────────────────────────────────────────────────────────────────────────────
-
- Most processor and operating-system modes require the use of segmented
- addresses to access the code and data for MASM applications. The address of
- the code or data in a segment is relative to an address in a segment
- register. You can also use pointers to access data in MASM programs.
-
- The first section of this chapter describes how to initialize default
- segment registers to access near and far addresses. The next section
- describes how to use the available addressing modes to access the code and
- data. It also describes the related operators, syntax, and displacements.
-
- The third section of this chapter explains how to use the TYPEDEF directive
- to declare pointers (variables containing addresses) and the ASSUME
- directive to give the assembler information about registers containing
- pointers. This section also shows you how to do typical pointer operations
- and how to write code that works for pointer variables in any memory model.
-
-
-
- 3.1 Programming Segmented Addresses
-
- Before you use segmented addresses in your programs, you need to initialize
- the segment registers. The initialization process depends on the registers
- used and on your choice of simplified segment directives or full segment
- definitions. The simplified segment directives (introduced in Section 2.2)
- handle most of the initialization process for you. This section explains how
- to inform the assembler and the processor of segment addresses, and how to
- access the near and far code and data in those segments.
-
-
- 3.1.1 Initializing Default Segment Registers
-
- The segmented architecture of the 8086-family of processors does not require
- you to specify two addresses every time you access memory. As Chapter 2,
- "Organizing MASM Segments," explains, the 8086 family of processors uses a
- system of default segment registers to simplify access to the most commonly
- used data and code.
-
- The segment registers DS, SS, and CS are normally initialized to default
- segments at the beginning of a program. If you write the main module in a
- high-level language, the compiler initializes the segment registers. If you
- write the
-
- main module in assembly language, you must initialize them yourself. Follow
- these two steps to initialize segments:
-
-
- 1. Tell the assembler which segment is associated with a register. The
- assembler must know the default segments at assembly time.
-
- 2. Tell the processor which segment is associated with a register by
- writing the necessary code to load the correct segment value into the
- segment register on the processor.
-
-
- These steps are discussed separately in the following sections.
-
-
- 3.1.1.1 Informing the Assembler about Segment Values
-
- Use ASSUME to inform the assembler about default segments.
-
- The first step in initializing segments is to tell the assembler which
- segment to associate with a register. You do this with the ASSUME directive.
- If you use simplified segment directives, the assembler generates the
- appropriate ASSUME statements automatically. If you use full segment
- definitions, you must code the ASSUME statements for registers other than CS
- yourself. (ASSUME can also be used on general-purpose registers, as
- explained in Section 3.3.2, "Defining Register Types with ASSUME.")
-
- With simplified segment directives, the .STARTUP directive and the start-up
- code initialize DS to be equal to SS (unless you specify FARSTACK), which
- allows default data to be accessed through either SS or DS. This can improve
- efficiency in the code generated by compilers. The "DS equals SS" convention
- may not work with certain applications, such as memory-resident programs in
- DOS and multithread programs in OS/2. The code generated for .STARTUP is
- shown in Section 2.2.6, "Starting and Ending Code with .STARTUP and .EXIT."
- You can use similar code to set DS equal to SS in programs using full
- segment definitions.
-
- Here is an example using full segment definitions; it is equivalent to the
- ASSUME statement generated with simplified segment directives in small model
- with NEARSTACK:
-
- ASSUME cs:_TEXT, ds:DGROUP, ss:DGROUP
-
- In the example above, DS and SS are part of the same segment group. It is
- also possible to have different segments for data and code, and to use
- ASSUME to set ES, as shown below:
-
- ASSUME cs:MYCODE, ds:MYDATA, ss:MYSTACK, es:OTHER
-
- Correct use of the ASSUME statement can help find addressing errors. With
- .CODE, the assembler assumes CS to the current segment. When you use the
- simplified segment directives .DATA, .DATA?, .CONST, .FARDATA, or .FARDATA?,
- the assembler automatically assumes CS to ERROR. This prevents
-
- instructions from appearing in these segments. If you use full segment
- definitions, you can accomplish the same by placing ASSUME CS:ERROR in a
- data segment.
-
- With either simple or full segments, you can cancel the control of an ASSUME
- statement by assuming NOTHING. No assumptions is the default condition. For
- example, you cancel the assumption for ES above with the following
- statement:
-
- ASSUME es:NOTHING
-
- Prior to the .MODEL statement (or in its absence), the assembler sets the
- ASSUME statement for DS, ES, and SS to the current segment.
-
-
- 3.1.1.2 Informing the Processor about Segment Values
-
- The second step in initializing segments is to inform the processor of
- segment values at run time. How segment values are initialized at run time
- differs for each segment register and depends on your use of simplified
- segment directives or full segment definitions and on the operating system.
-
-
- Specifying a Starting Address - The CS segment register and the IP
- (instruction pointer) register are initialized automatically if you use the
- .STARTUP directive with simplified segment directives. If you use full
- segment definitions, you must specifically set a label in the code segment
- at the instruction you want executed first. Then provide that label as an
- argument to the END directive. Both CS and IP are set at load time to the
- start address the linker gets from the END directive:
-
- _TEXT SEGMENT WORD PUBLIC 'CODE
- ORG 100h ; Use this declaration for .COM files only
- start: ; First instruction here
- .
- .
- .
- _TEXT ENDS
- END start ; Name of starting label
-
- The operating system automatically resolves the value of CS:IP at load time.
- The label specified as the start address becomes the initial value of IP. In
- an executable (.EXE) file, the start address is encoded into the header and
- is initialized by the operating system at load time. In a .COM file, the
- initial IP is always assumed to be 100h. Therefore, you must use the ORG
- directive to set the start address to 100h. CS and IP cannot be directly
- modified except through jump, call, and interrupt instructions.
-
- DS is initialized automatically under OS/2, but you must initialize it for
- DOS.
-
- Initializing DS - The DS register is automatically initialized to the
- correct value (DGROUP) if you use .STARTUP or if you are writing a program
- for OS/2. If you do not use .STARTUP with DOS, you must initialize DS using
- the following instructions:
-
- mov ax, DGROUP
- mov ds, ax
-
- The initialization requires two instructions because the segment name is a
- constant and the assembler does not allow a constant to be loaded directly
- to a segment register. The example above loads DGROUP, but you can load any
- valid segment or group.
-
- SS and SP are initialized automatically.
-
- Initializing SS and SP - The SS and SP registers are initialized
- automatically if you use the .STACK directive with simplified segments or if
- you define a segment that has the STACK combine type with full segment
- definitions. Using the STACK directive initializes SS to the stack segment.
- If you want SS to be equal to DS, use .STARTUP or its equivalent. (See
- "Combining Segments" in Section 2.3.1.) For an executable file, the values
- are encoded into the executable header and resolved at link time. For a .COM
- file, SS is initialized to the first address of the 64K program segment and
- SP is initialized to 0FFFEh.
-
- If you do not need to access far data in your program, you do not need to
- initialize the ES register, although you can do so. Use the same technique
- as for the DS register. You can initialize SS to a far stack in the same
- way.
-
-
- 3.1.2 Near and Far Addresses
-
- Addresses which have an implied segment name or segment registers associated
- with them are called "near addresses." Addresses which have an explicit
- segment associated with them are called "far addresses." The assembler
- handles near and far code automatically, as described below. You must
- specify how to handle far data.
-
- The Microsoft segment model puts all near data and the stack in a group
- called DGROUP. Near code is put in a segment called _TEXT. Each module's far
- code or far data is placed in a separate segment. This convention is
- described in Section 2.3.2, "Controlling the Segment Order."
-
- The assembler cannot determine the address for some program components,
- which are said to be relocatable. The assembler generates a fixup record and
- the linker provides the address once the location of all segments has been
- determined. Usually a relocatable operand references a label, but there are
- exceptions. Examples in the next two sections include information about the
- relocatability of near and far data.
-
- Near Code - Control transfers within near code do not require changes to
- segment registers. The processor automatically handles changes to the offset
- in the IP register when control-flow instructions such as JMP, CALL, and RET
- are used. The statement
-
- call nearproc ; Change code offset
-
- changes the IP register to the new address but leaves the segment unchanged.
- When the procedure returns, the processor resets IP to the offset of the
- next instruction after the call.
-
- Far Code - The processor automatically handles segment register changes when
- dealing with far code. The statement
-
- call farproc ; Change code segment and offset
-
- automatically moves the segment and offset of the farproc procedure to the
- CS and IP registers. When the procedure returns, the processor sets CS to
- the original code segment and sets IP to the offset of the next instruction
- after the call.
-
- Near Data - Near data can usually be accessed directly. That is, a segment
- register already holds the correct segment for the data item. The term "near
- data" is often used to refer to the data in the DGROUP group.
-
- After the first initialization of the DS and SS registers, these registers
- normally point into DGROUP. If you modify the contents of either of these
- registers during the execution of the program, the register may need to be
- reloaded prior to being used for addressing DGROUP data.
-
- If a stack variable is accessed directly through BP or SP, the SS register
- is the default. Otherwise, the default is DS:
-
- nearvar WORD 0
- .
- .
- .
- mov ax, nearvar ; Access near data through DS or SS
- mov ax, [bp+6] ; Access near data through SS
-
- In this example, nearvar is a relocatable label. The assembler does not
- know where the memory for nearvar will be allocated. The linker provides
- the address at link time. The expression [bp+6] is not relocatable. The
- linker does not need to provide an address for this expression.
-
- Far Data - To read or modify a far address, a segment register must point to
- the segment of the data. This requires two steps. First load the segment
- (normally either ES or DS) with the correct value, and then (optionally) set
- an assume of the segment register to the segment of the address (or to
- NOTHING).
-
- ────────────────────────────────────────────────────────────────────────────
- NOTE
- In flat model (OS/2 2.x), far addresses are rarely used. By default, all
- addressing is relative to the initial values of the segment registers. Thus,
- this section on far addressing does not apply to most flat model programs.
- ────────────────────────────────────────────────────────────────────────────
-
- You can initialize ES.
-
- One method commonly used to access far data is to initialize the ES segment
- register. This example shows two ways to do this:
-
- ; First method
- mov ax, SEG farvar ; Load segment of the far address
- mov es, ax
- mov ax, es:farvar ; Provide an explicit segment
- ; override on the addressing
- ; Second method
- mov ax, SEG farvar2 ; Load the segment of the
- ; far address
- mov ex, ax
- ASSUME ES:SEG farvar2 ; Tell the assembler that ES points
- ; to the segment containing farvar2
- mov ax, farvar2 ; The assembler provides the ES
- ; override since it knows that
- ; the label is addressable
-
- After loading the segment of the address into the ES segment register, you
- can either explicitly override the segment register so that the addressing
- is correct (method 1) or allow the assembler to insert the override for you
- (method 2). The assembler uses ASSUME statements to determine which segment
- register can be used to address a segment of memory. To use the segment
- override operator, the left operand must be a segment register, not a
- segment name. (See Section 3.2.3 for more information on segment overrides.)
-
-
- If an instruction needs a segment override, the resulting code is slightly
- larger and slower, since the override must be encoded into the instruction.
- However, the resulting code may still be smaller than the code for multiple
- loads of the default segment register for the instruction.
-
- The DS, SS, FS, and GS segment registers (FS and GS are available only on
- the 80386/486 processors) may also be used to provide for addressing through
- other segments.
-
- If a program uses ES to access far data, it need not restore ES when
- finished (unless the program uses flat model). Some compilers require that
- you restore ES before returning to a module written in a high-level
- language.
-
- You can reinitialize DS.
-
- For a series of memory accesses to far data, you can reinitialize DS to the
- far data and then restore DS when you are finished. Use the ASSUME directive
- to let the assembler know that DS is no longer associated with the default
- data segment, as shown below:
-
- push ds ; Save original segment
- mov ax, SEG fararray ; Move segment into data register
- mov ds, ax ; Initialize segment register
- ASSUME ds:SEG fararray ; Tell assembler where data is
- mov ax, fararray[0] ; Direct access faster
- mov dx, fararray[2] ; (A relocatable expression)
- .
- .
- .
- pop ds ; Restore segment
- ASSUME ds:@DATA ; and default assumption
-
- The additional overhead of saving and restoring the DS register in this data
- access method may be worthwhile to avoid repeated segment overrides.
-
- If a program changes DS to access far data, it should restore DS when
- finished. This allows procedures to assume that DS is the segment for near
- data. This is a convention used in many compilers, including Microsoft
- compilers.
-
- Relocatable Data - The memory expression es:farvar is a relocatable memory
- expression, since the assembler cannot determine the address at assembly
- time.
-
- Since no label is referenced, you may expect
-
- mov ax, _myseg:0
-
- to be nonrelocatable (in small model). However, in this case, _myseg:0 is
- a location in a local module whose memory location is dependent on the link
- order, so mov ax, _myseg:0 is relocatable.
-
- A group name is also an immediate constant representing the beginning of the
- group. The first three expressions below are relocatable expressions; the
- fourth is not.
-
- mov ax, DGROUP ; Relocatable
- mov ax, @data ; Relocatable
- mov ax, mygroup ; Relocatable
- mov ax, ds:0 ; Not relocatable
-
-
- 3.2 Specifying Addressing Modes
-
- The 8086 family of processors recognizes four kinds of instruction operands:
- register, immediate, direct memory, and indirect memory. Each type of
- operand corresponds to a different addressing mode.
-
- The four types of operands are summarized in the following list and
- described at length in the rest of this section.
-
- Operand Type Addressing Mode
- ────────────────────────────────────────────────────────────────────────────
- Register An 8-bit or 16-bit register on the
- 8086-80486; can also be 32-bit on the
- 80386/486
-
- Immediate A constant value contained in the
- instruction itself
-
- Direct memory A fixed location in memory
-
- Indirect memory A memory location determined at run time
- by using the address stored in one or
- two registers and a constant
-
-
-
- 3.2.1 Register Operands
-
- A register operand specifies that the value in a particular register is an
- operand. Code for the register or registers used in operands is encoded into
- the instruction at assembly time.
-
- Register operands can be used anywhere you need an operand. The following
- examples show typical register operands:
-
- mov bx, 10 ; Load constant to BX
- add ax, bx ; Add AX and BX
- jmp di ; Jump to the address in DI
-
- Register operands have a specific use related to addresses.
-
- An offset stored in a base or index register is often used as a pointer into
- memory. An offset can be stored in one of the base or index registers; the
- register can then be used as an indirect memory operand (see Section 3.2.4).
- For example:
-
- mov [bx], dl ; Store DL in indirect memory operand
- inc bx ; Increment register operand
- mov [bx], dl ; Store DL in new indirect memory operand
-
- This example moves the value in DL to two consecutive bytes of a memory
- location pointed to by BX. Any instruction that changes the register value
- also changes the data item pointed to by the register.
-
-
- 3.2.2 Immediate Operands
-
- An immediate operand is a constant value that is specified at assembly time.
- It can be a constant or the result of a constant expression. Immediate
- values are usually encoded into the internal representation of the
- instruction at assembly time. These are typical examples:
-
- mov cx, 20 ; Load constant to register
- add var, 1Fh ; Add hex constant to variable
- sub bx, 25 * 80 ; Subtract constant expression
-
- The OFFSET Operator - Address constants are a special case of immediate
- operand and consist of an offset or segment value. The OFFSET operator
- specifies the offset of a memory location, as shown below:
-
- mov bx, OFFSET var ; Load offset address
-
- For information on differences between MASM 5.1 behavior and MASM 6.0
- behavior related to OFFSET, see Appendix A.
-
- An OFFSET expression is resolved at link time.
-
- Since segments in different modules may be combined into a single segment,
- the true base of the segment is not known. Thus, the offset cannot be
- resolved until link time and var is a relocatable immediate.
-
- The SEG Operator - The SEG operator specifies the segment of a memory
- location:
-
- mov ax, SEG farvar ; Load segment address
- mov es, ax
-
- A SEG expression is resolved at load time.
-
- The actual value of a particular segment is never known until the program is
- loaded into memory. Constant segments are encoded into the header of the
- executable file at link time. Executable files in the DOS .COM format (tiny
- model) cannot contain relocatable segment expressions.
-
- When you use the SEG operator with a variable that is not external, MASM 6.0
- returns the address of the frame (the segment, group, or segment register)
- if one has been explicitly set. Otherwise, it returns the group if one has
- been specified. In the absence of a defined group, SEG returns the segment
- where the variable is defined.
-
- For external variables that are not defined in a segment, the linker fills
- in the segment portion of the address, which may be a segment or group.
-
- This behavior can be changed with the /Zm command-line option or with the
- OPTION OFFSET:SEGMENT statement (see Appendix A, "Differences between MASM
- 6.0 and 5.1"). Section 1.3.2 introduces the OPTION directive.
-
-
- 3.2.3 Direct Memory Operands
-
- A direct memory operand specifies the data at a given address. The address
- and size of the data are encoded into the internal representation of the
- instruction. However, the instruction acts on the contents of the address,
- not the address itself. You must usually specify the size of these operands
- so that the instruction knows how much memory to operate on.
-
- The offset value of a direct memory operand is not resolved until link time,
- and the segment must always be in a segment register at run time. The
- assembler automatically handles address resolution.
-
- You usually represent a direct memory operand in source code as a symbolic
- name previously declared with a data directive such as BYTE, as illustrated
- below:
-
- .DATA? ; Segment for uninitialized data
- var BYTE ? ; Reserve one byte at current address
- ; and assign this address to var
- .CODE
- .
- .
- .
- mov var, al ; Load contents of byte register into
- address specified by var
-
- Any location in memory can be a direct memory operand as long as a size is
- specified and the location is fixed. The data at the address can change, but
- the address cannot. By default, instructions that use direct memory
- addressing use the DS register. You can create an expression that points to
- a memory location using any of the following operators:
-
- Operator Name Symbol
- Plus
- ────────────────────────────────────────────────────────────────────────────
- Minus -
- Index [ ]
- Structure member .
- Segment override :
-
- These operators are discussed in more detail below.
-
- Several operators can be used in expressions that evaluate to direct memory
- operands.
-
- Plus and Minus - The result of combining a memory operand and a constant
- number with the plus or minus operator is a direct memory operand. However,
- the result of combining two memory operands with the minus operator is an
- immediate operand. For example:
-
- memvar EQU array + 5 ; Address five bytes beyond
- array
- immexp EQU mem1 - mem2 ; Distance between addresses
-
- The second expression is legal only if both addresses are in the same
- segment.
-
- The expression mem1 - mem2 is not relocatable, since the reference to the
- two labels represents a difference in addresses (offsets). The linker does
- not need to know about the labels in this statement.
-
- Index - The index operator (brackets enclosing an index value) specifies the
- register or registers for indirect operands. It should contain a constant
- index when used with direct memory operands. It is equivalent to the plus
- operator. For example, the following statements are the same:
-
- mov ax, array[5]
- mov ax, array+5
-
- Any direct memory operand can be enclosed in the index operator. The
- following are equivalent:
-
- mov ax, var
- mov ax, [var]
-
- Some programmers prefer to enclose the operand in brackets to show that the
- contents, not the address, are used.
-
- Structure Field - The structure operator (a period) accesses elements of a
- structure. A field within a structure variable can be accessed as a direct
- memory operand:
-
- mov bx, structvar.field1
-
- The address of the structure operand is the sum of the offsets of structvar
- and field1. See Section 5.2, "Structures and Unions," for more information
- about structures.
-
- Segment Override - The segment override operator (a colon) specifies a
- segment portion of the address that is different from the default segment.
- When used with instructions, this operator can apply to segment registers or
- segment names:
-
- mov ax, es:farvar ; Use segment override
-
- The assembler will not generate a segment override if the default segment is
- explicitly provided. Thus, the following two statements are equivalent:
-
- mov [bx], ax
- mov ds:[bx], ax
-
- A segment name override or the segment override operator forces the operand
- to be an address expression.
-
- mov WORD PTR FARSEG:0, ax ; Segment name override
- mov WORD PTR es:100h, ax ; Legal and equivalent
- mov WORD PTR es:[100h], ax ; expressions
- ; mov WORD PTR [100h], ax ; Illegal, not an address
-
- As the example shows, a constant expression cannot be an address expression
- unless it has a segment override.
-
-
- 3.2.4 Indirect Memory Operands
-
- Like direct memory operands, indirect memory operands specify the contents
- of a given address. However, the processor calculates the address at run
- time by referring to the contents of registers. Since values in the
- registers can change at run time, indirect memory operands provide dynamic
- access to memory.
-
- Indirect memory operands make possible run-time operations such as pointer
- indirection and dynamic indexing of array elements, including indexing of
- multidimensional arrays.
-
- Strict rules govern which registers can be used for indirect memory operands
- under 16-bit versions of the 8086-based processors. The rules change
- significantly for 32-bit processors starting with the 80386. However, the
- new rules apply only to code that does not need to be backward compatible.
-
- This section first discusses features of indirect operands in either mode.
- Then it explains the specific 16-bit rules and 32-bit rules separately.
-
-
- 3.2.4.1 Indirect Operands with 16- and 32-Bit Registers
-
- Some rules and options for indirect memory operands always apply, regardless
- of the size of the register. For example, you must always specify the
- register and operand size for indirect memory operands. But you can use
- various syntaxes to indicate an indirect memory operand. This section
- describes the rules that apply to both 16-bit and 32-bit register modes.
-
- Certain rules govern the use of base and index registers.
-
- Specifying Indirect Memory Operands - The index operator specifies the
- register or registers for indirect operands. The processor uses the data
- pointed to by the register. For example, the following instruction moves the
- word-sized data at the address contained in DS:BX into AX:
-
- mov ax, WORD PTR [bx]
-
- When you specify more than one register, the processor adds the two
- addresses together to determine the effective address (the address of the
- data to operate on):
-
- mov ax, [bx+si]
-
- An indirect memory operand can have a displacement.
-
- Specifying Displacements - You can specify an address displacement─ a
- constant value to add to the effective address. A direct memory specifier is
- the most common displacement:
-
- mov ax, table[si]
-
- In the relocatable expression above, the displacement table is the base
- address of an array; SI holds an index to an array element. The SI value is
- calculated at run time, often in a loop. The element loaded into AX depends
- on the value of SI at the time the instruction is executed.
-
- Each displacement can be an address or numeric constant. If there is more
- than one displacement, the assembler adds them together at assembly time and
- encodes the total displacement. For example, in the statement
-
- table WORD 100 DUP (0)
- .
- .
- .
- mov ax, table[bx][di]+6
-
- both table and 6 are displacements. The assembler adds the value of
- table to 6 to get the total displacement. However, this statement is not
- legal:
-
- mov ax, mem1[si] + mem2
-
- Indirect memory operands must always have a size.
-
- Specifying Operand Size - Indirect memory operands must always have a
- specified size. Often the size is specified by the size of the identifier.
- In the example above, the size of the table array determines the operand
- size. If an indirect memory operand is used with a register operand, the
- register size determines the size of the memory object:
-
- mov ax, [bx] ; Size is 2 bytes - same as
- AX
- mov table[bx], 0 ; Size is 2 bytes - from size
- ; of table
-
- If there is no address or register operand, the size must be given
- specifically with the PTR operator, as shown below:
-
- inc WORD PTR [bx] ; Word size
- mov BYTE PTR [bp+6], 0 ; Byte size
-
- Syntax Options - The assembler allows a variety of syntaxes for indirect
- memory operands. However, all registers must be inside brackets. You can
- enclose each register in its own pair of brackets, or you can place the
- registers in the same pair of brackets separated by a plus operator (+). All
- the following variations are legal and equivalent:
-
- mov ax, table[bx][di]
- mov ax, table[di][bx]
- mov ax, table[bx+di]
- mov ax, [table+bx+di]
- mov ax, [bx][di]+table
-
- All of these statements move the value in table indexed by BX+DI into
- AX.
-
- Registers pointing into arrays must be zero-based and scaled for the size of
- the array.
-
- Scaling Indexes - The value of index registers pointing into arrays must
- often be adjusted for zero-based arrays and scaled according to the size of
- the array items. For a word array, the item number must be multiplied by two
- (shifted left two places). When you are using 16-bit registers, scaling must
- be done with separate instructions, as shown below:
-
- mov bx, 5 ; Get sixth element (adjust
- for 0)
- shl bx, 1 ; Scale by two (word size)
- inc wtable[bx] ; Increment sixth element in table
-
- When using 32-bit registers on the 80386/486 processor, you can include
- scaling in the operand, as described in Section 3.2.4.3, "Indirect Memory
- Operands with 32-Bit Registers."
-
- Accessing Structure Elements - The structure member operator can be used in
- indirect memory operands to access structure elements. In this example, the
- structure member operator loads the year field of the fourth element of
- the students array into AL:
-
- STUDENT STRUCT
- grade WORD ?
- name BYTE 20 DUP (?)
- year BYTE ?
- STUDENT ENDS
-
- students STUDENT < >
- .
- . ; Assume array initialized
- . ; earlier
- mov bx, OFFSET students ; Point to array of students
- mov ax, 4 ; Get fourth element
- mov di, SIZE STUDENT ; Get size of STUDENT
- mul di ; Multiply size times
- ; elements to point to
- ; current element
- ; Load field from element:
- mov al, (STUDENT PTR[bx+di]).year
-
- See Section 5.2 for more information on MASM structures.
-
-
- 3.2.4.2 Indirect Memory Operands with 16-Bit Registers
-
- For 8086-based computers and DOS, you must follow the strict indexing rules
- established for the 8086 processor. Only four registers are allowed─BP, BX,
- SI, and DI─and those only in certain combinations.
-
- BP and BX are base registers. SI and DI are index registers. You can use
- either a base or an index register by itself. But if you combine two
- registers, one must be a base and one an index. Here are legal and illegal
- forms:
-
- mov ax, [bx+di] ; Legal
- mov ax, [bx+si] ; Legal
- mov ax, [bp+di] ; Legal
- mov ax, [bp+si] ; Legal
- ; mov ax, [bx+bp] ; Illegal - two base registers
- ; mov ax, [di+si] ; Illegal - two index registers
-
- Table 3.1 shows the modes in which registers can be used to specify indirect
- memory operands.
-
- Table 3.1 Indirect Addressing Modes with 16-Bit Registers
-
- ╓┌─────────────────────┌────────────────────────┌────────────────────────────╖
- Mode Syntax Effective Address
- ────────────────────────────────────────────────────────────────────────────
- Register indirect [BX] Contents of register
- [BP]
- [DI]
- Mode Syntax Effective Address
- ────────────────────────────────────────────────────────────────────────────
- [DI]
- [SI]
-
- ────────────────────────────────────────────────────────────────────────────
-
- Base or index displacement[BX] Contents of register plus
- displacement[BP] displacement
- displacement[DI]
- displacement[SI]
-
- ────────────────────────────────────────────────────────────────────────────
-
- Base plus index [BX][DI] Contents of base register
- [BP][DI] plus contents of index
- [BX][SI] register
- [BP][SI]
-
- ────────────────────────────────────────────────────────────────────────────
-
- Mode Syntax Effective Address
- ────────────────────────────────────────────────────────────────────────────
- Base plus index with displacement[BX][DI] Sum of base register, index
- displacement displacement[BP][DI] register, and displacement
- displacement[BX][SI]
- displacement[BP][SI]
-
- ────────────────────────────────────────────────────────────────────────────
-
-
-
- Different combinations of registers and displacements have different
- timings, as shown in the Macro Assembler Reference.
-
-
- 3.2.4.3 Indirect Memory Operands with 32-Bit Registers
-
- Instructions for the 80386/486 processor can be given in two segment
- modes─16-bit and 32-bit. Indirect memory operands are different in each
- mode. The segment mode is independent of the register size; you can use
- 32-bit registers in either mode.
-
- In 16-bit mode, the 80386/486 operates in the mode used by all other
- 8086-based processors, with one difference: you can use 32-bit registers. If
- the 80386/486 processor is enabled (with the .386 or .486 directive), 32-bit
- general-purpose registers are available in either segment mode. Using them
- eliminates many of the limitations of 16-bit indirect memory operands. Using
- 80386/486 features can make your DOS programs run faster and more
- efficiently if you are willing to sacrifice backward compatibility with
- other processors.
-
- In 32-bit mode, an offset address can be up to four gigabytes. (Segments are
- still represented in 16 bits.) This effectively eliminates size restrictions
- on each segment, since few programs need four gigabytes of memory. OS/2 2.x
- uses 32-bit mode and flat model, which spans all segments. XENIX 386 uses
- 32-bit mode with multiple segments.
-
- Any general-purpose 32-bit register can be used as either the base or the
- index.
-
- 80386/486 Enhancements - On the 80386/486, the processor allows any
- general-purpose 32-bit register to be used as either the base or the index
- register (except ESP, which can be a base but not an index). The same
- register can also be used as both the base and index, but you cannot combine
- 16-bit and 32-bit registers. Several examples are shown below:
-
- add edx, [eax] ; Add double
- mov dl, [esp+10] ; Add byte from stack
- dec WORD PTR [edx][eax] ; Decrement word
- cmp ax, array[ebx][ecx] ; Compare word from array
- jmp FWORD PTR table[ecx] ; Jump into pointer table
-
- The index register can have a scaling factor of 1, 2, 4, or 8.
-
- Scaling Factors - With 80386/486 registers, the index register can have a
- scaling factor of 1, 2, 4, or 8. Any register except ESP can be the index
- register and can have a scaling factor. Specify the scaling factor by using
- the multiplication operator (*) adjacent to the register.
-
- You can use scaling to index into arrays with different sizes of elements.
- For example, the scaling factor is 1 for byte arrays (no scaling needed), 2
- for word arrays, 4 for doubleword arrays, and 8 for quadword arrays. There
- is no performance penalty for using a scaling factor. Scaling is illustrated
- in the following examples:
-
- mov eax, darray[edx*4] ; Load double of double
- array
- mov eax, [esi*8][edi] ; Load double of quad array
- mov ax, wtbl[ecx+2][edx*2] ; Load word of word array
-
- Scaling is also necessary on earlier processors, but it must be done with
- separate instructions before the indirect memory operand is used, as
- described in Section 3.2.4.2, "Indirect Memory Operands with 16-Bit
- Registers."
-
- The number of registers and the scaling factor affect base and index
- registers.
-
- The default segment register is SS if the base register is EBP or ESP; it is
- DS for all other base registers. If two registers are used, only one can
- have a scaling factor. The register with the scaling factor is defined as
- the index register. The other register is defined as the base. If scaling is
- not used, the first register is the base. If only one register is used, it
- is considered the base for deciding the default segment unless it is scaled.
- The following examples illustrate how to determine the base register:
-
- mov eax, [edx][ebp*4] ; EDX base (not scaled - seg
- DS)
- mov eax, [edx*1][ebp] ; EBP base (not scaled - seg SS)
- mov eax, [edx][ebp] ; EDX base (first - seg DS)
- mov eax, [ebp][edx] ; EBP base (first - seg SS)
- mov eax, [ebp*2] ; EBP base (only - seg SS)
-
- Mixing 16-Bit and 32-Bit Registers - Statements can mix 16-bit and 32-bit
- registers if the register use is correct. For example, the following
- statement is legal for either 16-bit or 32-bit segments:
-
- mov eax, [bx]
-
- This statement moves the 32-bit value pointed to by BX into the EAX
- register. Although BX is a 16-bit pointer, it can still point into a 32-bit
- segment.
-
- However, the following statement is never legal, since the CX register
- cannot be used as a 16-bit pointer (although ECX can be used as a 32-bit
- pointer):
-
- ; mov eax, [cx] ; illegal
-
- Operands that mix 16-bit and 32-bit registers are also illegal:
-
- ; mov eax, [ebx+si] ; illegal
-
- The following statement is legal in either mode:
-
- mov bx, [eax]
-
- This statement moves the 16-bit value pointed to by EAX into the BX
- register. This works fine in 32-bit mode. However, in 16-bit mode, moving a
- 32-bit pointer into a 16-bit segment is illegal. If EAX contains a 16-bit
- value (the top half of the 32-bit register is 0), the statement works.
- However, if the top half of the EAX register is not 0, the operand points
- into a part of the segment that doesn't exist, and this generates an error.
- If you use 32-bit registers as indexes in 16-bit mode, you must make sure
- that the index registers contain valid 16-bit addresses.
-
-
- 3.3 Accessing Data with Pointers and Addresses
-
- In high-level languages, a "pointer" (or pointer variable) is an address
- that is stored in a variable. Assembly language also uses pointer variables,
- but the term "pointer" has a wider use. The indirect memory operands
- discussed in the previous section can be thought of as pointers stored in
- registers.
-
- An address can be stored in a pointer variable for later use. Program
- procedures (including OS/2 systems calls) frequently pass pointer variables
- onto the stack to transfer data between the calling program and the called
- procedure.
-
- A pointer variable must be transferred to registers before it can be used.
-
- Regardless of the reason for maintaining it, a pointer variable to data
- cannot in itself be directly used in MASM statements. (Pointers to code can
- be used directly.) It must first be loaded into registers as an indirect
- memory operand.
-
- There is a difference between a far address and a far pointer. A "far
- address" is the address of a variable located in a far data segment. A "far
- pointer" is a variable that can specify both a segment and an offset. Like
- any other variable, a pointer variable can be located in either the default
- (near) data segment or in a far segment.
-
- Previous versions of MASM allow pointer variables but provide little support
- for them. In previous versions, any address loaded into a variable can be
- considered a pointer, as in the following statements:
-
- Var BYTE 0 ; Variable
- npVar WORD Var ; Near pointer to variable
- fpVar DWORD Var ; Far pointer to variable
-
- If a variable is initialized to the name of another variable, the
- initialized variable is a pointer, as shown in the example above. However,
- in previous versions of MASM, the CodeView debugger recognizes npVar and
- fpVar as word and doubleword variables. CodeView does not treat them as
- pointers, nor does it recognize the type of data they point to (bytes, in
- the example).
-
- The new directive TYPEDEF and the new capabilities of ASSUME make it easier
- to manage pointers in registers and variables. These directives are
- discussed in the next two sections. Basic pointer and address operations are
- covered in Section 3.3.3.
-
-
- 3.3.1 Defining Pointer Types with TYPEDEF
-
- Once defined, a TYPEDEF is considered the same as an intrinsic type.
-
- You can define types for pointer variables using the TYPEDEF directive. A
- type so defined is considered the same as the intrinsic types provided by
- the assembler and can be used in the same contexts. The syntax for TYPEDEF
- when used to define pointers is
-
- typename TYPEDEF «distance» PTR qualifiedtype
-
- The typename is the name assigned to the new type. The distance can be NEAR,
- FAR, or any distance modifier. The qualifiedtype can be any previously
- intrinsic or defined MASM type, or a type previously defined with TYPEDEF.
- (See Section 1.2.6, "Data Types," for a full definition of qualifiedtype.)
-
- Here are some examples of user-defined types:
-
- PBYTE TYPEDEF PTR BYTE ; Pointer to bytes
- NPBYTE TYPEDEF NEAR PTR BYTE ; Near pointer to bytes
- FPBYTE TYPEDEF FAR PTR BYTE ; Far pointer to bytes
- PWORD TYPEDEF PTR WORD ; Pointer to words
- NPWORD TYPEDEF NEAR PTR WORD ; Near pointer to words
- FPWORD TYPEDEF FAR PTR WORD ; Far pointer to words
-
- PPBYTE TYPEDEF PTR PBYTE ; Pointer to pointer to bytes
- ; (in C, an array of strings)
- PVOID TYPEDEF PTR ; Pointer to any type of data
-
- STRUCT PERSON ; Structure type
- name BYTE 20 DUP (?)
- num WORD ?
- PERSON ENDS
- PPERSON TYPEDEF PTR PERSON ; Pointer to structure type
-
- The distance of a pointer can either be set specifically or determined
- automatically by the memory model (set by .MODEL) and the segment size (16
- or 32 bits). If you don't use .MODEL, near pointers are the default.
-
- In 16-bit mode, a near pointer is two bytes that contain the offset of the
- object pointed to. A far pointer requires four bytes, and it contains both
- the offset and the segment. In 32-bit mode, a near pointer is four bytes and
- a far pointer is six bytes. If you specify the distance with NEAR or FAR,
- the default distance of the current segment size is used. You can use
- NEAR16, NEAR32, FAR16, and FAR32 to override the defaults set by the current
- segment size. In flat model, NEAR is the default.
-
- A pointer type created with TYPEDEF can be used to declare pointer
- variables. Here are some examples using the pointer types defined above:
-
- ; Type declarations
- Array WORD 25 DUP (0)
- Msg BYTE "This is a string", 0
- pMsg PBYTE Msg ; Pointer to string
- pArray PWORD Array ; Pointer to word array
- npMsg NPBYTE Msg ; Near pointer to string
- npArray NPWORD Array ; Near pointer to word array
- fpArray FPWORD Array ; Far pointer to word array
- fpMsg FPBYTE Msg ; Far pointer to string
-
- S1 BYTE "first", 0 ; Some strings
- S2 BYTE "second", 0
- S3 BYTE "third", 0
- pS123 PBYTE S1, S2, S3, 0 ; Array of pointers to strings
- ppS123 PPBYTE pS123 ; A pointer to pointers to strings
-
- Andy PERSON <> ; Structure variable
- pAndy PPERSON Andy ; Pointer to structure variable
-
- ; Procedure prototype
-
- EXTERN ptrArray:PBYTE ; External variable
- Sort PROTO pArray:PBYTE ; Parameter for prototype
-
- ; Parameter for procedure
- Sort PROC pArray:PBYTE
- LOCAL pTmp:PBYTE ; Local variable
- .
- .
- .
- ret
- Sort ENDP
-
- Once defined, pointer types can be used in any context where intrinsic types
- are allowed.
-
-
- 3.3.2 Defining Register Types with ASSUME
-
- Beginning with MASM 6.0, you can use the ASSUME directive with
- generalpurpose registers to specify that a register is a pointer to a
- certain size of object. For example:
-
- ASSUME bx:PTR WORD ; BX is word pointer until further
- ; notice
- inc [bx] ; Increment word pointed to by BX
- add bx, 2 ; Point to next word
- mov [bx], 0 ; Word pointed to by BX = 0
- .
- . ; Other pointer operations with BX
- .
- ASSUME bx:NOTHING ; Cancel assumptions
-
- In this example, BX is specified to be a pointer to a word. After a sequence
- of using BX as a pointer, the assumption is cancelled by assuming NOTHING.
-
- Without the assumption to PTR WORD, many instructions need a size specifier.
- The INC and MOV statements from the examples above would have to be written
- like this to specify the sizes of the memory operands:
-
- inc WORD PTR [bx]
- mov WORD PTR [bx], 0
-
- When you have used ASSUME, attempts to use the register for other purposes
- generate assembly errors. In the example above, while the PTR WORD
- assumption is in effect, any use of BX inconsistent with its ASSUME
- declaration generates an error. For example,
-
- ; mov al, [bx] ; Can't move word to byte register
-
- You can also use the PTR operator to override defaults:
-
- mov ax, BYTE PTR [bx] ; Legal
-
- Similarly, you can use ASSUME to prevent the use of a register as a pointer
- or even to disable a register:
-
- ASSUME bx:WORD, dx:ERROR
- ; mov al, [bx] ; Error - BX is an integer, not a pointer
- ; mov ax, dx ; Error - DX disabled
-
- See Section 2.3.3 for information on using ASSUME with segment registers.
-
-
- 3.3.3 Basic Pointer and Address Operations
-
- You can do these basic operations with pointers and addresses:
-
-
- ■ Initialize a pointer variable by storing an address in it
-
- ■ Load an address into registers, directly or from a pointer
-
-
- The sections in the rest of this chapter describe variations of these tasks
- with both pointers and addresses. The examples in these sections assume that
- you have previously defined the following pointer types with the TYPEDEF
- directive:
-
- PBYTE TYPEDEF PTR BYTE ; Pointer to bytes
- NPBYTE TYPEDEF NEAR PTR BYTE ; Near pointer to bytes
- FPBYTE TYPEDEF FAR PTR BYTE ; Far pointer to bytes
-
-
- 3.3.3.1 Initializing Pointer Variables
-
- Let the assembler initialize pointer variables when possible.
-
- If the value of a pointer is known at assembly time, the assembler can
- initialize it automatically so that no processing time is wasted on the task
- at run time. The following example illustrates how to do this:
-
- Msg BYTE "String", 0
- pMsg PBYTE Msg
-
- If a pointer variable can be conditionally defined to one of several
- constant addresses, initialization must be delayed until run time. The
- technique is different for near pointers than for far pointers, as shown
- below:
-
- Msg1 BYTE "String1"
- Msg2 BYTE "String2"
- npMsg NPBYTE ?
- fpMsg FPBYTE ?
- .
- .
- .
- mov npMsg, OFFSET Msg1 ; Load near pointer
-
- mov WORD PTR fpMsg[0], OFFSET Msg2 ; Load far offset
- mov WORD PTR fpMsg[2], SEG Msg2 ; Load far segment
-
- If you know that the segment for a far pointer is currently in a register,
- you can load it directly:
-
- mov WORD PTR fpMsg[2], ds ; Load segment
- of
- ; far pointer
-
- Dynamic Addresses - Often the address to be initialized is dynamic. You know
- the register or registers containing the address, and you want to save them
- in a variable for later use. Typical situations include memory allocated by
- DOS (see interrupt 21h function 48h in online help) and addresses found by
- the SCAS or CMPS instructions (see Section 5.1.3.1). The technique for
- saving dynamic addresses is illustrated below:
-
- ; Dynamically allocated buffer
- fpBuf FPBYTE 0 ; Initialize so offset will be zero
- .
- .
- .
- mov ah, 48h ; Allocate memory
- mov bx, 10h ; Request 16 paragraphs
- int 21h ; Call DOS
- jc error ; Return segment in AX
- mov WORD PTR fpBuf[2], ax ; Load segment
- . ; (offset is already 0)
- .
- .
- error: ; Handle error
-
- There are several options for copying pointers.
-
- Copying Pointers - Sometimes one pointer variable must be initialized by
- copying from another. Here are two ways to copy a far pointer:
-
- fpBuf1 FPBYTE ?
- fpBuf2 FPBYTE ?
- .
- .
- .
- ; Copy through registers is faster, but requires a spare register
- mov bx, WORD PTR fpBuf1[0]
- mov WORD PTR fpBuf2[0], bx
- mov bx, WORD PTR fpBuf1[2]
- mov WORD PTR fpBuf2[2], bx
-
- ; Copy through stack is slower, but does not use a register
- push WORD PTR fpBuf1[0]
- push WORD PTR fpBuf1[2]
- pop WORD PTR fpBuf2[2]
- pop WORD PTR fpBuf2[0]
-
- Pointers passed as procedure arguments are pushed onto the stack.
-
- Pointers as Arguments - When a pointer is passed as an argument to a
- procedure, it must be pushed onto the stack. The procedure then sets up a
- stack frame so that it can access the arguments from the stack. This
- technique is discussed in detail in Section 7.3.2, "Passing Arguments on the
- Stack." Pushing a pointer is illustrated below:
-
- ; Push a far pointer (segment always pushed first)
- push WORD PTR fpMsg[2] ; Push segment
- push WORD PTR fpMsg[0] ; Push offset
-
- Pushing an address is somewhat different:
-
- ; Push a far address as a far pointer
- mov ax, SEG fVar ; Load and push segment
- push ax
- mov ax, OFFSET fVar ; Load and push offset
- push ax
-
- On the 80186 and later processors, you can shorten pushing a constant to one
- step:
-
- push SEG fVar ; Push segment
- push OFFSET fVar ; Push offset
-
-
- 3.3.3.2 Loading Addresses into Registers
-
- Loading an address into a pair of registers is one of the most common tasks
- in assembly-language programming. You cannot do processing work with a
- constant address or a pointer variable until the address is loaded into
- registers.
-
- Certain register pairs have standard uses.
-
- You often load addresses into particular segment:offset pairs. The following
- pairs have specific uses:
-
- Segment:Offset Pair Standard Use
- ────────────────────────────────────────────────────────────────────────────
- DS:SI Source for string operations
- ES:DI Destination for string operations
- DS:DX Input for DOS functions
- ES:BX Output from DOS functions
-
- In addition, you can use ES:SI, DS:DI, DS:BX, or any segment:offset pair for
- your own indirect memory operands. You can use SS:BP with a displacement to
- access procedure arguments or local variables in procedures.
-
- Addresses from Data Segments - For near addresses, you need only load the
- offset; the segment is assumed as SS for stack-based data and as DS for
- other data. You must load both segment and offset for far pointers.
-
- Here is an example of loading an address to DS:BX from a near data segment:
-
-
- .DATA
- Msg BYTE "String"
- .
- .
- .
- mov bx, OFFSET Msg ; Load address to BX
- ; (DS already loaded)
-
- If the data is in a far data segment, it is loaded like this:
-
- .FARDATA
- Msg BYTE "String"
- .
- .
- .
- mov ax, SEG Msg ; Load address to ES:BX
- mov es, ax
- mov bx, OFFSET Msg
-
- Stack Variables - The technique for loading the address of a stack variable
- is significantly different from the technique for loading near addresses.
- You may need to put the correct segment value into ES for string operations.
- The following example illustrates how to load the address of a local (stack)
- variable to ES:DI:
-
- Task PROC
- LOCAL Arg[4]:BYTE
-
- push ss ; Since it's stack-based, segment is SS
- pop es ; Copy SS to ES
- lea di, Arg ; Load offset to DI
-
- Use LEA to load the offset of an indirect memory operand.
-
- The local variable in this case actually evaluates to SS:[BP-4]. This is an
- offset from the stack frame (described in Section 7.3.2, "Passing Arguments
- on the Stack"). Since you cannot use the OFFSET operator to get the offset
- of an indirect memory operand, you must use the LEA (Load Effective Address)
- instruction.
-
- Use MOV and OFFSET to load the offset of a direct memory operand.
-
- Direct Memory Operands - To get the address of a direct memory operand, you
- can use the MOV instruction with OFFSET or the LEA instruction. MASM 6.0
- automatically optimizes the LEA statement by generating the smaller and
- faster code, as shown in this example:
-
-
- lea si, Msg ; If you code this statement,
- mov si, OFFSET Msg ; MASM 6.0 generates this code
-
- The LEA instruction can be used to determine the address of indirect memory
- operands, as shown below.
-
- lea si, [bx] ; Legal - LEA required for indirect
- ; mov si, OFFSET [bx] ; Illegal - no OFFSET on indirect
-
- Far Pointers - Use the LES and LDS instructions to load far pointers. Use
- the MOV instruction to load a near pointer. The following example shows how
- to load a far pointer to ES:DI and a near pointer to SI (assuming DS as the
- segment):
-
- InBuf BYTE 20 DUP (1)
- OutBuf BYTE 20 DUP (0)
-
- npIn NPBYTE InBuf
- fpOut FPBYTE OutBuf
- .
- .
- .
- les di, fpOut ; Load far pointer to ES:DI
-
- mov si, npIn ; Load near pointer to SI (assume DS)
-
- Copying between Segment Pairs - Copying from one register pair to another is
- complicated by the fact that you cannot copy one segment register directly
- to another. Two methods are shown below. Timings are for the 8088 processor:
-
-
- ; Copy DS:SI to ES:DI, generating smaller code
- push ds ; 1 byte, 14 clocks
- pop es ; 1 byte, 12 clocks
- mov di, si ; 2 bytes, 2 clocks
-
- ; Copy DS:SI to ES:DI, generating faster code
- mov di, ds ; 2 bytes, 2 clocks
- mov es, di ; 2 bytes, 2 clocks
- mov di, si ; 2 bytes, 2 clocks
-
-
- 3.3.3.3 Model-Independent Techniques
-
- Use conditional assembly to write memory-model independent code.
-
- Often you may want to write code that is memory-model independent. If you
- are writing libraries that must be available for different memory models,
- you can use conditional assembly to handle different sizes of pointers. You
- can use the predefined symbols @DataSize and @Model to test the current
- assumptions.
-
- Use conditional assembly to handle pointers that have no specified distance.
-
-
- You can use conditional assembly to write code that works with pointer
- variables that have no specified distance. The predefined symbol @DataSize
- tests the pointer size for the current memory model:
-
- Msg1 BYTE "String1"
- pMsg PBYTE ?
- .
- .
- .
- IF @DataSize
- mov WORD PTR pMsg[0], OFFSET Msg1 ; Load far offset
- mov WORD PTR pMsg[2], SEG Msg1 ; Load far segment
- ELSE
- mov pMsg, OFFSET Msg1 ; Load near pointer
- ENDIF
-
- In the following example, a procedure receives as an argument a pointer to a
- word variable. The code inside the procedure uses @DataSize to determine
- whether the current memory model supports far or near data. It loads and
- processes the data accordingly:
-
- ; Procedure that receives an argument by reference
- mul8 PROC arg:PTR WORD
-
- IF @DataSize
- les bx, arg ; Load far pointer to ES:BX
- mov ax, es:[bx] ; Load the data pointed to
- ELSE
- mov bx, arg ; Load near pointer to BX (assume DS)
- mov ax, [bx] ; Load the data pointed to
- ENDIF
- shl ax, 1 ; Multiply by 8
- shl ax, 1
- shl ax, 1
- ret
- mul8 ENDP
-
- If you have many routines, writing the conditionals for each case can be
- tedious. The following conditional statements generate the proper
- instructions and segment overrides automatically.
-
- ; Equates for conditional handling of pointers
- IF @DataSize
- lesIF TEXTEQU <les>
- ldsIF TEXTEQU <lds>
- esIF TEXTEQU <es:>
- ELSE
- lesIF TEXTEQU <mov>
- ldsIF TEXTEQU <mov>
- esIF TEXTEQU <>
- ENDIF
-
- Once you define these conditionals, you can use them to simplify code that
- must handle several types of pointers. This next example rewrites the above
- mul8 procedure to use conditional code.
-
- mul8 PROC arg:PTR WORD
-
- lesIF bx, arg ; Load pointer to BX or ES:BX
- mov ax, esIF [bx] ; Load the data from [BX] or ES:[BX]
- shl ax, 1 ; Multiply by 8
- shl ax, 1
- shl ax, 1
- ret
- mul8 ENDP
-
- The conditional statements from the examples above can be defined once in an
- include file and used whenever you need to handle pointers.
-
-
- 3.4 Related Topics in Online Help
-
- In addition to information covered in this chapter, information on the
- following topics can be found in online help.
-
- ╓┌─────────────────────────────────────┌─────────────────────────────────────╖
- Topics Access
- ────────────────────────────────────────────────────────────────────────────
- LROFFSET, THIS From the "MASM 6.0 Contents" screen,
- choose "Operators"; then choose
- "Address"
-
- LFS, LGS, and LSS From the "MASM 6.0 Contents" screen,
- Topics Access
- ────────────────────────────────────────────────────────────────────────────
- LFS, LGS, and LSS From the "MASM 6.0 Contents" screen,
- choose "Processor Instructions";
- then choose "Data
- Transfer"
-
- ALIGN, EVEN, ORG From the "MASM 6.0 Contents" screen,
- choose "Directives"; then choose
- "Miscellaneous"
-
- NEAR, NEAR16, NEAR32, FAR16, FAR32, From the "MASM 6.0 Contents" screen,
- and TYPE choose "Operators"; then choose
- "Type and Size"
-
- PTR From the "MASM 6.0 Contents" screen,
- choose "Operators"; then choose
- "Miscellaneous"
-
- PUSHCONTEXT and POPCONTEXT Access from the Macro Assembler
- Index
- Topics Access
- ────────────────────────────────────────────────────────────────────────────
- Index
-
- ASSUME, .MODEL From the "MASM 6.0 Contents" screen,
- choose "Directives"; then choose
- "Simplified Segment Control"
-
- @DataSize, @Model From the "MASM 6.0 Contents" screen,
- choose "Predefined Symbols"
-
-
-
-
-
-
-
-
- Chapter 4 Defining and Using Integers
- ────────────────────────────────────────────────────────────────────────────
-
- The 8086 family of processors is designed to operate on integer data;
- therefore, most assembler statements are integer operations. Even string
- elements (discussed in Chapter 5, "Defining and Using Complex Data Types")
- are byte-sized integers to the assembler.
-
- This chapter covers the concepts essential for using integer variables in
- assembly-language programs. The first section shows how to declare integer
- variables. The second section describes basic integer operations including
- moving, loading, and sign-extending integers, as well as calculating with
- integers. Finally, the last section describes how to do various operations
- with integers at the bit level, such as using bitwise logical instructions
- and shifting and rotating bits.
-
- The complex data types introduced in the next chapter─arrays, strings,
- structures, unions, and records─use many of the integer operations
- illustrated in this chapter, since the components of complex data types are
- often integers. Floating-point operations require a different set of
- instructions and techniques. These are covered in Chapter 6, "Using
- Floating-Point and Binary Coded Decimal Numbers."
-
-
- 4.1 Declaring Integer Variables
-
- You declare integer variables in the data segment of your program to
- allocate memory for data. The EQU and = directives define integer constants.
- Integer variables allocated with the data allocation directives can be
- initialized in several ways. MASM 6.0 provides new forms of the data
- allocation directives. This section discusses these features and explains
- how to use the SIZEOF and TYPE operators to provide information to the
- assembler about the types in your program. For information on symbolic
- integer constants, see Section 1.2.4, "Integer Constants and Constant
- Expressions."
-
-
- 4.1.1 Allocating Memory for Integer Variables
-
- When you declare an integer variable by assigning a label to a data
- allocation directive, the assembler allocates memory space for the integer.
- The variable's name becomes a label for the memory space. The syntax is
-
- «name» directive initializer
-
- These directives, listed below, indicate the integer's size and value range.
-
-
- ╓┌─────────────────────────────────┌─────────────────────────────────────────╖
- Directive Description of Initializers
- BYTE, DB (bytes) Allocates unsigned numbers from
- 0 to 255.
- ────────────────────────────────────────────────────────────────────────────
- SBYTE (signed bytes) Allocates signed numbers from
- -128 to +127.
-
- WORD, DW (words = 2 bytes) Allocates unsigned numbers from
- 0 to 65,535 (64K).
-
- SWORD (signed words) Allocates signed numbers from
- -32,768 to +32,767.
-
- DWORD, DD (doublewords = 4 bytes) Allocates unsigned numbers from
- 0 to 4,294,967,295 (4 megabytes).
-
- SDWORD (signed doublewords) Allocates signed numbers from
- Directive Description of Initializers
- BYTE, DB (bytes) Allocates unsigned numbers from
- 0 to 255.
- ────────────────────────────────────────────────────────────────────────────
- SDWORD (signed doublewords) Allocates signed numbers from
- -2,147,483,648 to +2,147,483,647.
-
- FWORD, DF (farwords = 6 bytes) Allocates 6-byte (48-bit) integers.
- These values are normally used only as
- pointer variables on the 80386/486
- processors.
-
- QWORD, DQ (quadwords = 8 bytes) Allocates 8-byte integers used with
- 8087-family coprocessor instructions.
-
- TBYTE, DT (10 bytes) Allocates 10-byte (80-bit) integers if
- the initializer has a radix specifying
- the base of the number.
-
-
-
- See Chapter 6 for information on the REAL4, REAL8, and REAL10 directives
- that allocate real numbers.
-
-
-
-
- The assembler enforces only the size of initializers.
-
- MASM does not enforce the range of values assigned to an integer. If the
- value does not fit in the space allocated, however, the assembler generates
- an error.
-
- The SIZEOF and TYPE operators, when applied to a type, return the size of an
- integer of that type. The following list gives the size attribute associated
- with each data type.
-
- Data Type Bytes
- BYTE
- ────────────────────────────────────────────────────────────────────────────
- WORD, SWORD 2
- DWORD, SDWORD 3
- FWORD 6
- QWORD 8
- TBYTE 10
-
- The SBYTE, SWORD, and SDWORD data types are new to MASM 6.0. Use of these
- signed data types tells the assembler to treat the initializers as signed
- data. It is important to use these signed types with high-level constructs
- such as .IF, .WHILE, and .REPEAT (see Section 7.2.1, "Loop-Generating
- Directives"), and with PROTO and INVOKE directives (see Sections 7.3.6,
- "Declaring Procedure Prototypes," and 7.3.7, "Calling Procedures with
- INVOKE").
-
- The assembler stores integers with the least significant bytes lowest in
- memory. Note that assembler listings and most debuggers show the bytes of a
- word in the opposite order─high byte first.
-
- Figure 4.1 illustrates the integer formats.
-
- (This figure may be found in the printed book.)
-
- TYPEDEF can define integer aliases.
-
- Although the TYPEDEF directive's primary purpose is to define pointer
- variables (see Section 3.3.1), you can also use TYPEDEF to create an alias
- for any integer type. For example, these declarations
-
- char TYPEDEF SBYTE
- longint TYPEDEF DWORD
- float TYPEDEF REAL4
- double TYPEDEF REAL8
-
- allow you to use char, longint, float, or double in your programs if
- you prefer the C data labels.
-
-
- 4.1.2 Data Initialization
-
- You can initialize variables when you declare them by giving initial
- values─that is, constants or expressions that evaluate to integer constants.
- The assembler generates an error if you specify an initial value too large
- for the specified variable type. Variables can also be initialized with ? if
- there are no initial values.
-
- You can declare and initialize variables in one step with the data
- directives, as these examples show.
-
- integer BYTE 16 ; Initialize byte to 16
- negint SBYTE -16 ; Initialize signed byte to -16
- expression WORD 4*3 ; Initialize word to 12
- signedexp SWORD 4*3 ; Initialize signed word to 12
- empty QWORD ? ; Allocate uninitialized long
- ; integer
- BYTE 1,2,3,4,5,6 ; Initialize six unnamed bytes
- long DWORD 4294967295 ; Initialize doubleword to
- ; 4,294,967,295
- longnum SDWORD -2147433648 ; Initialize signed doubleword
-
- ; to -2,147,433,648
- tb TBYTE 2345t ; Initialize 10-byte binary
- ; number
-
- See Section 5.1, "Arrays and Strings," for information on arrays and on
- using the DUP operator to allocate initializer lists.
-
- Once you have declared integer variables in your program, you can use them
- in integer operations such as adding, moving, loading, and exchanging. The
- next section describes these operations.
-
-
- 4.2 Integer Operations
-
- You often need to copy, move, exchange, load, and sign-extend integer
- variables in your MASM code. This section shows how to do these operations
- as well as how to add, subtract, multiply, and divide integers; push and pop
- integers onto the stack; and do bit-level manipulations with logical, shift,
- and rotate instructions.
-
- The PTR operator tells the assembler the size of the operand.
-
- Since MASM instructions require operands to be the same size, you may need
- to operate on data in a size other than the size originally declared. The
- PTR operator lets you do this. For example, you can use the PTR operator to
- access the high-order word of a DWORD-size variable. The syntax for the PTR
- operator is
-
- type PTR expression
-
- where the PTR operator forces expression to be treated as having the type
- specified. An example of this use is
-
- .DATA
- num DWORD 0
- .CODE
-
- mov ax, WORD PTR num[0] ; Loads a word-size value
- from
- mov dx, WORD PTR num[2] ; a doubleword variable
-
- You might choose not to use PTR, in contrast to this example. In that case,
- trying to move num[0] into AX generates an error.
-
-
- 4.2.1 Moving and Loading Integers
-
- The primary instructions for moving integers from operand to operand and
- loading them into registers are MOV (Move), XCHG (Exchange), XLAT
- (Translate), CWD (Convert Word to Double), and CBW (Convert Byte to Word).
-
-
- 4.2.1.1 Moving Integers
-
- The most common method of moving data, the MOV instruction, can be thought
- of as a copy instruction, since it always copies the source operand to the
- destination operand. Immediately after a MOV instruction, both the source
- and destination operands contain the same value.
-
- The statements in the following example illustrate each type of memory move
- that can be performed with a single instruction. Note that you cannot move
- memory operands to memory operands in one operation.
-
- ; Immediate value moves
- mov ax, 7 ; Immediate to register
- mov mem, 7 ; Immediate to memory direct
- mov mem[bx], 7 ; Immediate to memory indirect
- ; Register moves
- mov mem, ax ; Register to memory direct
- mov mem[bx], ax ; Register to memory indirect
- mov ax, bx ; Register to register
- mov ds, ax ; General register to segment
- ; register
-
- ; Direct memory moves
- mov ax, mem ; Memory direct to register
- mov ds, mem ; Memory to segment register
-
- ; Indirect memory moves
- mov ax, mem[bx] ; Memory indirect to register
- mov ds, mem[bx] ; Memory indirect to segment register
-
- ; Segment register moves
- mov mem, ds ; Segment register to memory
- mov mem[bx], ds ; Segment register to memory indirect
- mov ax, ds ; Segment register to general
- ; register
-
- This next example shows several common types of moves that require two
- instructions.
-
- ; Move immediate to segment register
- mov ax, DGROUP ; Load immediate to general register
- mov ds, ax ; Store general register to segment
- ; register
-
- ; Move memory to memory
- mov ax, mem1 ; Load memory to general register
- mov mem2, ax ; Store general register to memory
-
- ; Move segment register to segment register
- mov ax, ds ; Load segment register to general
- ; register
- mov es, ax ; Store general register to segment
- ; register
-
- The MOVSX and MOVZX instructions for the 80386/486 processors extend and
- copy values in one step. See Section 4.2.1.4, "Extending Signed and Unsigned
- Integers."
-
-
- 4.2.1.2 Exchanging Integers
-
- The XCHG (Exchange) instruction exchanges the data in the source and
- destination operands. Data can be exchanged between registers or between
- registers and memory, but not from memory to memory:
-
- xchg ax, bx ; Put AX in BX and BX in AX
- xchg memory, ax ; Put "memory" in AX and AX in "memory"
- ; xchg mem1, mem2 ; Illegal- can't exchange between
- ; memory location
-
- In some circumstances, register-to-register moves are faster with XCHG than
- with MOV. If speed is important in your programs, check the Reference to
- find the fastest clock speeds for various operand combinations allowed with
- MOV and XCHG.
-
-
- 4.2.1.3 Translating Integers from Tables
-
- The XLAT (Translate) instruction loads data from a table into memory. The
- instruction is useful for translating bytes from one coding system to
- another. The syntax is
-
- XLAT[[B]] [[[[segment:]]memory]]
-
- XLAT and XLATB are synonyms.
-
- The BX register must contain the address of the start of the table. By
- default, the DS register contains the segment of the table, but you can use
- a segment override to specify a different segment. Also, you need not give
- the operand except when specifying a segment override. (See Section 3.2.3,
- "Direct Memory Operands," for information about the segment override
- operator.)
-
- Before the XLAT instruction executes, the AL register should contain a value
- that points into the table (the start of the table is position 0). After the
- instruction executes, AL contains the table value pointed to. For example,
- if AL contains 7, the assembler puts the eighth byte of the table in the AL
- register.
-
- This example, illustrating XLAT, looks up hexadecimal characters in a table
- to convert an eight-bit binary number to a string representing a hexadecimal
- number.
-
- ; Table of hexadecimal digits
- hex BYTE "0123456789ABCDEF"
- convert BYTE "You pressed the key with ASCII code "
- key BYTE ?,?,"h",13,10,"$"
- .CODE
- .
- .
- .
- mov ah, 8 ; Get a key in AL
- int 21h ; Call DOS
- mov bx, OFFSET hex ; Load table address
- mov ah, al ; Save a copy in high byte
- and al, 00001111y ; Mask out top character
- xlat ; Translate
- mov key[1], al ; Store the character
- mov cl, 12 ; Load shift count
- shr ax, cl ; Shift high character into
- ; position
- xlat ; Translate
- mov key, al ; Store the character
- mov dx, OFFSET convert ; Load message
- mov ah, 9 ; Display character
- int 21h ; Call DOS
-
-
- 4.2.1.4 Extending Signed and Unsigned Integers
-
- Since moving data to a different-sized register is illegal, you must
- "sign-extend" integers to convert signed data to a larger register or
- register pair.
-
- Sign-extending means copying the sign bit of the unextended operand to all
- bits of the extended operand. The instructions in the following list
- sign-extend values as shown. They work only on signed values in the
- accumulator register.
-
- Instruction Function
- ────────────────────────────────────────────────────────────────────────────
- CBW Convert byte to word
- CWD Convert word to doubleword
- CWDE Convert word to doubleword extended (80386/486 only)
- CDQ Convert doubleword to quadword (80386/486 only)
-
- On the 80386/486, the CWDE instruction converts a signed 16-bit value in AX
- to a signed 32-bit value in EAX. The CDQ instruction converts a signed
- 32-bit value in EAX to a signed 64-bit value in the EDX:EAX register pair.
-
- This example converts signed integers using CBW, CWD, CWDE, and CDQ.
-
- .DATA
- mem8 SBYTE -5
- mem16 SWORD -5
- mem32 SDWORD -5
- .CODE
- .
- .
- .
- mov al, mem8 ; Load 8-bit -5 (FBh)
- cbw ; Convert to 16-bit -5 (FFFBh) in AX
-
- mov ax, mem16 ; Load 16-bit -5 (FFFBh)
- cwd ; Convert to 32-bit -5 (FFFF:FFFBh)
- ; in DX:AX
- mov ax, mem16 ; Load 16-bit -5 (FFFBh)
- cwde ; Convert to 32-bit -5 (FFFFFFFBh)
- ; in EAX
- mov eax, mem32 ; Load 32-bit -5 (FFFFFFFBh)
- cdq ; Convert to 64-bit -5
- ; (FFFFFFFF:FFFFFFFBh) in EDX:EAX
-
- Conversion instructions do not operate on unsigned numbers.
-
- The procedure is different for unsigned values. Unsigned values are extended
- by filling the upper bits with zeros rather than by sign extension. Because
- the sign-extend instructions do not work on unsigned integers, you must set
- the value of the higher register to zero.
-
- This example shows sign extension for unsigned numbers.
-
- .DATA
- mem8 BYTE 251
- mem16 WORD 251
- .CODE
- .
- .
- .
- mov al, mem8 ; Load 251 (FBh) from 8-bit memory
- sub ah, ah ; Zero upper half (AH)
-
- mov ax, mem16 ; Load 251 (FBh) from 16-bit memory
- sub dx, dx ; Zero upper half (DX)
-
- The 80386/486 processors provide instructions that move and extend a value
- to a larger data size in a single step. MOVSX moves a signed value into a
- register and sign-extends it. MOVZX moves an unsigned value into a register
- and zeroextends it.
-
- ; 80386/486 instructions
- movzx dx, bl ; Load unsigned 8-bit value into
- ; 16-bit register and zero-extend
-
- These special 80386 and 80486 instructions usually execute much faster than
- the equivalent 8086-80286 instructions.
-
-
- 4.2.2 Pushing and Popping Stack Integers
-
- A stack is an area of memory for storing data temporarily. Unlike other
- segments that store data starting from low memory, the stack stores data in
- reverse order─starting from high memory. Data is always pushed or popped
- from the top of the stack. The data on the stack can be the calling
- addresses of procedures or interrupts, procedure arguments, or any operands,
- flags, or registers your program needs to store temporarily.
-
- At first, the stack is an uninitialized segment of a finite size. As data is
- added to the stack at run time, the stack grows downward from high memory to
- low memory. When items are removed from the stack, it shrinks upward from
- low to high memory.
-
-
- 4.2.2.1 Saving Operands on the Stack
-
- PUSH and POP always operate on word-sized data.
-
- The PUSH instruction stores a two-byte operand on the stack. The POP
- instruction retrieves a previously pushed value. When a value is pushed onto
- the stack, the assembler decreases the SP (Stack Pointer) register by 2. On
- 8086-based processors, the SP register always points to the top of the
- stack. The PUSH and POP instructions use the SP register to keep track of
- the current position.
-
- When a value is popped off the stack, the assembler increases the SP
- register by 2. Although the stack always contains word values, the SP
- register points to byte addresses. Thus, SP changes in multiples of two.
- When a PUSH or POP instruction executes in a 32-bit code segment (one with
- USE32 use type), the assembler transfers a four-byte value, and ESP changes
- in multiples of four.
-
- ────────────────────────────────────────────────────────────────────────────
- NOTE
- The 8086 and 8088 processors differ from later Intel processors in how they
- push and pop the SP register. If you give the statement push sp with the
- 8086 or 8088, the word pushed is the word in SP after the push operation.
- ────────────────────────────────────────────────────────────────────────────
-
- Figure 4.2 illustrates how pushes and pops change the SP register.
-
- (Please refer to the printed book.)
-
- (This figure may be found in the printed book.)
-
- On the 8086, PUSH and POP take only registers or memory expressions as their
- operands. The other processors allow an immediate value to be an operand for
- PUSH. For example, the following statement is legal on the 80186-80486
- processors:
-
- push 7 ; 3 clocks on 80286
-
- That statement is faster than these equivalent statements, which are
- required on the 8088 or 8086:
-
- mov ax, 7 ; 2 clocks plus
- push ax ; 3 clocks on 80286
-
- There are two ways to clean up the stack.
-
- Words are popped off the stack in reverse order: the last item pushed is the
- first popped. To return the stack to its original status, you can do the
- same number of pops as pushes. You can subtract the correct number of words
- from the SP register if you want to restore the stack without using the
- values on it.
-
- To reference operands on the stack, keep in mind that the values pointed to
- by the BP (Base Pointer) and SP registers are relative to the SS (Stack
- Segment) register. The BP register is often used to point to the base of a
- frame of reference (a stack frame) within the stack.
-
- This example shows how you can access values on the stack using indirect
- memory operands with BP as the base register.
-
- push bp ; Save current value of BP
- mov bp, sp ; Set stack frame
- push ax ; Push first; SP = BP - 2
- push bx ; Push second; SP = BP - 4
- push cx ; Push third; SP = BP - 6
- .
- .
- .
- mov ax, [bp-6] ; Put third in AX
- mov bx, [bp-4] ; Put second in BX
- mov cx, [bp-2] ; Put first in CX
- .
- .
- .
- add sp, 6 ; Restore stack pointer
- ; two bytes per push
- pop bp ; Restore BP
-
- Creating labels for stack variables makes code easier to read.
-
- If you use these stack values often in your program, you may want to give
- them labels. For example, you can use TEXTEQU to create a label such as
- count TEXTEQU <bp-6>. Now you can replace the mov ax, [bp - 6] statement
- in the example above with mov ax, count. Section 9.1, "Text Macros," gives
- more information about the TEXTEQU directive.
-
-
- 4.2.2.2 Saving Flags on the Stack
-
- Flags can be pushed and popped onto the stack with the PUSHF and POPF
- instructions. You can use these instructions to save the status of flags
- before a procedure call and then to restore the original status after the
- procedure. You can also use them within a procedure to save and restore the
- flag status of the caller. The 32-bit versions of these instructions are
- PUSHFD and POPFD.
-
-
- This example saves the flags register before calling the systask
- procedure:
-
- pushf
- call systask
- popf
-
- If you do not need to store the entire flag register, you can use the LAHF
- instruction to manually load and store the status of the lower byte of the
- flag register in the AH register. (You need to save AH before making a
- procedure call.) SAHF restores the value.
-
-
- 4.2.2.3 Saving Registers on the Stack (80186-80486 Only)
-
- Starting with the 80186 processor, the PUSHA and POPA instructions push or
- pop all the general-purpose registers with only one instruction. These
- instructions save the status of all registers before a procedure call and
- then restore them after the return. Using PUSHA and POPA is significantly
- faster and takes fewer bytes of code than pushing and popping each register
- individually.
-
- The processor pushes the registers in the following order: AX, CX, DX, BX,
- SP, BP, SI, and DI. The SP word pushed is the value before the first
- register is pushed.
-
- The processor pops the registers in the opposite order. The 32-bit versions
- of these instructions are PUSHAD and POPAD.
-
-
- 4.2.3 Adding and Subtracting Integers
-
- You can use the ADD, ADC, INC, SUB, SBB, and DEC instructions for adding,
- incrementing, subtracting, and decrementing values in single registers. You
- can also combine them to handle larger values that require two registers for
- storage.
-
-
- 4.2.3.1 Adding and Subtracting Integers Directly
-
- The ADD, INC (Increment), SUB, and DEC (Decrement) instructions operate on
- 8- and 16-bit values on the 8086-80286 processors, and on 8-, 16-, and
- 32-bit values on the 80386/486 processors. They can be combined with the ADC
- and SBB instructions to work on 32-bit values on the 8086 and 64-bit values
- on the 80386/486 processors (see Section 4.2.3.2).
-
-
- These instructions have two requirements:
-
-
- 1. If there are two operands, only one operand can be a memory operand.
-
- 2. If there are two operands, both must be the same size.
-
-
- PTR allows you to operate on data in sizes different from its declared type.
-
-
- To meet the second requirement, you can use the PTR operator to force an
- operand to the size required (see Section 4.2, "Integer Operations"). For
- example, if Buffer is an array of bytes and BX points to an element of the
- array, you can add a word from Buffer with
-
- add ax, WORD PTR Buffer[bx] ; Adds a word from the
- ; byte variable
-
- The next example shows 8-bit signed and unsigned addition and subtraction.
-
- DATA
- mem8 BYTE 39
- .CODE
-
- ; Addition
-
- ; signed unsigned
- mov al, 26 ; Start with register 26 26
- inc al ; Increment 1 1
- add al, 76 ; Add immediate 76 + 76
- ; ---- ----
- ; 103 103
- add al, mem8 ; Add memory 39 + 39
- ; ---- ----
- mov ah, al ; Copy to AH -114 142
- +overflow
- add al, ah ; Add register 142
- ; ----
- ; 28+carry
-
- ; Subtraction
-
- ; signed unsigned
- mov al, 95 ; Load register 95 95
- dec al ; Decrement -1 -1
- sub al, 23 ; Subtract immediate -23 -23
- ; ---- ----
- ; 71 71
- sub al, mem8 ; Subtract memory -122 -122
- ; ---- ----
- ; -51 205+sign
-
- mov ah, 119 ; Load register 119
- sub al, ah ; and subtract -51
- ; ----
- ; 86+overflow
-
- The INC and DEC instructions treat integers as unsigned values and do not
- update the carry flag for signed carries and borrows.
-
-
- Your programs must include error-recovery for overflows and carries.
-
- When the sum of eight-bit signed operands exceeds 127, the processor sets
- the overflow flag. (The overflow flag is also set if both operands are
- negative and the sum is less than or equal to -128.) Placing a JO (Jump on
- Overflow) or INTO (Interrupt on Overflow) instruction in your program at
- this point can transfer control to error-recovery statements. When the sum
- exceeds 255, the processor sets the carry flag. A JC (Jump on Carry)
- instruction at this point can transfer control to error-recovery statements.
-
-
- In the subtraction example above, the processor sets the sign flag if the
- result goes below 0. At this point, you can use a JS (Jump on Sign)
- instruction to transfer control to error-recovery statements.
-
-
- 4.2.3.2 Adding and Subtracting in Multiple Registers
-
- You can add and subtract numbers larger than the register size on your
- processor with the ADC (Add with Carry) and SBB (Subtract with Borrow)
- instructions. If the operations prior to an ADC or SBB instruction do not
- set the carry flag, these instructions are identical to ADD and SUB. When
- you operate on large values in more than one register, use ADD and SUB for
- the least significant part of the number and ADC or SBB for the most
- significant part.
-
- The following example illustrates multiple-register addition and
- subtraction. You can also use this technique with 64-bit operands on the
- 80386/486 processors.
-
-
- .DATA
- mem32 DWORD 316423
- mem32a DWORD 316423
- mem32b DWORD 156739
- .CODE
- .
- .
- .
- ; Addition
- mov ax, 43981 ; Load immediate 43981
- sub dx, dx ; into DX:AX
- add ax, WORD PTR mem32[0] ; Add to both + 316423
- adc dx, WORD PTR mem32[2] ; memory words ------
- ; Result in DX:AX 360404
-
- ; Subtraction
- mov ax, WORD PTR mem32a[0] ; Load mem32 316423
- mov dx, WORD PTR mem32a[2] ; into DX:AX
- sub ax, WORD PTR mem32b[0] ; Subtract low - 156739
- sbb dx, WORD PTR mem32b[2] ; then high ------
- ; Result in DX:AX 159684
-
- For 32-bit registers on the 80386/486, only two steps are necessary. If your
- program needs to be assembled for more than one processor, you can assemble
- the statements conditionally, as shown in this example:
-
- .DATA
- mem32 DWORD 316423
- mem32a DWORD 316423
- mem32b DWORD 156739
- p386 TEXTEQU (@Cpu AND 08h)
- .CODE
- .
- .
- .
- ; Addition
- IF p386
- mov eax, 43981 ; Load immediate
- add eax, mem32 ; Result in EAX
- ELSE
- .
- . ; do steps in previous example
- .
- ENDIF
-
- ; Subtraction
- IF p386
- mov eax, mem32a ; Load memory
- sub eax, mem32b ; Result in EAX
- ELSE
- .
- . ; do steps in previous example
- .
- ENDIF
-
- Since the status of the carry flag affects the results of calculations with
- ADC and SUB, be sure to turn off the carry flag with the CLC (Clear Carry
- Flag) instruction or use ADD for the first calculation when appropriate.
-
-
- 4.2.4 Multiplying and Dividing Integers
-
- The 8086 family of processors uses different multiplication and division
- instructions for signed and unsigned integers. Multiplication and division
- instructions also have special requirements depending on the size of the
- operands and the processor the code runs on.
-
-
- 4.2.4.1 Using Multiplication Instructions
-
- The MUL instruction multiplies unsigned numbers. IMUL multiplies signed
- numbers. For both instructions, one factor must be in the accumulator
- register (AL for 8-bit numbers, AX for 16-bit numbers, EAX for 32-bit
- numbers). The other factor can be in any single register or memory operand.
- The result overwrites the contents of the accumulator register.
-
- Multiplying two 8-bit numbers produces a 16-bit result returned in AX.
- Multiplying two 16-bit operands yields a 32-bit result in DX:AX. The
- 80386/486 processor handles 64-bit products in the same way in the EDX:EAX
- pair.
-
- This example illustrates multiplication of signed 16- and 32-bit integers.
-
- .DATA
- mem16 SWORD -30000
- .CODE
- .
- .
- .
- ; 8-bit signed multiply
- mov al, 23 ; Load AL 23
- mov bl, 24 ; Load BL * 24
- mul bl ; Multiply BL -----
- ; Product in AX 552
- ; overflow and carry set
-
- ; 16-bit unsigned multiply
- mov ax, 50 ; Load AX 50
- ; -30000
- imul mem16 ; Multiply memory -----
- ; Product in DX:AX -1500000
- ; overflow and carry set
-
- A nonzero number in the upper half of the result (AH for byte, DX or EDX for
- word) sets the overflow and carry flags.
-
- On the 80186-80486 processors, the IMUL instruction supports three different
- operand combinations. The first syntax option allows for 16-bit multipliers
- producing a 16-bit product or 32-bit multipliers for 32-bit products on the
- 80386/486. The result overwrites the destination. The syntax for this
- operation is
-
- IMUL register16, immediate
-
- Multiplication by an immediate operand is possible on the 80386/486.
-
- The second syntax option specifies three operands for IMUL. The first
- operand must be a 16-bit register operand, the second a 16-bit memory or
- register operand, and the third a 16-bit immediate operand. IMUL multiplies
- the memory (or register) and immediate operands and stores the product in
- the register operand with this syntax:
-
- IMUL register16, memory16 | register16, immediate
-
- For the 80386/486 only, a third option for IMUL allows an additional operand
- for multiplication of a register value by a register or memory value. This
- is the syntax:
-
- IMUL register,{register | memory}
-
- The destination can be any 16-bit or 32-bit register. The source must be the
- same size as the destination.
-
- In all of these options, products too large to fit in 16 or 32 bits set the
- overflow and carry flags. The following examples show these three options
- for IMUL.
-
- imul dx, 456 ; Multiply DX times 456 on 80186-80486
- imul ax, [bx],6 ; Multiply the value pointed to by BX
- ; by 6 and put the result in AX
-
- imul dx, ax ; Multiply DX times AX on 80386
- imul ax, [bx] ; Multiply AX by the value pointed to
- ; by BX on 80386
-
- The IMUL instruction with multiple operands can be used for either signed or
- unsigned multiplication, since the 16-bit product is the same in either
- case. To get a 32-bit result, you must use the single-operand version of MUL
- or IMUL.
-
-
- 4.2.4.2 Using Division Instructions
-
- The DIV instruction divides unsigned numbers, and IDIV divides signed
- numbers. Both return a quotient and a remainder.
-
- Table 4.1 summarizes the division operations. The dividend is the number to
- be divided, and the divisor is the number to divide by. The quotient is the
- result. The divisor can be in any register or memory location except the
- registers where the quotient and remainder are returned.
-
- Table 4.1 Division Operations
-
- Size of Dividend Size of
- Operand Register Divisor Quotient Remainder
- ────────────────────────────────────────────────────────────────────────────
- 16 bits AX 8 bits AL AH
-
- 32 bits DX:AX 16 bits AX DX
-
- 64 bits EDX:EAX 32 bits EAX EDX
- (80386
- and 80486)
-
- ────────────────────────────────────────────────────────────────────────────
-
-
- Unsigned division does not require careful attention to flags. The following
- examples illustrate signed division, which can be more complex.
-
- .DATA
- mem16 SWORD -2000
- mem32 SDWORD 500000
- .CODE
- .
- .
- .
- ; Divide 16-bit unsigned by 8-bit
- mov ax, 700 ; Load dividend 700
- mov bl, 36 ; Load divisor DIV 36
- div bl ; Divide BL ------
- ; Quotient in AL 19
- ; Remainder in AH 16
-
- ; Divide 32-bit signed by 16-bit
- mov ax, WORD PTR mem32[0] ; Load into DX:AX
- mov dx, WORD PTR mem32[2] ; 500000
- idiv mem16 ; DIV -2000
- ; Divide memory ------
- ; Quotient in AX -250
- ; Remainder in DX 0
-
- ; Divide 16-bit signed by 16-bit
- mov ax, WORD PTR mem16 ; Load into AX -2000
- cwd ; Extend to DX:AX
- mov bx,-421 ; DIV -421
- idiv bx ; Divide by BX -----
- ; Quotient in AX 4
- ; Remainder in DX -316
-
- If the dividend and divisor are the same size, sign-extend or zero-extend
- the dividend so that it is the length expected by the division instruction.
- See Section 4.2.1.4, "Extending Signed and Unsigned Integers."
-
-
- 4.3 Manipulating Integers at the Bit Level
-
- The instructions introduced so far in this chapter accessed integers at the
- byte or word level. The logical, shift, and rotate instructions described in
- this section, however, access the individual bits of the integers. You can
- use logical instructions to evaluate characters and do other text and screen
- operations. The shift and rotate instructions do similar tasks by shifting
- and rotating bits through registers. This section discusses some
- applications of these bit-level operations.
-
-
- 4.3.1 Logical Operations
-
- The logical instructions─AND, OR, XOR, and NOT─operate on each bit in one
- operand and on the corresponding bit in the other. The following list shows
- how each instruction works. Except for NOT, these instructions require two
- integers of the same size.
-
-
- Instruction Sets a Bit to 1 under These Conditions
- ────────────────────────────────────────────────────────────────────────────
- AND Both corresponding bits in the operands
- have the value 1.
-
- OR Either of the corresponding bits in the
- operands has the value 1.
-
- XOR Either, but not both, of the
- corresponding bits in the operands has
- the value 1.
-
- NOT The corresponding bit in the operand is
- 0. (This instruction takes only one
- operand.)
-
-
- ────────────────────────────────────────────────────────────────────────────
- NOTE
- Do not confuse logical instructions with the logical operators, which
- perform these operations at assembly time, not run time. Although the names
- are the same, the assembler recognizes the difference from context.
- ────────────────────────────────────────────────────────────────────────────
-
- The following example shows the result of the AND, OR, XOR, and NOT
- instructions operating on a value in the AX register and in a mask. A mask
- is a binary or hexadecimal number with appropriate bits set for the intended
- operation.
-
- mov ax, 035h ; Load value 00110101
- and ax, 0FBh ; Clear bit 2 AND 11111011
- ; --------
- ; Value is now 31h 00110001
- or ax, 016h ; Set bits 4,2,1 OR 00010110
- ; --------
- ; Value is now 37h 00110111
- xor ax, 0ADh ; Toggle bits 7,5,3,2,0 XOR 10101101
- ; --------
- ; Value is now 9Ah 10011010
- not ax ; Value is now 65h 01100101
-
- Use AND, OR, and XOR to set or clear specific bits.
-
- You can use the AND instruction to clear the value of specific bits
- regardless of their current settings. To do this, put the target value in
- one operand and a mask of the bits you want to clear in the other. The bits
- of the mask should be 0 for any bit positions you want to clear and 1 for
- any bit positions you want to remain unchanged.
-
- You can use the OR instruction to force specific bits to 1 regardless of
- their current settings. The bits of the mask should be 1 for any bit
- positions you want to set and 0 for any bit positions you want to remain
- unchanged.
-
- You can use the XOR instruction to toggle the value of specific bits
- (reverse them from their current settings). This instruction sets a bit to 1
- if the corresponding bits are different or to 0 if they are the same. The
- bits of the mask should be 1 for any bit positions you want to toggle and 0
- for any bit positions you want to remain unchanged.
-
- The following examples show an application for each of these instructions.
- The code illustrating the AND instruction converts a "y" or "n" read from
- the keyboard to uppercase, since bit 5 is always clear in uppercase letters.
- In the example for OR, the first statement is faster and uses fewer bytes
- than cmp bx, 0. When the operands for XOR are identical, each bit cancels
- itself, producing 0.
-
- ; Converts characters to uppercase
- mov ah, 7 ; Get character without echo
- int 21h
- and al, 11011111y ; Convert to uppercase by clearing
-
- ; bit 5
- cmp al, 'Y' ; Is it Y?
- je yes ; If so, do Yes actions
- . ; else do No actions
- .
- yes: .
-
- ; Compares operand to 0
- or bx, bx ; Compare to 0
- ; 2 bytes, 2 clocks on 8088
- jg positive ; BX is positive
- jl negative ; BX is negative
- ; else BX is zero
-
- ; Sets a register to 0
- xor cx, cx ; 2 bytes, 3 clocks on 8088
- sub cx, cx ; 2 bytes, 3 clocks on 8088
- mov cx, 0 ; 3 bytes, 4 clocks on 8088
-
- On the 80386 and 80486, the BSF (Bit Scan Forward) and the BSR (Bit Scan
- Reverse) instructions perform operations similar to those of the logical
- instructions. They scan the contents of a register to find the first-set or
- last-set bit. You can use BSF or BSR to find the position of a set bit in a
- mask or to check if a register value is 0.
-
-
- 4.3.2 Shifting and Rotating Bits
-
- The 8086-based processors provide a complete set of instructions for
- shifting and rotating bits. Shift instructions move bits a specified number
- of places to the right or left. The last bit in the direction of the shift
- goes into the carry flag, and the first bit is filled with 0 or with the
- previous value of the first bit.
-
- Rotate instructions also move bits a specified number of places to the right
- or left. For each bit rotated, the last bit in the direction of the rotate
- operation moves into the first bit position at the other end of the operand.
- With some variations, the carry bit is used as an additional bit of the
- operand. Figure 4.3 illustrates the eight variations of shift and rotate
- instructions for eight-bit operands. Notice that SHL and SAL are identical.
-
-
- (This figure may be found in the printed book.)
-
- All shift instructions use the same format. Before the instruction executes,
- the destination operand contains the value to be shifted; after the
- instruction executes, it contains the shifted operand. The source operand
- contains the number of bits to shift or rotate. It can be the immediate
- value 1 or the CL register. The 8088 and 8086 processors do not accept any
- other values or registers with these instructions.
-
-
- The shift instruction allows you to change masks during program execution.
-
- Masks for logical instructions can be shifted to new bit positions. For
- example, an operand that masks off a bit or group of bits can be shifted to
- move the mask to a different position, allowing you to mask off a different
- bit each time the mask is used. This technique, illustrated in the following
- example, is useful only if the mask value is unknown until run time.
-
-
- .DATA
- masker BYTE 00000010y ; Mask that may change at run time
- .CODE
- .
- .
- .
- mov cl, 2 ; Rotate two at a time
- mov bl, 57h ; Load value to be changed 01010111y
- rol masker, cl ; Rotate two to left 00001000y
- or bl, masker ; Turn on masked values ---------
- ; New value is 05Fh 01011111y
- rol masker, cl ; Rotate two more 00100000y
- or bl, masker ; Turn on masked values ---------
- ; New value is 07Fh 01111111y
-
- Starting with the 80186 processor, you can use eight-bit immediate values
- larger than 1 as the source operand for shift or rotate instructions, as
- shown below:
-
- shr bx, 4 ; 9 clocks, 3 bytes on 80286
-
- The following statements are equivalent if the program must run on the 8088
- or 8086 processor:
-
- mov cl, 4 ; 2 clocks, 3 bytes on 80286
- shr bx, cl ; 9 clocks, 2 bytes on 80286
- ; 11 clocks, 5 bytes
-
-
- 4.3.3 Multiplying and Dividing with Shift Instructions
-
- You can use the shift and rotate instructions (SHR, SHL, SAR, and SAL) for
- multiplication and division. Shifting an integer right by one bit has the
- effect of dividing by two; shifting left by one bit has the effect of
- multiplying by two. You can take advantage of shifts to do fast
- multiplication and division by powers of two. For example, shifting left
- twice multiplies by four, shifting left three times multiplies by eight, and
- so on.
-
- Use SHR (Shift Right) to divide unsigned numbers. You can use SAR (Shift
- Arithmetic Right) to divide signed numbers, but SAR rounds numbers down─IDIV
- always rounds up. Division using SAR must adjust for this difference.
- Multiplication by shifting is the same for signed and unsigned numbers, so
- you can use either SAL or SHL.
-
- Use shifts instead of MUL or DIV to optimize your code.
-
- Since the multiply and divide instructions are very slow on the 8088 and
- 8086 processors, using shifts instead can often speed operations by a factor
- of 10 or more. For example, on the 8088 or 8086 processor, these statements
- take only four clocks:
-
- sub ah, ah ; Clear AH
- shl ax, 1 ; Multiply byte in AL by 2
-
- The following statements produce the same results, but take between 74 and
- 81 clocks on the 8088 or 8086. The same statements take 15 clocks on the
- 80286 and between 11 and 16 clocks on the 80386.
-
- mov bl, 2 ; Multiply byte in AL by 2
- mul bl
-
- You can put multiplication and division operations in macros so they can be
- changed if the constants in a program change, as shown in the two macros
- below.
-
- mul_10 MACRO factor ; Factor must be unsigned
- mov ax, factor ; Load into AX
- shl ax, 1 ; AX = factor * 2
- mov bx, ax ; Save copy in BX
- shl ax, 1 ; AX = factor * 4
- shl ax, 1 ; AX = factor * 8
- add ax, bx ; AX = (factor * 8) + (factor * 2)
- ENDM ; AX = factor * 10
-
- div_512 MACRO dividend ; Dividend must be unsigned
- mov ax, dividend ; Load into AX
- shr ax, 1 ; AX = dividend / 2 (unsigned)
- xchg al, ah ; xchg is like rotate right 8
- ; AL = (dividend / 2) / 256
- cbw ; Clear upper byte
- ENDM ; AX = (dividend / 512)
-
- Since RCR and RCL use the carry flag, clear it before multiple-register
- shifts.
-
- If you need to shift a value that is too large to fit in one register, you
- can shift each part separately. The RCR (Register Carry Right) and RCL
- (Register Carry Left) instructions carry values from the first register to
- the second by passing the leftmost or rightmost bit through the carry flag.
-
-
- This example shifts a multiword value.
-
-
- .DATA
- mem32 DWORD 500000
- .CODE
-
- ; Divide 32-bit unsigned by 16
- mov cx, 4 ; Shift right 4 500000
- again: shr WORD PTR mem32[2], 1 ; Shift into carry DIV 16
- rcr WORD PTR mem32[0], 1 ; Rotate carry in ------
- loop again ; 31250
-
- Since the carry flag is treated as part of the operand (it's like using a
- nine-bit or 17-bit operand), the flag value before the operation is crucial.
- The carry flag can be set by a previous instruction, but you can also set it
- directly by using the CLC (Clear Carry Flag), CMC (Complement Carry Flag),
- and STC (Set Carry Flag) instructions.
-
-
- On the 80386 and 80486, an alternate method for multiplying quickly by
- constants takes advantage of the LEA (Load Effective Address) instruction
- and the scaling of indirect memory operands. By using a 32-bit value as both
- the index and the base register in an indirect memory operand, you can
- multiply by the constants 2, 3, 4, 5, 8, and 9 more quickly than you can by
- using the MUL instruction. LEA calculates the offset of the source operand
- and stores it into the destination register, EBX, as this example shows:
-
- lea ebx, [eax*2] ; EBX = 2 * EAX
- lea ebx, [eax*2+eax] ; EBX = 3 * EAX
- lea ebx, [eax*4] ; EBX = 4 * EAX
- lea ebx, [eax*4+eax] ; EBX = 5 * EAX
- lea ebx, [eax*8] ; EBX = 8 * EAX
- lea ebx, [eax*8+eax] ; EBX = 9 * EAX
-
- Section 3.2.4.3, "Indirect Memory Operands with 32-Bit Registers," discusses
- scaling of 80386 indirect memory operands, and Section 3.3.3.2, "Loading
- Addresses into Registers," introduces LEA.
-
- This chapter has covered the integer operations you use in your MASM
- programs. The next chapter looks at more complex data types─arrays, strings,
- structures, unions, and records. Many of the operations presented in this
- chapter can also be applied to the data structures discussed in Chapter 5,
- "Defining and Using Complex Data Types."
-
-
- 4.4 Related Topics in Online Help
-
- Online help features additional information about the topics discussed in
- this chapter. From the "MASM 6.0 Contents" screen for MASM online help,
- select the following topics:
-
- ╓┌─────────────────────────────────────┌─────────────────────────────────────╖
- Topic Access
- ────────────────────────────────────────────────────────────────────────────
- BYTE, WORD, ... Choose "Directives" and then "Data
- Allocation"
-
- Bitwise logical operations Choose "Operators" and then from the
- list of operators, choose "Logical
- and Shift"
-
- Location counter Choose "Predefined Symbols" for
- information on the $ symbol
- Topic Access
- ────────────────────────────────────────────────────────────────────────────
- information on the $ symbol
-
- BSF, BSR, SHLD, SHRD, and SET From the "Processor Instructions"
- condition categories, choose "Logical and
- Shift"
-
- LES, LFS, LGS From the "Processor Instructions"
- categories, choose "Data Transfer"
-
- .RADIX directive Choose "Directives" and then choose
- "Miscellaneous"
-
- MOD Choose "Operators," and then
- "Arithmetic"
-
- OPATTR, .TYPE, HIGH, LOW, HIGHWORD, Choose "Operators," then
- and LOWWORD "Miscellaneous"
-
- OPTION EXPR32, Choose "Directives," and then
- Topic Access
- ────────────────────────────────────────────────────────────────────────────
- OPTION EXPR32, Choose "Directives," and then
- OPTION EXPR16, "OPTION"
-
-
-
-
-
-
-
-
- Chapter 5 Defining and Using Complex Data Types
- ────────────────────────────────────────────────────────────────────────────
-
- With the complex data types available in MASM 6.0─arrays, strings, records,
- structures, and (new to version 6.0) unions─you can access data either as a
- unit or as individual elements that make up the unit. The individual
- elements of complex data types are often the integer types discussed in
- Chapter 4, "Defining and Using Integers."
-
- Section 5.1 first discusses how to declare, reference, and initialize arrays
- and strings. This section summarizes the general steps needed to process
- arrays and strings and describes the MASM instructions for moving,
- comparing, searching, loading, and storing operations.
-
- Section 5.2 covers similar information for structures and unions: how to
- declare structure and union types, how to define structure and union
- variables, and how to reference structures and unions and their fields.
-
- Section 5.3 explains how to declare record types, define record variables,
- and use record operators.
-
- All three sections also describe how to use the LENGTHOF, SIZEOF, and TYPE
- operators with each complex data type.
-
-
- 5.1 Arrays and Strings
-
- An assembly-language array is a sequence of fixed-size variables. A string
- is an array of characters. You can access the elements in an array or string
- relative to the first element.
-
- This section explains and illustrates the essential ways to handle arrays
- and strings in your programs. It covers arrays first, beginning with the two
- ways to declare an array and continuing with how to reference it. The
- section then explains the special requirements for declaring and
- initializing a string. Finally, it describes the processing of arrays and
- strings.
-
-
- 5.1.1 Declaring and Referencing Arrays
-
- You can declare an array in two ways: you can specify a list of array
- elements, or you can use the DUP operator to specify a group of identical
- elements.
-
- To declare an array, you must supply a label name, a type, and a series of
- elements separated by commas. You can access each element of an array
- relative to the first. In the examples below, warray and xarray are
- arrays.
-
- warray WORD 1, 2, 3, 4
- xarray DWORD OFFFh, OAAAh
-
- The assembler stores the elements consecutively in memory, with the first
- address referenced by the label name.
-
- Initializer lists can be longer than one line.
-
- Beginning with MASM 6.0, initializer lists of array declarations can span
- multiple lines. The first initializer must appear on the same line as the
- data type, all entries must be initialized, and, if you want the array to
- continue to the new line, the line must end with a comma. These examples
- show legal multiple-line array declarations:
-
- big BYTE 21, 22, 23, 24, 25,
- 26, 27, 28
-
- somelist WORD 10,
- 20,
- 30
-
- If you do not want to use the new LENGTHOF and SIZEOF operators discussed
- later in this section, then an array may span more than one logical line,
- although a separate type declaration is needed on each logical line:
-
- var1 BTYE 10, 20, 30
- BYTE 40, 50, 60
- BYTE 70, 80, 90
-
-
- The DUP Operator
-
- You can also declare an array with the DUP operator. This operator can be
- used with any of the data allocation directives described in Section 4.1.1.
- In the syntax
-
- count DUP (initialvalue [[,initialvalue]]...)
-
- the count value sets the number of times to repeat the last initialvalue.
- Each initial value is evaluated only once and can be any expression that
- evaluates to an integer value, a character constant, or another DUP
- operator. The initial value (or values) must always be placed within
- parentheses. For example, the statement
-
- barray BYTE 5 DUP (1)
-
- allocates the integer 1 five times for a total of five bytes.
-
- The following examples show various ways to use the DUP operator to allocate
- data elements.
-
- array DWORD 10 DUP (1) ; 10 doublewords
- ; initialized to 1
- buffer BYTE 256 DUP (?) ; 256-byte buffer
-
- masks BYTE 20 DUP (040h, 020h, 04h, 02h) ; 80-byte buffer
- ; with bit masks
- three_d DWORD 5 DUP (5 DUP (5 DUP (0))) ; 125 doublewords
- ; initialized to 0
-
-
- Referencing Arrays
-
- Once an array is defined, you can refer to its first element by typing the
- array name (no brackets required). The array name refers to the first object
- of the given type in the list of initial values.
-
- If warray has been defined as
-
- warray WORD 2, 4, 6, 8, 10
-
- then referencing warray in your program refers to the first word─the word
- containing 2.
-
- To refer to the next element (in an array of words), use either of these two
- forms, each of which refers to the array element two bytes past the
- beginning of warray:
-
- warray+2
- warray[2]
-
- This element can be used as you would any data item:
-
- mov ax, warray[2]
- push warray+2
-
- When used with a variable name, brackets only add a number to the address.
- If warray refers to the address 2400h, then warray[2] refers to the
- address 2402h. The BOUND instruction (80186-80486 only) can be used to
- verify that an index value is within the bounds of an array.
-
- Array indexes are not scaled. The index is a distance in bytes.
-
- In assembly language, array indexes are zero-based and unscaled. The number
- within brackets always represents an absolute distance in bytes. In
- practical terms, the fact that indexes are unscaled means that if an element
- is larger than one byte, you must multiply the index of the element by its
- size (in the example above, 2), and then add the result to the address of
- the array. Thus, the expression warray[4] represents the third element,
- which is four bytes past the beginning of the array. Similarly, the
- expression warray[6] represents the fourth element.
-
- You can also determine an index at run time:
-
- mov si, cx ; CX holds index value
- shl si, 7 ; Scale for word referencing
- mov ax, warray[si] ; Move element into AX
-
- The offset required to access an array element can be calculated with the
- following formula:
-
- nth element of array = array[(n-1) * size of element]
-
-
- LENGTHOF, SIZEOF, and TYPE for Arrays
-
- When applied to arrays, the LENGTHOF, SIZEOF, and TYPE operators return
- information about the length and size of the array and about the type of the
- initializers.
-
- The LENGTHOF operator returns the number of items in the definition. It can
- be applied only to an integer label. This is useful for determining the
- number of elements you need to process in an array of integers. For an array
- or string label, SIZEOF returns the number of bytes used by the initializers
- in the definition. TYPE returns the size of the elements of the array. These
- examples illustrate these operators:
-
- array WORD 40 DUP (5)
-
- larray EQU LENGTHOF array ; 40 elements
- sarray EQU SIZEOF array ; 80 bytes
- tarray EQU TYPE array ; 2 bytes per element
-
- num DWORD 4, 5, 6, 7,
- 8, 9, 10, 11
-
- lnum EQU LENGTHOF num ; 8 elements
- snum EQU SIZEOF num ; 32 bytes
- tnum EQU TYPE num ; 4 bytes per element
-
- warray WORD 40 DUP (40 DUP (5))
-
- len EQU LENGTHOF warray ; 1600 elements
- siz EQU SIZEOF warray ; 3200 bytes
- typ EQU TYPE warray ; 2 bytes per element
-
-
- 5.1.2 Declaring and Initializing Strings
-
- A string is an array of bytes. Initializing a string like "Hello, there"
- allocates and initializes one byte for each character in the string. An
- initialized string can be no longer than 255 characters.
-
- Strings declared with types other than BYTE must fit the memory space
- allocated.
-
- For data directives other than BYTE, a string may initialize only a single
- element. This element must be short enough to fit into the specified size
- and conform to the expression word size in effect (see Section
- 1.2.4,"Integer Constants and Constant Expressions"), as shown in these
- examples:
-
- wstr WORD "OK"
- dstr DWORD "ADCD" ; Legal under EXPR32 only
-
- As with arrays, string initializers can span multiple lines. The line must
- end with a comma if you want the string to continue to the next line.
-
- str1 BYTE "This is a long string that does not ",
- "fit on one line."
-
- You can also have an array of pointers to strings. For example:
-
- PBYTE TYPEDEF PTR BYTE
- .DATA
- msg1 BYTE "Operation completed successfully."
- msg2 BYTE "Unknown command"
- msg3 BYTE "File not found"
- pmsg1 PBYTE msg1
- pmsg2 BPBYTE msg2
- pmsg3 PBYTE msg3
-
- errors WORD pmsg1, pmsg2, pmsg3 ; An array of pointers
- ; to strings
-
- Strings must be enclosed in single (') or double (") quotation marks. To put
- a single quotation mark inside a string enclosed by single quotation marks,
- use two single quotation marks. Likewise, if you need quotation marks inside
- a string enclosed by double quotation marks, use two sets. These examples
- show the various uses of quotation marks:
-
- char BYTE 'a'
- message BYTE "That's the message." ; That's the message.
- warn BYTE 'Can''t find file.' ; Can't find file.
- string BYTE "This ""value"" not found." ; This "value"
-
- not found.
-
- You can always use single quotation marks inside a string enclosed by double
- quotation marks, as the initialization for message shows, and vice versa.
-
-
-
- The ? Initializer
-
- The actual values stored when you use ? depend on the other data in your
- program.
-
- You do not have to initialize all elements in an array to a value. If there
- is no initial value, you can initialize the array elements with the ?
- operator. The ? operator either is treated as a zero or causes a byte to be
- left unspecified in the object file. Object files contain records for
- initialized data. An unspecified byte left in the object file means that no
- records contain initialized data for that address.
-
- The actual values stored in arrays allocated with ? depend on certain
- conditions. The ? initializer is treated as a zero in a DUP statement that
- contains initializers in addition to the ? initializer. An unspecified byte
- is left in the object file if the ? initializer does not appear in a DUP
- statement, or if the DUP statement contains only ? initializers for nested
- DUP statements.
-
-
- Length-Specified Strings
-
- Often there are reasons to know the length of a string. To use the DOS
- functions for writing to a file, for example, CX must contain the length of
- the string before the interrupt is called, as shown in this example.
-
- msg BYTE "This is a length-specified string"
- .
- .
- .
- mov ah, 40h
- mov bx, 1
- mov cx, LENGTHOF msg
- mov dx, OFFSET msg
- int 21h
-
- Some high-level languages also expect strings passed to procedures to have a
- certain format. For example, Pascal procedures require the first byte of a
- string passed as a parameter to contain the length of the string. You can
- write this length into the first byte with
-
- msg BYTE LENGTHOF msg - 1, "This is a Pascal string"
-
- Interfacing with high-level languages requires special techniques with
- strings.
-
- Other languages such as Basic have string descriptions─a kind of structure
- containing both the length and the address of the string. For example, this
- structure DESC could be used in a procedure accessed from Basic:
-
- DESC STRUCT
- len WORD ? ; Length of string1
- off WORD ? ; Offset of string1
- DESC ENDS
-
- string1 BYTE "This string goes in a string descriptor"
- msg DESC {LENGTHOF string1, string1}
-
- See Section 5.2, "Structures and Unions."
-
-
- Null-Terminated and $-Terminated Strings
-
- Null-terminated and $-terminated strings have a special use with DOS
- functions. Strings in modules shared with C need to end with a null
- character (0).
-
- str1 BYTE "This string ends with a null character", 0
-
- DOS file names also require a null character at the end. This example opens
- a file named "MYFILE.ASM".
-
- name1 BYTE "MYFILE.ASM", 0
- .
- .
- .
- mov ah, 3Dh
- mov dx, OFFSET name1
- int 21h
-
- DOS function 9 requires a string to end with a dollar sign ($) so that it
- can recognize the end of the string to write to the screen, as shown in this
- example.
-
- msg BYTE "This is a dollar-terminated string$"
- .
- .
- .
- mov ah, 09h
- mov dx, OFFSET msg
- int 21h
-
-
- LENGTHOF, SIZEOF, and TYPE for Strings
-
- Because the assembler considers strings as simply arrays of byte elements,
- the LENGTHOF and SIZEOF operators return the same values for strings as they
- do for arrays, as illustrated in this example. The TYPE operator considers
- msg to be one data unit and returns 1.
-
- msg BYTE "This string extends ",
- "over three ",
- "lines."
-
- lmsg EQU LENGTHOF msg ; 37 elements
- smsg EQU SIZEOF msg ; 37 bytes
- tmsg EQU TYPE msg ; 1 byte per element
-
-
- 5.1.3 Processing Arrays and Strings
-
- The 8086-family instruction set has seven string instructions for fast and
- efficient processing of entire strings and arrays. The term "string" in
- "string instructions" refers to a sequence of elements, not just character
- strings. These instructions work directly only on arrays of bytes and words
- on the 8086-80486 and on arrays of bytes, words, and doublewords on the
- 80386 and 80486. Processing larger elements must be done indirectly with
- loops.
-
-
- The following list gives capsule descriptions of the five instructions
- discussed in this section. Two additional instructions not described here
- are the INS and OUTS instructions that transfer values to and from a memory
- port.
-
- Instruction Description
- ────────────────────────────────────────────────────────────────────────────
- MOVS Copies a string from one location to another
- STOS Stores values from the accumulator register to a string
- CMPS Compares values in one string with values in another
- LODS Loads values from a string to the accumulator register
- SCAS Scans a string for a specified value
-
- All of these instructions use registers in a similar way and have a similar
- syntax. Most are used with the repeat instruction prefixes REP, REPE (or
- REPZ), and REPNE (or REPNZ). REPZ is a synonym for REPE (Repeat While Equal)
- and REPNZ is a synonym for REPNE (Repeat While Not Equal).
-
-
- This section first explains the general procedures for using all string
- instructions. It then illustrates each instruction with an example.
-
-
- 5.1.3.1 Overview of String Operations
-
- The string instructions have specific requirements for the location of
- strings and the use of registers. To operate on any string, follow these
- three steps:
-
- All string operations follow three basic steps.
-
-
- 1. Set the direction flag to indicate the direction in which you want to
- process the string. The STD instruction sets the flag, while CLD
- clears it.
-
- If the direction flag is clear, the string is processed upward (from
- low addresses to high addresses, which is from left to right through
- the string). If the direction flag is set, the string is processed
- downward (from high addresses to low addresses, or from right to
- left). Under DOS, the direction flag is normally clear if your program
- has not changed it.
-
- 2. Load the number of iterations for the string instruction into the CX
- register.
-
- If you want to process a 100-byte string, move 100 into CX. If you
- wish the string instruction to terminate conditionally (for example,
- during a search when a match is found), load the maximum number of
- iterations that can be performed without an error.
-
- 3. Load the starting offset address of the source string into DS:SI and
- the start-ing address of the destination string into ES:DI. Some
- string instructions take only a destination or source, not both (see
- Table 5.1).
-
- Normally, the segment address of the source string should be DS, but
- you can use a segment override to specify a different segment for the
- source operand. You cannot override the segment address for the
- destination string. Therefore, you may need to change the value of ES.
- See Section 3.1 for information on changing segment registers.
-
- ────────────────────────────────────────────────────────────────────────────
- NOTE
- Although you can use a segment override on the source operand, a segment
- override combined with a repeat prefix can cause problems in certain
- situations on all processors except the 80386/486. If an interrupt occurs
- during the string operation, the segment override is lost and the rest of
- the string operation processes incorrectly. Segment overrides can be used
- safely when interrupts are turned off or with an 80386/486
- processor.───────────────────────────────────────────────────────────────────
-
-
-
- You can adapt these steps to the requirements of any particular string
- operation. The syntax for the string instructions is:
-
- «prefix» CMPS «segmentregister:»
- source, «ES:» destination
- LODS «segmentregister:» source
- «prefix» MOVS «ES:» destination,
- «segmentregister:» source
- «prefix» SCAS «ES:» destination
- «prefix» STOS «ES:« destination
-
- Some instructions have special forms for byte, word, or doubleword operands.
- If you use the form of the instruction that ends in B (BYTE), W (WORD), or D
- (DWORD) with LODS, SCAS, and STOS, the assembler knows whether the element
- is in the AL, AX, or EAX register. Therefore, these instruction forms do not
- require operands.
-
- Table 5.1 lists each string instruction with the type of repeat prefix it
- uses and indicates whether the instruction works on a source, a destination,
- or both.
-
- Table 5.1 Requirements for String Instructions
-
- ╓┌─────────────┌───────────────┌───────────────────┌─────────────────────────╖
- Instruction Repeat Prefix Source/Destination Register Pair
- ────────────────────────────────────────────────────────────────────────────
- MOVS REP Both DS:SI, ES:DI
- SCAS REPE/REPNE Destination ES:DI
- CMPS REPE/REPNE Both DS:SI, ES:DI
- LODS None Source DS:SI
- STOS REP Destination ES:DI
- INS REP Destination ES:DI
- OUTS REP Source DS:SI
- ────────────────────────────────────────────────────────────────────────────
-
-
- The instruction automatically increments DI or SI.
-
- The repeat prefix causes the instruction that follows it to repeat for the
- number of times specified in the count register or until a condition becomes
- true. After each iteration, the instruction increments or decrements SI and
- DI so that it points to new array elements. The string instructions work on
- these elements. The direction flag determines whether SI and DI are
- incremented (flag clear) or decremented (flag set). The size of the
- instruction determines whether SI and DI are altered by one, two, or four
- bytes each time.
-
- These are the conditions that determine the number of repetitions specified
- by a prefix.
-
- Prefix Description
- ────────────────────────────────────────────────────────────────────────────
- REP Repeats instruction CX times
-
- REPE, REPZ Repeats instruction CX times, or as long
- as elements are equal, whichever is
- fewer
-
- REPNE, REPNZ Repeats instruction CX times, or as long
- as elements are not equal, whichever is
- fewer
-
-
- The prefixes apply to only one string instruction at a time. To repeat a
- block of instructions, use a loop construction (see Section 7.2, "Loops").
-
- At run time, if a string instruction is preceded by a repeat sequence, the
- processor takes the following steps:
-
-
- 1. Checks the CX register and exits if CX is 0. If the REPE prefix is
- used, the loop exits if the zero flag is set; if REPNE is used, the
- loop exits if the zero flag is clear.
-
- 2. Performs the string operation once.
-
- 3. Increases SI and/or DI if the direction flag is clear. Decreases SI
- and/or DI if the direction flag is set. The amount of increase or
- decrease is 1 for byte operations, 2 for word operations, and 4 for
- doubleword operations (80386/486 only).
-
- 4. Decrements CX (no flags are modified).
-
- 5. Checks the zero flag at this point if the REPE or REPNE prefix is used
- (for SCAS or CMPS). If the repeat condition does not hold, execution
- proceeds to the next instruction.
-
- 6. Proceeds to the next iteration and repeats from step 1.
-
-
- At loop end, SI and DI point to the element immediately after the match.
-
- When the repeat loop ends, SI (or DI) points to the position following a
- match (when using SCAS or CMPS), so you need to decrement or increment DI or
- SI to point to the element where the match occurred.
-
- Although string instructions (except LODS) are most often used with repeat
- prefixes, they can also be used by themselves. In this case, the SI and/or
- DI registers are adjusted as specified by the direction flag and the size of
- operands. However, you must decrement the CX register and set up a loop for
- the repeated action.
-
-
- 5.1.3.2 String Instructions
-
- To use the 8086-family string instructions, apply the steps outlined in the
- previous section. Examples in this section illustrate each instruction.
-
- You can also use the techniques in this section with structures and unions,
- since arrays and strings can be fields in structures and unions (see Section
- 5.2).
-
- Moving Array Data - The MOVS instruction copies data from one area of memory
- to another. To move data, first load the count and the source and
- destination addresses into the appropriate registers. Then use REP with the
- MOVS instruction.
-
- .MODEL small
- .DATA
- source BYTE 10 DUP ('0123456789')
- destin BYTE 100 DUP (?)
- .CODE
- mov ax, @data ; Load same segment
- mov ds, ax ; to both DS
- mov es, ax ; and ES
- .
- .
- .
- cld ; Work upward
- mov cx, LENGTHOF source ; Set iteration count to 100
- mov si, OFFSET source ; Load address of source
- mov di, OFFSET destin ; Load address of destination
- rep movsb ; Move 100 bytes
-
- Storing Data in Arrays - The STOS instruction stores a specified value in
- each position of a string. The string is the destination, so it must be
- pointed to by ES:DI. The value to store must be in the accumulator.
-
- This example stores the character 'a' in each byte of a 100-byte string.
- Notice that it does this by storing 50 words rather than 100 bytes. This
- makes the code faster by reducing the number of iterations. To fill an odd
- number of bytes, you would have to adjust for the last byte.
-
- .MODEL small, C
- .DATA
- destin BYTE 100 DUP (?)
- ldestin EQU (LENGTHOF destin) / 2
- .CODE
- . ; Assume ES = DS
- .
- .
- cld ; Work upward
- mov ax, 'aa' ; Load character to fill
- mov cx, ldestin ; Load length of string
- mov di, OFFSET destin ; Load address of destination
- rep stosw ; Store 'aa' into array
-
- Comparing Arrays - The CMPS instruction compares two strings and points to
- the address after which a match or nonmatch occurs. If the values are the
- same, the zero flag is set. Either string can be considered as the
- destination or the source unless a segment override is used.
-
- This example using CMPSB assumes that the strings are in different segments.
- Both segments must be initialized to the appropriate segment register.
-
- .MODEL large, C
- .DATA
- string1 BYTE "The quick brown fox jumps over the lazy dog"
- .FARDATA
- string2 BYTE "The quick brown dog jumps over the lazy fox"
- lstring EQU LENGTHOF string2
- .CODE
- mov ax, @data ; Load data segment
- mov ds, ax ; into DS
- mov ax, @fardata ; Load far data segment
- mov es, ax ; into ES
- .
- .
- .
- cld ; Work upward
- mov cx, lstring ; Load length of string
- mov si, OFFSET string1 ; Load offset of string1
- mov di, OFFSET string2 ; Load offset of string2
- repe cmpsb ; Compare
- jcxz allmatch ; CX is 0 if no nonmatch
- .
- .
- .
- allmatch: ; Special case for all match
-
- Loading Data from Arrays - The LODS instruction loads a value from a string
- into a register. The string is the source; the value is in the accumulator.
- This instruction normally is not used with a repeat instruction prefix,
- since something must be done with each element before going on to the next.
-
- The code in this example loads, processes, and displays each byte in a
- string of bytes.
-
- .DATA
- info BYTE 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
- linfo WORD LENGTHOF info
- .CODE
- .
- .
- .
- cld ; Work upward
- mov cx, linfo ; Load length
- mov si, OFFSET info ; Load offset of source
- mov ah, 2 ; Display character function
-
- get:
- lodsb ; Get a character
- add al, '0' ; Convert to ASCII
- mov dl, al ; Move to DL
- int 21h ; Call DOS to display character
- loop get ; Repeat
-
- Searching Arrays - The SCAS instruction scans a string for a specified
- value. As the loop executes, this instruction compares the value pointed to
- by DI with the value in the accumulator. If values are the same, the zero
- flag is set.
-
- After a REPNE SCAS, the zero flag is cleared if no match was found. After a
- REPE SCAS, the zero flag is set if all values matched.
-
- This example assumes that ES is not the same as DS and that the address of
- the string is stored in a pointer variable. The LES instruction loads the
- far address of the string into ES:DI.
-
- .DATA
- string BYTE "The quick brown fox jumps over the lazy dog"
- pstring PBYTE string ; Far pointer to string
- lstring EQU LENGTHOF string ; Length of string
- .CODE
- .
- .
- .
- cld ; Work upward
- mov cx, lstring ; Load length of string
- les di, pstring ; Load address of string
- mov al, 'z' ; Load character to find
- repne scasb ; Search
- jcxz notfound ; CX is 0 if not found
- . ; ES:DI points to character
- . ; after first 'z'
- .
- notfound: ; Special case for not found
-
-
- 5.2 Structures and Unions
-
- A structure is a group of possibly dissimilar data types and variable
- declarations that can be accessed as a unit or by any of its components. The
- fields within the structure can have different sizes and data types.
-
- Unions are identical to structures, except that the fields of a union
- overlap in memory, which allows you to define different data formats for the
- same memory space. Unions can store different types of data depending on the
- situation. They can also store data as one data type and retrieve it as
- another data type.
-
- Whereas each field in a structure has an offset relative to the first byte
- of the structure, all the fields in a union start at the same offset. The
- size of a structure is the sum of its components, while the size of a union
- is the length of the longest field.
-
- A MASM structure is similar to a struct in the C language, a STRUCTURE in
- FORTRAN, and a RECORD in Pascal. Unions in MASM are similar to unions in C
- and FORTRAN, and to variant records in Pascal.
-
- Follow these steps when using structures and unions:
-
-
- 1. Declare a structure (or union) type.
-
- 2. Define one or more variables having that type.
-
- 3. Reference the fields directly or indirectly with the field (dot)
- operator.
-
-
- You can use the entire structure or union variable or just the individual
- fields as operands in assembler statements. This section explains the
- allocating, initializing, and nesting of structures and unions.
-
- MASM 6.0 extends the functionality of structures and also makes some changes
- to MASM 5.1 behavior. You can still retain MASM 5.1 behavior if you prefer
- by specifying OPTION OLDSTRUCTS in your program. See Section 1.3.2 for
- information about the OPTION directive, and Section 5.2.3 for information
- about referencing structures and unions.
-
-
- 5.2.1 Declaring Structure and Union Types
-
- When you declare a structure or union type, you create a template for data
- that contains the sizes and, optionally, the initial values for fields in
- the structure or union but that allocates no memory.
-
- The STRUCT keyword marks the beginning of a type declaration for a
- structure. (STRUCT and STRUC are synonyms.) STRUCT and UNION type
- declarations have the following format:
-
- name {STRUCT | UNION} «alignment»
- «,NONUNIQUE »
- fielddeclarations
- name ENDS
-
- The fielddeclarations are a series of one or more variable declarations. You
- can declare default initial values individually or with the DUP operator
- (see Section 5.2.2, "Defining Structure and Union Variables"). Section
- 5.2.3, "Referencing Structures, Unions, and Fields," explains the NONUNIQUE
- keyword. Structures and unions can also be nested in MASM 6.0 (see Section
- 5.2.4).
-
-
- Initializing Fields
-
- If you provide initializers for the fields of a structure or union when you
- declare the type, these initializers become the default value for the fields
- when you define a variable of that type. Section 5.2.2 explains default
- initializers.
-
- When you initialize the fields of a union type, the type and value of the
- first field become the default value and type for the union. In this example
- of an initialized union declaration, the default type for the union is
- DWORD:
-
- DWB UNION
- d DWORD 00FFh
- w WORD ?
- b BYTE ?
- DWB ENDS
-
- If the size of the first member is less than the size of the union, the
- assembler initializes the rest of the union to zeros. When initializing
- strings in a type, make sure the initial values are long enough to
- accommodate the largest possible string.
-
-
- Field Names
-
- Structure and union field names in MASM 6.0 must be unique within a given
- nesting level because they represent the offset from the beginning of the
- structure to the corresponding field.
-
- A nested structure has its own level.
-
- In MASM 6.0, a label and a structure field may have the same name, but not a
- text macro and a field name. Also, field names between structures need not
- be unique. Field names do need to be unique if you place OPTION M510 or
- OPTION OLDSTRUCTS in your code or use the /Zm option from the command line,
- since versions of MASM prior to 6.0 require unique field names (see Appendix
- A).
-
-
- Alignment Value and Offsets for Structures
-
- Data access to structures is faster on aligned fields than on unaligned
- fields. Therefore, alignment gains speed at the cost of space. Alignment
- improves access on 16-bit processors but makes no difference on code
- executing on an 8-bit 8088 processor.
-
- The way the assembler aligns structure fields determines the amount of space
- required to store a variable of that type. Each field in a structure has an
- offset relative to 0. If you specify an alignment in the structure
- declaration (or with the /Zpn command-line option), the offset for each
- field may be modified by the alignment (or n).
-
- The only values accepted for alignment are 1, 2, and 4. The default is 1. If
- the type declaration includes an alignment, the fields are aligned to the
- minimum of the field's size and the alignment. Any padding required to reach
- the correct offset for the field is added prior to allocating the field. The
- padding consists of zeros and always precedes the field.
-
- If the number of bytes in the field is greater than the alignment value, the
- element will be padded such that the offset of the element is divisible by
- the alignment value. If the number of bytes is greater than or equal to the
- alignment value, the offset of the element is padded such that it is
- divisible by the element size.
-
- The size of the structure must also be evenly divisible by the structure
- alignment value, so zeros may be added at the end of the structure.
-
- If neither the alignment nor the /Zp command-line option is used, the offset
- is incremented by the size of each data directive. This is the same as a
- default alignment equal to 1. The alignment specified in the type
- declaration overrides the /Zp command-line option.
-
- These examples show how offsets are determined:
-
- STUDENT2 STRUCT 2 ; Alignment value is 2
- score WORD 1 ; Offset is 0
- id BYTE 2 ; Offset is 2
- year DWORD 3 ; Offset is 4; one byte padding added
- sname BYTE 4 ; Offset is 8
- STUDENT2 ENDS
-
- One byte of padding is added at the end of the first byte-sized field.
- Otherwise the offset of the year field would be 3, which is not divisible
- by the alignment value of 2. The size of this structure is now 9 bytes.
- Since 9 is not evenly divisible by 2, one byte of padding is added at the
- end of student2.
-
- STUDENT4 STRUCT 4 ; Alignment value is 4
- sname BYTE 1 ; Offset is 0
- score WORD 10 DUP (100) ; Offset is 2
- year BYTE 2 ; Offset is 22; 1 byte padding
- ; added so offset of next field
- ; is divisible by 4
- id DWORD 3 ; Offset is 24
- STUDENT4 ENDS
-
- The alignment value affects memory allocation of structure variables.
-
- The alignment value affects the alignment of structure variables, so adding
- an alignment value affects memory usage. This feature provides compatibility
- with structures in Microsoft C.
-
- With MASM 6.0, C programmers can use the H2INC utility to translate C
- structures to MASM (see Chapter 16).
-
-
- 5.2.2 Defining Structure and Union Variables
-
- Once you have declared a structure or union type, variables of that type can
- be defined. For each variable defined, memory is allocated in the current
- segment in the format declared by the type. The syntax for defining a
- structure or union variable is:
-
- [[name]] typename < [[initializer
- [[,initializer]]...]] >
-
- [[name]] typename { [[initializer
- [[,initializer]]...]] }
-
- [[name]] typename constant
- DUP ({ [[initializer [[,initializer]]...]]
- })
-
- The name is the label assigned to the variable. If no name is given, the
- assembler allocates space for the variable but does not give it a symbolic
- name. The typename is the name of a previously declared structure or union
- type.
-
- An initializer can be given for each field. The type of each initializer
- must be the type of the corresponding field defined in the type declaration.
- For unions, the type of the initializer must be the same as the type for the
- first field. An initialization list can also be repeated using the DUP
- operator.
-
- The list of initializers can be broken only after a comma unless you use a
- line continuation character (\) at the end of the line. The last curly brace
- or angle bracket must appear on the same line as the last initializer. You
- can also use the line continuation character to extend a line as shown in
- the Item4 declaration below. Angle brackets and curly braces can be
- intermixed in an initialization as long as they match. This example using
- the ITEMS structure illustrates the options for initializing lists:
-
- ITEMS STRUCT
- Iname BYTE 'Item Name'
- Inum WORD ?
- ITYPE UNION
- oldtype BYTE 0
- newtype WORD ?
- ENDS
- ITEMS ENDS
- .
- .
- .
- .DATA
- Item1 ITEMS < > ; Accepts default initializers
- Item2 ITEMS { } ; Accepts default initializers
- Item3 ITEMS <'Bolts', 126> ; Overrides default value of first
- ; 2 fields; use default of
- ; the third field
- Item4 ITEMS { \
- 'Bolts', ; Item name
- 126 \ ; Part number
- }
-
- The angle brackets or curly braces are required even if no initial value is
- given, as in Item1 and Item2 in the example. If initial values are given
- for more than one field, the values must be separated by commas, as shown in
- Item3.
-
- You need not initialize all fields in a structure. If an initial value is
- blank, the assembler automatically uses the default initial value of the
- field, which was originally provided in the structure type declaration. If
- there is no default value, the field is undefined.
-
- For nested structures or unions (see Section 5.2.4), however, these are
- equivalent:
-
- Item5 ITEMS {'Bolts', , }
- Item6 ITEMS {'Bolts', , { } }
-
- A variable and an array of union type WB look like this:
-
- WB UNION
- w WORD ?
- b BYTE ?
- WB ENDS
-
- num WB {0Fh} ; Store 0Fh
- array WB (40 / SIZEOF WB) DUP ({2}) ; Allocates and
- ; initializes 10 unions
-
- (This figure may be found in the printed book.)
-
- In MASM 6.0, control structures (such as IF, macros, and directives) are
- also allowed within structure and union declarations.
-
-
- Arrays as Field Initializers
-
- Default initializers for string or array fields set the size for the field.
-
-
- The length of the array that can override the contents of a field in a
- variable definition is fixed by the size of the initializer. The override
- cannot contain more elements than the default. Specifying fewer override
- array elements changes the first n values of the default where n is the
- number of values in the override. The rest of the array elements take their
- default values from the initializer.
-
-
- Strings as Field Initializers
-
- If the override is shorter, the assembler pads the override with spaces to
- equal the length of the initializer. If the initializer is a string and the
- override value is not a string, the override value must be enclosed in angle
- brackets or curly braces.
-
- A string may be used to override any member of type BYTE (or SBYTE). The
- string does not need to be enclosed in angle brackets or curly braces unless
- mixed with other override methods.
-
- The string fields for structure variables are the length defined by the type
- declaration.
-
- If a structure has an initialized string field or an array of bytes, any new
- string assigned to a variable of the field that is smaller than the default
- is padded with spaces. The assembler adds four spaces at the end of 'Bolts'
- in the variables of type ITEMS above. The Iname field in the ITEMS
- structure cannot contain a field initializer longer than 'Item Name'.
-
-
- Structures as Field Initializers
-
- Initializers for structure variables must be enclosed in curly braces or
- angle brackets, but you can specify overrides with fewer elements than the
- defaults.
-
- This example illustrates the use of default values with structures as field
- initializers:
-
- DISKDRIVES STRUCT
- a1 BYTE ?
- b1 BYTE ?
- c1 BYTE ?
- DISKDRIVES ENDS
-
- INFO STRUCT
- buffer BYTE 100 DUP (?)
- crlf BYTE 13, 10
- query BYTE 'Filename: ' ; String <= can override
- endmark BYTE 36
- drives DISKDRIVES <0, 1, 1>
- INFO ENDS
-
- info1 INFO { , , 'Dir' }
-
- ; Illegal since name in query field is too long
- ; and a string cannot initialize a field defined with DUP:
- ; info2 INFO {"TESTFILE", , "DirectoryName",}
-
- lotsof INFO { , , 'file1', , {0,0,0} },
- { , , 'file2', , {0,0,1} },
- { , , 'file3', , {0,0,2} }
-
- The diagram below shows how the assembler stores info1.
-
- (This figure may be found in the printed book.)
-
- The initialization for drives gives default values for all three fields of
- the structure. The fields left blank in info1 use the default values for
- those fields. The info2 declaration is illegal since "DirectoryName" is
- longer than the initial string for that field, and the "TESTFILE" string
- cannot initialize a field defined with DUP.
-
-
- Arrays of Structures and Unions
-
- You can define an array of structures using the DUP operator (see Section
- 5.1.1, "Declaring and Referencing Arrays") or by creating a list of
- structures. For example, you can define an array of structure variables like
- this:
-
- Item7 ITEMS 30 DUP ({,,{10}})
-
- The Item7 array defined here has 30 elements of type ITEMS, with the
- third field of each element (the union) initialized to 10.
-
- You can also list array elements as shown in this example:
-
- Item8 ITEMS {'Bolts', 126, 10},
- {'Pliers',139, 10},
- {'Saws', 414, 10}
-
-
- Structure Redefinition
-
- The assembler generates an error for a structure redefinition unless all of
- the following are the same:
-
-
- ■ Field names
-
- ■ Offsets of named fields
-
- ■ Initialization lists
-
- ■ Field alignment value
-
-
- Additionally, all fields must be present and at the same offset.
-
-
- LENGTHOF, SIZEOF, and TYPE for Structures
-
- The size of a structure determined by SIZEOF is the offset of the last
- field, plus the size of the last field, plus any padding required for proper
- alignment (see Section 5.2.1 for information about alignment). This example,
- using the data declarations above, shows how to use the LENGTHOF, SIZEOF,
- and TYPE operators with structures:
-
- INFO STRUCT
- buffer BYTE 100 DUP (?)
- crlf BYTE 13, 10
- query BYTE 'Filename: '
- endmark BYTE 36
- drives DISKDRIVES <0, 1, 1>
- INFO ENDS
-
- info1 INFO { , , 'Dir' }
- lotsof INFO { , , 'file1', , {0,0,0} },
- { , , 'file2', , {0,0,1} },
- { , , 'file3', , {0,0,2} }
-
- sinfo1 EQU SIZEOF info1 ; 116 = number of bytes in
-
- ; initializers
- linfo1 EQU LENGTHOF info1 ; 1 = number of items
- tinfo1 EQU TYPE info1 ; 116 = same as size
-
- slotsof EQU SIZEOF lotsof ; 116 * 3 = number of bytes in
- ; initializers
- llotsof EQU LENGTHOF lotsof ; 3 = number of items
- tlotsof EQU TYPE lotsof ; 116 = same as size for structure
-
- ; of type INFO
-
-
- LENGTHOF, SIZEOF, and TYPE for Unions
-
- The size of a union determined by SIZEOF is the size of the longest field
- plus any padding required. The length of a union variable determined by
- LENGTHOF equals the number of initializers defined inside angle brackets or
- curly braces. TYPE returns a value indicating the type of the longest field.
-
-
- DWB UNION
- d DWORD ?
- w WORD ?
- b BYTE ?
- DWB ENDS
-
- num DWB {0FFFFh}
- array DWB (100 / SIZEOF DWB) DUP ({0})
-
- snum EQU SIZEOF num ; = 4
- lnum EQU LENGTHOF num ; = 1
- tnum EQU TYPE num ; = 4
- sarray EQU SIZEOF array ; = 100 (4*25)
- larray EQU LENGTHOF array ; = 25
- tarray EQU TYPE array ; = 4
-
-
- 5.2.3 Referencing Structures, Unions, and Fields
-
- Like other variables, structure variables can be accessed by name. You can
- access fields within structure variables with this syntax:
-
- variable.field
-
- In MASM 6.0, references to fields must always be fully qualified, with both
- the structure or union name and the dot operator preceding the field name.
- Also, in MASM 6.0, the dot operator can be used only with structure fields,
- not as an alternative to the plus operator; nor can the plus operator be
- used as an alternative to the dot operator.
-
- This example shows several ways to reference the fields of a structure
- called date.
-
- DATE STRUCT ; Defines structure
- type
- month BYTE ?
- day BYTE ?
- year WORD ?
- DATE ENDS
-
- yesterday DATE {9, 30, 1987} ; Declare structure
- ; variable
- .
- .
- .
- mov al, yesterday.day ; Use structure variables
- mov bx, OFFSET yesterday ; Load structure address
- mov al, (DATE PTR [bx]).month ; Use as indirect operand
- mov al, [bx].date.month ; This is necessary if
- ; month were already a
-
- ; field in a different
- ; structure
-
- Under OPTION M510 or OPTION OLDSTRUCTS, unique structure names do not need
- to be qualified. See Section 1.3.2 for information on the OPTION directive.
-
-
- If the NONUNIQUE keyword appears in a structure definition, all fields of
- the structure must be fully qualified when referenced, even if the OPTION
- OLDSTRUCTS directive appears in the code. Also, in MASM 6.0, all references
- to a field must be qualified.
-
- Even if the initialized union is the size of a WORD or DWORD, members of
- structures or unions are accessible only through the field's names.
-
- In the following example, the two MOV statements show how you can access the
- elements of an array of structures.
-
- WB UNION
- w WORD ?
- b BYTE ?
- WB ENDS
-
- array WB (100 / SIZEOF WB) DUP ({0})
-
- mov array[12].w, 40
- mov array[32].b, 2
-
- (This figure may be found in the printed book.)
-
- The WB union cannot be used directly as a WORD variable. However, you can
- define a union containing both the structure and a WORD variable and access
- either field. (The next section discusses nested structures and unions.)
-
- You can use unions to access the same data in more than one form. For
- example, one application of structures and unions is to simplify the task of
- reinitializing a far pointer. If you have a far pointer declared as
-
- FPWORD TYPEDEF FAR PTR WORD
-
- .DATA
- BoxB FPWORD ?
- BoxA FPWORD ?
- BoxB2 uptr < >
-
- you must follow these steps to point BoxB to BoxA:
-
- mov bx, OFFSET BoxA
- mov WORD PTR BoxB[2], ds
- mov WORD PTR BoxB, bx
-
- When you do this, you must remember whether the segment or the offset is
- stored first. However, if your program contains this union:
-
- uptr UNION
- dwptr FPWORD 0
- STRUCT
- offs WORD 0
- segm WORD 0
- ENDS
- uptr ENDS
-
- you can initialize a far pointer with these steps:
-
- mov BoxB2.segm, ds
- mov BoxB2.offs, bx
- lds si, BoxB2.dwptr
-
- This code moves the segment and the offset into the pointer and then moves
- the pointer into a register with the other field of the union. Although this
- technique does not reduce the code size, it avoids confusion about the order
- for loading the segment and offset.
-
-
- 5.2.4 Nested Structures and Unions
-
- Structures and unions in MASM 6.0 can be nested in several ways. This
- section explains how to refer to the fields in a nested structure or union.
- The example below illustrates the four techniques for nesting and how to
- reference the fields. Note the syntax for nested structures. The discussion
- of these techniques follows the example.
-
- ITEMS STRUCT
- Inum WORD ?
- Iname BYTE 'Item Name'
- ITEMS ENDS
-
- INVENTORY STRUCT
- UpDate WORD ?
- oldItem ITEMS { \
- ?,
- 'AF8' \ ; Named variable of
- } ; existing structure
- ITEMS { ?, '94C' } ; Unnamed variable of
- ; existing type
- STRUCT ups ; Named nested structure
- source WORD ?
- shipmode BYTE ?
- ENDS
- STRUCT ; Unnamed nested structure
- f1 WORD ?
- f2 WORD ?
- ENDS
- INVENTORY ENDS
-
- .DATA
-
- yearly INVENTORY { }
-
- ; Referencing each type of data in the yearly structure:
-
- mov ax, yearly.oldItem.Inum
- mov yearly.ups.shipmode, 'A'
- mov yearly.Inum, 'C'
- mov ax, yearly.f1
-
- To nest structures and unions, you can use any of these techniques:
-
-
- ■ The field of a structure or union can be a named variable of an
- existing structure or union type, as in the oldItem field. The field
- names in oldItem are not unique, so the full field names must be
- used when referencing those fields in the statement
-
- mov ax, yearly.oldItem.Inum
-
-
- ■ To declare a named structure or union inside another structure or
- union, give the STRUCT or UNION keyword first and then define a label
- for it. Fields of the nested structure or union must always be
- qualified, as shown in this example:
-
- mov yearly.ups.shipmode, 'A'
-
-
- ■ As shown in the Items field of Inventory, you can also use unnamed
- variables of existing structures or unions inside another structure or
- union. In this case you can reference its fields directly, as shown in
- this example:
-
- mov yearly.Inum, 'C'
- mov ax, yearly.f1
-
-
-
- Offsets of nested structures are relative to the nested structure, not the
- root structure. In the example above, the offset of yearly.ups.shipmode is
- (current address of yearly) + 8 + 2. It is relative to the ups structure,
- not the yearly structure.
-
-
- 5.3 Records
-
- Records are similar to structures, except that fields in records are bit
- strings. Each bit field in a record variable can be used separately in
- constant operands or expressions. The processor cannot access bits
- individually at run time, but it can access bit fields with instructions
- that manipulate bits.
-
- Record fields are bits, not bytes or words.
-
- Records are bytes, words, or doublewords in which the individual bits or
- groups of bits are considered fields. In general, the three steps for using
- record variables are the same as those for other complex data types:
-
-
- 1. Declare a record type.
-
- 2. Define one or more variables having the record type.
-
- 3. Reference record variables using shifts and masks.
-
-
- Once defined, the record variable can be used as an operand in assembler
- statements.
-
- This section explains the record declaration syntax and the use of the MASK
- and WIDTH operators. It also shows a few applications of record variables
- and constants.
-
-
- 5.3.1 Declaring Record Types
-
- A record type creates a template for data with the sizes and, optionally,
- the initial values for bit fields in the record, but it does not allocate
- memory space for the record.
-
- The RECORD directive declares a record type for an 8-bit, 16-bit, or 32-bit
- record that contains one or more bit fields. The maximum size is based on
- the expression word size. See OPTION EXPR16 and OPTION EXPR32 in Section
- 1.3.2. The syntax is
-
- recordname RECORD field [[,field]]...
-
- The field declares the name, width, and initial value for the field. The
- syntax for each field is:
-
- fieldname:width[[=expression]]
-
- Global labels, macro names, and record field names must all be unique, but
- record field names can have the same names as structure field names or
- global labels. Width is the number of bits in the field, and expression is a
- constant giving the initial (or default) value for the field. Record
- definitions can span more than one line if the continued lines end with
- commas.
-
- If expression is given, it declares the initial value for the field. The
- assembler generates an error message if an initial value is too large for
- the width of its field.
-
- The assembler shifts bits in a record to the right if all bits are not used.
-
-
- The first field in the declaration always goes into the most significant
- bits of the record. Subsequent fields are placed to the right in the
- succeeding bits. If the fields do not total exactly 8, 16, or 32 bits as
- appropriate, the entire record is shifted right, so the last bit of the last
- field is the lowest bit of the record. Unused bits in the high end of the
- record are initialized to 0.
-
- The following example creates a byte record type color having four fields:
- blink, back, intense, and fore. The contents of the record type are
- shown after the example. Since no initial values are given, all bits are set
- to 0. Note that this is only a template maintained by the assembler. No data
- is created.
-
- COLOR RECORD blink:1, back:3, intense:1, fore:3
-
- (This figure may be found in the printed book.)
-
- The next example creates a record type cw having six fields. Each record
- declared with this type occupies 16 bits of memory. Initial (default) values
- are given for each field. They can be used when data is declared for the
- record. The bit diagram after the example shows the contents of the record
- type.
-
- CW RECORD r1:3=0, ic:1=0, rc:2=0, pc:2=3, r2:2=1, masks:6=63
-
- (This figure may be found in the printed book.)
-
-
- 5.3.2 Defining Record Variables
-
- Once you have declared a record type, you can define record variables of
- that type. For each variable, memory is allocated to the object file in the
- format declared by the type. The syntax is
-
- [[name]] recordname <[[initializer
- [[,initializer]]...]] > <$IAngle
- brackets (<< \ra);records>
-
- [[name]] recordname {
- [[initializer [[,initializer]]...]]
- }
-
- [[name]] recordname constant
- DUP ( [[initializer [[,initializer]]...]]
- )
-
- The recordname is the name of a record type that was previously declared by
- using the RECORD directive.
-
- A fieldlist for each field in the record can be a list of integers,
- character constants, or expressions that correspond to a value compatible
- with the size of the field. Curly braces or angle brackets are required even
- if no initial value is given.
-
- If you use the DUP operator (see Section 5.1.1, "Declaring and Referencing
- Arrays") to initialize multiple record variables, only the angle brackets
- and initial values, if given, need to be enclosed in parentheses. For
- example, you can define an array of record variables with
-
- xmas COLOR 50 DUP ( <1, 2, 0, 4> )
-
- You do not have to initialize all fields in a record. If an initial value is
- blank, the assembler automatically stores the default initial value of the
- field. If there is no default value, the assembler clears each bit in the
- field.
-
- The definition in the example below creates a variable named warning whose
- type is given by the record type color. The initial values of the fields in
- the
-
- variable are set to the values given in the record definition. The initial
- values override any default record values, had any been given in the
- declaration.
-
- COLOR RECORD blink:1,back:3,intense:1,fore:3 ; Record
- ; declaration
- warning COLOR <1, 0, 1, 4> ; Record
- ; definition
-
- (This figure may be found in the printed book.)
-
-
- LENGTHOF, SIZEOF, and TYPE with Records
-
- The SIZEOF and TYPE operators applied to a record name return the number of
- bytes used by the record. SIZEOF for a record variable returns the number of
- bytes used by the variable. You cannot use LENGTHOF with record types, but
- you can with the variables of that type. LENGTHOF returns the number of
- items in an initializer. The record can be used as an operand. The value of
- the operand is a bit mask of the defined record. This example illustrates
- these points.
-
- ; Record definition
- ; 9 bits stored in 2 bytes
- RGBCOLOR RECORD red:3, green:3, blue:3
-
- mov ax, RGBCOLOR ; Equivalent to "mov ax,
- ; 01FFh"
- ; mov ax, LENGTHOF RGBCOLOR ; Illegal since LENGTHOF can
- ; apply only to data label
- mov ax, SIZEOF RGBCOLOR ; Equivalent to "mov ax, 2"
-
- mov ax, TYPE RGBCOLOR ; Equivalent to "mov ax, 2"
-
-
- ; Record instance
- ; 8 bits stored in 1 byte
- RGBCOLOR2 RECORD red:3, green:3, blue:2
- rgb RGBCOLOR2 <1, 1, 1> ; Initialize to 025h
-
- mov ax, RGBCOLOR2 ; Equivalent to "mov ax,
- ; 00FFhh"
- mov ax, LENGTHOF rgb ; Equivalent to "mov ax,
- 1"
- mov ax, SIZEOF rgb ; Equivalent to "mov ax,
- 1"
- mov ax, TYPE rgb ; Equivalent to "mov ax,
- 1"
-
-
- 5.3.3 Record Operators
-
- The WIDTH operator (which is used only with records) returns the width in
- bits of a record or record field. The MASK operator returns a bit mask for
- the bit positions occupied by the given record field. A bit in the mask
- contains a 1 if that bit corresponds to a bit field. The example below shows
- how to use MASK and WIDTH.
-
- .DATA
- COLOR RECORD blink:1, back:3, intense:1, fore:3
- message COLOR <1, 5, 1, 1>
- wblink EQU WIDTH blink ; "wblink" = 1
- wback EQU WIDTH back ; "wback" = 3
- wintense EQU WIDTH intense ; "wintense" = 1
- wfore EQU WIDTH fore ; "wfore" = 3
- wcolor EQU WIDTH color ; "wcolor" = 8
- .CODE
- .
- .
- .
- mov ah, message ; Load initial 0101 1001
- and ah, NOT MASK back ; Turn off AND 1000 1111
- ; "back" ---------
- ; 0000 1001
- or ah, MASK blink ; Turn on OR 1000 0000
- ; "blink" ---------
- ; 1000 1001
- xor ah, MASK intense ; Toggle XOR 0000 1000
- ; "intense" ---------
- ; 1000 0001
- .
- IF (WIDTH color) GE 8 ; If color is 16 bit, load
- mov ax, message ; into 16-bit register
- ELSE ; else
- mov al, message ; load into low 8-bit register
- xor ah, ah ; and clear high 8-bits
- ENDIF
-
- This example illustrates several ways in which record fields can be used as
- operands and in expressions.
-
- ; Rotate "back" of "cursor" without changing other
- values
-
- mov al, cursor ; Load value from memory
- mov ah, al ; Save a copy for work 1101
- 1001=ah/al
- and al, NOT MASK back; Mask out old bits AND
- 1000 1111=mask
- ; to save old cursor ---------
- ; 1000
- 1001=al
- mov cl, back ; Load bit position
- shr ah, cl ; Shift to right 0000
- 1101=ah
- inc ah ; Increment 0000
- 1110=ah
-
- shl ah, cl ; Shift left again 1110
- 0000=ah
- and ah, MASK back ; Mask off extra bits AND
- 0111 0000=mask
- ; to get new cursor ---------
- ; 0110
- 0000 ah
- or ah, al ; Combine old and new OR
- 1000 1001 al
- ; ---------
- mov cursor, ah ; Write back to memory 1110
- 1001 ah
-
- Record variables are often used with the logical operators to perform
- logical operations on the bit fields of the record, as in the previous
- example using the MASK operator.
-
-
- 5.4 Related Topics in Online Help
-
- In addition to information on all the instructions and directives mentioned
- in this chapter, information on the following topics can be found in online
- help, starting at the "MASM 6.0 Contents" screen:
-
- Topic Access
- ────────────────────────────────────────────────────────────────────────────
- INS, OUTS Choose "Processor Instructions" and then
- "System and I/O Access"
-
- LABEL Choose "Directives" and then "Code
- Labels"
-
- RECORD, UNION, STRUCT, MASK, ORG Choose "Directives" and then choose
- , WIDTH, and ALIGN "Complex Data Types"
-
- SHRD, SHLD, BSF, and BSR From "Processor Instructions," choose
- "Logical and Shifts"
-
- BOUND From "Processor Instructions," choose
- "Data
- Transfer"
-
-
-
-
-
-
-
-
- Chapter 6 Using Floating-Point and Binary Coded Decimal Numbers
- ────────────────────────────────────────────────────────────────────────────
-
- MASM requires different techniques for handling floating-point (real)
- numbers and binary coded decimal (BCD) numbers than for handling integers.
- You have two choices for working with real numbers─a math coprocessor or
- emulation routines.
-
- Math coprocessors─the 8087, 80287, and 80387 chips─work with the main
- processor to handle real-number calculations. The 80486 processor performs
- floating-point operations directly. All information in this chapter
- pertaining to the 80387 coprocessor applies to the 80486 processor as well.
-
-
- This chapter begins with a summary of the directives and formats of
- floating-point data; you need to use these to allocate memory storage and
- initialize variables before you can work with floating-point numbers.
-
- The chapter then explains how to use a math coprocessor for floating-point
- operations. It covers these areas:
-
-
- ■ The architecture of the registers
-
- ■ The operands for the coprocessor instruction formats
-
- ■ The coordination of coprocessor and main processor memory access
-
- ■ The basic groups of coprocessor instructions─for loading and storing
- data, doing arithmetic calculations, and controlling program flow
-
-
- The next main section describes emulation libraries. With the emulation
- routines provided with all Microsoft high-level languages, you can use
- coprocessor instructions as though your computer had a math coprocessor.
- However, some coprocessor instructions are not handled by emulation, as this
- section explains.
-
- Finally, because math coprocessor and emulation routines can also operate on
- BCD numbers, this chapter discusses the instruction set for these numbers.
-
-
- 6.1 Using Floating-Point Numbers
-
- Before using floating-point data in your program, you need to allocate the
- memory storage for the data. You can then initialize variables either as
- real numbers in decimal form or as encoded hexadecimals. The assembler
- stores allocated data in 10-byte IEEE format. This section looks at
- floating-point declarations and floating-point data formats.
-
-
- 6.1.1 Declaring Floating-Point Variables and Constants
-
- You can allocate real constants using the REAL4, REAL8, and REAL10
- directives. The list below shows the size of the floating-point number each
- of these directives allocates.
-
- Directive Size
- ────────────────────────────────────────────────────────────────────────────
- REAL4 Short (32-bit) real numbers
- REAL8 Long (64-bit) real numbers
- REAL10 10-byte (80-bit) real numbers and BCD numbers
-
- The possible ranges for floating-point variables are given in Table 6.1.
-
- Table 6.1 Ranges of Floating-Point Variables
-
- Significant
- Data Type Bits Digits Approximate Range
- ────────────────────────────────────────────────────────────────────────────
- Short real 32 6-7 ±1.18 x 10-38 to ±3.40 x 10(38)
-
- Long real 64 15-16 ±2.23 x 10-308 to ±1.79 x 10(308)
-
- 10-byte real 80 19 ±3.37 x 10-4932 to ±1.18 x 10
- (4932)
-
- ────────────────────────────────────────────────────────────────────────────
-
-
- With previous versions of MASM, the DD, DQ, and DT directives could be used
- to allocate real constants. These directives are still supported by MASM
- 6.0, but this means that the variables are integers rather than
- floating-point values. Although this makes no difference in the assembly
- code, CodeView displays the values incorrectly.
-
- There are two forms for specifying floatingpoint numbers.
-
- You can specify floating-point constants either as decimal constants or as
- encoded hexadecimal constants. You can express decimal real-number constants
- in the form
-
- [[+ | -]] integer.[[fraction]][[E]][[[[+
- | -]]exponent]]
-
- For example, the numbers 2.523E1 and -3.6E-2 are written in the correct
- decimal format. These numbers can be used as initializers for real-number
- variables.
-
- Digits of real numbers are always evaluated as base 10. During assembly, the
- assembler converts real-number constants given in decimal format to a binary
- format. The sign, exponent, and mantissa of the real number are encoded as
- bit fields within the number.
-
- You can also specify the encoded format directly with hexadecimal digits
- (0-9 plus A-F). The number must begin with a decimal digit (0-9) and a
- leading zero if necessary, and end with the real-number designator (R). It
- cannot be signed.
-
- For example, the hexadecimal number 3F800000r can be used as an
- initializer for a doubleword-sized variable.
-
- The maximum range of exponent values and the number of digits required in
- the hexadecimal number depend on the directive. The number of digits for
- encoded numbers used with REAL4, REAL8, and REAL10 must be 8, 16, and 20
- digits, respectively. If the number has a leading zero, the number must be
- 9, 17, or 21 digits.
-
- Examples of decimal constant and hexadecimal specifications are shown here:
-
-
- ; Real numbers
- short REAL4 25.23 ; IEEE format
- double REAL8 2.523E1 ; IEEE format
- tenbyte REAL10 2523.0E-2 ; 10-byte real format
-
- ; Encoded as hexadecimals
- ieeeshort REAL4 3F800000r ; 1.0 as IEEE short
- ieeedouble REAL8 3FF0000000000000r ; 1.0 as IEEE long
- temporary REAL10 3FFF8000000000000000r ; 1.0 as 10-byte
- ; real
-
- Section 6.1.2, "Storing Numbers in Floating-Point Format," explains the IEEE
- formats--the way the assembler actually stores the data.
-
- Pascal or C programmers may prefer to create language-specific TYPEDEF
- declarations, as illustrated in this example:
-
- ; C-language specific
- float TYPEDEF REAL4
- double TYPEDEF REAL8
- long_double TYPEDEF REAL10
- ; Pascal-language specific
- SINGLE TYPEDEF REAL4
- DOUBLE TYPEDEF REAL8
- EXTENDED TYPEDEF REAL10
-
- For applications of TYPEDEF other than aliasing, see Section 3.3.1,
- "Defining Pointer Types with TYPEDEF."
-
-
- 6.1.2 Storing Numbers in Floating-Point Format
-
- The assembler stores real numbers in the IEEE format.
-
- The assembler stores the floating-point variables in the IEEE format. MASM
- 6.0 does not support .MSFLOAT and Microsoft binary format, which are
- available in previous versions.
-
- Figure 6.1 illustrates the IEEE format for encoding short (four-byte), long
- (eight-byte), and 10-byte real numbers. Although this figure places the
- most-significant bit first for illustration, low bytes actually appear first
- in memory.
-
- (This figure may be found in the printed book.)
-
- This is how the parts of a real number are stored in the IEEE format:
-
-
- 1. Sign bit (0 for positive or 1 for negative) in the upper bit of the
- first byte.
-
- 2. Exponent in the next bits in sequence (8 bits for a short real number,
- 11 bits for a long real number, and 15 bits for a 10-byte real
- number).
-
- 3. Mantissa in the remaining bits. The first bit is always assumed to be
- 1. The length is 23 bits for short real numbers, 52 bits for long real
- numbers, and 63 bits for 10-byte reals.
-
-
- The exponent field represents a multiplier 2n. To accommodate negative
- exponents (such as 2-6), the value in the exponent field is biased; that is,
- the actual exponent is determined by subtracting the appropriate bias value
- from the value in the exponent field. For example, the bias for short reals
- is 127. If the value in the exponent field is 130, the exponent represents a
- value of 2130-127, or 23. The bias for long reals is 1,023. The bias for
- 10-byte reals is 16,383.
-
- Notice that the 10-byte real format stores the integer part of the mantissa.
- This differs from the 4-byte and 8-byte formats, in which the integer part
- is implicit.
-
- Once you have declared floating-point data for your program, you can use
- coprocessor or emulator instructions to access the data. The next section
- focuses on the coprocessor architecture, instructions, and operands required
- for floating-point operations.
-
-
- 6.2 Using a Math Coprocessor
-
- When used with real numbers, packed BCD numbers, or long integers,
- coprocessors (the 8087, 80287, 80387, and 80486) calculate many times faster
- than the 8086-based processors. The coprocessor handles data with its own
- registers. The organization of these registers reflects four possible
- formats for using operands (as explained in Section 6.2.2, "Instruction and
- Operand Formats").
-
- This section also describes how the coprocessor performs various tasks:
- transferring data to and from the coprocessor, coordinating processor and
- coprocessor operations, and controlling program flow.
-
-
- 6.2.1 Coprocessor Architecture
-
- The coprocessor accesses memory as the CPU does, but it has its own data and
- control registers--eight data registers organized as a stack and seven
- control registers similar to the 8086 flag registers. The coprocessor's
- instruction set provides direct access to these registers.
-
- The eight coprocessor data registers form a stack.
-
- The eight 80-bit data registers of the 8087-based coprocessors are organized
- as a stack although they need not be used as a stack. As data items are
- pushed into the top register, previous data items move into higher-numbered
- registers, which are lower on the stack. Register 0 is the top of the stack;
- register 7 is the bottom. The syntax for specifying registers is shown
- below:
-
- ST «(number)»
-
- The number must be a digit between 0 and 7 or a constant expression that
- evaluates to a number from 0 to 7. ST is another way to refer to ST(0).
-
- All coprocessor data is stored in registers in the 10-byte real format. The
- registers and the register format are shown in Figure 6.2.
-
- (This figure may be found in the printed book.)
-
- Internally, all calculations are done on numbers of the same type. Since
- 10-byte real numbers have the greatest precision, lower-precision numbers
- are guaranteed not to lose precision as a result of calculations. The
- instructions that transfer values between the main memory and the
- coprocessor automatically convert numbers to and from the 10-byte real
- format.
-
-
- 6.2.2 Instruction and Operand Formats
-
- Because of the stack organization of registers, you can consider registers
- either as elements on a stack or as registers much like 8086-family
- registers. Table 6.2 lists the four main groups of coprocessor instructions
- and the general syntax for each. The names given to the instruction format
- reflect the way the instruction uses the coprocessor registers. The
- instruction operands are placed in the coprocessor data registers before the
- instruction executes.
-
- Table 6.2 Coprocessor Operand Formats
-
- Instruction Implied Operands
- Format Syntax Example
- ────────────────────────────────────────────────────────────────────────────
- Classical stack Faction ST, ST(1) fadd
-
- Memory Faction memory ST fadd memloc
-
- Register Faction ST(num), ─ fadd st(5), st
- ST fadd st, st(3)
- Faction ST, ST(
- num)
-
- Register pop FactionP ST(num ─ faddp st(4), st
- ), ST
-
- ────────────────────────────────────────────────────────────────────────────
-
-
- All coprocessor instructions begin with F.
-
- You can easily recognize coprocessor instructions because, unlike all
- 8086-family instruction mnemonics, they start with the letter F. Coprocessor
- instructions can never have immediate operands and, with the exception of
- the FSTSW instruction, they cannot have processor registers as operands.
-
-
- 6.2.2.1 Classical-Stack Format
-
- Instructions in the classical-stack format treat the coprocessor registers
- like items on a stack─thus its name. Items are pushed onto or popped off the
- top elements of the stack. Since only the top item can be accessed on a
- traditional stack, there is no need to specify operands. The first (top)
- register (and the second if the instruction needs two operands) is always
- assumed.
-
- In coprocessor arithmetic operations, the top of the stack (ST) is the
- source operand and the second register [ST(1)] is the destination. The
- result of the operation goes into the destination operand, and the source is
- popped off the stack. The result is left at the top of the stack.
-
- Instructions that load constants are one example of instructions that
- require the classical-stack format. In this case, the constant created by
- the instruction is the implied source, and the top of the stack is the
- destination.
-
- This example illustrates the classical-stack format, and Figure 6.3 shows
- the status of the register stack after each instruction:
-
- fld1 ; Push 1 into first position
- fldpi ; Push pi into first position
- fadd ; Add pi and 1 and pop
-
- (This figure may be found in the printed book.)
-
-
- 6.2.2.2 Memory Format
-
- Instructions using the memory format, such as data transfer instructions,
- also treat coprocessor registers like items on a stack. However, with this
- format, items are pushed from memory onto the top element of the stack or
- popped from the top element to memory. You must specify the memory operand.
-
-
- Some coprocessor instructions operate on integers or BCDs.
-
- Some instructions that use the memory format specify how a memory operand is
- to be interpreted─as an integer (I) or as a binary coded decimal (B). The
- letter I or B follows the initial F in the syntax. For example, FILD
- interprets its operand as an integer and FBLD interprets its operand as a
- BCD number. If the instruction name does not include a type letter, the
- instruction works on real numbers.
-
- You can also use memory operands in calculation instructions that operate on
- two values (see Section 6.2.4, "Using Coprocessor Instructions"). The memory
- operand is always the source. The stack top (ST) is always the implied
- destination. The result of the operation replaces the destination without
- changing its stack position, as shown in this example and Figure 6.4:
-
- .DATA
- m1 REAL4 1.0
- m2 REAL4 2.0
- .CODE
- .
- .
- .
- fld m1 ; Push m1 into first position
- fld m2 ; Push m2 into first position
- fadd m1 ; Add m2 to first position
- fstp m1 ; Pop first position into m1
- fst m2 ; Copy first position to m2
-
- (This figure may be found in the printed book.)
-
-
- 6.2.2.3 Register Format
-
- Instructions using the register format treat coprocessor registers as
- registers rather than as stack elements. Instructions that use this format
- require two register operands; one of them must be the stack top (ST).
-
- In the register format, specify all operands by name. The first operand is
- the destination; its value is replaced with the result of the operation. The
- second operand is the source; it is not affected by the operation. The stack
- position of the operands does not change.
-
- The only instructions using the register operand format are the FXCH
- instruction and the arithmetic instructions that do calculations on two
- values. With the FXCH instruction, the stack top is implied and need not be
- specified, as shown in this example and Figure 6.5:
-
- fadd st(1), st ; Add second position to first -
- ; result goes in second position
- fadd st, st(2) ; Add first position to third -
- ; result goes in first position
- fxch st(1) ; Exchange first and second positions
-
- (This figure may be found in the printed book.)
-
-
- 6.2.2.4 Register-Pop Format
-
- The register-pop format treats coprocessor registers as a modified stack.
- The source register must always be the stack top. Specify the destination
- with the register's name.
-
- Instructions with this format place the result of the operation into the
- destination operand, and the stack top pops off the stack. The effect is
- that both values being operated on are lost and the result of the operation
- is saved in the specified destination register. The register-pop format is
- used only for instructions that do calculations on two values, as in this
- example and Figure 6.6:
-
- faddp st(2), st ; Add first and third positions and
- pop -
- ; first position destroyed;
- ; third moves to second and holds result
-
- (This figure may be found in the printed book.)
-
-
- 6.2.3 Coordinating Memory Access
-
- The math coprocessor works simultaneously with the main processor. However,
- since the coprocessor cannot handle device input or output, data originates
- in the main processor.
-
- The processor and coprocessor exchange data through memory.
-
- The main processor and the coprocessor have their own registers, which are
- completely separate and inaccessible to each other. They usually exchange
- data through memory, since memory is available to both.
-
- When using the coprocessor, follow these three steps:
-
-
- 1. Load data from memory to coprocessor registers.
-
- 2. Process the data.
-
- 3. Store the data from coprocessor registers back to memory.
-
-
- Step 2, processing the data, can occur while the main processor is handling
- other tasks. Steps 1 and 3 must be coordinated with the main processor so
- that the processor and coprocessor do not try to access the same memory at
- the same time; otherwise, problems of coordinating memory access can occur.
- Since the processor and coprocessor work independently, they may not finish
- working on memory in the order in which you give instructions. Two potential
- timing conflicts can occur; they are handled in different ways.
-
- One timing conflict results if a coprocessor instruction follows a processor
- instruction. The processor may have to wait until the coprocessor finishes
- if the next processor instruction requires the result of the coprocessor's
- calculation. You do not have to write your code to avoid this conflict,
- however. The assembler coordinates this timing automatically for the 8088
- and 8086 processors, and the processor coordinates it automatically on the
- 80186-80486 processors. This is the first case shown in the example later in
- this section.
-
- Another conflict results if a processor instruction that accesses memory
- follows a coprocessor instruction that accesses the same memory. The
- processor can try to load a variable that is still being used by the
- coprocessor. You need careful synchronization to control the timing, and
- this synchronization is not automatic on the 8087 coprocessor. For code to
- run correctly on the 8087, you must include the WAIT or FWAIT instruction
- (they are mnemonics for the same instruction) to ensure that the coprocessor
- finishes before the processor begins, as shown in the second example. In
- this situation, the processor does not generate the FWAIT instruction
- automatically.
-
- ; Processor instruction first - No wait needed
- mov WORD PTR mem32[0], ax ; Load memory
- mov WORD PTR mem32[2], dx
- fild mem32 ; Load to register
-
- ; Coprocessor instruction first - Wait needed (for 8087)
- fist mem32 ; Store to memory
- fwait ; Wait until coprocessor
- ; is done
- mov ax, WORD PTR mem32[0] ; Move to register
- mov dx, WORD PTR mem32[2]
-
- When generating code for the 8087 coprocessor, the assembler automatically
- inserts a WAIT instruction before the coprocessor instruction. However, if
- you use the .286 or .386 directive, the compiler assumes that the
- coprocessor instructions are for the 80287 or 80387 and does not insert the
- WAIT instruction.
-
- If your code does not need to run on an 8086 or 8088 processor, you can make
- your programs shorter and more efficient by using the .286 or .386
- directive.
-
-
- 6.2.4 Using Coprocessor Instructions
-
- The 8087 family of coprocessors has separate instructions for each of the
- following operations:
-
-
- ■ Loading and storing data
-
- ■ Doing arithmetic calculations
-
- ■ Controlling program flow
-
-
- The following sections explain the available instructions and show how to
- use them for each of the operations listed above. See Section 6.2.2,
- "Instruction and Operand Formats," for general syntax information.
-
-
- 6.2.4.1 Loading and Storing Data
-
- Data-transfer instructions transfer data between main memory and the
- coprocessor registers or between different coprocessor registers. Two
- principles govern data transfers:
-
-
- ■ The choice of instruction determines whether a value in memory is
- considered an integer, a BCD number, or a real number. The value is
- always considered a 10-byte real number once it is transferred to the
- coprocessor.
-
- ■ The size of the operand determines the size of a value in memory.
- Values in the coprocessor always take up 10 bytes.
-
-
- Load commands transfer data, and store commands remove data.
-
- You can transfer data to stack registers using load commands. These commands
- push data onto the stack from memory or from coprocessor registers. Store
- commands remove data. Some store commands pop data off the register stack
- into memory or coprocessor registers; others simply copy the data without
- changing it on the stack.
-
- If you use constants as operands, you cannot load them directly into
- coprocessor registers. You must allocate memory and initialize a variable to
- a constant value. That variable can then be loaded by using one of the load
- instructions listed below.
-
- A few special instructions are provided for loading certain constants. You
- can load 0, 1, pi, and several common logarithmic values directly. Using
- these instructions is faster and often more precise than loading the values
- from initialized variables.
-
- All instructions that load constants have the stack top as the implied
- destination operand. The constant to be loaded is the implied source
- operand.
-
- The coprocessor data area, or parts of it, can also be moved to memory and
- later loaded back. You may want to do this to save the current state of the
- coprocessor before executing a procedure. After the procedure ends, restore
- the previous status. Saving coprocessor data is also useful when you want to
- modify coprocessor behavior by writing certain data to main memory,
- operating on the data with 8086-family instructions, and then loading it
- back to the coprocessor data area.
-
- You can use the following instructions for transferring numbers to and from
- registers:
-
- ╓┌──────────────────────┌────────────────────────────────────────────────────╖
- Instruction(s) Description
- ────────────────────────────────────────────────────────────────────────────
- Instruction(s) Description
- ────────────────────────────────────────────────────────────────────────────
- FLD, FST, FSTP Loads and stores real numbers
- FILD, FIST, FISTP Loads and stores binary integers
- FBLD Loads BCD
- FBSTP Stores BCD
- FXCH Exchanges register values
- FLDZ Pushes 0 into ST
- FLD1 Pushes 1 into ST
- FLDPI Pushes the value of pi into ST
- FLDCW mem2byte Loads the control word into the coprocessor
- F«N»STCW mem2byte Stores the control word in memory
- FLDENV mem14byte Loads environment from memory
- F«N»STENV mem14byte Stores environment in memory
- FRSTOR mem94byte Restores state from memory
- F«N»SAVE mem94byte Saves state in memory
- FLDL2E Pushes the value of log2e into ST
- FLDL2T Pushes log210 into ST
- FLDLG2 Pushes log102 into ST
- FLDLN2 Pushes loge2 into ST
-
-
- The following example and Figure 6.7 illustrate some of these instructions:
-
-
- .DATA
- m1 REAL4 1.0
- m2 REAL4 2.0
- .CODE
- fld m1 ; Push m1 into first item
- fld st(2) ; Push third item into first
- fst m2 ; Copy first item to m2
- fxch st(2) ; Exchange first and third items
- fstp m1 ; Pop first item into m1
-
- (This figure may be found in the printed book.)
-
-
- 6.2.4.2 Doing Arithmetic Calculations
-
- Most of the coprocessor instructions for doing arithmetic operations have
- several forms, depending on the operand used. You do not need to specify the
- operand type in the instruction if both operands are stack registers, since
- register values are always 10-byte real numbers. The arithmetic instructions
- are listed below. In most cases, the result replaces the destination
- register.
-
- ╓┌─────────────────────────────────┌─────────────────────────────────────────╖
- Instruction Description
- ────────────────────────────────────────────────────────────────────────────
- FADD Adds the source and destination
-
- FSUB Subtracts the source from the
- destination
-
- FSUBR Subtracts the destination from the
- source
-
- FMUL Multiplies the source and the
- destination
-
- FDIV Divides the destination by the source
-
- Instruction Description
- ────────────────────────────────────────────────────────────────────────────
- FDIVR Divides the source by the destination
-
- FABS Sets the sign of ST to positive
-
- FCHS Reverses the sign of ST
-
- FRNDINT Rounds ST to an integer
-
- FSQRT Replaces the contents of ST with its
- square root
-
- FSCALE Multiplies the stack-top value by 2 to
- the power contained in ST(1)
-
- FPREM Calculates the remainder of ST divided
- by ST(1)
-
-
-
-
- 80387 Only
-
- ╓┌─────────────────────────────────┌─────────────────────────────────────────╖
- Instruction Description
- ────────────────────────────────────────────────────────────────────────────
- FSIN Calculates the sine of the value in ST
-
- FCOS Calculates the cosine of the value in ST
-
- FSINCOS Calculates the sine and cosine of the
- value in ST
-
- FPREM1 Calculates the partial remainder by
- performing modulo division on the top
- two stack registers
-
- FXTRACT Breaks a number down into its exponent
- and mantissa and pushes the mantissa
- onto the register stack
- Instruction Description
- ────────────────────────────────────────────────────────────────────────────
- onto the register stack
-
- F2XM1 Calculates 2(x)-1
-
- FYL2X Calculates Y * log2 X
-
- FYL2XP1 Calculates Y * log2 (X+1)
-
- FPTAN Calculates the tangent of the value in
- ST
-
- FPATAN Calculates the arctangent of the ratio Y
- /X
-
- F«N»INIT Resets the coprocessor and restores all
- the default conditions in the control
- and status words
-
- F«N»CLEX Clears all exception flags and the busy
- Instruction Description
- ────────────────────────────────────────────────────────────────────────────
- F«N»CLEX Clears all exception flags and the busy
- flag of the status word
-
- FINCSTP Adds 1 to the stack pointer in the
- status word
-
- FDECSTP Subtracts 1 from the stack pointer in
- the status word
-
- FFREE Marks the specified register as empty
-
-
-
- The following example illustrating several arithmetic instructions solves
- quadratic equations. It does no error checking and fails for some values
- because it attempts to find the square root of a negative number. You could
- revise the code using the FTST (Test for Zero) instruction to check for a
- negative number or 0 before the square root is calculated. If b2 - 4ac is
- negative or 0, the code can jump to routines that handle these two special
- cases.
-
- .DATA
- a REAL4 3.0
- b REAL4 7.0
- cc REAL4 2.0
- posx REAL4 0.0
- negx REAL4 0.0
-
- .CODE
- .
- .
- .
- ; Solve quadratic equation - no error checking
- ; The formula is: -b +/- squareroot(b2 - 4ac) / (2a)
- fld1 ; Get constants 2 and 4
- fadd st,st ; 2 at bottom
- fld st ; Copy it
- fmul a ; = 2a
-
- fmul st(1),st ; = 4a
- fxch ; Exchange
- fmul cc ; = 4ac
-
- fld b ; Load b
- fmul st,st ; = b2
- fsubr ; = b2 - 4ac
- ; Negative value here produces error
- fsqrt ; = square root(b2 - 4ac)
- fld b ; Load b
- fchs ; Make it negative
- fxch ; Exchange
-
- fld st ; Copy square root
- fadd st,st(2) ; Plus version = -b + root(b2 -
- 4ac)
- fxch ; Exchange
- fsubp st(2),st ; Minus version = -b - root(b2 -
- 4ac)
-
- fdiv st,st(2) ; Divide plus version
- fstp posx ; Store it
- fdivr ; Divide minus version
- fstp negx ; Store it
-
- The examples in online help contain an enhanced version of this procedure.
-
-
- 6.2.4.3 Controlling Program Flow
-
- The math coprocessors have several instructions that set control flags in
- the status word. The 8087-family control flags can be used with conditional
- jumps to direct program flow in the same way that 8086-family flags are
- used. Since the coprocessor does not have jump instructions, you must
- transfer the status word to memory so that the flags can be used by
- 8086-family instructions.
-
- An easy way to use the status word with conditional jumps is to move its
- upper byte into the lower byte of the processor flags, as shown in this
- example:
-
- fstsw mem16 ; Store status word in memory
- fwait ; Make sure coprocessor is done
- mov ax, mem16 ; Move to AX
- sahf ; Store upper word in flags
-
- The SAHF (Store AH into Flags) instruction in the example above transfers AH
- into the low bits of the flags register.
-
- You can save several steps by loading the status word directly to AX on the
- 80287 with the FSTSW and FNSTSW instructions. This is the only case in which
- data can be transferred directly between processor and coprocessor
- registers, as shown in this example:
-
- fstsw ax
-
- The coprocessor control flags and their relationship to the status word are
- described in Section 6.2.4.4, "Control Registers."
-
- The 8087-family coprocessors provide several instructions for comparing
- operands and testing control flags. All these instructions compare the stack
- top (ST) to a source operand, which may either be specified or implied as
- ST(1).
-
- The compare instructions affect the C3, C2, and C0 control flags, but not
- the C1 flag. Table 6.3 shows the flags set for each possible result of a
- comparison or test.
-
- Table 6.3 Control-Flag Settings after Comparison or Test
-
- After FCOM After FTEST C3 C2 C0
- ────────────────────────────────────────────────────────────────────────────
- ST > source ST is positive 0 0 0
- ST < source ST is negative 0 0 1
- ST = source ST is 0 1 0 0
- Not comparable ST is NAN or projective infinity 1 1 1
- ────────────────────────────────────────────────────────────────────────────
-
- Variations on the compare instructions allow you to pop the stack once or
- twice and to compare integers and zero. For each instruction, the stack top
- is always the implied destination operand. If you do not give an operand,
- ST(1) is the implied source. With some compare instructions, you can specify
- the source as a memory or register operand.
-
- All instructions summarized in the following list have implied operands:
- either ST as a single-destination operand or ST as the destination and ST(1)
- as the source. These are the instructions for comparing and testing flags.
-
- Some instructions have a wait version and a no-wait version. The no-wait
- versions have N as the second letter.
-
- ╓┌─────────────────────────────────┌─────────────────────────────────────────╖
- Instruction Description
- ────────────────────────────────────────────────────────────────────────────
- FCOM Compares the stack top to the source.
- The
- source and destination are unaffected by
- the comparison.
-
- FTST Compares ST to 0.
-
- FCOMP Compares the stack top to the source and
- then pops the stack.
-
- FUCOM, FUCOMP, FUCOMPP Compare the source to ST and set the
- condition codes of the status word
- Instruction Description
- ────────────────────────────────────────────────────────────────────────────
- condition codes of the status word
- according to the result (80386/486 only).
-
- F«N»STSW mem2byte Stores the status word in memory.
-
- FXAM Sets the value of the control flags
- based on the type of the number in ST.
-
- FPREM Finds a correct remainder for large
- operands. It uses the C2 flag to
- indicate whether the remainder returned
- is partial (C2 is set) or complete (C2
- is clear). (If the bit is set, the
- operation should be repeated. It also
- returns the least-significant three bits
- of the quotient in C0, C3, and C1.)
-
- FNOP Copies the stack top onto itself, thus
- padding the executable file and taking
- Instruction Description
- ────────────────────────────────────────────────────────────────────────────
- padding the executable file and taking
- up processing time without having any
- effect on registers or memory.
-
- FDISI, FNDISI, FENI, FNENI Enables or disables interrupts (8087
- only).
-
- FSETPM Sets protected mode. Requires a .286P or
- .386P directive (80287, 80387, and 80486
- only).
-
-
-
- The following example illustrates some of these instructions. Notice how
- conditional blocks are used to enhance 80287 code.
-
- .DATA
- down REAL4 10.35 ; Sides of a rectangle
- across REAL4 13.07
- diamtr REAL4 12.93 ; Diameter of a circle
- status WORD ?
- P287 EQU (@Cpu AND 00111y)
- .CODE
- .
- .
- .
- ; Get area of rectangle
- fld across ; Load one side
- fmul down ; Multiply by the other
-
- ; Get area of circle: Area = PI * (D/2)2
- fld1 ; Load one and
- fadd st, st ; double it to get constant 2
- fdivr diamtr ; Divide diameter to get radius
- fmul st, st ; Square radius
- fldpi ; Load pi
- fmul ; Multiply it
-
- ; Compare area of circle and rectangle
- fcompp ; Compare and throw both away
- IF p287
- fstsw ax ; (For 287+, skip memory)
- ELSE
- fnstsw status ; Load from coprocessor to memory
- mov ax, status ; Transfer memory to register
- ENDIF
- sahf ; Transfer AH to flags register
- jp nocomp ; If parity set, can't compare
- jz same ; If zero set, they're the same
- jc rectangle ; If carry set, rectangle is bigger
- jmp circle ; else circle is bigger
-
- nocomp: ; Error handler
- .
- .
- .
- same: ; Both equal
- .
- .
- .
- rectangle: ; Rectangle bigger
- .
- .
- .
- circle: ; Circle bigger
-
- Additional instructions for the 80387/486 are FLDENVD and FLDENVW for
- loading the environment; FNSTENVD, FNSTENVW, FSTENVD, and FSTENVW for
- storing the environment state; FNSAVED, FNSAVEW, FSAVED, and FSAVEW for
- saving the coprocessor state; and FRSTORD and FRSTORW for restoring the
- coprocessor state.
-
- The size of the code segment, not the operand size, determines the number of
- bytes loaded or stored with these instructions. The instructions ending with
- W store the 16-bit form of the control register data, and the instructions
- ending with D store the 32-bit form. For example, in 16-bit mode FSAVEW
- saves the 16-bit control register data. If you need to store the 32-bit form
- of the control register data, use FSAVED.
-
-
- 6.2.4.4 Control Registers
-
- Some of the flags of the seven 16-bit control registers control coprocessor
- operations, while others maintain the current status of the coprocessor. In
- this sense, they are much like the 8086-family flags registers (see Figure
- 6.8).
-
- (This figure may be found in the printed book.)
-
- Of the control registers, only the status word register is commonly used
- (the others are used mostly by systems programmers). The format of the
- status word register is shown in Figure 6.9, which shows how the coprocessor
- control flags align with the processor flags. C3 overwrites the zero flag,
- C2 overwrites the parity flag, and C0 overwrites the carry flag. C1
- overwrites an undefined bit, so it cannot be used directly with conditional
- jumps, although you can use the TEST instruction to check C1 in memory or in
- a register. The status word register also overwrites the sign and
- auxiliary-carry flags, so you cannot count on their being unchanged after
- the operation.
-
- (This figure may be found in the printed book.)
-
-
- 6.3 Using Emulator Libraries
-
- If you do not have a math coprocessor or an 80486 processor, you can do most
- floating-point operations by writing assembly-language procedures and
- accessing the emulator from a high-level language. All Microsoft high-level
- languages come with the emulator library.
-
- However, you cannot use a Microsoft emulator library with stand-alone
- assembler programs, since the library depends on the high-level-language
- start-up code.
-
- With emulator libraries, you can use most floating-point instructions.
-
- To use the emulator, first write the procedure using coprocessor
- instructions. Then assemble it using the /FPi option of your compiler.
- Finally, link it with your high-level-language modules. In MASM 6.0 you can
- enter options in the Programmer's WorkBench (PWB) environment, or you can
- use the OPTION EMULATOR in your source code.
-
- In emulation mode, the assembler generates instructions for the linker that
- the Microsoft emulator can use. The form of the OPTION directive in the
- example below tells the assembler to use emulation mode. This option
- (introduced in Section 1.3.2) can be defined only once in a module.
-
- OPTION EMULATOR
-
- Emulator libraries do not allow for all of the coprocessor instructions. The
- following floating-point instructions are not emulated:
-
- (This figure may be found in the printed book.)
-
- The set of emulated instructions is different under OS/2 2.x. If you use a
- coprocessor instruction that is not emulated, your program generates a
- run-time error when it tries to execute the unemulated instruction.
-
- See Chapter 20, "Mixed-Language Programming," for information about writing
- assembly-language procedures for high-level languages.
-
-
- 6.4 Using Binary Coded Decimal Numbers
-
- Binary coded decimal (BCD) numbers allow calculations on large numbers
- without rounding errors. The 8087-family coprocessors can do fast
- calculations with packed BCD numbers. See Section 6.4.2.2 for details. The
- 8086-family processors can also do some calculations with packed BCD
- numbers, but the process is slower and more complicated. See Section 6.4.2
- for details.
-
- This section explains how to define BCD numbers and then how to use them in
- calculations.
-
-
- 6.4.1 Defining BCD Constants and Variables
-
- Unpacked BCD numbers are made up of bytes containing a single decimal digit
- in the lower four bits of each byte. Packed BCD numbers are made up of bytes
- containing two decimal digits: one in the upper four bits and one in the
- lower four bits. The leftmost digit holds the sign (0 for positive, 1 for
- negative).
-
- Packed BCD numbers are encoded in the 8087 coprocessor's packed BCD format.
- They can be up to 18 digits long, packed two digits per byte. The assembler
- zero-pads BCDs initialized with fewer than 18 digits. Digit 20 is the sign
- bit, and digit 19 is reserved.
-
- The TBYTE directive allocates packed BCD constants.
-
- When you define an integer constant with the TBYTE directive and the current
- radix is decimal (t), the assembler interprets the number as a packed BCD
- number.
-
- The syntax for specifying packed BCDs is exactly the same as for other
- integers.
-
- pos1 TBYTE 1234567890 ; Encoded as 00000000001234567890h
- neg1 TBYTE -1234567890 ; Encoded as 80000000001234567890h
-
- Unpacked BCD numbers are stored one digit to a byte, with the value in the
- lower four bits. They can be defined using the BYTE directive. For example,
- an unpacked BCD number could be defined and initialized as shown below:
-
- unpackedr BYTE 1,5,8,2,5,2,9 ; Initialized to 9,252,851
- unpackedf BYTE 9,2,5,2,8,5,1 ; Initialized to 9,252,851
-
- Least-significant digits can come either first or last, depending on how you
- write the calculation routines that handle the numbers.
-
-
- 6.4.2 Calculating with BCDs
-
- When you use the processor to calculate with BCDs, the result is not correct
- unless you use the ASCII-adjust instructions to convert the result into the
- valid BCD integer.
-
-
- 6.4.2.1 Unpacked BCD Numbers
-
- Instructions for unpacked BCDs allow accurate BCD calculations.
-
- To do processor arithmetic on unpacked BCD numbers, you must do the
- eight-bit arithmetic calculations on each digit separately and assign the
- result to the AL register. After each operation, use the corresponding BCD
- instruction to adjust the result. The ASCII-adjust instructions do not take
- an operand. They always work on the value in the AL register.
-
- When a calculation using two one-digit values produces a two-digit result,
- the AAA, AAS, AAM, and AAD instructions put the first digit in AL and the
- second in AH. If the digit in AL needs to carry to or borrow from the digit
- in AH, the instructions set the carry and auxiliary carry flags.
-
- These instructions get their names from Intel mnemonics that use the term
- "ASCII" to refer to unpacked BCD numbers and "decimal" to refer to packed
- BCD numbers. The four ASCII-adjust instructions for unpacked BCDs are
- described below:
-
- Instruction Description
- ────────────────────────────────────────────────────────────────────────────
- AAA Adjusts after an addition operation.
-
- AAS Adjusts after a subtraction operation.
-
- AAM Adjusts after a multiplication operation.
- Always use with MUL, not with IMUL.
-
- AAD Adjusts before a division operation.
- Unlike other BCD instructions, AAD
- converts a BCD value to a binary value
- before the operation. After the
- operation, use AAM to adjust the
- quotient. The remainder is lost. If you
- need the remainder, save it in another
- register before adjusting the quotient.
- Then move it back to AL and adjust if
- necessary.
-
-
- The following examples show how to use each of these instructions in BCD
- addition, subtraction, multiplication, and division.
-
- ; To add 9 and 3 as BCDs:
- mov ax, 9 ; Load 9
- mov bx, 3 ; and 3 as unpacked BCDs
- add al, bl ; Add 09h and 03h to get 0Ch
- aaa ; Adjust 0Ch in AL to 02h,
- ; increment AH to 01h, set carry
- ; Result 12 (unpacked BCD in AX)
-
- ; To subtract 4 from 13:
- mov ax, 103h ; Load 13
- mov bx, 4 ; and 4 as unpacked BCDs
- sub al, bl ; Subtract 4 from 3 to get FFh (-1)
- aas ; Adjust 0FFh in AL to 9,
- ; decrement AH to 0, set carry
- ; Result 9 (unpacked BCD in AX)
-
- ; To multiply 9 times 3:
- mov ax, 903h ; Load 9 and 3 as unpacked BCDs
- mul ah ; Multiply 9 and 3 to get 1Bh
- aam ; Adjust 1Bh in AL
- ; to get 27 (unpacked BCD in AX)
-
- ; To divide 25 by 2:
- mov ax, 205h ; Load 25
- mov bl, 2 ; and 2 as unpacked BCDs
- aad ; Adjust 0205h in AX
- ; to get 19h in AX
- div bl ; Divide by 2 to get
- ; quotient 0Ch in AL
- ; remainder 1 in AH
- aam ; Adjust 0Ch in AL
- ; to 12 (unpacked BCD in AX)
- ; (remainder destroyed)
-
- If you process multidigit BCD numbers in loops, each digit is processed and
- adjusted in turn.
-
-
- 6.4.2.2 Packed BCD Numbers
-
- Packed BCD numbers are made up of bytes containing two decimal digits: one
- in the upper four bits and one in the lower four bits. The 8086-family
- processors provide instructions for adjusting packed BCD numbers after
- addition and subtraction. You must write your own routines to adjust for
- multiplication and division.
-
- To do processor calculations on packed BCD numbers, you must do the
- eight-bit arithmetic calculations on each byte separately. The result should
- always be in the AL register. After each operation, use the corresponding
- BCD instruction to adjust the result. The decimal-adjust instructions do not
- take an operand. They always work on the value in the AL register.
-
- The 8086-family processors provide DAA (Decimal Adjust after Addition) and
- DAS (Decimal Adjust after Subtraction) for adjusting packed BCD numbers
- after addition and subtraction.
-
- These examples show DAA and DAS used for adding and subtracting BCDs.
-
- ;To add 88 and 33:
- mov ax, 8833h ; Load 88 and 33 as packed BCDs
- add al, ah ; Add 88 and 33 to get 0BBh
- daa ; Adjust 0BBh to 121 (packed BCD:)
- ; 1 in carry and 21 in AL
-
- ;To subtract 38 from 83:
- mov ax, 3883h ; Load 83 and 38 as packed BCDs
- sub al, ah ; Subtract 38 from 83 to get 04Bh
- das ; Adjust 04Bh to 45 (packed BCD:)
- ; 0 in carry and 45 in AL
-
- Unlike the ASCII-adjust instructions, the decimal-adjust instructions never
- affect AH. The assembler sets the auxiliary carry flag if the digit in the
- lower four bits carries to or borrows from the digit in the upper four bits,
- and it sets the carry flag if the digit in the upper four bits needs to
- carry to or borrow from another byte.
-
- Multidigit BCD numbers are usually processed in loops. Each byte is
- processed and adjusted in turn.
-
-
- 6.5 Related Topics in Online Help
-
- In addition to information on the instructions and directives mentioned in
- this chapter, information on the following topics can be found in online
- help, starting from the "MASM 6.0 Contents" screen.
-
- Topic Access
- ────────────────────────────────────────────────────────────────────────────
- Control registers Choose "Language Overview," and then
- choose "Coprocessor Status Word,"
- "Coprocessor
- Control Word," or "Coprocessor
- Environment"
-
- ML options Choose "ML Command Line"
-
- Coprocessor instructions Choose "Coprocessor Instructions"
-
- MATHDEMO.ASM Choose "Example Code" and then "Map of
- Demos"
-
-
-
-
-
-
-
-
-
- Chapter 7 Controlling Program Flow
- ────────────────────────────────────────────────────────────────────────────
-
- Very few programs actually execute all lines sequentially from .STARTUP to
- .EXIT. Rather, complex program logic and efficiency dictate that you control
- the flow of your program─jumping from one point to another, repeating an
- action until a condition is reached, and passing control to procedures. This
- chapter describes various means for controlling program flow and several
- features that simplify coding program-control constructs.
-
- The first section covers jumps from one point in the program to another. It
- explains how MASM 6.0 optimizes both unconditional and conditional jumps
- under certain circumstances, so that you do not have to specify every
- attribute. The section also describes instructions you can use to test
- conditional jumps.
-
- The next section describes loop and decision structures that repeat actions
- or evaluate conditions. They discuss some new MASM directives, such as
- .WHILE and .REPEAT, that generate appropriate compare, loop, and jump
- instructions for you, and the new .IF, .ELSE, and .ELSEIF directives that
- generate jump instructions.
-
- A number of improvements to procedure automation are covered in Section 7.3.
- These include extended functionality for PROC, a PROTO directive that lets
- you write procedure prototypes similar to those used in C, an INVOKE
- directive that automates parameter passing, and new options for the
- stack-frame setup inside procedures.
-
- Finally, the last section explains how to pass control to an interrupt
- routine.
-
-
- 7.1 Jumps
-
- Jumps are the most direct method for changing program control from one
- location to another. At the processor level, jumps work by changing the
- value of the IP (Instruction Pointer) register from the address of the
- current instruction to a target address, by changing the CS register for far
- jumps, and by changing the CS register for far jumps. The many forms of the
- jump instructions handle jumps based on conditions, flags, and bit settings.
-
-
- This section first describes unconditional jumps, including the new jump
- optimization features of MASM 6.0 and the use of indirect operands to
- specify the jump's destination and to construct jump tables. The section
- then discusses conditional jumps─extending jumps, jumps based on bit or flag
- status, anonymous jumps, labels for jump targets, and decision directives
- that generate conditional jumps.
-
-
- 7.1.1 Unconditional Jumps
-
- Jumps in assembler programs are either conditional or unconditional. The
- assembler executes conditional jumps only when the jump condition is true.
- You use the JMP instruction to jump unconditionally to a specified address.
- Its single operand contains the target address, which can be short, near, or
- far.
-
- Unconditional jumps are often used to skip over code that should not be
- executed, as shown in this example.
-
- ; Handle one case
- label1: .
- .
- .
- jmp continue
- ; Handle second case
- label2: .
- .
- .
- jmp continue
- .
- .
- .
- continue:
-
- The distance of the target from the jump instruction and the size of the
- operand determine the assembler's encoding of the instruction. The larger
- the distance, the more bytes the assembler uses to code the instruction. In
- previous versions of MASM, unconditional NEAR jumps sometimes generate
- inefficient code. Unspecified FAR jumps result in phase errors.
-
-
- 7.1.1.1 Jump Optimizing
-
- Beginning with MASM 6.0, the assembler determines the smallest encoding
- possible for the direct unconditional jump. You do not specify a distance
- operator, so you do not have to determine the correct distance of the jump.
- If you do specify a distance, however, and it is too short, the assembler
- generates an error. A specified distance that is too long causes a less
- efficient jump to be generated than the assembler would generate if the
- distance had not been specified.
-
- MASM 6.0 optimizes jumps if the following conditions are met:
-
-
- ■ You do not specify SHORT, NEAR, FAR, NEAR16, NEAR32, FAR16, FAR32, or
- PROC as the distance of the target.
-
- ■ The target of the jump is not external and is in the same segment as
- the jump instruction. If the target is in a different segment (but in
- the same group), it is treated as if external.
-
-
- If these two conditions are met, MASM uses the instruction, distance, and
- size of the operand to determine how best to optimize the encoding for the
- jump. No syntax changes are necessary.
-
- ────────────────────────────────────────────────────────────────────────────
- NOTE
- This information about jump optimizing also applies to conditional jumps on
- the 80386/486.
- ────────────────────────────────────────────────────────────────────────────
-
-
- 7.1.1.2 Indirect Operands
-
- Indirect operands specify a register or data memory location that holds the
- address of the jump's destination. Indirect operands differ from the
- operands of direct jumps by being a memory expression instead of an
- immediate expression. For indirect jumps, you can specify the encoding for
- the instruction by giving the size (WORD, DWORD, or FWORD) attributes for
- the operand.
-
- The default rules are based on the .MODEL and the default segment size.
-
- jmp [bx] ; Uses .MODEL and segment size
- ; defaults
- jmp WORD PTR [bx] ; A NEAR16 indirect call
-
- If the indirect operand is a register, the jump is always a NEAR16 jump for
- a 16-bit register, and FAR32 for a 32-bit register:
-
- jmp bx ; NEAR16 jump
- jmp ebx ; FAR32 jump
-
- A DWORD indirect operand, however, is an ambiguous case:
-
- jmp DWORD PTR [var] ; A NEAR32 jump in a 32-bit
- segment;
- ; a FAR16 jump in a 16-bit segment
-
- In this case, you must define a type with TYPEDEF to specify the indirect
- operand.
-
- NFP TYPEDEF PTR NEAR32
- FFP TYPEDEF PTR FAR16
- jmp NFP PTR [var] ; NEAR32 indirect jump
- jmp FFP PTR [var] ; FAR16 indirect jump
-
- You can use an unconditional jump as a form of conditional jump by
- specifying the address in a register or indirect memory operand. Also, you
- can use indirect memory operands to construct jump tables that work like C
- switch statements,
-
- Pascal CASE statements, or Basic ON GOTO, ON GOSUB, or SELECT CASE
- statements, as shown in this example:
-
- NPVOID TYPEDEF NEAR PTR VOID
- .DATA
- ctl_tbl NPVOID extended, ; Null key (extended code)
- ctrla, ; Address of CONTROL-A key routine
- ctrlb ; Address of CONTROL-B key routine
- .CODE
- .
- .
- .
- mov ah, 8h ; Get a key
- int 21h
- cbw ; Stretch AL into AX
- mov bx, ax ; Copy
- shl bx, 1 ; Convert to address
- jmp ctl_tbl[bx] ; Jump to key routine
-
- extended:
- mov ah, 8h ; Get second key of extended key
- int 21h
- . ; Use another jump table
- . ; for extended keys
- .
- jmp next
- ctrla: . ; CONTROL-A code here
- .
- .
- jmp next
- ctrlb: . ; CONTROL-B code here
- .
- .
- jmp next
- .
- .
- next: . ; Continue
-
- In this example, the indirect memory operands point to addresses of routines
- for handling different keystrokes.
-
-
- 7.1.2 Conditional Jumps
-
- The most common way to transfer control in assembly language is with a
- conditional jump. This is a two-step process: first test the condition, and
- then jump if the condition is true or continue if it is false.
-
- The conditional jump instructions check flag status.
-
- Conditional-jump instructions (except JCXZ) use the status of one or more
- flags as their condition. Thus, any statement that sets a flag under
- specified conditions can be the test statement. The most common test
- statements use the CMP or TEST instructions. The jump statement can be any
- one of 31 conditional-jump instructions. Conditional-jump instructions take
- a single operand containing the target address.
-
-
- 7.1.2.1 Jump Extending
-
- In earlier versions of MASM, the NEAR and FAR operators cannot be used with
- conditional jumps on the 8086-80286 processors. MASM 6.0 automatically
- expands the jump instruction to include an unconditional jump to the
- destination, as long as a distance or size other than SHORT is specified or
- implicitly required from the operands. That is, MASM now generates the code
- that previously you had to write.
-
- Conditional jumps cannot refer to labels more than 128 bytes away.
- Therefore, in versions of MASM prior to 6.0, they are often combined with
- unconditional jumps, which have no such limitation. For example, the
- following statement is valid as long as target is not far away:
-
- ; Jump to target less than 128 bytes away
- jz target ; If previous operation resulted in
- ; zero, jump to target
-
- However, once target becomes too distant, the following sequence is
- necessary to enable a longer jump. Note that this sequence is logically
- equivalent to the example above:
-
- ; Jumps to distant targets previously required two steps
- jnz skip ; If previous operation result is
- ; NOT zero, jump to "skip"
- jmp target ; Otherwise, jump to target
- skip:
-
- If the instruction is any of the conditional-jump instructions (except JCXZ
- and JECXZ ) and the target is greater than 128 bytes or is in a far segment,
- then jump-extending for an instruction such as je target generates two
- instructions to replace it:
-
-
- 1. The logical negation of the jump instruction, with a destination that
- skips over the second line it generates
-
- 2. An unconditional jump to the target destination
-
-
- For example, if target is more than 128 bytes away, MASM generates these
- lines of code for je target:
-
- jne $ + 2 + (length in bytes of the next instruction)
- jmp NEAR PTR target
-
- Now the conditional jump executes correctly.
-
- The assembler generates this same code sequence if you specify the distance
- with NEAR PTR, FAR PTR, or SHORT. Therefore,
-
- jz NEAR PTR target
-
- becomes
-
- jne $ + 5
- jmp NEAR PTR target
-
- even if target is nearby.
-
- When skip is more than 128 bytes away, this example
-
- mov ax, cx
- jz skip ; Skip is more than 128 bytes away
- .
- . ; (additional code here)
- .
- skip:
-
- generates code that looks like this:
-
- 7327:0000 8BC1 MOV AX,CX
- 7327:0002 7503 JNZ 0007
- 7327:0004 E9C000 JMP 00C7
- 7327:0007 (more code here)
-
- MASM 6.0 enables this jump expansion feature by default, but you can turn it
- off with the NOLJMP form of the OPTION directive. See Section 1.3.2 for
- information about the OPTION directive.
-
- If the assembler generates code to extend a conditional jump, it issues a
- level 3 warning saying that the conditional jump has been lengthened. You
- can set the warning level to 1 for development and to level 3 for a final
- optimizing pass to see if you can shorten jumps by reorganizing.
-
- If you specify the distance for the jump and the target is out of range for
- that distance, a "Jump out of Range" error results.
-
- Since the JCXZ and JECXZ instructions do not have logical negations,
- expansion of the jump instruction to handle targets with unspecified
- distances cannot be performed for those instructions. Therefore the distance
- must always be short.
-
- The size and distance of the target operand determines the encoding for
- conditional or unconditional jumps to externals or targets in different
- segments. The new jump-extending and optimization features do not apply in
- this case.
-
- ────────────────────────────────────────────────────────────────────────────
- NOTE
-
- Conditional jumps on the 80386 and 80486 processors can be to targets up to
- 32K bytes away, so jump extension occurs only for targets greater than that
- distance.
- ────────────────────────────────────────────────────────────────────────────
-
-
- 7.1.2.2 Jumps Based on Comparisons
-
- The CMP instruction is specifically designed to test for conditional jumps.
- It does not change the destination operand─it compares two values without
- changing either of them. Instructions that change operands (such as SUB or
- AND) can also be used to test conditions.
-
- SUB and CMP set the same flags.
-
- Internally, the CMP instruction is the same as the SUB instruction, except
- that CMP does not change the destination operand. Both set flags according
- to the result that the subtraction generates.
-
- Table 7.1 lists conditional-jump instructions for each comparison
- relationship and shows the flags that are tested to see if the relationship
- is true. Note the difference in instructions depending on the sign of the
- operands. Some of these are equivalent to instructions listed in the
- previous section.
-
- Table 7.1 Conditional-Jump Instructions Used after Compare Instruction
-
- ╓┌──────────────┌──────────────┌──────────────┌──────────────┌───────────────╖
- Jump Signed Flags Tested Unsigned Flags Tested
- Condition Compare (Jump if True) Compare (Jump if True)
- ────────────────────────────────────────────────────────────────────────────
- = (Equal) JE ZF = 1 JE ZF = 1
-
- (Not equal) JNE ZF = 0 JNE ZF = 0
-
- > (Greater JG or JNLE ZF = 0 and JA or JNBE CF = 0 and
- than) SF = 0F ZF = 0
-
- <= (Less JLE or JNG ZF = 1 or JBE or JNA CF = 1 or
- than SF 0F ZF = 1
- or
- equal to)
-
- < (Less JL or JNGE SF 0F JB or JNAE CF = 1
- than)
-
- >= (Greater JGE or JNL SF = 0F JAE or JNB CF = 0
- Jump Signed Flags Tested Unsigned Flags Tested
- Condition Compare (Jump if True) Compare (Jump if True)
- ────────────────────────────────────────────────────────────────────────────
- >= (Greater JGE or JNL SF = 0F JAE or JNB CF = 0
- than
- or
- equal to)
-
- ────────────────────────────────────────────────────────────────────────────
-
-
-
- In the CMP instruction, the mnemonic names always refer to the relationship
- of the first operand to the second operand. For instance, in this example JG
- tests whether the first operand is greater than the second.
-
- cmp ax, bx ; Compares ax and bx
- jg contin ; Equivalent to: If ( ax > bx ) goto
- ; contin
- jl next ; Equivalent to: If ( ax < bx ) goto next
-
- Several conditional instructions have two names. For example, JG and JNLE
- (Jump if Not Less or Equal) are equivalent. You can use whichever name seems
- more mnemonic in context.
-
-
- 7.1.2.3 Testing Bits and Jumping
-
- Using CMP is not the only way to check a condition prior to a jump. You can
- also check the status of bits in the operands using the TEST instruction.
- This instruction tests for conditions prior to jumps by comparing specific
- bits rather than entire operands. Jump execution depends on whether certain
- bits are on or off.
-
- Pairs of operands cannot be both registers or both memory locations.
-
- The TEST instruction is the same as the AND instruction, except that TEST
- changes neither operand. If the result of the operation is 0, the zero flag
- is set, but the 0 is not actually written to the destination operand. The
- following example shows an application of TEST.
-
- .DATA
- bits BYTE ?
- .CODE
- .
- .
- .
- ; If bit 2 or bit 4 is set, then call task_a
- ; Assume "bits" is 0D3h 11010011
- test bits, 10100y ; If 2 or 4 is set AND 00010100
- jz skip1 ; --------
- call task_a ; Then call task_a 00010000
- skip1: ; Jump taken
- .
- .
- .
- ; If bits 2 and 4 are clear, then call task_b
- ; Assume "bits" is 0E9h 11101001
- test bits, 10100y ; If 2 and 4 are clear AND 00010100
- jnz skip2 ; --------
- call task_b ; Then call task_b 00000000
- skip2: ; Jump taken
-
- Generally, when you use TEST, one of the operands is a mask in which the
- bits to be tested are the only bits set. The other operand contains the
- value to be tested. If all the bits set in the mask are clear in the operand
- being tested, the zero flag is set. If any of the flags set in the mask are
- also set in the operand, the zero flag is cleared.
-
-
- 7.1.2.4 Jumping Based on Flag Status
-
- Your code can jump based on the condition of flags rather than on the
- relationships of operands. Use the following conditional-jump instructions:
-
-
- ╓┌───────────────────┌───────────────────────────────────────────────────────╖
- Instruction Jumps if
- ────────────────────────────────────────────────────────────────────────────
- JO The overflow flag is set
-
- JNO The overflow flag is clear
-
- JC The carry flag is set (same as JB)
- Instruction Jumps if
- ────────────────────────────────────────────────────────────────────────────
- JC The carry flag is set (same as JB)
-
- JNC The carry flag is clear (same as JAE)
-
- JZ The zero flag is set (same as JE)
-
- JNZ The zero flag is clear (same as JNE)
-
- JS The sign flag is set
-
- JNS The sign flag is clear
-
- JP The parity flag is set
-
- JNP The parity flag is clear
-
- JPE Parity is even (parity flag set)
-
- JPO Parity is odd (parity flag clear)
- Instruction Jumps if
- ────────────────────────────────────────────────────────────────────────────
- JPO Parity is odd (parity flag clear)
-
- JCXZ CX is 0
-
- JECXZ ECX is 0
- (80386/486 only)
-
-
-
- The following example shows two ways to use the instructions from the list
- above:
-
- ; Uses JO to handle overflow condition
- add ax, bx ; Add two values
- jo overflow ; If value too large, adjust
-
- ; Uses JNZ to check for zero as the result of subtraction
- sub ax, bx ; Subtract
- jnz skip ; If the result is not zero, continue
- call zhandler ; Else do special case
-
-
- 7.1.2.5 Anonymous Labels
-
- Anonymous labels are alternatives to named labels.
-
- Coding jumps in assembly language requires that you invent many label names.
- One alternative to continually thinking up new label names is using
- anonymous labels, which you can use anywhere in your program. But because
- anonymous labels do not provide meaningful names, they are best used for
- conditionally testing a few lines of code. You should mark major divisions
- of a program with actual named labels.
-
- Use two at signs (@) followed by a colon (:) as an anonymous label. To jump
- to the nearest preceding anonymous label, use @B (back) in the jump
- instruction's operand field; to jump to the nearest following anonymous
- label, use @F (forward) in the operand field.
-
- The jump in the example below uses an anonymous label:
-
- ; DX is 20, unless CX is less than -20, then make DX 30
- mov dx, 20
- cmp cx, -20
- jge @F
- mov dx, 30
- @:
-
- The items @B and @F always refer to the nearest occurrences of @:, so
- there is never any conflict between different anonymous labels.
-
-
- 7.1.2.6 Decision Directives
-
- The high-level structures you can use for decision-making are the .IF,
- .ELSEIF, and .ELSE statements. These directives generate conditional jumps.
- The expression following the .IF directive is evaluated, and if true, the
- following instructions are executed until the next .ENDIF, .ELSE, or .ELSEIF
- directive is reached. The .ELSE statements execute if the expression is
- false. Using the .ELSEIF directive puts a new expression to be evaluated
- inside the alternative part of the original .IF statement. The syntax is
-
- .IF condition1
- statements
- «.ELSEIF condition2
- statements»
- «.ELSE
- statements»
- .ENDIF
-
- The decision structure
-
- .IF cx = 20
- mov dx, 20
- .ELSE
- mov dx, 30
- .ENDIF
-
- generates this code:
-
- .IF cx == 20
- 0017 83 F9 14 * cmp cx, 014h
- 001A 75 05 * jne @C0001
- 001C BA 0014 mov dx, 20
- .ELSE
- 001F EB 03 * jmp @C0003
- 0021 *@C0001:
- 0021 BA 001E mov dx, 30
- .ENDIF
- 0024 *@C0003:
-
-
- 7.2 Loops
-
- Loops repeat an action until a termination condition is reached. This
- condition can be a counter or the result of an expression's evaluation. MASM
- 6.0 offers many ways to set up loops in your programs. The following list
- compares MASM loop structures.
-
- Instructions Action
- ────────────────────────────────────────────────────────────────────────────
- LOOP Automatically decrements CX. When CX = 0,
- the loop ends. The top of the loop
- cannot be greater than 128 bytes from
- the LOOP instruction. (This is true for
- all LOOP instructions.)
-
- LOOPE, LOOPZ, LOOPNE, LOOPNZ Loops while equal (or not equal). Checks
- CX and a condition. The loop ends when
- the condition is true. Set CX to a
- number out of range if you don't want a
- count to control the loop.
-
- JCXZ, JECXZ Branches to a label only if CX = 0 (ECX
- on the 80386). Useful for testing
- condition of CX before beginning loop.
- If CX = 0 before entering the loop, CX
- decrements to -1 on the first iteration
- and then must be decremented 65,535
- times before it reaches 0 again. Unlike
- conditional-jump instructions, which can
- jump to either a near or a short label
- under the 80386 or 80486, the loop
- instructions JCXZ and JECXZ always jump
- to a short label.
-
- Conditional jumps Acts only if certain conditions met.
- Necessary if several conditions must be
- tested. See Section 7.1.2, "Conditional
- Jumps."
-
- The following examples illustrate these loop constructions.
-
- ; The LOOP instruction: For 200 to 0 do task
- mov cx, 200 ; Set counter
- next: . ; Do the task here
- .
- .
- loop next ; Do again
- ; Continue after loop
-
- ; The LOOPNE instruction: While AX is not 'Y', do task
- mov cx, 256 ; Set count too high to interfere
- wend: . ; But don't do more than 256 times
- . ; Some statements that change AX
- .
- cmp al, 'Y' ; Is it Y or too many times?
- loopne wend ; No? Repeat
- ; Yes? Continue
-
- ; Using JCXZ: For 0 to CX do task
- ; CX counter set previously
- jcxz done ; Check for 0
- next: . ; Do the task here
- .
- .
- loop next ; Do again
- done: ; Continue after loop
-
-
- 7.2.1 Loop-Generating Directives
-
- These directives are new to MASM 6.0.
-
- The high-level control structures new to MASM 6.0 generate loop structures
- for you. These new directives are similar to the while and repeat loops of C
- or Pascal. They can make your assembly programs less repetitive and easier
- to code, as well as easier to read. The assembler generates the appropriate
- assembly code. The .BREAK and .CONTINUE directives are also implemented to
- interrupt loop execution. These directives are summarized in the following
- list:
-
- Directives Action
- ────────────────────────────────────────────────────────────────────────────
- .WHILE, .ENDW The statements between .WHILE condition
- and .ENDW execute while the condition is
- true.
-
- .REPEAT, .UNTIL The loop executes at least once and
- continues until the condition given
- after .UNTIL is true. Generates
- conditional jumps.
-
- .REPEAT, .UNTILCXZ Compares label to an expression and
- generates appropriate loop instructions.
-
-
- These constructs work much as they do in a high-level language such as C or
- Pascal. Keep in mind the following points:
-
-
- ■ These directives generate appropriate processor instructions. They are
- not new instructions.
-
- ■ They require proper use of signed and unsigned data declarations.
-
-
- These directives cause a set of instructions to execute based on the
- evaluation of some condition. This condition can be an expression that
- evaluates to a negative or nonnegative value, an expression using the binary
- operators in C (&&, ||, or !), or the state of a flag. See Section 7.2.2.1
- for more information about expression operators.
-
- The evaluation of the condition requires the assembler to know if the
- operands in the condition are signed or unsigned. To state explicitly that a
- named memory location contains a signed integer, use the signed data
- allocation directives: SBYTE, SWORD, and SDWORD.
-
-
- 7.2.1.1 .WHILE Loops
-
- As with while loops in C or Pascal, the test condition for .WHILE is checked
- before the statements inside the loop execute. If the test condition is
- false, the loop does not execute. While the condition is true, the
- statements inside the loop repeat.
-
- Use the .ENDW directive to mark the end of the .WHILE loop. When the
- condition becomes false, program execution begins at the first statement
- following the .ENDW directive. The .WHILE directive generates appropriate
- compare and jump statements. The syntax is
-
- .WHILE condition statements .ENDW
-
- For example, this loop copies one buffer to another until a `$' character
- (marking the end of the string) is found:
-
- .DATA
- buf1 BYTE "This is a string",'$'
- buf2 BYTE 100 DUP (?)
- .CODE
- sub bx, bx ; Zero out bx
- .WHILE (buf1[bx] != '$')
- mov al, buf1[bx] ; Get a character
- mov buf2[bx], al ; Move it to buffer 2
- inc bx ; Count forward
- .ENDW
-
-
- 7.2.1.2 .REPEAT Loops
-
- MASM's .REPEAT directive allows for loop constructions like the do loop of C
- and the REPEAT loop of Pascal. The loop executes until the condition
- following the .UNTIL (or .UNTILCXZ) directive becomes true. Since the
- condition is checked at the end of the loop, the loop always executes at
- least once. The .REPEAT directive generates conditional jumps. The syntax
- is:
-
- .REPEAT
- statements
- .UNTIL condition
-
- .REPEAT
- statements
- .UNTILCXZ «condition»
-
- A condition is optional with .UNTILCXZ.
-
- where condition can also be expr1 == expr2 or expr1 != expr2. When two
- conditions are used, expr2 can be an immediate expression, a register, or
- (if expr1 is a register) a memory location.
-
- For example, the following code fills up a buffer with characters typed at
- the keyboard. The loop ends when the ENTER key (character 13) is pressed:
-
- .DATA
- buffer BYTE 100 DUP (0)
- .CODE
- sub bx, bx ; Zero out bx
- .REPEAT
- mov ah, 01h
- int 21h ; Get a key
- mov buffer[bx], al ; Put it in the buffer
- inc bx ; Increment the count
- .UNTIL (al == 13) ; Continue until al is 13
-
- The .UNTIL directive generates conditional jumps, but the .UNTILCXZ
- directive generates a LOOP instruction, as shown by the listing file code
- for these examples. In a listing file, assembler-generated code is preceded
- by an asterisk.
-
- ASSUME bx:PTR SomeStruct
-
- .REPEAT
- *@C0001:
- inc ax
- .UNTIL ax==6
- * cmp ax, 006h
- * jne @C0001
-
- .REPEAT
- *@C0003:
- mov ax, 1
- .UNTILCXZ
- * loop @C0003
-
- .REPEAT
- *@C0004:
- .UNTILCXZ [bx].field != 6
- * cmp [bx].field, 006h
- * loope @C0004
-
-
- 7.2.1.3 .BREAK and .CONTINUE Directives
-
- .BREAK and .CONTINUE interrupt loop execution.
-
- The .BREAK and .CONTINUE directives can be used to terminate a .REPEAT or
- .WHILE loop prematurely. These directives allow an optional .IF clause for
- conditional breaks. The syntax is
-
- .BREAK «.IF condition»
- .CONTINUE «.IF condition»
-
- Note that .ENDIF is not used with the .IF forms of .BREAK and .CONTINUE in
- this context. The .BREAK and .CONTINUE directives work the same way as the
- break and continue instructions in C. Execution continues at the instruction
- following the .UNTIL, .UNTILCXZ, or .ENDW of the nearest enclosing loop.
-
- Instead of causing the loop execution to end as .BREAK does, .CONTINUE
- causes loop execution to jump directly to the code that evaluates the loop
- condition of the nearest enclosing loop.
-
- The following loop accepts only the keys in the range `0' to `9' and
- terminates when ENTER is pressed.
-
- .WHILE 1 ; Loop forever
- mov ah, 08h ; Get key without echo
- int 21h
- .BREAK .IF al == 13 ; If ENTER, break out of the loop
- .CONTINUE .IF (al < '0') || (al > '9')
- ; If not a digit, continue looping
- mov dl, al ; Save the character for processing
- mov ah, 02h ; Output the character
- int 21h
- .ENDW
-
- If you assemble the source code above with the /Fl and /Sg command-line
- options and then view the results in the listing file, you would see this
- code:
-
- .WHILE 1
- 0017 *@C0001:
- 0017 B4 08 mov ah, 08h
- 0019 CD 21 int 21h
- .BREAK .IF al == 13
- 001B 3C 0D * cmp al, 00Dh
- 001D 74 10 * je @C0002
- .CONTINUE .IF (al '0') || (al '9')
- 001F 3C 30 * cmp al, '0'
- 0021 72 F4 * jb @C0001
- 0023 3C 39 * cmp al, '9'
- 0025 77 F0 * ja @C0001
- 0027 8A D0 mov dl, al
- 0029 B4 02 mov ah, 02h
- 002B CD 21 int 21h
- .ENDW
- 002D EB E8 * jmp @C0001
- 002F *@C0002:
-
- The high-level control structures can be nested. That is, .REPEAT or .WHILE
- loops can contain .REPEAT or .WHILE loops as well as .IF statements.
-
- If the code generated by a .WHILE loop, .REPEAT loop, or .IF statement
- generates a conditional or unconditional jump, MASM uses the jump extension
- and jump optimization techniques described in Sections 7.1.1, "Unconditional
- Jumps," and 7.1.2, "Conditional Jumps," to encode the jump appropriately.
-
-
- 7.2.2 Writing Loop Conditions
-
- You can express the conditions of the .IF, .REPEAT, and .WHILE directives
- using relational operators, and you can express the attributes of the
- operand with the PTR operator. To write loop conditions, you also need to
- know how the assembler evaluates the operators and operands in the
- condition. This section explains the operators, attributes, precedence
- level, and expression evaluation order for the conditions used with
- loop-generating directives.
-
-
- 7.2.2.1 Expression Operators
-
- The binary relational operators in MASM 6.0 high-level control structures
- are listed below. The same binary operators are used in C. These operators
- generate MASM compare, test, and conditional jump instructions.
-
- ╓┌──────────────────────┌────────────────────────────────────────────────────╖
- Operator Meaning
- ────────────────────────────────────────────────────────────────────────────
- == Equal
- != Not equal
- > Greater than
- >= Greater than or equal to
- < Less than
- <= Less than or equal to
- & Bit test
- ! Logical NOT
- && Logical AND
- || Logical OR
- Operator Meaning
- ────────────────────────────────────────────────────────────────────────────
- || Logical OR
-
-
- A condition without operators (other than !) tests for nonzero as it does in
- C. For example, .WHILE (x) is the same as .WHILE (x != 0), and .WHILE
- (!x) is the same as .WHILE (x == 0).
-
- Flag names can be operands in a condition.
-
- You can also use the flag names (ZERO?, CARRY?, OVERFLOW?, SIGN?, and
- PARITY?) as operands in conditions with the high-level control structures as
- in .WHILE (CARRY?). The particular flag set determines the outcome of the
- condition. Use flag names when you want to generate the compare or other
- instructions that set the flags.
-
-
- 7.2.2.2 Signed and Unsigned Operands
-
- Registers, constants, and memory locations are unsigned by default.
-
- Expression operators generate unsigned jumps by default. However, if either
- side of the operation is signed, then the entire operation is considered
- signed. The default for the operands in registers, constants, and named
- memory locations is also to be unsigned.
-
- You can use the PTR operator to tell the assembler that a particular operand
- in a register or constant is a signed number, as in these examples:
-
- .WHILE SWORD PTR [bx] <= 0
- .IF SWORD PTR mem1 > 0
-
- Without the PTR operator, the assembler would treat the contents of BX as an
- unsigned value.
-
- You can also specify the size attributes of operands in memory locations
- with SBYTE, SWORD, and SDWORD, for use with .IF, .WHILE, and .REPEAT.
-
- .DATA
- mem1 SBYTE ?
- mem2 WORD ?
- .IF mem1 > 0
- .WHILE mem2 < bx
- .WHILE SWORD PTR ax < count
-
-
- 7.2.2.3 Precedence Level
-
- As with C, you can concatenate conditions with the && operator for AND, the
- || operator for OR, and the ! operator for negate. The precedence level is
- !, &&, and ||, with ! having the highest precedence. Like expressions in
- high-level languages, associativity is evaluated left to right.
-
-
- 7.2.2.4 Expression Evaluation
-
- The assembler evaluates conditions created with high-level control
- structures according to short-circuit evaluation. If the evaluation of a
- particular condition automatically determines the final result (such as a
- condition that evaluates to false in a compound statement concatenated with
- AND), the evaluation does not continue.
-
- For example, in this .WHILE statement,
-
- .WHILE (ax > 0) && (WORD PTR [bx] == 0)
-
- the assembler evaluates the first condition. If this condition is false
- (that is, if AX is less than or equal to 0), the evaluation is finished. The
- second condition is not checked and the loop does not execute, because a
- compound condition containing a && requires both expressions to be true for
- the entire condition to be true.
-
-
-
-
-
- 7.3 Procedures
-
- Organizing your code into procedures that execute specific tasks divides
- large programs into manageable units, allows for separate testing, and makes
- code more efficient for repetitive tasks.
-
- Assembly-language procedures are comparable to functions in C; subprograms,
- functions, and subroutines in Basic; procedures and functions in Pascal; or
- subroutines and functions in FORTRAN.
-
- Two instructions control the use of assembly-language procedures; CALL
- pushes the return address onto the stack and transfers control to a
- procedure, and RET pops the return address off the stack and returns control
- to that location.
-
- The PROC and ENDP directives mark the beginning and end of a procedure.
- Additionally, PROC can automatically
-
-
- ■ Preserve register values that should not change but that the procedure
- might otherwise alter
-
- ■ Set up a local stack pointer, so that you can access parameters and
- local variables placed on the stack
-
- ■ Adjust the stack when the procedure ends
-
-
- Sections 7.3.1 through 7.3.3 give information on techniques for calling
- procedures and accessing parameters. Sections 7.3.4 through 7.3.5 show how
- to allocate and access local variables and parameters.
-
- Sections 7.3.6 and 7.3.7 introduce new directives in MASM 6.0 to further
- automate calling procedures and passing arguments. The PROTO directive
- allows you to declare prototypes for your procedures. INVOKE handles
- procedure calls and stack cleanup. Section 7.3.8 describes the automatic
- stack setup and cleanup generated with PROC.
-
-
- 7.3.1 Defining Procedures
-
- Procedures require a label at the start of the procedure and a return at the
- end. Procedures are normally defined by using the PROC directive at the
- start of the procedure and the ENDP directive at the end. The RET
- instruction is normally placed immediately before the ENDP directive. The
- assembler makes sure that the distance of the RET instruction matches the
- distance defined by the PROC directive. The basic syntax for PROC is
-
- label PROC [[NEAR|FAR]]
- .
- .
- .
- RET [[constant]]
- label ENDP
-
- The CALL instruction pushes the address of the next instruction in your code
- onto the stack and passes control to a specified address. The syntax is
-
- CALL {label | register | memory}
-
- The operand contains a value calculated at run time. Since that operand can
- be a register, direct memory operand, or indirect memory operand, you can
- write call tables similar to the jump table illustrated in Section 7.1.1.2.
-
-
- Calls can be near or far. Near calls push only the offset portion of the
- calling address and therefore must be within the same segment or group. You
- can specify the type for the target operand, but if you do not, MASM uses
- the declared distance (NEAR or FAR) for operands that are labels and for the
- size of register or memory operands. Then the assembler encodes the call
- appropriately, as it does with unconditional jumps (see Sections 7.1.1,
- "Unconditional Jumps," and 7.1.2, "Conditional Jumps").
-
- MASM 6.0 optimizes a call to a far label when the label is in the current
- segment by generating the code for a near call, saving one byte.
-
- You can define procedures without PROC and ENDP, but if you do, you must
- make sure that the size of the CALL matches the size of the RET. You can
- specify the RET instruction as RETN (Return Near) or RETF (Return Far) to
- override the default size:
-
- call NEAR PTR task ; Call is declared near
- . ; Return comes to here
- .
- .
- task: ; Procedure begins with near label
- .
- . ; Instructions go here
- .
- retn ; Return declared near
-
- The syntax for RETN and RETF is
-
- label: | label NEAR
- statements
- RETN [[constant]]
-
- label LABEL FAR
- statements
- RETF [[constant]]
-
- The RET instruction (and its RETF and RETN variations) allows an optional
- constant operand that specifies a number of bytes to be added to the value
- of the SP register after the return. This operand adjusts for arguments
- passed to the procedure before the call, as shown in the example in Section
- 7.3.4, "Using Local Variables."
-
- Incorrect size for RET can cause your program to fail.
-
- When you define procedures without PROC and ENDP, you must make sure that
- calls have the same size as corresponding returns. For example, RETF pops
- two words off the stack. If a NEAR call is made to a procedure with a far
- return, not only is the popped value meaningless, but the stack status may
- cause the execution to return to a random memory location, resulting in
- program failure.
-
- There is an also an extended PROC syntax that automates many of the details
- of accessing arguments and saving registers. See Section 7.3.3, "Declaring
- Parameters with the PROC Directive."
-
-
- 7.3.2 Passing Arguments on the Stack
-
- Each time you call a procedure, you may want it to operate on different
- data. This data, called "arguments," can be passed in various ways. For
- example, arguments can be passed to a procedure in registers or in
- variables. However, the
-
- most common method of passing arguments is to use the stack. Microsoft
- languages have specific conventions for passing arguments. Chapter 20,
- "Mixed-Language Programming," explains these conventions for
- assembly-language modules shared with modules from high-level languages.
-
- This section describes how a procedure accesses the arguments passed to it
- on the stack. Each argument is accessed as an offset from BP. However, if
- you use the PROC directive to declare parameters, the assembler calculates
- these offsets for you and lets you refer to parameters by name. The next
- section, "Declaring Parameters with the PROC Directive," explains how to use
- PROC this way.
-
- This example shows how to pass arguments to a procedure. The procedure
- expects to find those arguments on the stack. As this example shows,
- arguments must be accessed as offsets of BP.
-
- ; C-style procedure call and definition
-
- mov ax, 10 ; Load and
- push ax ; push constant as third argument
- push arg2 ; Push memory as second argument
- push cx ; Push register as first argument
- call addup ; Call the procedure
- add sp, 6 ; Destroy the pushed arguments
- . ; (equivalent to three pops)
- .
- .
- addup PROC NEAR ; Return address for near call
- ; takes two bytes
- push bp ; Save base pointer - takes two bytes
- ; so arguments start at fourth byte
- mov bp, sp ; Load stack into base pointer
- mov ax, [bp+4] ; Get first argument from
- ; fourth byte above pointer
- add ax, [bp+6] ; Add second argument from
- ; sixth byte above pointer
- add ax, [bp+8] ; Add third argument from
- ; eighth byte above pointer
- mov sp, bp
- pop bp ; Restore BP
- ret ; Return result in AX
- addup ENDP
-
- Figure 7.1 shows the stack condition at key points in the process.
-
- (This figure may be found in the printed book.)
-
- Starting with the 80186 processor, the ENTER and LEAVE instructions simplify
- the stack setup and restore instructions at the beginning and end of
- procedures.
-
- However, ENTER uses a lot of time. It is necessary only with nested,
- statically scoped procedures. Thus, a Pascal compiler may sometimes generate
- ENTER. The LEAVE instruction, on the other hand, is an efficient way to do
- the stack cleanup. LEAVE reverses the effect of the last ENTER instruction
- by restoring BP and SP to their values before the procedure call.
-
-
- 7.3.3 Declaring Parameters with the PROC Directive
-
- With the PROC directive, you can specify registers to be saved, define
- parameters to the procedure, and assign symbol names to parameters (rather
- than as offsets from BP). This section describes how to use the PROC
- directive to automate the parameter-accessing techniques described in the
- last section.
-
- For example, the diagram below shows a valid PROC statement for a procedure
- called from C. It takes two parameters, var1 and arg1, and uses (and must
- save) the DI and SI registers:
-
- (This figure may be found in the printed book.)
-
- The syntax for PROC is
-
- label PROC [[attributes]]
- [[USES reglist]] [[, parameter[[:tag]]...
- ]]
-
- The following list describes the parts of the PROC directive.
-
- Argument Description
- ────────────────────────────────────────────────────────────────────────────
- label The name of the procedure.
-
- attributes Any of several attributes of the
- procedure, including the distance,
- langtype, and visibility of the
- procedure. The syntax for attributes is
- given in Section 7.3.3.1.
-
- reglist A list of registers following the USES
- keyword that the procedure uses and that
- should be saved on entry. Registers in
- the list must be separated by blanks or
- tabs, not by commas. The assembler
- generates prologue code to push these
- registers onto the stack. When you exit,
- the assembler generates epilogue code to
- pop the saved register values off the
- stack.
-
- parameter The list of parameters passed to the
- procedure on the stack. The list can
- have a variable number of parameters.
- See the discussion below for the syntax
- of parameter. This list can be longer
- than one line if the continued line ends
- with a comma.
-
-
- This diagram shows a valid PROC definition that uses several attributes:
-
- (This figure may be found in the printed book.)
-
-
- 7.3.3.1 Attributes
-
- The syntax for the attributes field is
-
- «distance» «langtype» «visibility»
- «<prologuearg>»
-
- The list below explains each of these options.
-
- ╓┌─────────────────────────────────┌─────────────────────────────────────────╖
- Argument Description
- ────────────────────────────────────────────────────────────────────────────
- distance Controls the form of the RET instruction
- generated. Can be NEAR or FAR. If
- distance is not specified, it is
- determined from the model declared with
- Argument Description
- ────────────────────────────────────────────────────────────────────────────
- determined from the model declared with
- the .MODEL directive. For TINY, SMALL,
- COMPACT, and FLAT, NEAR is assumed. For
- MEDIUM, LARGE, and HUGE, FAR is assumed.
- For 80386/486 programming with 16- and
- 32-bit segments, NEAR16, NEAR32, FAR16,
- or FAR32 can be specified.
-
- langtype Determines the calling convention used
- to access param-
- eters and restore the stack. The BASIC,
- FORTRAN, and PASCAL langtypes convert
- procedure names to uppercase, place the
- last parameter in the parameter list
- lowest on the stack, and generate a RET,
- which adjusts the stack upward by the
- number of bytes in the argument list.
-
- The C and STDCALL langtype prefixes an
- Argument Description
- ────────────────────────────────────────────────────────────────────────────
- The C and STDCALL langtype prefixes an
- underscore to the procedure name when
- the procedure's scope is PUBLIC or
- EXPORT and places the first parameter
- lowest on the stack. SYSCALL is
- equivalent to the C calling convention
- with no underscore prefixed to the
- procedure's name. STDCALL uses caller
- stack cleanup when :VARARG is specified;
- otherwise the called routine must clean
- up the stack (see Chapter 20).
-
- visibility Indicates whether the procedure is
- available to other modules. The
- visibility can be PRIVATE, PUBLIC, or
- EXPORT. A procedure name is PUBLIC
- unless it is explicitly declared as
- PRIVATE. If the visibility is EXPORT,
- the linker places the procedure's name
- Argument Description
- ────────────────────────────────────────────────────────────────────────────
- the linker places the procedure's name
- in the export table for segmented
- executables. EXPORT also enables PUBLIC
- visibility.
-
- You can explicitly set the default
- visibility with the
- OPTION directive. OPTION PROC:PUBLIC
- sets the default to public. See Section
- 1.3.2 for more information.
-
- prologuearg Specifies the arguments that affect the
- generation of prologue and epilogue code
- (the code MASM generates when it
- encounters a PROC directive or the end
- of a procedure). See Section 7.3.8 for
- an explanation of prologue and epilogue
- code.
-
- Argument Description
- ────────────────────────────────────────────────────────────────────────────
-
-
-
- 7.3.3.2 Parameters
-
- The parameters are separated from the reglist by a comma if there is a list
- of registers. In the syntax:
-
- parmname [[:tag»
-
- parmname is the name of the parameter. The tag can be either the
- qualifiedtype or the keyword VARARG. However, only the last parameter in a
- list of parameters can use the VARARG keyword. The qualifiedtype is
- discussed in Section 1.2.6, "Data Types." An example showing how to
- reference VARARG parameters appears later in this section. Procedures can be
- nested if they do not have parameters or USES register lists. This diagram
- shows a procedure definition with one parameter definition.
-
- (This figure may be found in the printed book.)
-
- The following example shows the procedure in Section 7.3.2, "Passing
- Arguments on the Stack," rewritten to use the extended PROC functionality.
- Prior to the procedure call, you must push the arguments onto the stack
- unless you use INVOKE (see Section 7.3.7, "Calling Procedures with INVOKE").
-
-
- addup PROC NEAR C,
- arg1:WORD, arg2:WORD, count:WORD
- mov ax, arg1
- add ax, count
- add ax, arg2
- ret
- addup ENDP
-
- If the arguments for a procedure are pointers, the assembler does not
- generate any code to get the value or values that the pointers reference;
- your program must still explicitly treat the argument as a pointer. (See
- Chapter 3, "Using Addresses and Pointers," for more information about using
- pointers.)
-
- In the example below, even though the procedure declares the parameters as
- near pointers, you still must code two MOV instructions to get the values of
- the parameters─the first MOV gets the address of the parameters, and the
- second MOV gets the parameter.
-
- ; Call from C as a FUNCTION returning an integer
-
- .MODEL medium, c
- .CODE
- myadd PROC arg1:NEAR PTR WORD, arg2:NEAR PTR WORD
-
- mov bx, arg1 ; Load first argument
- mov ax, [bx]
- mov bx, arg2 ; Add second argument
- add ax, [bx]
-
- ret
-
- myadd ENDP
- END
-
- You can use conditional-assembly directives to make sure that your pointer
- parameters are loaded correctly for the memory model. For example, the
- following version of myadd treats the parameters as FAR parameters if
- necessary:
-
- .MODEL medium, c ; Could be any model
- .CODE
- myadd PROC arg1:PTR WORD, arg2:PTR WORD
-
- IF @DataSize
- les bx, arg1 ; Far parameters
- mov ax, es:[bx]
- les bx, arg2
- add ax, es:[bx]
- ELSE
- mov bx, arg1 ; Near parameters
- mov ax, [bx]
- mov bx, arg2
- add ax, [bx]
- ENDIF
-
- ret
- myadd ENDP
-
- END
-
-
- 7.3.3.3 Using VARARG
-
- In the PROC statement, you can append the :VARARG keyword to the last
- parameter to indicate that a variable number of arguments can be passed if
- you use the C, SYSCALL, or STDCALL calling conventions (see Section 20.1). A
- label must precede :VARARG so that the arguments can be accessed as offsets
- from the variable name given. This example illustrates VARARG:
-
- addup3 PROTO NEAR C, argcount:WORD, arg1:VARARG
-
- invoke addup3, 3, 5, 2, 4
-
- addup3 PROC NEAR C, argcount:WORD, arg1:VARARG
- sub ax, ax ; Clear work register
- sub si, si
-
- .WHILE argcount > 0 ; Argcount has number of arguments
- add ax, arg1[si] ; Arg1 has the first argument
- dec arg1 ; Point to next argument
- inc si
- inc si
- .ENDW
-
- ret ; Total is in AX
- addup3 ENDP
-
- Passing non-default-sized pointers in the VARARG portion of the parameter
- list can be done by explicitly passing the segment portion and the offset
- portion of the address separately.
-
- ────────────────────────────────────────────────────────────────────────────
- NOTE
-
- When you use the extended PROC features and the assembler encounters a RET
- instruction, it automatically generates instructions to pop saved registers,
- remove local variables from the stack, and, if necessary, remove parameters.
- It generates this code for each RET instruction it encounters. You can
- reduce code size by having only one return and jumping to it from various
- locations.
- ────────────────────────────────────────────────────────────────────────────
-
-
- 7.3.4 Using Local Variables
-
- In high-level languages, local variables are visible only within a
- procedure. In Microsoft languages, these variables are usually stored on the
- stack. In assembly-language programs, you can also have local variables.
- These variables should not be confused with labels or variable names that
- are local to a module, as described in Chapter 8, "Sharing Data and
- Procedures among Modules and Libraries."
-
- This section outlines the standard methods for creating local variables. The
- next section shows how to use the LOCAL directive to make the assembler
- automatically generate local variables. When you use this directive, the
- assembler generates the same instructions as those used in this section but
- handles some of the details for you.
-
- If your procedure has relatively few variables, you can usually write the
- most efficient code by placing these values in registers. Local (stack) data
- is more efficient when you have a large amount of local data for the
- procedure.
-
- Local variables are stored on the stack.
-
- To use local variables you must save stack space for the variable at the
- start of the procedure. The variable can then be accessed by its position in
- the stack. At the end of the procedure, you need to restore the stack
- pointer, which restores the memory used by local variables.
-
- This example subtracts two bytes from the SP register to make room for a
- local word variable. This variable can then be accessed as [bp-2].
-
- push ax ; Push one argument
- call task ; Call
- .
- .
- .
-
- task PROC NEAR
- push bp ; Save base pointer
- mov bp, sp ; Load stack into base pointer
- sub sp, 2 ; Save two bytes for local
- ; variable
- .
- .
- .
- mov WORD PTR [bp-2], 3 ; Initialize local variable
- add ax, [bp-2] ; Add local variable to AX
- sub [bp+4], ax ; Subtract local from argument
- . ; Use [bp-2] and [bp+4] in
- . ; other operations
- .
- mov sp, bp ; Clear local variables
- pop bp ; Restore base
- ret 2 ; Return result in AX and pop
- task ENDP ; two bytes to clear parameter
-
- Notice that the instruction mov sp,bp at the end of the procedure restores
- the original value of SP. The statement is required only if the value of SP
- is changed inside the procedure (usually by allocating local variables). The
- argument passed to the procedure is removed with the RET instruction.
- Contrast this to the example in Section 7.3.2, "Passing Arguments on the
- Stack," in which the calling code adjusts the stack for the argument.
-
- Figure 7.2 shows the state of the stack at key points in the process.
-
- (This figure may be found in the printed book.)
-
-
- 7.3.5 Creating Local Variables Automatically
-
- Section 7.3.4 described how to create local variables on the stack. This
- section shows you how to automate the process with the LOCAL directive.
-
- The LOCAL directive generates code to set up the stack for local variables.
-
-
- You can use the LOCAL directive to save time and effort when working with
- local variables. When you use this directive, simply list the variables you
- want to create, giving a type for each one. The assembler calculates how
- much space is required on the stack. It also generates instructions to
- properly decrement SP (as described in the previous section) and to reset SP
- when you return from the procedure.
-
- When you create local variables this way, your source code can then refer to
- each local variable by name rather than as an offset of the stack pointer.
- Moreover, the assembler generates debugging information for each local
- variable.
-
- The procedure in the previous section can be generated more simply with the
- following code:
-
- task PROC NEAR arg:WORD
- LOCAL loc:WORD
- .
- .
- .
- mov loc, 3 ; Initialize local variable
- add ax, loc ; Add local variable to AX
- sub arg, ax ; Subtract local from argument
- . ; Use "loc" and "arg" in other operations
- .
- .
- ret
- task ENDP
-
- The LOCAL directive must be on the line immediately following the PROC
- statement. It cannot be used after the first instruction in a procedure. The
- LOCAL directive has the following syntax:
-
- LOCAL vardef [[, vardef]]...
-
- Each vardef defines a local variable. A local variable definition has this
- form:
-
- label[[ [count] ]][[:qualifiedtype]]
-
- These are the parameters in local variable definitions:
-
- Argument Description
- ────────────────────────────────────────────────────────────────────────────
- label The name given to the local variable.
- You can use this name to access the
- variable.
-
- count The number of elements of this name and
- type to allocate on the stack. You can
- allocate a simple array on the stack
- with count. The brackets around count
- are required. If this field is omitted,
- one data object is assumed.
-
- qualifiedtype A simple MASM type or a type defined
- with other types and attributes. See
- Section 1.2.6, "Data Types," for more
- information.
-
-
- If the number of local variables exceeds one line, you can place a comma at
- the end of the first line and continue the list on the next line. Another
- method is to use several consecutive LOCAL directives.
-
- You must initialize local variables.
-
- The assembler does not initialize local variables. Your program must include
- code to perform any necessary initializations. For example, the following
- code fragment sets up a local array and initializes it to zero:
-
- arraysz EQU 20
-
- aproc PROC USES di
- LOCAL var1[arraysz]:WORD, var2:WORD
- .
- .
- .
- ; Initialize local array to zero
- push ss
- pop es ; Set ES=SS
- lea di, var1 ; ES:DI now points to array
- mov cx, arraysz ; Load count
- sub ax, ax
- rep stosw ; Store zeros
- ; Use the array...
- .
- .
- .
- ret
- aproc ENDP
-
- Even though you can reference stack variables by name, the assembler treats
- them as offsets from BP, and they are not visible outside the procedure. In
- this procedure, array is a local variable.
-
- index EQU 10
- test PROC NEAR
- LOCAL array[index]:WORD
- .
- .
- .
- mov bx, index
- ; mov array[bx], 5 ; Not legal!
-
- The second MOV statement may appear to be legal, but since array is an
- offset of BP, this statement is the same as
-
- ; mov [bp + bx + arrayoffset], 5 ; Not legal!
-
- BP and BX can be added only to SI and DI. This example would be legal,
- however, if the index value were moved to SI or DI. This type of error in
- your program can be difficult to find unless you keep in mind that local
- variables in procedures are offsets of BP.
-
-
- 7.3.6 Declaring Procedure Prototypes
-
- MASM 6.0 provides a new directive, INVOKE, to handle many of the details
- important to procedure calls, such as pushing parameters according to the
- correct calling conventions. In order to use INVOKE, the procedure called
- must have previously been declared with a PROC statement, an EXTERNDEF (or
- EXTERN) statement, or a TYPEDEF. You can also place a prototype defined with
- PROTO before the INVOKE if the procedure type does not appear before the
- INVOKE. Procedure prototypes defined with PROTO inform the assembler of
- types and numbers of arguments so the assembler can check for errors and
- provide automatic conversions when INVOKE calls the procedure.
-
-
- Place prototypes after data declarations or in a separate include file.
-
- Prototypes in MASM perform the same function as prototypes in the C language
- and other high-level languages. A procedure prototype includes the procedure
- name, the types, and (optionally) the names of all parameters the procedure
- expects. Prototypes are usually placed at the beginning of an assembly
- program or in a separate include file. They are especially useful for
- procedures called from other modules and other languages, enabling the
- assembler to check for unmatched parameters. If you write routines for a
- library, you may want to put prototypes into an include file for all the
- procedures used in that library. See Chapter 8, "Sharing Data and Procedures
- among Modules and Libraries," for more information about using include
- files.
-
- Declaring procedure prototypes is optional. You can use the PROC directive
- and the CALL instruction, as shown in the previous section.
-
- In MASM 6.0, using the PROTO directive is one way to define procedure
- prototypes. The syntax for a prototype definition is the same as for a
- procedure declaration (see Section 7.3.3, "Declaring Parameters with the
- PROC Directive"), except that you do not include the list of registers,
- prologuearg list, or the scope of the procedure.
-
- Also, the PROTO keyword precedes the langtype and distance attributes. The
- attributes (like C and FAR) are optional, but if not specified, the defaults
- are based on any .MODEL or OPTION LANGUAGE statement. The names of the
- parameters are also optional, but you must list parameter types. A label
- preceding :VARARG is also optional in the prototype but not in the PROC
- statement.
-
- If a PROTO and a PROC for the same function appear in the same module, they
- must match in attribute, number of parameters, and parameter types. The
- easiest way to create prototypes with PROTO for your procedures is to write
- the procedure and then copy the first line (the line that contains the PROC
- keyword) to a location in your program that follows the data declarations.
- Change PROC to PROTO and remove the USES reglist, the prologuearg field, and
- the visibility field. It is important that the prototype follow the
- declarations for any types used in it to avoid any forward references used
- by the parameters in the prototype.
-
- The prototype defined with PROTO statement and the PROC statement for two
- procedures are given below.
-
- ; Procedure prototypes
-
- addup PROTO NEAR C argcount:WORD, arg2:WORD, arg3:WORD
-
- myproc PROTO FAR C, argcount:WORD, arg2:VARARG
-
- ; Procedure declarations
-
- addup PROC NEAR C, argcount:WORD, arg2:WORD, arg3:WORD
-
- myproc PROC FAR C PUBLIC <callcount> USES di si,
- argcount:WORD,
- arg2:VARARG
-
- When you call a procedure with INVOKE, the assembler checks the arguments
- given by INVOKE against the parameters expected by the procedure. If the
- data types of the arguments do not match, MASM either reports an error or
- converts the type to the expected type. These conversions are explained in
- the next section.
-
-
- 7.3.7 Calling Procedures with INVOKE
-
- INVOKE generates a sequence of instructions that push arguments and call a
- procedure. This helps maintain code if arguments or langtype for a procedure
- is changed. INVOKE generates procedure calls and automatically handles the
- following tasks:
-
-
- ■ Converts arguments to the expected types
-
- ■ Pushes arguments on the stack in the correct order
-
- ■ Cleans up the stack when the procedure returns
-
-
- If arguments do not match in number or if the type is not one the assembler
- can convert, an error results.
-
- If VARARG is an option in a procedure, INVOKE can pass arguments in addition
- to those in the parameter list without generating an error or warning. The
- extra arguments must be at the end of the INVOKE argument list. All other
- arguments must match in number and type.
-
- The syntax for INVOKE is
-
- INVOKE expression «, arguments»
-
- where expression can be the procedure's label or an indirect reference to a
- procedure, and arguments can be an expression, a register pair, or an
- expression preceded with ADDR. (The ADDR operator is discussed below.)
-
- Procedures that have these procedure prototypes
-
- addup PROTO NEAR C argcount:WORD, arg2:WORD, arg3:WORD
-
- myproc PROTO FAR C, argcount:WORD, arg2:VARARG
-
- and these procedure declarations
-
- addup PROC NEAR C, argcount:WORD, arg2:WORD, arg3:WORD
-
- myproc PROC FAR C PUBLIC <callcount> USES di si,
- argcount:WORD,
- arg2:VARARG
-
- may have INVOKE statements that look like this:
-
- INVOKE addup, ax, x, y
- INVOKE myproc, bx, cx, 100, 10
-
- The assembler can convert some arguments and parameter type combinations so
- that the correct type can be passed. The signed or unsigned qualities of the
- arguments in the INVOKE statements determine how the assembler converts them
- to the types expected by the procedure.
-
- The addup procedure, for example, expects parameters of type WORD, but the
- arguments passed by INVOKE to the addup procedure can be any of these
- types:
-
-
- ■ BYTE, SBYTE, WORD, or SWORD
-
- ■ An expression whose type is specified with the PTR operator to be one
- of those types
-
- ■ An 8-bit or 16-bit register
-
- ■ An immediate expression in the range -32K to +64K
-
- ■ A NEAR PTR
-
-
- If the type is smaller than that expected by the procedure, MASM widens the
- argument to match.
-
-
- 7.3.7.1 Widening Arguments
-
- For INVOKE to correctly handle type conversions, you must use the signed
- data types for any signed assignments. This list shows the cases in which
- MASM widens an argument to match the type expected by a procedure's
- parameters.
-
- Type Passed Type Expected
- ────────────────────────────────────────────────────────────────────────────
- BYTE, SBYTE WORD, SWORD, DWORD, SDWORD
-
- WORD, SWORD DWORD, SDWORD
-
- When possible, MASM widens arguments to match parameter types.
-
- The assembler generates instructions such as XOR and CBW to perform the
- conversion. You can see these generated instructions in the listing file by
- using the /Sg command-line option. The assembler can extend a segment if far
- data is expected, and it can convert the type given in the list to the types
- expected. If the assembler cannot convert the type, however, it generates an
- error.
-
-
- 7.3.7.2 Detecting Errors
-
- When the assembler widens arguments, it may require the use of a register
- that could overwrite another argument.
-
- For example, if a procedure with the C calling convention is called with
- this INVOKE statement,
-
- INVOKE myprocA, ax, cx, 100, arg
-
- where arg is a BYTE variable and myproc expects four arguments of type
- WORD, the assembler widens and then pushes the variable with this code:
-
- mov al, DGROUP:arg
- xor ah, ah
- push ax
-
- As a result, the assembler generates code that also uses the AX register and
- therefore overwrites the first argument passed to the procedure in AX. The
- assembler generates an error in this case, requiring you to rewrite the
- INVOKE statement for this procedure.
-
- The INVOKE directive uses as few registers as possible. However, widening
- arguments or pushing constants on the 8088 and 8086 requires the use of the
- AX register, and sometimes the DX register or the EAX and EDX on the
- 80386/486. This means that the content of AL, AH, AX, and EAX must
- frequently be overwritten, so you should avoid using these registers to pass
- arguments. As an alternative you can use DL, DH, DX, and EDX, since these
- registers are rarely used.
-
-
- 7.3.7.3 Invoking Far Addresses
-
- You can pass a FAR pointer in a segment::offset pair, as shown below. Note
- the use of double colons to separate the register pair. The registers could
- be any other register pair, including a pair that a DOS call uses to return
- values.
-
- FPWORD TYPEDEF FAR PTR WORD
- SomeProc PROTO var1:DWORD, var2:WORD, var3:WORD
-
- pfaritem FPWORD faritem
- .
- .
- .
- les bx, pfaritem
- INVOKE SomeProc, ES::BX, arg1, arg2
-
- However, you cannot give INVOKE two arguments, one for the segment and one
- for the offset, and have INVOKE combine the two for an address.
-
-
- 7.3.7.4 Passing an Address
-
- You can use the ADDR operator to pass the address of an expression to a
- procedure that is expecting a NEAR or FAR pointer. This example generates
- code to pass a far pointer (to arg1) to the procedure proc1.
-
- PBYTE TYPEDEF FAR PTR BYTE
- arg1 BYTE "This is a string"
- proc1 PROTO NEAR C fparg:PBYTE
- .
- .
- .
- INVOKE proc1, ADDR arg1
-
- See Section 3.3.1 for information on defining pointers with TYPEDEF.
-
-
- 7.3.7.5 Invoking Procedures Indirectly
-
- You can make an indirect procedure call such as call [bx + si] by using a
- pointer to a function prototype with TYPEDEF, as shown in this example:
-
- FUNCPROTO TYPEDEF PROTO NEAR ARG1:WORD, ARG2:WORD
- FUNCPTR TYPEDEF PTR FUNCPROTO
-
- .DATA
- pfunc FUNCPTR OFFSET proc1, OFFSET proc2
-
- .CODE
- mov si, Num ; Num contains 0 or 2
- INVOKE FUNCPTR PTR [si] ; Selects proc1 or proc2
-
- You can also use ASSUME to accomplish the same task. The ASSUME statement
- associates the type PFUNC with the BX register.
-
- ASSUME BX:FUNCPTR
- mov si, Num
- INVOKE FUNCPTR PTR [bx+si]
-
-
- 7.3.7.6 Checking the Code Generated
-
- The INVOKE directive generates code that may vary depending on the processor
- mode and calling conventions in effect. You can check your listing files to
- see the code generated by the INVOKE directive if you use the /Sg
- command-line option.
-
-
- 7.3.8 Generating Prologue and Epilogue Code
-
- When you use the PROC directive with its extended syntax and argument list,
- the assembler automatically generates the prologue and epilogue code in your
- procedure. "Prologue code" is generated at the start of the procedure; it
- sets up a stack pointer so you can access parameters from within the
- procedure. It also saves space on the stack for local variables, initializes
- registers such as DS, and pushes registers that the procedure uses.
- Similarly, "epilogue code" is the code at the end of the procedure that pops
- registers and returns from the procedure.
-
- The assembler automatically generates the prologue code when it encounters
- the first instruction after the PROC directive. It generates the epilogue
- code when it encounters a RET or IRET instruction. Using the
- assembler-generated prologue and epilogue code saves you time and decreases
- the number of repetitive lines of code in your procedures.
-
- The generated prologue or epilogue code depends on the
-
-
- ■ Local variables defined
-
- ■ Arguments passed to the procedure
-
- ■ Current processor selected (affects epilogue code only)
-
- ■ Current calling convention
-
- ■ Options passed in the prologuearg of the PROC directive
-
- ■ Registers being saved
-
-
- The prologuearg list contains options specifying how the prologue or
- epilogue code should be generated. The next section explains how to use
- these options, gives the standard prologue and epilogue code, and explains
- the techniques for defining your own prologue and epilogue code.
-
-
- 7.3.8.1 Using Automatic Prologue and Epilogue Code
-
- The standard prologue and epilogue code handles parameters and local
- variables. If a procedure does not have any parameters or local variables,
- the prologue and epilogue code that sets up and restores a stack pointer is
- omitted, unless FORCEFRAME is included in the prologuearg list. (FORCEFRAME
- is discussed later in this section.) Prologue and epilogue code also
- generates a push and pop for each register in the register list unless the
- register list is empty.
-
- RETN and RETF suppress epilogue code generation.
-
- When a RET is used without an operand, the assembler generates the standard
- epilogue code. If you do not want the standard epilogue generated, you can
- use RETN or RETF with or without operands. RET with an integer operand does
- not generate epilogue code, but it does generate the right size of return.
-
- In the examples below showing standard prologue and epilogue code,
- localbytes is a variable name used in this example to represent the number
- of bytes needed on the stack for the locals declared, parmbytes represents
- the number of bytes that the parameters take on the stack, and registers
- represents the list of registers to be pushed or popped.
-
- The standard prologue code is the same in any processor mode:
-
- push bp
- mov bp, sp
- sub sp, localbytes ; if localbytes is not 0
- push registers
-
- The standard epilogue code is:
-
- pop registers
- mov sp, bp ; if localbytes is not 0
- pop bp
- ret parmbytes ; use parmbytes only if lang is not C
-
- The standard prologue and epilogue code recognizes two operands passed in
- the prologuearg list, LOADDS and FORCEFRAME. These operands modify the
- prologue code. Specifying LOADDS saves and initializes DS. Specifying
- FORCEFRAME as an argument generates a stack frame even if no arguments are
- sent to the procedure and no local variables are declared. If your procedure
- has any parameters or locals, you do not need to specify FORCEFRAME.
-
- Specifying LOADDS generates this prologue code:
-
- push bp
- mov bp, sp
- sub sp, localbytes ; if localbytes is not 0
- push ds
- mov ax, DGROUP
- mov ds, ax
- push registers
-
- Specifying LOADDS generates the following epilogue code:
-
- pop registers
- pop ds
- mov sp, bp
- pop bp
- ret parmbytes ; use parmbytes only if lang is not C
-
-
- 7.3.8.2 User-Defined Prologue and Epilogue Code
-
- If you want a different set of instructions for prologue and epilogue code
- in your procedures, you can write macros that are executed instead of the
- standard prologue and epilogue code. For example, while you are debugging
- your procedures, you may want to include a stack check or track the number
- of times a procedure is called. You can write your own prologue code to do
- these things whenever a procedure executes. Different prologue code may also
- be necessary if you are writing applications for Microsoft Windows or any
- other environment application for DOS. User-defined prologue macros will
- respond correctly if you specify FORCEFRAME in the prologuearg of a
- procedure.
-
- To write your own prologue or epilogue code, the OPTION directive must
- appear in your program. It disables automatic prologue and epilogue code
- generation. When you specify
-
- OPTION PROLOGUE : macroname
-
- OPTION EPILOGUE : macroname
-
- the assembler calls the macro specified in the OPTION directive instead of
- generating the standard prologue and epilogue code. The prologue macro must
- be a macro function, and the epilogue macro must be a macro procedure.
-
- The assembler expects your prologue or epilogue macro to have this form:
-
- macroname MACRO procname, /
- flag, /
- parmbytes, /
- localbytes, /
- <reglist>, /
- userparms
-
- The following list explains the arguments passed to your macro. Your macro
- must have formal parameters to match all the actual arguments passed.
-
- ╓┌───────────┌───────────────────────────────┌───────────────────────────────╖
- Argument Description
- ────────────────────────────────────────────────────────────────────────────
- procname The name of the procedure.
-
- flag A 16-bit flag containing the
- following information:
-
- Bit = Value Description
-
- Bit 0, 1, 2 For calling conventions
- (000=unspecified language type,
- 001=C, 010=SYSCALL, 011=
- STDCALL, 100=PASCAL, 101=
- FORTRAN, 110=BASIC)
-
- Bit 3 Undefined (not necessarily
- Argument Description
- ────────────────────────────────────────────────────────────────────────────
- Bit 3 Undefined (not necessarily
- zero)
-
- Bit 4 Set if the caller restores the
- stack (Use RET, not RETn)
-
- Bit 5 Set if procedure is FAR
-
- Bit 6 Set if procedure is PRIVATE
-
- Bit 7 Set if procedure is EXPORT
-
- Bit 8 Set if the epilogue was
- generated as a result of an
- IRET instruction and cleared
- if the epilogue was generated
- as a result of a RET
- instruction
-
- Argument Description
- ────────────────────────────────────────────────────────────────────────────
- Bits 9-15 Undefined (not necessarily
- zero)
-
- parmbytes The byte count of all the
- parameters given in the PROC
- statement.
-
- localbytes The count in bytes of all
- locals defined with the LOCAL
- directive.
-
- reglist A list of the registers
- following the USES operator in
- the procedure declaration.
- This list is enclosed by angle
- brackets (< >), and each item
- is separated by commas. This
- list is reversed for epilogues.
- Argument Description
- ────────────────────────────────────────────────────────────────────────────
- list is reversed for epilogues.
-
- userparms Any argument you want to pass
- to the macro. The
- prologuearg (if there is one)
- specified in the PROC
- directive is passed to this
- argument.
-
-
-
- Your macro function must return the parmbytes parameter. However, if the
- prologue places other values on the stack after pushing BP and these values
- are not referenced by any of the local variables, the exit value must be the
- number of bytes for procedure locals plus any space between BP and the
- locals. Therefore parmbytes is not always equal to the bytes occupied by the
- locals.
-
- The following macro is an example of a user-defined prologue that counts the
- number of times a procedure is called.
-
- ProfilePro MACRO procname, \
- flag, \
- bytecount, \
- numlocals, \
- regs, \
- macroargs
-
- .DATA
- procname&count WORD 0
- .CODE
- inc procname&count ; Accumulates count of times the
- ; procedure is called
- push bp
- mov bp, sp
- ; Other BP operations
- IFNB <regs>
- FOR r, regs
- push r
- ENDM
- ENDIF
- EXITM %bytecount
- ENDM
-
- Your program must also include this statement before any procedures are
- called that use the prologue:
-
- OPTION PROLOGUE:ProfilePro
-
- If you define only a prologue or an epilogue macro, the standard prologue or
- epilogue code is used for the one you do not define. The form of the code
- generated depends on the .MODEL and PROC options used.
-
- If you want to revert to the standard prologue or epilogue code, use
- PROLOGUEDEF or EPILOGUEDEF as the macroname in the OPTION statement.
-
- OPTION EPILOGUE:EPILOGUEDEF
-
- You can completely suppress prologue or epilogue generation with
-
- OPTION PROLOGUE:None
- OPTION EPILOGUE:None
-
- In this case, no user-defined macro is called, and the assembler does not
- generate a default code sequence. This state remains in effect until the
- next OPTION PROLOGUE or OPTION EPILOGUE is encountered.
-
- See Chapter 9 for additional information about writing macros. The
- PROLOGUE.INC file provided in the MASM 6.0 distribution disks can be used to
- create the prologue and epilogue sequences for the Microsoft C Professional
- Development System, version 6.0.
-
-
- 7.4 DOS Interrupts
-
- In addition to jumps, loops, and procedures that alter program execution,
- interrupt routines transfer execution to a different location. In this case,
- control goes to an interrupt routine.
-
- You can write your own interrupt routines, either to replace an existing
- routine or to use an undefined interrupt number. You may want to replace the
- processor's divide-overflow (0h) interrupts or DOS interrupts, such as the
- critical-error (24h) and CONTROL+C (23h) handlers. The BOUND instruction
- checks array bounds and calls interrupt 5 when an error occurs. If you use
- this instruction, you need to write an interrupt handler for it.
-
- This section summarizes the following:
-
-
- ■ How to call interrupts
-
- ■ How the processor handles interrupts
-
- ■ How to redefine an existing interrupt routine
-
-
- The example routine in this section handles addition or multiplication
- overflow and illustrates the steps necessary for writing an interrupt
- routine. See Chapter 19, "Writing Memory-Resident Software" for additional
- information about DOS and BIOS interrupts.
-
- ────────────────────────────────────────────────────────────────────────────
- NOTE
- Under OS/2, system access is made through calls to the Applications Program
- Interface (API), not through interrupts. Microsoft Windows applications use
- both interrupts and API calls.
- ────────────────────────────────────────────────────────────────────────────
-
-
- 7.4.1 Calling DOS and ROM-BIOS Interrupts
-
- Interrupts are the only way to access DOS from assembly language. They are
- called with the INT instruction, which takes one operand─an immediate value
- between 0 and 255.
-
- When calling DOS and ROM-BIOS interrupts, you usually need to place a
- function number in the AH register. You can use other registers to pass
- arguments to functions. Some interrupts and functions return values in
- certain registers, although register use varies for each interrupt. This
- code writes the text of msg to the screen.
-
- .DATA
- msg BYTE "This writes to the screen",$
- .CODE
- mov dx, offset msg
- mov ah, 09h
- int 21h
-
- When the INT instruction executes, the processor takes the following six
- steps:
-
-
- 1. Looks up the address of the interrupt routine in the interrupt
- descriptor table (also called the "interrupt vector"). This table
- starts at the lowest point in memory (segment 0, offset 0) and
- consists of four bytes (two segment and two offset) for each
- interrupt. Thus, the address of an interrupt routine equals the number
- of the interrupt multiplied by 4.
-
- 2. Clears the trap flag (TF) and interrupt enable flag (IF).
-
- 3. Pushes the flags register, the current code segment (CS), and the
- current instruction pointer (IP).
-
- 4. Jumps to the address of the interrupt routine, as specified in the
- interrupt descriptor table.
-
- 5. Executes the code of the interrupt routine until it encounters an IRET
- instruction.
-
- 6. Pops the instruction pointer, code segment, and flags.
-
-
- Figure 7.3 illustrates how interrupts work.
-
- (This figure may be found in the printed book.)
-
- Some DOS interrupts should not normally be called. Some (such as 20h and
- 27h) have been replaced by other DOS interrupts. Others are used internally
- by DOS.
-
-
- 7.4.2 Replacing or Redefining Interrupt Routines
-
- One interrupt routine you may want to redefine is the routine called by
- INTO. The INTO (Interrupt on Overflow) instruction is a variation of the INT
- instruction. It calls interrupt 04h when the overflow flag is set. By
- default, the routine for interrupt 4 simply consists of an IRET, so it
- returns without doing anything. Using INTO is an alternative to using JO
- (Jump on Overflow) to jump to an overflow routine.
-
- To replace or redefine an existing interrupt, your routine must
-
-
- ■ Replace the address in the interrupt descriptor table with the address
- of your new routine and save the old address
-
- ■ Provide new instructions to handle the interrupt
-
- ■ Restore the old address when your routine ends
-
-
- An interrupt routine can be written like a procedure by using the PROC and
- ENDP directives. The routine should always be defined as FAR and should end
- with an IRET instruction instead of a RET instruction.
-
- ────────────────────────────────────────────────────────────────────────────
- NOTE
-
- Since the assembler doesn't know whether you are going to terminate with
- RET or IRET, you can use the full extended PROC syntax (described in
- Section 7.3.3, "Declaring Parameters with the PROC Directive") to write
- interrupt procedures. However, you should not make interrupt procedures NEAR
- or specify arguments for them. You can use the USES keyword, however, to
- correctly generate code to save and to restore a register list in interrupt
- procedures.
- ────────────────────────────────────────────────────────────────────────────
-
- The STI (Set Interrupt Flag) and CLI (Clear Interrupt Flag) instructions
- turn interrupts on or off. You can use CLI to turn off interrupt processing
- so that an important routine cannot be stopped by a hardware interrupt.
- After the routine has finished, use STI to turn interrupt processing back
- on. Interrupts received while interrupt processing was turned off by CLI are
- saved and executed when STI turns interrupts back on.
-
- MASM 6.0 provides two new forms of the IRET instruction that suppress
- epilogue sequences. This allows an interrupt to have local variables or use
- a userdefined prologue. IRETF pops a FAR16 return address, and IRETFD pops a
- FAR32 return address.
-
- The following example uses DOS functions to save the address of the initial
- interrupt routine in a variable and to put the address of the new interrupt
- routine in the interrupt descriptor table. Once the new address has been
- set, the new routine is called any time the interrupt is called. This new
- routine prints a message and sets AX and DX to 0.
-
- To replace the address in the interrupt descriptor table with the address of
- your procedure, AL needs to be loaded with 04h and AH loaded with 35, the
- Get Interrupt Vector function. The Set Interrupt Vector function requires 25
- in AH.
-
- Follow this example to replace an existing interrupt routine. To write an
- interrupt handler for an unused interrupt, see online help for available
- vectors.
-
- .MODEL LARGE, C, DOS
- FPFUNC TYPEDEF FAR PTR
- .DATA
- msg BYTE "Overflow - result set to 0",13,10,"$"
- vector FPFUNC ?
- .CODE
- .STARTUP
-
- mov ax, 3504h ; Load interrupt 4 and call DOS
- int 21h ; Get Interrupt Vector function
- mov WORD PTR vector[2],es ; Save segment
- mov WORD PTR vector[0],bx ; and offset
-
- push ds ; Save DS
- mov ax, cs ; Load segment of new routine
- mov ds, ax
- mov dx, OFFSET ovrflow ; Load offset of new routine
- mov ax, 2504h ; Load interrupt 4 and call DOS
- int 21h ; Set Interrupt Vector function
- pop ds ; Restore
- .
- .
- .
- add ax, bx ; Do addition (or multiplication)
- into ; Call interrupt 4 if overflow
- .
- .
- .
- lds dx, vector ; Load original interrupt address
- mov ax, 2504h ; Restore interrupt number 4
- int 21h ; with DOS set vector function
- mov ax, 4C00h ; Terminate function
- int 21h
-
- ovrflow PROC FAR
- sti ; Enable interrupts
- ; (turned off by INT)
- mov ah, 09h ; Display string function
- mov dx, OFFSET msg ; Load address
- int 21h ; Call DOS
- sub ax, ax ; Set AX to 0
- sub dx, dx ; Set DX to 0
- iret ; Return
- ovrflow ENDP
- END
-
- Before your program ends, you should restore the original address by loading
- DX with the original interrupt address and using the DOS set vector function
- to store the original address at the correct location.
-
-
- 7.5 Related Topics in Online Help
-
- Other information available online which relates to topics in this chapter
- is given in the list below:
-
- Topic Access
- ────────────────────────────────────────────────────────────────────────────
- OPTION directive From the "MASM 6.0 Contents" screen,
- choose "Directives," then choose
- "Miscellaneous"
-
- DOS and ROM-BIOS interrupts From the list of System Resources on the
- "MASM 6.0 Contents" screen, choose "DOS
- Calls" or "BIOS Calls"
-
- BT, BTC, BTR, BTS From the "MASM 6.0 Contents" screen,
- choose "Processor Instructions" and then
- "Logical and Shifts"
-
- Other forms of the LOOP From the "MASM 6.0 Contents" screen,
- instruction choose "Processor Instructions" and then
- "Control Flow"
-
- Processor Flag Summary From the "MASM 6.0 Contents" screen,
- choose "Processor Instructions"
-
-
-
-
-
-
-
- Chapter 8 Sharing Data and Procedures among Modules and Libraries
- ────────────────────────────────────────────────────────────────────────────
-
- To use symbols and procedures in more than one module, the assembler must be
- able to recognize the shared data as global to all the modules where they
- are used. MASM 6.0 provides new techniques to simplify data-sharing and give
- a high-level interface to multiple-module programming. With these
- techniques, you can place shared symbols in include files. This makes the
- data declarations in the file available to all modules that use the include
- file.
-
- After an overview of the data-sharing methods, the next section of this
- chapter focuses on organizing modules and using the include file to simplify
- data-sharing. The first method allows you to create a single include file
- that works in the modules where the symbol is used as well as where it is
- defined.
-
- Sharing procedures and data items using the PUBLIC and EXTERN directives in
- the appropriate modules is the other method of data-sharing. The third
- section of this chapter explains how to use PUBLIC and EXTERN.
-
- You may also want to place commonly used routines in libraries. Section 8.4
- explains how to create program libraries and access their routines.
-
-
- 8.1 Selecting Data-Sharing Methods
-
- If data defined in one module is to be used in the other modules of a
- multiple-module program, the data must be made public and external. MASM
- provides several methods for doing this.
-
- One method is to declare a symbol public (with the PUBLIC directive) in the
- module where it is defined. This makes the symbol available to other
- modules. Then place an EXTERN statement for that symbol in the rest of the
- modules that use the public symbol. This statement informs the assembler
- that the symbol is external─defined in another module.
-
- As an alternative, you can use the COMM directive instead of PUBLIC and
- EXTERN. However, communal variables have some limitations. You cannot depend
- on their location in memory because they are allocated by the linker, and
- they cannot be initialized.
-
- These two data-sharing methods are still available, but MASM 6.0 introduces
- a new directive, EXTERNDEF, that declares a symbol either public or
- external, as appropriate. EXTERNDEF simplifies the declarations for global
- (public and external) variables and encourages the use of include files.
-
- The next section provides further details on using include files. Section
- 8.3, "Using Alternatives to Include Files," provides more information on
- PUBLIC and EXTERN.
-
-
- 8.2 Sharing Symbols with Include Files
-
- Place statements common to all modules in include files.
-
- Include files can contain any valid MASM statement but typically consist of
- type and symbol declarations. The assembler inserts the contents of the
- include file into a module at the location of the INCLUDE directive. Include
- files can simplify project organization by eliminating the need to
- physically insert common declarations into more than one program or module.
- Include files are always optional. See Section 8.3 for alternatives to using
- include files.
-
- The first part of this section explains how to organize symbol definitions
- and the declarations that make the symbols global (available to all
- modules). It then shows how to make both variables and procedures public
- with EXTERNDEF, PROTO, and COMM. The last part of this section tells where
- to place these directives in the modules and include files.
-
-
- 8.2.1 Organizing Modules
-
- This section summarizes the organization of declarations and definitions in
- modules and include files and the use of the INCLUDE directive.
-
- Include Files - Type declarations that need to be identical in every module
- should be placed in an include file. Doing so ensures consistency and can
- save programming time when updating programs. Include files should contain
- only symbol declarations and any other declarations that are resolved at
- assembly time. (See Section 1.3.1, "Generating and Running Executable
- Programs," for a list of assembly-time operations.) If the include file is
- associated with more than one module, it cannot contain statements that
- define and allocate memory for symbols unless you include the data
- conditionally (see Section 1.3.3).
-
- Modules - Label definitions that cause the assembler to allocate memory
- space must be defined in a module, not in an include file. If any of these
- definitions is located in the include file, it is copied into each file that
- uses the include file, creating an error.
-
- Include files are inserted at the location of the INCLUDE directive.
-
- Once you have placed public symbols in an include file, you need to
- associate that file with the main module. The INCLUDE statement is usually
- placed before data and code segments in your modules. When the assembler
- encounters an INCLUDE directive, it opens the specified file and assembles
- all its statements. The assembler then returns to the original file and
- continues the assembly process.
-
- The INCLUDE directive takes the form
-
- INCLUDE filename
-
- where filename is the full name or fully specified path of the include file.
- For example, the following declaration inserts the contents of the include
- file SCREEN.INC in your program:
-
- INCLUDE SCREEN.INC
-
- You must make sure that the assembler can find include files.
-
- The file name in the INCLUDE directive must be fully specified; no
- extensions are assumed. If a full path name is not given, the assembler
- searches first in the directory of the source file containing the INCLUDE
- directive.
-
- If the include file is not in the source file directory, the assembler
- searches the paths specified in the assembler's command-line option /I, or
- in PWB's Include Paths field in the MASM Option dialog box (accessed from
- the Option menu). The /I option takes this form:
-
- /I path
-
- Multiple /I options can be used to specify that multiple directives be
- searched in the order they appear on the command line. If none of these
- directories contains the desired include file, the assembler finally
- searches in the paths specified in the INCLUDE environment variable. If the
- include file still cannot be found, an assembly error occurs. The related /x
- option tells the assembler to ignore the INCLUDE environment variable for
- all subsequent assemblies.
-
- An include file may specify another include file. The assembler processes
- the second include file before returning to the first. Include files can be
- nested this way as deeply as desired; the only limit is the amount of free
- memory.
-
- Put constants used in more than one module into the include file.
-
- Include Files or Modules - You can use the EQU directive to create named
- constants that cannot be redefined in your program (see Section 1.2.4,
- "Integer Constants and Constant Expressions," for information about the EQU
- directive). Placing a constant defined with EQU in an include file makes it
- available to all modules that use that include file.
-
- Placing TYPEDEF, STRUCT, UNION, and RECORD definitions in an include file
- guarantees consistency in type definitions. If required, the variable
- instances derived from these definitions can be made public among the
- modules with EXTERNDEF declarations (see the next section). Macros
- (including macros defined with TEXTEQU) must be placed in include files to
- make them visible in other modules.
-
- If you elect to use full segment definitions (along with, or instead of,
- simplified definitions), you can force a consistent segment order in all
- files by defining segments in an include file. This technique is explained
- in Section 2.3.2, "Controlling the Segment Order."
-
-
- 8.2.2 Declaring Symbols Public and External
-
- It is sometimes useful to make procedures and variables (such as large
- arrays or status flags) global to all program modules. Global variables are
- freely accessible within all routines; you do not have to explicitly pass
- them to the routines that need them.
-
- Variables can be made global to multiple modules in several ways. This
- section describes three ways to make them global by using the EXTERNDEF,
- PROTO, or COMM declarations within include files. Section 8.3.1 explains how
- to use the PUBLIC and EXTERN directives within modules.
-
- External identifiers must be unique.
-
- These methods make symbols global to the modules in which they are used.
- Therefore, symbols must be unique. The linker enforces this requirement.
-
-
- 8.2.2.1 Using EXTERNDEF
-
- EXTERNDEF can appear in the defining or calling modules.
-
- MASM treats EXTERNDEF as a public declaration in the defining module and as
- an external declaration in accessing module(s). You can use the EXTERNDEF
- statement in your include file to make a variable common among two or more
- modules. EXTERNDEF works with all types of variables, including arrays,
- structures, unions, and records. It also works with procedures.
-
- As a result, a single include file can contain an EXTERNDEF declaration that
- works in both the defining module and any accessing module. It is ignored in
- modules that neither define nor access the variable. Therefore, an include
- file for a library which is used in multiple .EXE files does not force the
- definition of a symbol as EXTERN does.
-
- The EXTERNDEF statement takes this form:
-
- EXTERNDEF [[langtype]] name:qualifiedtype
-
- The name is the variable's identifier. The qualifiedtype is explained in
- detail in Section 1.2.6, "Data Types."
-
- The optional langtype specifier sets the naming conventions for the name it
- precedes. It overrides any language specified in the .MODEL directive. The
- specifier can be C, SYSCALL, STDCALL, PASCAL, FORTRAN, or BASIC. See Section
- 20.1, "Naming and Calling Conventions," for information on selecting the
- appropriate langtype type.
-
- The diagram below shows the statements that declare an array, make it
- public, and use it in another module.
-
- (This figure may be found in the printed book.)
-
- The file position of EXTERNDEF directives is important. See Section 8.2.3,
- "Positioning External Declarations," for more information.
-
- The assembler does not check parameters when you call EXTERNDEF procedures.
-
-
- You can also make procedures visible by using EXTERNDEF without PROTO inside
- an include file. This method treats the procedure name as a simple
- identifier, without the parameter list, so you forgo the assembler's ability
- to check for the correct parameters during assembly.
-
- The method for using EXTERNDEF for procedures is the same as using it with
- variables. You can also use EXTERNDEF to make code labels global.
-
-
- 8.2.2.2 Using PROTO
-
- When a procedure is defined in one module and called from another module, it
- must be declared public in the defining module and external in the calling
- modules; otherwise, assembly or linking errors occur.
-
- You have three methods for declaring a procedure public. Using PUBLIC and
- EXTERN is the only method prior to MASM 6.0. Section 8.3.1 explains the use
- of PUBLIC and EXTERN. The previous section (8.2.2.1) explains the use of
- EXTERNDEF. This section illustrates the use of PROTO.
-
- A PROTO (prototype) declaration in the include file establishes a
- procedure's interface in both the defining and calling modules. The PROTO
- directive automatically generates an EXTERNDEF for the procedure unless the
- procedure has been declared PRIVATE in the PROC statement. Defining a
- prototype enables type-checking for the procedure arguments.
-
- PROTO and INVOKE simplify procedure calls.
-
- Follow these steps to create an interface for a procedure defined in one
- module and called from other modules:
-
-
- 1. Place the PROTO declaration in the include file.
-
- 2. Define the procedure with PROC. The PROC directive declares the
- procedure PUBLIC by default.
-
- 3. Call the procedure with the INVOKE statement (or with CALL).
-
-
- The following example is a PROTO declaration for the far procedure
- CopyFile, which uses the C parameter-passing and naming conventions, and
- takes the arguments filename and numberlines. The diagram following the
- example shows the file placement for these statements. This definition goes
- into the include file:
-
- CopyFile PROTO FAR C filename:BYTE, numberlines:WORD
-
- The procedure definition for CopyFile is
-
- CopyFile PROC FAR C USES cx, filename:BYTE, numberlines:WORD
-
- To call the CopyFile procedure, you can use this INVOKE statement:
-
- INVOKE CopyFile, NameVar, 200
-
- (This figure may be found in the printed book.)
-
- See Chapter 7, "Controlling Program Flow," for descriptions, syntax, and
- examples of PROTO, PROC, and INVOKE.
-
-
- 8.2.2.3 Using COMM
-
- Another way to share variables among modules is to add the COMM (communal)
- declaration to your include file. Since communal variables are allocated by
- the linker and cannot be initialized, you cannot depend on their location or
- sequence.
-
- Communal variables are supported by MASM primarily for compatibility with
- communal variables in Microsoft C. Communal variables are not used in any
- other Microsoft language, and they are not compatible with C++ and some
- other languages.
-
- Communal variables can reduce the size of executable files.
-
- COMM declares a variable external but cannot be used with code. COMM also
- instructs the linker to define the variable if it has not been explicitly
- defined in a module. The memory space for communal variables may not be
- assigned until load time, so using communal variables may reduce the size of
- your executable file.
-
- The COMM declaration has the syntax
-
- COMM [[langtype]] [[NEAR
- | FAR]] label:type«:count»
-
- The label is the name of the variable. The langtype sets the naming
- conventions for the name it precedes. It overrides any language specified in
- the .MODEL directive.
-
- If NEAR or FAR is not specified, the variable determines the default from
- the current memory model (NEAR for TINY, SMALL, COMPACT, and FLAT; FAR for
- MEDIUM, LARGE, and HUGE).
-
- The type can be a constant expression, but it is usually a type such as
- BYTE, WORD, or DWORD, or a structure, union, or record. If you first declare
- the type with TYPEDEF, CodeView can provide type information. The count is
- the number of elements. If no count is given, one element is assumed.
-
- The following example creates the common far variable DataBlock, which is a
- 1,024-element array of uninitialized signed doublewords:
-
- COMM FAR DataBlock:SDWORD:1024
-
- ────────────────────────────────────────────────────────────────────────────
- NOTE
-
- C variables declared outside functions (except static variables) are
- communal unless explicitly initialized; they are the same as
- assembly-language communal variables. If you are writing assembly-language
- modules for C, you can declare the same communal variables in both C and
- MASM include files. However, communal variables in C do not have to be
- declared communal in assembler. The linker will match the EXTERN, PUBLIC,
- and COMM statements for the variable.
- ────────────────────────────────────────────────────────────────────────────
-
- EXTERNDEF is a flexible alternative to using COMM.
-
- EXTERNDEF (explained in the previous section) is more flexible than COMM
- because you can initialize variables defined with it, and you can use those
- variables in code that depends on the position and sequence of the data.
-
-
- 8.2.3 Positioning External Declarations
-
- Although LINK determines the actual address of an external symbol, the
- assembler assumes a default segment for the symbol, based on the location of
- the external directive in the source code. You should therefore position
- EXTERN and EXTERNDEF directives according to these rules:
-
-
- ■ If you know which segment defines an external symbol, put the EXTERN
- statement in that segment.
-
- ■ If you know the group but not the segment, position the EXTERN
- statement outside any segment and reference the variable with the
- group name. For example, if var1 is in DGROUP, you would reference
- the variable as mov DGROUP:var1, 10.
-
- ■ If you know nothing about the location of an external variable, put
- the EXTERN statement outside any segment. You can use the SEG
- directive to access the external variable like this:
-
- mov ax, SEG var1
- mov es, ax
- mov ax, es:var1
-
-
- ■ If the symbol is an absolute symbol or a far code label, you can
- declare it external anywhere in the source code.
-
-
- Always close opened segments.
-
- Any segments opened in include files should always be closed so that
- external declarations following an include statement are not incorrectly
- placed inside a segment. Any include statements in your program should
- immediately follow the .MODEL, OPTION, and processor directives.
-
- For the same reason, if you want to be certain that an external definition
- is outside a segment, you can use @CurSeg. The @CurSeg predefined symbol
- returns a blank if the definition is not in a segment. For example,
-
- .DATA
- .
- .
- .
- @CurSeg ENDS ; Close segment
- EXTERNDEF var:WORD
-
- See Section 1.2.3, "Predefined Symbols," for information about predefined
- symbols such as @CurSeg.
-
-
- 8.3 Using Alternatives to Include Files
-
- If your project uses only two modules (or if it is written with a version of
- MASM prior to 6.0), you may want to continue using PUBLIC in the defining
- module and EXTERN in the accessing module, and not create an include file
- for the project. The EXTERN directive can be used in an include file, but
- the include file containing EXTERN cannot be added to the module that
- contains the corresponding PUBLIC directive for that symbol. This section
- assumes that you are not using include files.
-
-
- 8.3.1 PUBLIC and EXTERN
-
- The PUBLIC and EXTERN directives are less flexible than EXTERNDEF and PROTO
- because they are module-specific: PUBLIC must appear in the defining module
- and EXTERN must appear in the calling modules. This section shows how to use
- PUBLIC and EXTERN. Information on where to place the external declarations
- in your file is in Section 8.2.3, "Positioning External Declarations."
-
- The PUBLIC directive makes a name visible outside the module in which it is
- defined. This gives other program modules access to that identifier.
-
- The EXTERN directive performs the complementary function. It tells the
- assembler that a name referenced within a particular module is actually
- defined and declared public in another module that will be specified at link
- time.
-
- A PUBLIC directive can appear anywhere in a file. Its syntax is
-
- PUBLIC [[langtype]] name[[,
- [[langtype]] name]] ...
-
- The name must be the name of an identifier defined within the current source
- file. Only code labels, data labels, procedures, and numeric equates can be
- declared public.
-
- If you specify the langtype field here, it overrides the language specified
- by .MODEL. The langtype field can be C, SYSCALL, STDCALL, PASCAL, FORTRAN,
- or BASIC. Section 7.3.3, "Declaring Parameters with the PROC Directive," and
- Section 20.1, "Naming and Calling Conventions," provide more information on
- specifying langtype types.
-
- The EXTERN directive tells the assembler that an identifier is
- external─defined in some other module that will be supplied at link time.
- Its syntax is
-
- EXTERN «langtype» name:{ABS | qualifiedtype}
-
- Section 1.2.6, "Data Types," describes qualifiedtype. The ABS (absolute)
- keyword can be used only with external numeric constants. ABS causes the
- identifier to be imported as a relocatable unsized constant. This identifier
- can then be used anywhere a constant can be used. If the identifier is not
- found in another module at link time, the linker generates an error.
-
- In the following example, the procedure BuildTable and the variable Var
- are declared public. The procedure uses the Pascal naming and data-passing
- conventions:
-
- (This figure may be found in the printed book.)
-
-
- 8.3.2 Other Alternatives
-
- You can also use the directives discussed earlier (EXTERNDEF, PROTO, and
- COMM) without the include file. In this case, place the declarations to make
- a symbol global in the same module where the symbol is defined. You might
- want to use this technique if you are linking only a few modules that have
- very little data in common.
-
-
- 8.4 Developing Libraries
-
- As you create reusable procedures, you can place them in a library file for
- convenient access. Although you can put any routine into a library, each
- library usually contains related routines. For example, you might place
- string-manipulation functions in one library, matrix calculations in
- another, and port communications in another.
-
- A library consists of combined object modules, each created from a single
- source file. The object module is the smallest independent unit in a
- library. If you link with one symbol in a module, you get the entire module,
- but not the entire library.
-
- A library can consist of two files─an include file containing necessary
- declarations and constants and a .LIB file containing procedures already
- assembled into object code.
-
-
- 8.4.1 Associating Libraries with Modules
-
- You can choose either of two methods for associating your libraries with the
- modules that use them: you can use the INCLUDELIB directive inside your
- source files or link the modules from the command line.
-
- Specify library names with INCLUDELIB.
-
- To associate a specified library with your object code, use INCLUDELIB. You
- can add this directive to the source file to specify the libraries you want
- linked, rather than specifying them in the LINK command line. The INCLUDELIB
- syntax is
-
- INCLUDELIB libraryname
-
- The libraryname can be a file name or a complete path specification. If you
- do not specify an extension, .LIB is assumed. The libraryname is placed in
- the comment record of the object file. LINK reads this record and links with
- the specified library file.
-
- For example, the statement INCLUDELIB GRAPHICS passes a message from the
- assembler to the linker telling LINK to use library routines from the file
- GRAPHICS.LIB. If this statement is in the source file DRAW.ASM and
- GRAPHICS.LIB is in the same directory, the program can be assembled and
- linked with the following command line:
-
- ML DRAW.ASM
-
- Link libraries with command-line options.
-
- Without the INCLUDELIB directive, the program DRAW.ASM has to be linked with
- either of the following command lines:
-
- ML DRAW.ASM GRAPHICS.LIB
- ML DRAW /link GRAPHICS
-
- If you want to assemble and link separately, you can use
-
- ML /c DRAW.ASM
- LINK DRAW,,,GRAPHICS
-
- LINK searches in a specific order.
-
- If you do not specify a complete path in the INCLUDELIB statement or at the
- command line, LINK searches for the library file in the following order:
-
-
- 1. In the current directory
-
- 2. In any directories in the library field of the LINK command line
-
- 3. In any directories in the LIB environment variable
-
-
- The LIB utility provided with MASM 6.0 helps you create, organize, and
- maintain run-time libraries.
-
-
- 8.4.2 Using EXTERN with Library Routines
-
- In some cases, EXTERN helps you limit the size of your executable file by
- specifying in the syntax an alternative name for a procedure. You would use
- this form of the EXTERN directive when declaring a procedure or symbol that
- may not need to be used.
-
- The syntax looks like this:
-
- EXTERN «langtype» name « (altname)
- » :qualifiedtype
-
- The addition of the altname to the syntax provides the name of an alternate
- procedure that the linker uses to resolve the external reference if the
- procedure given by name is not needed. Both name and altname must have the
- same qualifiedtype.
-
- When the linker encounters an external definition for a procedure that gives
- an altname, the linker finishes processing that module before it links the
- object module that contains the procedure given by name. If the program does
- not reference any symbols in the name file's object from any of the linked
- modules, the assembler uses altname to satisfy the external reference. This
- saves space because the library object module is not brought in.
-
- For example, assume that the contents of STARTUP.ASM include these
- statements:
-
- EXTERN init(dummy)
- .
- .
- .
- dummy PROC
- .
- .
- . ; A procedure definition containing
- no
- ret ; executable code
-
- dummy ENDP
- .
- .
- .
- call init ; Defined in FLOAT.OBJ
-
- In this example, the reference to the routine init (defined in FLOAT.OBJ)
- does not force the module FLOAT.OBJ to be linked into the executable file.
- If another reference causes FLOAT.OBJ to be linked into the executable file,
- then init will refer to the init label in FLOAT.OBJ. If there are no
- references which force FLOAT.OBJ to be loaded, then the alternate name for
- init(dummy) will be used by the linker.
-
-
- 8.5 Related Topics in Online Help
-
- In addition to information covered in this chapter, information on the
- following topics can be found in online help.
-
- Topic Access
- ────────────────────────────────────────────────────────────────────────────
- LIB From the "Microsoft Advisor Contents"
- screen, choose "LIB" from the list of
- Microsoft Utilities
-
- INCLUDE, INCLUDELIB, From the "MASM 6.0 Contents" screen,
- EXTERNDEF, COMM, and choose "Directives," then "Scope and
- PUBLIC Visibility"
-
- TYPEDEF From the "MASM 6.0 Contents" screen,
- choose "Directives," then "Complex Data
- Types"
-
- PROTO and INVOKE From the "MASM 6.0 Contents" screen,
- choose "Directives," then "Procedures
- and Code Labels"
-
- OPTION directive From the "MASM 6.0 Contents" screen,
- choose "Directives," then "Miscellaneous"
-
- @CurSeg From the "MASM 6.0 Contents" screen,
- choose "Predefined Symbols"
-
- PWB Options menu From the "Microsoft Advisor Contents"
- screen, choose "Programmer's WorkBench"
-
-
-
-
-
-
- Chapter 9 Using Macros
- ────────────────────────────────────────────────────────────────────────────
-
- A "macro" is a symbolic name you give to a series of characters (a text
- macro) or to one or more statements (a macro procedure or function). As the
- assembler evaluates each line of your program, it scans the source code for
- names of previously defined macros. When it finds one, it substitutes the
- macro text for the macro name. In this way, you can avoid writing the same
- code several places in your program.
-
- This chapter describes the following types of macros:
-
-
- ■ Text macros, which expand to text within a source statement
-
- ■ Macro procedures, which expand to one or more complete statements and
- can optionally take parameters
-
- ■ Repeat blocks, which generate a group of statements a specified number
- of times or until a specified condition becomes true
-
- ■ Macro functions, which look like macro procedures and can be used like
- text macros but which also return a value
-
- ■ Predefined macro functions and string directives, which perform string
- operations
-
-
- Macro processing is a text-processing mechanism that is done sequentially at
- assembly time. By the end of assembly, all macros have been expanded and the
- resulting text assembled into object code.
-
- This chapter shows how to use macros for simple code substitutions as well
- as how to write sophisticated macros with parameter lists and repeat loops.
- It also describes how to use these features in conjunction with local
- symbols, macro operators, and predefined macro functions.
-
-
- 9.1 Text Macros
-
- You can give a sequence of characters a symbolic name and then use the name
- in place of the text later in the source code. The named text is called a
- text macro.
-
- The syntax for defining a text macro is
-
- name TEXTEQU <text>
- name TEXTEQU macroId | textmacro
- name TEXTEQU %constExpr
-
-
- where text is a sequence of characters enclosed in angle brackets, macroId
- is a previously defined macro function (see Section 9.6), textmacro is a
- previously defined text macro, and %constExpr is an expression that
- evaluates to text. The use of angle brackets to delimit text is discussed in
- more detail in Section 9.3.1, and the % operator is explained in Section
- 9.3.2.
-
- Here are some examples:
-
- msg TEXTEQU <Some text> ; Text assigned to symbol
- string TEXTEQU msg ; Text macro assigned to symbol
- msg TEXTEQU <Some other text> ; New text assigned to symbol
- value TEXTEQU %(3 + num) ; Text representation of
- ; resolved expression assigned
- ; to symbol
-
- In the first line, text is assigned to the symbol msg. In the second line,
- the text of the msg text macro is assigned to a new text macro called
- string. In the third line, new text is assigned to msg. The result is that
- msg has the new text value, while string has the original text value. The
- fourth line assigns 7 to value if num equals 4. If a text macro
- expands to another text macro (or macro function, which is discussed in
- Section 9.6), the resulting text macro will be recursively expanded.
-
- Text macros are useful for naming strings of text that do not evaluate to
- integers. For example, you might use a text macro to name a floating-point
- constant or a bracketed expression. Here are some practical examples:
-
- pi TEXTEQU <3.1416> ; Floating point constant
- WPT TEXTEQU <WORD PTR> ; Sequence of key words
- arg1 TEXTEQU <[bp+4]> ; Bracketed expression
-
- ────────────────────────────────────────────────────────────────────────────
- NOTE
-
- Use of the TEXTEQU directive to define text macros is new in MASM 6.0. In
- previous versions, you can use the EQU directive for the same purpose. If
- you have old code that worked under previous versions, it should still work
- under 6.0. However, the more consistent and flexible TEXTEQU is recommended
- for new code.
- ────────────────────────────────────────────────────────────────────────────
-
-
- 9.2 Macro Procedures
-
- If your program needs to perform the same task many times, you can avoid
- having to type the same statements each time by writing a macro procedure.
- Macro procedures (commonly called macros) can be seen as text-processing
- mechanisms that automatically generate repeated text.
-
- The term "macro procedure" rather than macro is used when necessary to
- distinguish between macro procedures and macro functions (a new feature of
- MASM 6.0 described in Section 9.6, "Returning Values with Macro Functions").
-
-
-
- 9.2.1 Creating Macro Procedures
-
- To define a macro procedure without parameters, place the desired statements
- between the MACRO and ENDM directives:
-
- name MACRO statements ENDM
-
- For example, suppose you want a program to beep when it encounters certain
- errors. A beep macro can be defined as follows:
-
- beep MACRO
- mov ah, 2 ;; Select DOS Print Char function
- mov dl, 7 ;; Select ASCII 7 (bell)
- int 21h ;; Call DOS
- ENDM
-
- Macro comments must start with two semicolons instead of one.
-
- The double semicolons mark the beginning of macro comments. Macro comments
- appear in a listing file only at the macro's initial definition, not at the
- point where it is called and expanded. Listings are usually easier to read
- if the comments aren't always expanded. Regular comments (those with a
- single semicolon) are listed in macro expansions. Appendix C discusses
- listing files and shows examples of how macros are expanded in listings.
-
- Once a macro is defined, you can call it anywhere in the program by using
- the macro's name as a statement. The following example calls the beep
- macro two times if an error flag has been set.
-
- .IF error ; If error flag is true
- beep ; execute macro two times
- beep
- .ENDIF
-
- The instructions in the macro take the place of the macro call when the
- program is assembled. This would be the resulting code (from the listing
- file):
-
- .IF error
- 0017 80 3E 0000 R 00 * cmp error, 000h
- 001C 74 0C * je @C0001
- beep
- 001E B4 02 1 mov ah, 2
- 0020 B2 07 1 mov dl, 7
- 0022 CD 21 1 int 21h
- beep
- 0024 B4 02 1 mov ah, 2
- 0026 B2 07 1 mov dl, 7
- 0028 CD 21 1 int 21h
- .ENDIF
- 002A *@C0001:
-
- Contrast this with the results of defining beep as a procedure using the
- PROC directive and then calling it using the CALL instruction. The
- instructions of the procedure occur only once in the executable file, but
- you would also have the additional overhead of the CALL and RET
- instructions.
-
- Macros are usually faster than run-time procedures.
-
- In some cases the same task can be done with either a macro or a procedure.
- Macros are potentially faster because they have less overhead, but they
- generate the same code multiple times rather than just once.
-
-
- 9.2.2 Passing Arguments to Macros
-
- Parameters allow macros to execute variations of a general task.
-
- By defining parameters for macros, you can define a general task and then
- execute variations of it by passing different arguments each time you call
- the macro. The complete syntax for a macro procedure includes a parameter
- list:
-
- name MACRO parameterlist statements ENDM
-
- The parameterlist can contain any number of parameters. Use commas to
- separate each parameter in the list. Parameter names cannot be reserved
- words unless the keyword has been disabled with OPTION NOKEYWORD, the
- compatibility modes have been set by specifying OPTION M510 (see Section
- 1.3.2), or the /Zm command-line option has been set.
-
- To pass arguments to a macro, place the arguments after the macro name when
- you call the macro:
-
- macroname arglist
-
- All text between matching quotation marks in an arglist is considered one
- text item.
-
- The beep macro introduced in the last section used the DOS interrupt to
- write the bell character (ASCII 7). It can be rewritten with a parameter to
- specify any character to write.
-
- writechar MACRO char
- mov ah, 2 ;; Select DOS Print Char function
- mov dl, char ;; Select ASCII char
- int 21h ;; Call DOS
- ENDM
-
- Wherever char appears in the macro definition, the assembler replaces it
- with the argument in the macro call. Each time you call writechar, you can
- print a different value:
-
- writechar 7 ; Causes computer to beep
- writechar 'A' ; Writes A to screen
-
- If you pass more arguments than there are parameters, the additional
- arguments generate a warning (unless you use the VARARG keyword; see Section
- 9.4.3). If you pass fewer arguments than the macro procedure expects,
- remaining parameters are assigned empty strings (unless default values have
- been specified). This may cause errors. For example, if you call the
- writechar macro with no argument, it generates the following:
-
- mov dl,
-
- The assembler generates an error for the expanded statement but not for the
- macro definition or the macro call.
-
- Macros can be made more flexible by leaving off macro arguments or adding
- additional ones. The next section tells some of the ways you can handle
- missing or extra arguments.
-
-
- 9.2.3 Specifying Required and Default Parameters
-
- You can specify required and default parameters for macros.
-
- You can give macro parameters special attributes to make them more flexible
- and improve error handling; you can make them required, give them default
- values, or vary their number. Because variable parameters are used almost
- exclusively with the FOR directive, discussion of them is postponed until
- Section 9.4.3, "FOR Loops and Variable-Length Parameters."
-
- The syntax for a required parameter is
-
- parameter:REQ
-
- For example, you can rewrite the writechar macro to require the char
- parameter:
-
- writechar MACRO char:REQ
- mov ah, 2 ;; Select DOS Print Char function
- mov dl, char ;; Select ASCII char
- int 21h ;; Call DOS
- ENDM
-
- If the call does not include a matching argument, the assembler reports the
- error in the line that contains the macro call. The effect of REQ is to
- improve error reporting.
-
- A default value fills in missing parameters.
-
- Another way to handle missing parameters is to specify a default value. The
- syntax is
-
- parameter:=textvalue
-
- Suppose that you often use writechar to beep by printing ASCII 7. The
- following macro definition uses an equal sign to tell the assembler to
- assume the parameter char is 7 unless you specify otherwise:
-
- writechar MACRO char:=<7>
- mov ah, 2 ;; Select DOS Print Char function
- mov dl, char ;; Select ASCII char
- int 21h ;; Call DOS
- ENDM
-
- In this case, char is not required. If you don't supply a value, the
- assembler fills in the blank with the default value of 7 and the macro
- beeps when called.
-
- The default parameter value is enclosed in angle brackets so that the
- supplied value will be recognized as a text value. Section 9.3.1, "Text
- Delimiters (< >) and the Literal-Character Operator (!)," explains this in
- more detail.
-
- Missing arguments can also be handled with the IFB, IFNB, .ERRB, and .ERRNB
- directives. They are described briefly in Section 1.3.3, "Conditional
- Directives," and in online help. Here is a slightly more complex macro that
- uses some of these techniques.
-
- Scroll MACRO distance:REQ, attrib:=<07h>, tcol, trow, bcol,
- brow
- IFNB <tcol> ;; Ignore arguments if blank
- mov cl, tcol
- ENDIF
- IFNB <trow>
- mov ch, trow
- ENDIF
- IFNB <bcol>
- mov dl, bcol
- ENDIF
- IFNB <brow>
- mov dh, brow
- ENDIF
- IFDIFI <attrib>, <bh> ;; Don't move BH onto itself
- mov bh, attrib
- ENDIF
- IF distance LE 0 ;; Negative scrolls up, positive down
- mov ax, 0600h + (-(distance) AND 0FFh)
- ELSE
- mov ax, 0700h + (distance AND 0FFh)
- ENDIF
- int 10h
- ENDM
-
- In this macro, the distance parameter is required. The attrib parameter
- has a default value of 07h (white on black), but the macro also tests to
- make sure the corresponding argument isn't BH, since it would be inefficient
- (though legal) to load a register onto itself. The IFNB directive is used to
- test for blank arguments. These are ignored to allow the user to manipulate
- rows and columns directly in registers CX and DX at run time.
-
- The following are two valid ways to call the macro:
-
- ; Assume DL and CL already loaded
- dec dh ; Decrement top row
- inc ch ; Increment bottom row
- Scroll -3 ; Scroll white on black dynamic
- ; window up three lines
- Scroll 5, 17h, 2, 2, 14, 12 ; Scroll white on blue constant
- ; window down five lines
-
- This macro can generate completely different code, depending on its
- arguments. In this sense, it is not comparable to a procedure, which always
- has the same code regardless of arguments.
-
-
- 9.2.4 Defining Local Symbols in Macros
-
- You can make a symbol local to a macro by declaring it at the start of the
- macro with the LOCAL directive. Any identifier may be declared local.
-
- You can choose whether you want numeric equates and text macros to be local
- or global. If a symbol will be used only inside a particular macro, you can
- declare it local so that the name will be available for other declarations
- inside other macros or at the global level. On the other hand, it is
- sometimes convenient to define text macros and equates that are not local,
- so that their values can be shared between macros.
-
- If you need to use a label inside a macro, you must declare it local, since
- a label can occur only once in the source. The LOCAL directive makes a
- special instance of the label each time the macro is called. This prevents
- redefinition of the label.
-
- All local symbols must be declared immediately following the MACRO statement
- (although blank lines and comments may precede the local symbol). Separate
- each symbol with a comma. Comments are allowed on the LOCAL statement.
- Multiple LOCAL statements are also permitted. Here is an example macro that
- declares local labels:
-
- power MACRO factor:REQ, exponent:REQ
- LOCAL again, gotzero ;; Local symbols
- sub dx, dx ;; Clear top
- mov ax, 1 ;; Multiply by one on first loop
- mov cx, exponent ;; Load count
- jcxz gotzero ;; Done if zero exponent
- mov bx, factor ;; Load factor
- again:
- mul bx ;; Multiply factor times exponent
- loop again ;; Result in AX
- gotzero:
- ENDM
-
- If the labels again and gotzero were not declared local, the macro would
- work the first time it is called, but it would generate redefinition errors
- on subsequent calls. MASM implements local labels by generating different
- names for them each time the macro is called. You can see this in listing
- files. The labels in the power macro might be expanded to ??0000 and
- ??0001 on the first call and to ??0002 and ??0003 on the second.
-
-
- 9.3 Assembly Time Variables and Macro Operators
-
- In writing macros, you will often assign and modify values assigned to
- symbols. These symbols can be thought of as assembly-time variables. Like
- memory variables, they are symbols that represent values. But since macros
- are processed at assembly time, any symbol modified in a macro must be
- resolved as a constant by the end of assembly.
-
- The three kinds of assembly-time variables are:
-
-
- ■ Macro parameters
-
- ■ Text macros
-
- ■ Macro functions
-
-
- When a macro is expanded, the symbols are processed in the order shown
- above. First macro parameters are replaced with the text of their actual
- arguments. Then text macros are expanded.
-
- Macro parameters are similar to procedure parameters in some ways, but they
- also have important differences. In a procedure, a parameter has a type and
- a memory location. Its value can be modified within the procedure. In a
- macro, a parameter is a placeholder for the argument text. The value can
- only be assigned to another symbol or used directly; it cannot be modified.
- The macro may interpret the argument text it receives either as a numeric
- value or as a text value.
-
- It is important to understand the difference between text values and numeric
- values. Numeric values can be processed with arithmetic operators and
- assigned to numeric equates. Text values can be processed with macro
- functions and assigned to text macros.
-
- Macro operators are often helpful when processing assembly-time variables.
- Table 9.1 shows the macro operators that MASM provides:
-
- Table 9.1 MASM Macro Operators
-
- Symbol Name Description
- ────────────────────────────────────────────────────────────────────────────
- < > Text Delimiters Opens and closes a literal
- string.
-
- ! Literal-Character Operator Treats the next character as a
- literal character, even if it
- would normally have another
- meaning.
-
- % Expansion Operator Causes the assembler to expand a
- constant expression or text
- macro.
-
- & Substitution Operator Tells the assembler to replace a
- macro parameter or text macro
- name with its
- actual value.
-
- ────────────────────────────────────────────────────────────────────────────
-
-
- The next sections explain these operators in detail.
-
-
- 9.3.1 Text Delimiters (< >) and the Literal-Character Operator (!)
-
- The angle brackets (< >) are text delimiters. The most common reason to
- delimit a text value is when assigning a text macro. You can do this with
- TEXTEQU, as previously shown, or with the SUBSTR and CATSTR directives
- discussed in Section 9.5, "String Directives and Predefined Functions."
-
- By delimiting the text of macro arguments, you can pass text that includes
- spaces, commas, semicolons, and other special characters. In the following
- example, assume you have previously defined a macro called work:
-
- work <1, 2, 3, 4, 5> ; Passes one argument
- ; with 15 characters
- work 1, 2, 3, 4, 5 ; Passes five arguments, each
- ; with 1 character
-
- Since angle brackets are delimiters, you can't include them as part of a
- delimited text value. The literal-character operator (!) can be used to
- override this limitation. It forces the assembler to treat the character
- following it literally rather than as a special character.
-
- errstr TEXTEQU <Expression !> 255> ; errstr = "Expression
- > 255"
-
- Text delimiters also have a special use with the FOR directive, as explained
- in Section 9.4.3.
-
-
- 9.3.2 Expansion Operator (%)
-
- The expansion operator (%) expands text macros or converts constant
- expressions into their text representations. It performs these tasks
- differently in different contexts, as discussed below.
-
-
- 9.3.2.1 The Expansion Operator with Constants
-
- The expansion operator can be used in any context where a text value is
- expected but a numeric value is supplied. In these contexts, it can be
- thought of as a conversion operator to convert numeric values to text
- values.
-
- The expansion operator forces immediate evaluation of a constant expression
- and replaces it with a text value consisting of the digits of the result.
- The digits are generated in the current radix (default decimal).
-
- This application of the expansion operator is useful when defining a text
- macro:
-
- a TEXTEQU <3 + 4> ; a = "3 + 4"
- b TEXTEQU %3 + 4 ; b = "7"
-
- When assigning text macros, numeric equates can be used in the constant
- expressions, but text macros cannot:
-
- num EQU 4 ; num = 4
- numstr TEXTEQU <4> ; numstr = <4>
- a TEXTEQU %3 + num ; a = <7>
- b TEXTEQU %3 + numstr ; b = <7>
-
- The expansion operator can be used when passing macro arguments. If you want
- the value rather than the text of an expression to be passed, use the
- expansion operator. Use of the expansion operator depends on whether you
- want the expression to be evaluated inside the macro on each use, or outside
- the macro once. The following macro
-
- work MACRO arg
- mov ax, arg * 4
- ENDM
-
- can be called with these statements:
-
- work 2 + 3 ; Passes "2 + 3"
- ; Code: mov ax, 2 + 3 * 4 (14)
- work %2 + 3 ; Passes 5
- ; Code: mov ax, 5 * 4 (20)
-
- Notice that because of operator precedence, results can vary depending on
- whether the expansion operator is used. Sometimes parentheses can be used
- inside the macro to force evaluation in a particular order:
-
- work MACRO arg
- mov ax, (arg) * 4
- ENDM
-
- work 2 + 3 ; Code: mov ax, (2 + 3) * 4 (20)
- work %2 + 3 ; Code: mov ax, (5) * 4 (20)
-
- This example generates the same code regardless of whether you pass the
- argument as a value or as text, but in some cases you need to specify how
- the argument is passed.
-
- The value for a default argument must be text, but frequently you need to
- give a constant value. The expansion operator is one way to force the
- conversion. The following statements are equivalent:
-
- work MACRO arg:=<07h>
- work MACRO arg:=%07h
-
- The expansion operator also has several uses with macro functions. See
- Section 9.6.
-
-
- 9.3.2.2 The Expansion Operator with Symbols
-
- When you use the expansion operator on a macro argument, any text macros or
- numeric equates in the argument are expanded:
-
- num EQU 4
- numstr TEXTEQU <4>
-
- work 2 + num ; Passes "2 + num"
- work %2 + num ; Passes "6"
- work 2 + numstr ; Passes "2 + numstr"
- work %2 + numstr ; Passes "6"
-
- The arguments can optionally be enclosed in parentheses. For example, these
- two statements are equivalent:
-
- work %2 + num
- work %(2 + num)
-
-
- 9.3.2.3 The Expansion Operator as the First Character on a Line
-
- The expansion operator has a different meaning when used as the first
- character on a line. In this case, it instructs the assembler to expand any
- text macros and macro functions it finds on the rest of the line.
-
- This feature makes it possible to use text macros with directives such as
- ECHO, TITLE, and SUBTITLE that take an argument consisting of a single text
- value. For instance, ECHO displays its argument to the standard output
- device during assembly. Such expansion can be useful for debugging macros
- and expressions, but the requirement that its argument be a single text
- value may have unexpected results:
-
- ECHO Bytes per element: %(SIZEOF array / LENGTHOF
- array)
-
- Instead of evaluating the expression, this line just echoes it:
-
- Bytes per element: %(SIZEOF array / LENGTHOF array)
-
- However, you can achieve the desired result by assigning the text of the
- expression to a text macro and then using the expansion operator at the
- beginning of the line to force expansion of the text macro.
-
- temp TEXTEQU %(SIZEOF array / LENGTHOF array)
- % ECHO Bytes per element: temp
-
- Note that you cannot get the same results by simply putting the % at the
- beginning of the first echo line, because % expands only text macros, not
- numeric equates or constant expressions.
-
- Here are more examples of the use of the expansion operator at the start of
- a line:
-
- ; Assume memmod, lang, and os are passed in with /D option
- % SUBTITLE Model: memmod Language: lang Operating System: os
-
- ; Assume num defined earlier
- tnum TEXTEQU %num
- % .ERRE num LE 255, <Failed because tnum !> 255>
-
-
- 9.3.3 Substitution Operator (&)
-
- In MASM 6.0, the substitution operator (&) enables substitution of macro
- parameters, even when the parameter occurs within a larger word or within a
- quoted string. It can also be used to concatenate two macro parameters after
- they have been expanded.
-
- The syntax for the substitution operator looks like this:
-
- ¶metername&
-
- The operators delimiting a name always tell the assembler to substitute the
- actual argument for the name. However, the substitution operator is often
- optional. The substitution operator is not necessary when there is a space
- or separation character (comma, tab, or other operator) on that side. In the
- case of a parameter name inside a string, at least one substitution operator
- must appear.
-
- The rules for using the substitution operator have changed significantly
- since MASM 5.1, making macro behavior more consistent and flexible. If you
- have macros written for a previous version of MASM, you can specify the old
- behavior by using OLDMACROS or M510 with the OPTION directive (see Section
- 1.3.2).
-
- In the macro
-
- work MACRO arg
- mov ax, &arg& * 4
- ENDM
-
- the & symbols tell the assembler to replace the value of arg with the
- corresponding argument. However, the characters on both the right and left
- are spaces. Therefore, the operators are unnecessary. The macro would
- normally be written like this:
-
- work MACRO arg
- mov ax, arg * 4
- ENDM
-
- The substitution operator is used for one of the following reasons:
-
-
- ■ To paste together two parameter names or a parameter name and text
-
- ■ To indicate that a parameter name inside double or single quotation
- marks should be expanded rather than be treated as part of the quoted
- string
-
-
- This macro illustrates both uses:
-
- errgen MACRO num, msg
- PUBLIC err&num
- err&num BYTE "Error &num: &msg"
- ENDM
-
- When called with the following arguments,
-
- errgen 5, <Unreadable disk>
-
- the macro generates this code:
-
- PUBLIC err5
- err5 BYTE "Error 5: Unreadable disk"
-
- In the second line of the macro, the left & symbol must be provided because
- it is adjacent to the r character, which is a valid identifier symbol. The
- right & symbol is not needed because there is a space to the right of the
- m. The statement pastes the text err to the argument value 5 to generate
- the symbol err5.
-
- The substitution operator is used again inside quotation marks at the start
- of the parameter names num and msg to indicate that these names should
- be expanded. In this case, no pasting operation is necessary, so either
- operator could be omitted, but not both. The macro line could have been
- written as
-
- err&num BYTE "Error num&: msg&"
-
- or
-
- err&num BYTE "Error &num&: &msg&"
-
- The assembler processes substitution operators from left to right. This can
- have unexpected results when you are pasting together two macro parameters.
- For example, if arg1 has the value var and arg2 has the value 3, you
- could paste them together with this statement:
-
- &arg1&&arg2& BYTE "Text"
-
- Eliminating extra substitution operators, you might expect the following to
- be equivalent:
-
- &arg1&arg2 BYTE "Text"
-
- However, this actually produces the symbol vararg2 because in processing
- from left to right the assembler associates both the first and the second &
- symbols with the first parameter. The assembler replaces &arg1& by var ,
- producing vararg2 . The arg2 is never evaluated. The correct abbreviation
- is
-
- arg1&&arg2 BYTE "Text"
-
- which produces the desired symbol var3. The symbol arg1&&arg2 is replaced
- by var&arg2, which is replaced by var3.
-
- The substitution operator is also necessary if you want a text macro
- substituted inside quotes. For example,
-
- arg TEXTEQU <hello>
- %echo This is a string "&arg" ; Produces: This is a string "hello"
- %echo This is a string "arg" ; Produces: This is a string "arg"
-
- The substitution operator can also be used in lines beginning with the
- expansion operator (%) symbol, even outside macros (see Section 9.3.2.3).
- Text macros are always expanded in such lines, but it may be necessary to
- use the substitution operator to paste text macro names to adjacent
- characters or symbol names, as shown below:
-
- text TEXTEQU <var>
- value TEXTEQU %5
- % ECHO textvalue is text&&value
-
- This echoes the message
-
- textvalue is var5
-
- Bit-test and macro expansion statements can be confused.
-
- The single ampersand (&) is the bit-test operator in MASM, as it is for C.
- This operator is also used in macro expansion as the substitute operator.
- Macro substitution always occurs before evaluation of the high-level control
- structures; therefore, in ambiguous cases, the & operator is treated as a
- macro-expansion character. You can always guarantee the correct use of the
- bit-test operator by enclosing the bit-test operands in parentheses. The
- example below illustrates these two uses.
-
- test MACRO x
- .IF ax==&x ; &x substituted with parameter value
- mov ax, 10
- .ELSEIF ax&(x) ; & is bitwise AND
- mov ax, 20
- .ENDIF
- ENDM
-
-
- 9.4 Defining Repeat Blocks with Loop Directives
-
- A "repeat block" is an unnamed macro defined with a loop directive. It
- generates the statements inside the repeat block a specified number of times
- or until a given condition becomes true.
-
- Several loop directives are available, providing different ways of
- specifying the number of iterations. Some loop directives also provide a way
- to specify arguments for each iteration. Although the number of iterations
- is usually specified in the directive, you can use the EXITM directive to
- exit from the loop early.
-
- Repeat blocks can be used outside macros, but they frequently appear inside
- macro definitions to perform some repeated operation in the macro.
-
- This section explains the following four loop directives: REPEAT, WHILE,
- FOR, and FORC. In previous versions of MASM, REPEAT was called REPT, FOR was
- called IRP, and FORC was called IRPC. MASM 6.0 still recognizes the old
- names.
-
- ────────────────────────────────────────────────────────────────────────────
- NOTE
-
- The REPEAT and WHILE directives should not be confused with the .REPEAT and
- .WHILE directives (see Section 7.2.1, "Loop-Generating Directives"), which
- generate loop and jump instructions for run-time program control.
- ────────────────────────────────────────────────────────────────────────────
-
-
- 9.4.1 REPEAT Loops
-
- Repeat loops are expanded at assembly time.
-
- The REPEAT directive is the simplest loop directive. It specifies the number
- of times to generate the statements inside the macro. The syntax is
-
- REPEAT constexpr
- statements
- ENDM
-
- The constexpr can be a constant or a constant expression, and must contain
- no forward references. Since the repeat block will be expanded at assembly
- time, the number of iterations must be known then.
-
- Here is an example of a repeat block used to generate data. It initializes
- an array containing sequential ASCII values for all uppercase letters.
-
- alpha LABEL BYTE ; Name the data generated
- letter = 'A' ; Initialize counter
- REPEAT 26 ;; Repeat for each letter
- BYTE letter ;; Allocate ASCII code for letter
- letter = letter + 1 ;; Increment counter
- ENDM
-
- Here is another use of REPEAT, this time inside a macro:
-
- beep MACRO iter:=<3>
- mov ah, 2 ;; Character output function
- mov dl, 7 ;; Bell character
- REPEAT iter ;; Repeat number specified by macro
- int 21h ;; Call DOS
- ENDM
- ENDM
-
-
- 9.4.2 WHILE Loops
-
- The WHILE directive is similar to REPEAT, but the loop continues as long as
- a given condition is true. The syntax is
-
- WHILE expression
- statements
- ENDM
-
- The expression must be a value that can be calculated at assembly time.
- Normally the expression uses relational operators, but it can be any
- expression that evaluates to zero (false) or nonzero (true). Usually, the
- condition changes during the evaluation of the macro so that the loop won't
- attempt to generate an infinite amount of code. However, you can use the
- EXITM directive to break out of the loop.
-
- Loops are especially useful for generating lookup tables.
-
- The following repeat block uses the WHILE directive to allocate variables
- initialized to calculated values. This is a common technique for generating
- lookup tables. Frequently it is faster to look up a value precalculated by
- the assembler at assembly time than to have the processor calculate the
- value at run time.
-
- cubes LABEL BYTE ;; Name the data generated
- root = 1 ;; Initialize root
- cube = root * root * root ;; Calculate first cube
- WHILE cube LE 32767 ;; Repeat until result too large
- WORD cube ;; Allocate cube
- root = root + 1 ;; Calculate next root and cube
- cube = root * root * root
- ENDM
-
-
- 9.4.3 FOR Loops and Variable-Length Parameters
-
- With the FOR directive you can iterate through a list of arguments, doing
- some operation on each of them in turn. It has the following syntax:
-
- FOR parameter, <argumentlist> statements ENDM
-
- The parameter is a placeholder that will be used as the name of each
- argument inside the FOR block. The argument list must be a list of
- comma-separated arguments and must always be enclosed in angle brackets, as
- the following example illustrates:
-
- series LABEL BYTE
- FOR arg, <1,2,3,4,5,6,7,8,9,10>
- BYTE arg DUP (arg)
- ENDM
-
- On the first iteration, the arg parameter is replaced with the first
- argument, the value 1. On the second iteration arg is replaced with 2. The
- result is an array with the first byte initialized to 1, the next two bytes
- initialized to 2, the next three bytes initialized to 3, and so on.
-
- In this example the argument list is given specifically, but in some cases
- the list must be generated as a text macro. The value of the text macro must
- include the angle brackets.
-
- arglist TEXTEQU <!<3,6,9!>> ; Generate list as text macro
- FOR arg, arglist
- . ; Do something to arg
- .
- .
- ENDM
-
- Note the use of the literal character operator (!) to use angle brackets as
- characters, not delimiters (see Section 9.3.1).
-
- Variable parameter lists provide flexibility.
-
- The FOR directive also provides a convenient way to process macros with a
- variable number of arguments. To do this, add VARARG to the last parameter
- to indicate that a single named parameter will have the actual value of all
- additional arguments. For example, the following macro definition includes
- the three possible parameter attributes─required, default, and variable.
-
- work MACRO rarg:REQ, darg:=<5>, varg:VARARG
-
- The variable argument must always come last. If this macro is called with
- the statement
-
- work 5, , 6, 7, a, b
-
- the first argument is received as passed, the second is replaced by the
- default value 5, and the last four are received as the single argument <6,
- 7, a, b>. This is the same format expected by the FOR directive. The FOR
- directive discards leading spaces but recognizes trailing spaces.
-
- The following macro illustrates variable arguments:
-
- show MACRO chr:VARARG
- mov ah, 02h
- FOR arg, <chr>
- mov dl, arg
- int 21h
- ENDM
- ENDM
-
- When called with
-
- show 'O', 'K', 13, 10
-
- the macro displays each of the specified characters one at a time.
-
- The parameter in a FOR loop can have the required or default attribute. The
- show macro can be modified to make blank arguments generate errors:
-
- show MACRO chr:VARARG
- mov ah, 02h
- FOR arg:REQ, <chr>
- mov dl, arg
- int 21h
- ENDM
- ENDM
-
- The macro now generates an error if called with
-
- show 'O',, 'K', 13, 10
-
- Another approach would be to use a default argument:
-
- show MACRO chr:VARARG
- mov ah, 02h
- FOR arg:=<' '>, <chr>
- mov dl, arg
- int 21h
- ENDM
- ENDM
-
- Now if the macro is called with
-
- show 'O',, 'K', 13, 10
-
- it inserts the default character, a space, for the blank argument.
-
-
- 9.4.4 FORC Loops
-
- The FORC directive is similar to FOR but takes a string of text rather than
- a list of arguments. The statements are assembled once for each character
- (including spaces) in the string, substituting a different character for the
- parameter each time through.
-
- The syntax looks like this:
-
- FORC parameter, < text>
- statements
- ENDM
-
- The text must be enclosed in angle brackets. The following example
- illustrates FORC:
-
- FORC arg, <ABCDEFGHIJKLMNOPQRSTUVWXYZ>
- BYTE '&arg' ;; Allocate uppercase letter
- BYTE '&arg' + 20h ;; Allocate lowercase letter
- BYTE '&arg' - 40h ;; Allocate ordinal of letter
- ENDM
-
- Notice that the substitution operator must be used inside the quotation
- marks to make sure that arg is expanded to a character rather than treated
- as a literal string.
-
- With earlier versions of MASM, FORC is often used for complex parsing tasks.
- A long sentence can be examined character by character. Each character is
- then either thrown away or pasted onto a token string, depending on whether
- it is a separator character. In MASM 6.0, the predefined macro functions and
- string processing directives discussed in Section 9.5 are usually more
- efficient for these tasks.
-
-
- 9.5 String Directives and Predefined Functions
-
- Predefined macro string functions are new to MASM 6.0.
-
- The assembler provides the following directives for manipulating text:
- SUBSTR, INSTR, SIZESTR, and CATSTR. Each of these has a corresponding
- predefined macro function version: @SubStr, @InStr, @SizeStr, and @CatStr.
-
- You use the directive versions to assign a processed value to a text macro
- or numeric equate. For example, CATSTR, which concatenates a list of text
- values, can be used like this:
-
- num = 7
- newstr CATSTR <3 + >, %num, < = > , %3 + num ; "3 + 7 = 10"
-
- Assignment with CATSTR and SUBSTR works like assignment with the TEXTEQU
- directive. Assignment with SIZESTR and INSTR works like assignment with the
- = operator.
-
- The arguments to directives must be text values. Use the expansion operator
- to make sure that constants and numeric equates are expanded to text.
-
- The macro function versions are similar, but their arguments must be
- enclosed in parentheses. Macro functions return text values and can be used
- in any context where text is expected. Section 9.6 tells how to write your
- own macro functions. An equivalent statement to the previous example using
- CATSTR is
-
- num = 7
- newstr TEXTEQU @CatStr( <3 + >, %num, < = > , %3 + num )
-
- Although the directive version is simpler in the example above, the function
- versions are often convenient because they can be used as arguments to
- string directives or to other macro functions.
-
- Unlike the string directives, predefined macro function names are case
- sensitive. Since MASM is not case sensitive by default, the case doesn't
- matter unless you use the /Cp command-line option.
-
- The following sections summarize the syntax for each of the string
- directives and functions. The explanations focus on the directives, but the
- functions work the same except where noted.
-
-
- SUBSTR
-
- name SUBSTR string, start«, length»
- @SubStr( string, start«, length» )
-
- The SUBSTR directive assigns a substring from a given string to a new
- symbol, specified by name. Start specifies the position (1-based) in string
- to start the substring. Length specifies the length of the substring. If
- length is not given, it is assumed to be the remainder of the string
- including the start character. The string
-
- in the SUBSTR syntax, as well as in the syntax for the other string
- directives and predefined functions, can be any textItem where textItem can
- be text enclosed in angle brackets (< >), the name of a macro, or a constant
- expression preceded by % (%constExpr).
-
-
- INSTR
-
- name INSTR «start,» string, substring
- @InStr( «start», string, substring
- )
-
- The INSTR directive searches a specified string for an occurrence of a given
- substring and assigns its position (1-based) to name. The search is case
- sensitive. Start is the position in string to start the search for
- substring. If start is not given, it is assumed to be 1 (the start of the
- string). If substring is not found, the position assigned to name is 0.
-
- If the INSTR directive is used, the position value is assigned to a name as
- if it were a numeric equate. If the @InStr function is used, the value is
- returned as a string of digits in the current radix.
-
- The @InStr function has a slightly different syntax than the INSTR
- directive. You can omit the first argument and its associated comma from the
- directive. You can leave the first argument blank with the function, but a
- blank function argument must still have a comma. For example,
-
- pos INSTR <person>, <son>
-
- is the same as
-
- pos = @InStr( , <person>, <son> )
-
- The return value could also be assigned to a text macro:
-
- strpos TEXTEQU @InStr( , <person>, <son> )
-
-
- SIZESTR
-
- name SIZESTR string
- @SizeStr( string )
-
- The SIZESTR directive assigns the number of characters in string to name. An
- empty string assigns a length of zero. Although the length is always a
- positive number, it is assigned as a string of digits in the current radix
- rather than as a numeric value.
-
- If the SIZESTR directive is used, the size value is assigned to a name as if
- it were a numeric equate. If the @SizeStr function is used, the value is
- returned as a string of digits in the current radix.
-
-
- CATSTR
-
- name CATSTR string«, string»...
- @CatStr( string«, string»... )
-
- The CATSTR directive concatenates a list of text values specified by string
- into a single text value and assigns it to name. TEXTEQU is technically a
- synonym for CATSTR. TEXTEQU is normally used for single-string assignments,
- while CATSTR is used for multistring concatenations.
-
- The following example that pushes and pops one set of registers illustrates
- several uses of string directives and functions:
-
- ; SaveRegs - Macro to generate a push instruction for each
- ; register in argument list. Saves each register name in the
- ; regpushed text macro.
- regpushed TEXTEQU <> ;; Initialize empty string
-
- SaveRegs MACRO regs:VARARG
- FOR reg, <regs> ;; Push each register
- push reg ;; and add it to the list
- regpushed CATSTR <reg>, <,>, regpushed
- ENDM ;; Strip off last comma
- regpushed CATSTR <!<>, regpushed ;; Mark start of list with
- <
- regpushed SUBSTR regpushed, 1, @SizeStr( regpushed )
- regpushed CATSTR regpushed, <!>> ;; Mark end with >
- ENDM
-
- ; RestoreRegs - Macro to generate a pop instruction for registers
- ; saved by the SaveRegs macro. Restores one group of registers.
-
- RestoreRegs MACRO
- LOCAL regs
- %FOR reg, regpushed ;; Pop each register pop
- reg
- ENDM
- ENDM
-
- Notice how the SaveRegs macro saves its result in the regpushed text
- macro for later use by the RestoreRegs macro. In this case, a text macro
- is used as a global variable. By contrast, the regs text macro is used
- only in RestoreRegs. It is declared LOCAL so that it won't take the name
- regs from the global name space. The MACROS.INC file provided with MASM 6.0
- includes expanded versions of these same two macros.
-
-
- 9.6 Returning Values with Macro Functions
-
- A macro function returns a text string.
-
- A macro function is a named group of statements that returns a value. When a
- macro function is called, its argument list must be enclosed in parentheses,
- even if the list is empty. The value returned is always text.
-
- Macro functions are new to MASM 6.0, as are several predefined macro
- functions for common tasks. The predefined macros include @Environ (see
- Section 1.2.3) and the string functions @SizeStr, @CatStr, @SubStr, and
- @InStr (discussed in the preceding section).
-
- Macro functions are defined in exactly the same way as macro procedures,
- except that a value must always be returned using the EXITM directive. Here
- is an example:
-
- DEFINED MACRO symbol:REQ
- IFDEF symbol
- EXITM <-1> ;; True
- ELSE
- EXITM <0> ;; False
- ENDIF
- ENDM
-
- This macro works like the defined operator in the C language. You can use it
- to test the defined state of several different symbols with a single
- statement, as shown below:
-
- IF DEFINED( DOS ) AND NOT DEFINED( XENIX )
- ;; Do something
- ENDIF
-
- Notice that the macro returns integer values as strings of digits, but the
- IF statement evaluates numeric values or expressions. There is no conflict
- because the value returned by the macro function is seen in the statement
- exactly as if the user had typed the values directly into the program:
-
- IF -1 AND NOT 0
-
-
- Returning Values with EXITM
-
- The return value must be text, a text equate name, or the result of another
- macro function. If a function must return a numeric value (such as a
- constant, a numeric equate, or the result of a numeric expression), it must
- first convert the value to text using angle brackets or the expansion
- operator (%). The defined macro, for example, could have returned its value
- as
-
- EXITM %-1
-
- Although macro functions can include any legal statement, they seldom need
- to include instructions. This is because a macro function is expanded and
- its value returned at assembly time, while instructions are executed at run
- time.
-
- Here is another example of a macro function. It uses the WHILE directive to
- calculate factorials:
-
- factorial MACRO num:REQ
- LOCAL i, factor
- factor = num
- i = 1
- WHILE factor GT 1
- i = i * factor
- factor = factor - 1
- ENDM
- EXITM %i
- ENDM
-
- The integer result of the calculation is changed to a text string with the
- expansion operator (%). The factorial macro can be used to define data, as
- shown below:
-
- var WORD factorial( 4 )
-
- The effect of this statement is to initialize var with the number 24 (the
- factorial of 4).
-
-
- Using Macro Functions with Variable-Length Parameter Lists
-
- Macro functions can enhance FOR loops.
-
- You can use the FOR directive to handle macro parameters with the VARARG
- attribute. Section 9.4.3 explains how to do this in simple cases where the
- variable parameters are handled sequentially, from first to last. However,
- you may sometimes need to process the parameters in reverse order or
- nonsequentially. Macro functions make these techniques possible.
-
- You may need to know the number of arguments in a VARARG parameter. The
- following macro functions handle this.
-
- @ArgCount MACRO arglist:VARARG
- LOCAL count
- count = 0
- FOR arg, <arglist>
- count = count + 1 ;; Count the arguments
- ENDM
- EXITM %count
- ENDM
-
- You could use this inside a macro that has a VARARG parameter, as shown
- below:
-
- work MACRO args:VARARG
- % ECHO Number of arguments is: @ArgCount( args )
- ENDM
-
- Another useful task might be to select an item from an argument list using
- an index to indicate which item. The following macro simplifies this.
-
- @ArgI MACRO index:REQ, arglist:VARARG
- LOCAL count, retstr
- retstr TEXTEQU <> ;; Initialize count
- count = 0 ;; Initialize return string
- FOR arg, <arglist>
- count = count + 1
- IF count EQ index ;; Item is found
- retstr TEXTEQU <arg> ;; Set return string
- EXITM ;; and exit IF
- ENDIF
- ENDM
- EXITM retstr ;; Exit function
- ENDM
-
- This function can be used as shown below:
-
- work MACRO args:VARARG
- % ECHO Third argument is: @ArgI( 3, args )
- ENDM
-
- Finally, you might need to process arguments in reverse order. The following
- macro returns a new argument list in reverse order.
-
- @ArgRev MACRO arglist:REQ
- LOCAL txt, arg
- txt TEXTEQU <>
- % FOR arg, <arglist>
- txt CATSTR <arg>, <,>, txt ;; Paste each onto list
- ENDM
- ;; Remove terminating comma
- txt SUBSTR txt, 1, @SizeStr( %txt ) - 1
- txt CATSTR <!<>, txt, <!>> ;; Add angle brackets
- EXITM txt
- ENDM
-
- You could call this function as shown below:
-
- work MACRO args:VARARG
- % FOR arg, @ArgRev( <args> ) ;; Process in reverse order
- ECHO arg
- ENDM
- ENDM
-
- These three macro functions are provided on the MASM distribution disk in
- the MACROS.INC include file.
-
-
- Macro Operators and Macro Functions
-
- This list summarizes the behavior of the expansion operator with macro
- functions.
-
-
- ■ If a macro function is not preceded by a %, it will be expanded.
- However, if it expands to a text macro or a macro function call, the
- result will not be expanded further.
-
- ■ If you use a macro function call as an argument for another macro
- function call, a % is not needed.
-
- ■ If a macro function expands to a text macro (or another macro
- function), the macro function will be recursively expanded.
-
- ■ If a macro function is called inside angle brackets and is preceded by
- %, it will be expanded.
-
-
-
- 9.7 Advanced Macro Techniques
-
- The concept of replacing macro names with predefined macro text is simple in
- theory, but it has many implications and complications. Here is a brief
- summary of some advanced techniques you can use in macros.
-
-
- 9.7.1 Nesting Macro Definitions
-
- Macros can define other macros or can be redefined. MASM does not process
- nested definitions until the outer macro has been called. Therefore, the
- inner macros cannot be called until the outer macro has been called. The
- nesting of macro definitions is limited only by memory.
-
- shifts MACRO opname ;; Macro generates macros
- opname&s MACRO operand:REQ, rotates:=<1>
- IF rotates LE 2 ;; One at a time is faster
- REPEAT rotate ;; for 2 or less
- opname operand, 1
- ENDM
- ELSE ;; Using CL is faster for
- mov cl, rotates ;; more than 2
- opname operand, cl
- ENDIF
- ENDM
- ENDM
-
- ; Call macro to make new macros
- shifts ror ; Generates rors
- shifts rol ; Generates rols
- shifts shr ; Generates shrs
- shifts shl ; Generates shls
- shifts rcl ; Generates rcls
- shifts rcr ; Generates rcrs
- shifts sal ; Generates sals
- shifts sar ; Generates sars
-
- This macro generates enhanced versions of the shift and rotate instructions.
- The macros could be called like this:
-
- shrs ax, 5
- rols bx, 3
-
- The macro versions handle multiple shifts by generating different code,
- depending on how many shifts are specified. The example above is optimized
- for the 8088 and 8086 processors. If you want to enhance for other
- processors, you can simply change the outer macro; it automatically changes
- all the inner macros. Code that uses the inner macros benefits from the
- enhancements but does not change so long as the macro interface doesn't
- change.
-
-
- 9.7.2 Testing for Argument Type and Environment
-
- Macros can check the type of arguments and generate different code depending
- on what they find. For example, you can use the OPATTR operator to determine
- if an argument is a constant, a register, or a memory operand.
-
- If you discover a constant value, you can often optimize the code. In some
- cases, you can generate better code for 0 or 1 than for other constants. If
- the argument is a memory operand, you know nothing about the value of the
- operand, since it may change at run time. However, you may want to generate
- different code depending on the operand size and on whether it is a pointer.
- Similarly, if the operand is a register, you know nothing of its contents,
- but you may be able to optimize if you can identify a particular register
- with the IFDIFI or IFIDNI directives.
-
- The following example illustrates some of these techniques. It loads a
- specified address into a specified offset register. The segment register is
- assumed to be DS.
-
- load MACRO reg:REQ, adr:REQ
- IF (OPATTR (adr)) AND 00010000y ;; Register
- IFDIFI reg, adr ;; Don't load register
- mov reg, adr ;; onto itself
- ENDIF
- ELSEIF (OPATTR (adr)) AND 00000100y
- mov reg, adr ;; Constant
- ELSEIF (TYPE (adr) EQ BYTE) OR (TYPE (adr) EQ SBYTE)
- mov reg, OFFSET adr ;; Bytes
- ELSEIF (SIZE (TYPE (adr)) EQ 2
- mov reg, adr ;; Near pointer
- ELSEIF (SIZE (TYPE (adr)) EQ 4
- mov reg, WORD PTR adr[0] ;; Far pointer
- mov ds, WORD PTR adr[2]
- ELSE
- .ERR <Illegal argument>
- ENDIF
- ENDM
-
- A macro may also generate different code depending on the assembly
- environment. The predefined text macro @Cpu can be used to test for
- processor type. The following example uses the more efficient constant
- variation of the PUSH instruction if the processor is an 80186 or higher.
-
- IF @Cpu AND 00000010y
- pushc MACRO op ;; 80186 or higher
- push op
- ENDM
- ELSE
- pushc MACRO op ;; 8088/8086
- mov ax, op
- push ax
- ENDM
- ENDIF
-
- Note that the example generates a completely different macro for the two
- cases. This is more efficient than testing the processor inside the macro
- and conditionally generating different code. With this macro, the
- environment is checked only once; if the conditional were inside the macro
- it would be checked every time the macro is called.
-
- You can test the language and operating system using the @Interface text
- macro. The memory model can be tested with the @Model, @DataSize, or
- @CodeSize text macros.
-
- You can save the contexts inside macros with PUSHCONTEXT and POPCONTEXT. The
- options for these keywords are:
-
- Option Description
- ────────────────────────────────────────────────────────────────────────────
- RADIX Saves segment register information
- LIST Saves listing and CREF information
- CPU Saves current CPU and processor
- ALL All of the above
-
-
- 9.7.3 Using Recursive Macros
-
- Macros can call themselves. In previous versions of MASM, recursion is an
- important technique for handling variable arguments. With MASM 6.0, you can
- do this much more cleanly using the FOR directive and the VARARG attribute,
- as described in Section 9.4.3. However, recursion is still available and may
- be useful for some macros.
-
-
- 9.8 Related Topics in Online Help
-
- In addition to information covered in this chapter, information on the
- following topics can be found in online help. From the "MASM 6.0 Contents"
- screen:
-
- ╓┌─────────────────────────────────────┌─────────────────────────────────────╖
- Topics Access
- ────────────────────────────────────────────────────────────────────────────
- INCLUDE Choose "Directives," and then "Scope
- and
- Visibility"
-
- GOTO, PURGE Choose "Directives," and then
- Topics Access
- ────────────────────────────────────────────────────────────────────────────
- GOTO, PURGE Choose "Directives," and then
- "Macros and Iterative Blocks"
-
- .LISTMACRO Choose "Directives," and then
- "Listing
- Control"
-
- IFB, IFNB, IFDIFI, Choose "Directives," and then
- and IFIDNI "Conditional Assembly"
-
- ECHO Choose "Directives," and then
- "Miscellaneous"
-
- OPATTR Choose "Operators," and then
- "Miscellaneous"
-
- @Cpu, @Interface, @DataSize, Choose "Predefined Symbols"
- @Environ, and @CodeSize
-
- Topics Access
- ────────────────────────────────────────────────────────────────────────────
- PUSHCONTEXT, Choose "Directives" and then
- POPCONTEXT "Iterative Blocks"
-
-
-
-
-
-
-
-
- Chapter 10 Managing Projects with NMAKE
- ────────────────────────────────────────────────────────────────────────────
-
- The Microsoft Program Maintenance Utility (NMAKE) is a sophisticated command
- processor that saves time and simplifies project management. Once you
- specify which project files depend on others, NMAKE automatically executes
- the commands needed to update your project when any project file has
- changed.
-
- The advantage of using NMAKE instead of simple batch files is that NMAKE
- recompiles only those files that need recompiling. NMAKE doesn't waste time
- with files that haven't changed since the last build. NMAKE also has
- advanced features (such as macros) that simplify managing complex projects.
-
-
- This chapter includes examples that show how each feature of NMAKE works. In
- addition, Section 10.9, "A Sample NMAKE Description File," shows how many of
- these features work together.
-
- If you are using the Microsoft Programmer's WorkBench (PWB) to build your
- project, PWB automatically creates a description file (called a "makefile"
- in the PWB documentation) and calls NMAKE to run the file. You may want to
- read this chapter if you intend to build your program outside of PWB or if
- you want to understand or modify a description file created by PWB.
-
- A utility called NMK allows you to use NMAKE to manage your project under
- DOS (or in a DOS session under OS/2). Section 10.11, "Using NMK," explains
- when and how to use NMK.
-
- If you are familiar with MAKE, the predecessor to NMAKE, be sure to read
- Section 10.10, "Differences between NMAKE and MAKE." These utilities differ
- in several important respects.
-
-
- 10.1 Overview of NMAKE
-
- NMAKE works by looking at the last times and dates of modification for a
- "target" file and its "dependents" and then comparing them. A target is
- usually a file you want to create, such as an executable file. A dependent
- is usually a file from which a target is created, such as a source file. A
- target is "out-of-date" if any of its dependents has changed more recently
- than the target.
-
- ────────────────────────────────────────────────────────────────────────────
- WARNING
-
- For NMAKE to work properly, the date and time setting on your system must be
- consistent relative to previous settings. If you set the date and time each
- time you start the system, be careful to set it accurately. If your system
- stores a setting, be certain that the battery is working.
- ────────────────────────────────────────────────────────────────────────────
-
- When you run NMAKE, it reads a "description file" that you supply. The
- description file consists of one or more description blocks. Each
- description block typically lists a target, the target's dependents, and the
- commands that build the target. NMAKE compares the last time the targets
- changed to the last time the dependents changed. If the modification time of
- any dependents is the same or later than the time of the target, NMAKE
- updates the target by executing the command or commands listed in the
- description block.
-
- NMAKE's main purpose is to help you update applications quickly and simply.
- However, it can execute any DOS or OS/2 command, so it is not limited to
- compiling and linking. NMAKE can also make backups, move files, and perform
- other project-management tasks that you ordinarily do at the
- operating-system prompt.
-
-
- 10.2 Running NMAKE
-
- You invoke NMAKE with the following syntax:
-
- NMAKE [[options]] [[macros]]
- [[targets]]
-
- The options field lists NMAKE options, which are described in Section 10.4,
- "Command-Line Options."
-
- The macros field lists macro definitions, which allow you to change text in
- the description file. The syntax for macros is described in "User-Defined
- Macros" in Section 10.3.4.1, "Macros."
-
- The targets field lists targets to build. NMAKE rebuilds only the targets
- listed on the command line. If you don't specify any targets, NMAKE builds
- only the first target in the description file. (This behavior departs
- significantly from that of MAKE. See Section 10.10, "Differences between
- NMAKE and MAKE.")
-
- NMAKE follows the instructions you specify in a description file.
-
- NMAKE searches the current directory for the name of a description file you
- specify with the /F option. It halts and displays an error message if the
- file does not exist. If you do not use the /F option to specify a
- description file, NMAKE searches the current directory for a description
- file named MAKEFILE. If MAKEFILE does not exist, NMAKE checks the command
- line for target files and tries to build them using predefined inference
- rules (either default or defined in TOOLS.INI). This feature lets you use
- NMAKE without a description file (as long as NMAKE has a predefined
- inference rule for the target). If the command line does not specify any
- target files, NMAKE halts and displays an error message.
-
-
- Example
-
- NMAKE /S "program=sample" sort.exe search.exe
-
- This command supplies four arguments: an option (/S), a macro definition
- ("program=sample"), and two target specifications (sort.exe and
- search.exe).
-
- The command does not specify a description file, so NMAKE looks for the
- default description file, MAKEFILE. The /S option tells NMAKE not to display
- the commands as they are executed. (See Section 10.4, "Command-Line
- Options.") The macro definition performs a text substitution throughout the
- description file, replacing every instance of program with sample. The
- target specifications tell NMAKE to update the targets SORT.EXE and
- SEARCH.EXE.
-
-
- 10.3 NMAKE Description Files
-
- The most important parts of a description file are the description blocks,
- which tell NMAKE how to build your project's target files. A description
- file can also contain comments, macros, inference rules, and directives.
- This section describes the elements of description files.
-
-
- 10.3.1 Description Blocks
-
- Description blocks form the heart of the description file. Figure 10.1
- illustrates a typical NMAKE description block, including the three sections:
- targets, dependents, and commands.
-
- (This figure may be found in the printed book.)
-
-
- 10.3.1.1 Targets
-
- The target is the file that you want to build.
-
- The targets section of the dependency line lists one or more files to build.
- The line that lists targets and dependents is called the "dependency line."
-
-
- The example in Figure 10.1 tells NMAKE how to build a single target,
- MYAPP.EXE, if it is missing or out-of-date. Although single targets are
- common, you can also list multiple targets in a single dependency line; you
- must separate each target name with a space. If the name of the last target
- before the colon (:) is one character long, put a space between the name and
- the colon, so NMAKE won't interpret the character as a drive specification.
-
-
- A target can appear in only one dependency line when specified as shown
- above. To update a target using more than one description block, specify two
- consecutive colons (::) between targets and dependents. For details, see
- Section 10.3.1.8, "Specifying a Target in Multiple Description Blocks."
-
- The target is usually a file, but it can also be a "pseudotarget," a name
- that lets you build groups of files or execute a group of commands. For more
- information, see Section 10.3.2, "Pseudotargets."
-
-
- 10.3.1.2 Dependents
-
- A dependent is a file used to build a target.
-
- The dependents section of the description block lists one or more files from
- which the target is built. A colon (:) separates it from the targets
- section. The example in Figure 10.1 lists three dependents after MYAPP.EXE:
-
-
- myapp.exe : myapp.obj another.obj myapp.def
-
- You can also specify the directories in which NMAKE should search for a
- dependent. Enclose one or more directory names in braces ( { } ). Separate
- multiple directories with a semicolon ( ; ). The syntax for a directory
- specification is
-
- {directory[[;directory...]]}dependent
-
-
- Example
-
- The following dependency line tells NMAKE to search the current directory
- first, then the specified directories:
-
- forward.exe : {\src\alpha;d:\proj}pass.obj
-
- In the line above, the target, FORWARD.EXE, has one dependent, PASS.OBJ. The
- directory list specifies two directories:
-
- {\src\alpha;d:\proj}
-
- NMAKE first searches for PASS.OBJ in the current directory. If PASS.OBJ
- isn't there, NMAKE searches the \ SRC \ ALPHA directory, then the D:\ PROJ
- directory. If NMAKE cannot find a dependent in the current directory or a
- listed directory, it looks for a description block with a dependency line
- containing PASS.OBJ as a target, and uses the commands in that description
- block to create PASS.OBJ. If NMAKE cannot find such a description block, it
- looks for an inference rule that describes how to create the dependent. (See
- Section 10.3.5, "Inference Rules.")
-
-
- 10.3.1.3 Dependency Line
-
- The dependency line in Figure 10.1 tells NMAKE to rebuild the target
- MYAPP.EXE whenever MYAPP.OBJ, ANOTHER.OBJ, or MYAPP.DEF has changed more
- recently than MYAPP.EXE.
-
- The object files in the dependency list above would never be newer than the
- executable file (unless you had recompiled the source code before running
- NMAKE). So NMAKE checks to see if the object files themselves are targets in
- other dependency lists, and if any dependents in those lists are targets
- elsewhere, and so on.
-
- NMAKE continues moving through all dependencies this way to build a
- "dependency tree" that specifies all the steps required to fully update the
- target. If NMAKE then finds any dependents in the tree that are newer than
- the target, NMAKE updates the appropriate files and rebuilds the target.
-
-
- 10.3.1.4 Commands
-
- The commands section can contain one or more commands.
-
- The commands section of the description block lists the commands that NMAKE
- should use to build the target. You can use any command that can be executed
- from the command line. The example in Figure 10.1 tells NMAKE to build
- MYAPP.EXE using the following LINK command:
-
- link myapp another.obj, , NUL, os2, myapp
-
- Notice that the line is indented. NMAKE uses indentation to distinguish
- between a dependency line and a command line. A command line must be
- indented at least one space or tab. The dependency line must not be indented
- (it cannot start with a space or tab).
-
- Many targets are built with a single command, but you can place more than
- one command after the dependency line, each on a separate line, as shown in
- Figure 10.1.
-
- A long command can span several lines if each line ends with a backslash ( \
- ). A backslash at the end of a line is equivalent to a space on the command
- line. For example, the command
-
- echo abcd\
- efgh
-
- is equivalent to the command
-
- echo abcd efgh
-
- You can also place a command at the end of a dependency line. Use a
- semicolon (;) to separate the command from the rightmost dependent, as in
-
- project.exe : project.obj ; link project;
-
- OS/2 allows multiple commands on one command line.
-
- OS/2 allows you to combine two or more commands on a single command line
- with an ampersand (&). For example, the following command line is legal in
- an OS/2 description file:
-
- DIR & COPY sample.exe backup.exe
-
- A slight restriction is imposed on the use of the CD, CHDIR, and SET
- commands in OS/2 description files. NMAKE executes these commands itself
- rather than passing them to OS/2. Therefore, if any of these commands is the
- first command on a line, the remaining commands are not executed because
- they aren't passed to OS/2.
-
- The following multiple-command line does not display the directory listing
- because DIR is preceded by a CD command:
-
- CD \mydir & DIR
-
- To use CD, CHDIR, or SET in a description block, place these commands on
- separate lines:
-
- CD \mydir
- DIR
-
- NMAKE interprets a percent symbol (%) within a command line as the start of
- a file specifier. To use a literal percent symbol in a command line, specify
- it as a double percent symbol (%%). (See Section 10.3.8, "Extracting
- Filename Components.")
-
-
- 10.3.1.5 Wild Cards
-
- You can use DOS and OS/2 wild-card characters (* and ?) to specify target
- and dependent filenames. NMAKE expands the wild cards when analyzing
- dependencies and when building targets. For example, the following
- description block links all files having the .OBJ extension in the current
- directory:
-
- project.exe : *.obj
- LINK $*.obj;
-
-
- 10.3.1.6 Command Modifiers
-
- Command modifiers are special prefixes attached to the command. They provide
- extra control over the commands in a description block. You can use more
- than one modifier for a single command. Table 10.1 describes the three NMAKE
- command modifiers.
-
- Table 10.1 Command Modifiers
-
- ╓┌─────────────────────────────────┌─────────────────────────────────────────╖
- Character Action
- ────────────────────────────────────────────────────────────────────────────
- @ Prevents NMAKE from displaying the
- command as it executes. In the example
- below, the at sign (@) suppresses
- display of the ECHO command line:
-
- sort.exe : sort.obj
- @ECHO Now sorting.
-
- The output of the ECHO command is not
- suppressed.
-
- Character Action
- ────────────────────────────────────────────────────────────────────────────
- -«number» Turns off error checking for the command.
- Spaces and tabs can appear before the
- command. If the dash is followed by a
- number, NMAKE checks the exit code
- returned by the command and stops if the
- code is greater than the number. No
- space or tab can appear between the dash
- and number. (See Section 10.12, "Using
- Exit Codes with NMAKE.")
-
- In the following example, if the program
- sample returns an exit code, NMAKE
- does not stop but continues to execute
- commands; if sort returns an exit code
- greater than 5, NMAKE stops:
-
- light.lst : light.txt
- -sample light.txt
- Character Action
- ────────────────────────────────────────────────────────────────────────────
- -sample light.txt
- -5 sort light.txt
-
- ! Executes the command for each dependent
- file if the command preceded by the
- exclamation point uses the predefined
- macros $** or $?. (See Section 10.3.4,
- "Macros.") The $** macro refers to all
- dependent files in the description block.
- The $? macro refers to all dependent
- files in the description block that have
- a more recent modification time than the
- target. For example,
-
- print : one.txt two.txt three.txt
- !print $** lpt1:
-
- generates the following commands:
-
- Character Action
- ────────────────────────────────────────────────────────────────────────────
- print one.txt lpt1:
- print two.txt lpt1:
- print three.txt lpt1:
-
- ────────────────────────────────────────────────────────────────────────────
-
-
-
-
- 10.3.1.7 Using Special Characters as Literals
-
- You may need to specify as a literal character one of the characters that
- NMAKE uses for a special purpose. These characters are
-
- : ; # ( ) $ ^ \ { } ! @ ─
-
- To use one of these characters literally, place a caret (^) in front of it.
- For example, suppose you define a macro that ends with a backslash:
-
- exepath=c:\bin\
-
- The line above is intended to define a macro named exepath with the value
- c:\bin\. But the second backslash has an unintended side effect. Since the
- backslash is NMAKE's line-continuation character, the line actually defines
- exepath as c:\bin, followed by whatever appears on the next line of the
- description file. You can avoid this problem by placing a caret in front of
- the second backslash:
-
- exepath=c:\bin^\
-
- You can also use a caret to insert a literal newline character in a string
- or macro:
-
- XYZ=abc^
- def
-
- The caret tells NMAKE to interpret the newline character as part of the
- macro, not a line break. Note that this effect differs from using a
- backslash ( \ ) to continue a line. A newline character that follows a
- backslash is replaced with a space.
-
- NMAKE ignores carets that precede characters other than the special
- characters listed above. The line
-
- ign^ore : these ca^rets
-
- is interpreted as
-
- ignore : these carets
-
- A caret within a quoted string is treated as a literal caret character.
-
-
- 10.3.1.8 Specifying a Target in Multiple Description Blocks
-
- You can specify a target in more than one description block by placing two
- colons (::) after the target. This feature is useful for building a complex
- target, such as a library, that contains components created with different
- commands. For example,
-
- target.lib :: a.asm b.asm c.asm
- ML a.asm b.asm c.asm
- LIB target -+a.obj -+b.obj -+c.obj;
- target.lib :: d.c e.c
- CL /c d.c e.c
- LIB target -+d.obj -+e.obj;
-
- Both description blocks update the library named TARGET.LIB. If any of the
- assembly-language files have changed more recently than the library, NMAKE
- executes the commands in the first block to assemble the source files and
- update the library. Similarly, if any of the C-language files have changed,
- NMAKE executes the second group of commands to compile the C files and
- update the library.
-
- If you use a single colon in the example above, NMAKE issues an error
- message. It is legal, however, to use single colons if the target appears in
- only one block. In this case, dependency lines are cumulative. For example,
-
-
- target : jump.bas
- target : up.c
- echo Building target...
-
- is equivalent to
-
- target : jump.bas up.c
- echo Building target...
-
- No commands can appear between cumulative dependency lines, but blank lines,
- comment lines, macro definitions, and directives can appear.
-
-
- 10.3.2 Pseudotargets
-
- A "pseudotarget" is similar to a target, but it is not a file. It is a name
- used as a label for executing a group of commands. In the following example,
- UPDATE is a pseudotarget.
-
- UPDATE : *.*
- !COPY $** a:\product
-
- NMAKE always considers the pseudotarget to be out-of-date. In the previous
- example, NMAKE copies all the dependent files to the specified drive and
- directory.
-
- Like target names, pseudotarget names are not case sensitive.
-
-
- 10.3.3 Comments
-
- You can place comments in a description file by preceding them with a number
- sign (#):
-
- # Comment on line by itself
- OPTIONS = /MAP # Comment on macro's line
- all.exe : one.obj two.obj # Comment on dependency line
- link $(OPTIONS) one.obj two.obj;
-
- A comment extends to the end of the line in which it appears. Command lines
- (and dependency lines containing commands) cannot contain comments.
-
- To specify a literal #, precede it with a caret (^ ), as in the following:
-
- DEF=^#define #Macro representing a C preprocessing directive
-
-
- 10.3.4 Macros
-
- Macros offer a convenient way to replace a particular string in the
- description file with another string. Macros are useful for a variety of
- tasks, including the following:
-
-
- ■ Creating a single description file that works for several projects.
- You can define a macro that replaces a dummy filename in the
- description file with the specific filename for a particular project.
-
- ■ Controlling the options NMAKE passes to the compiler or linker. When
- you specify options in a macro, you can change options throughout the
- description file in a single step.
-
-
- You can define your own macros or use predefined macros. This section
- describes user-defined macros first.
-
-
- 10.3.4.1 User-Defined Macros
-
- You can define a macro with this syntax:
-
- macroname=string
-
- The macroname can be any combination of letters, digits, and the underscore
- ( _ ) character. Macro names are case sensitive. NMAKE interprets MyMacro
- and MYMACRO as different macro names.
-
- The string can be any sequence of zero or more characters. (A string of zero
- characters is called a "null string." A string consisting only of spaces,
- tabs, or both is also considered a null string.) For example,
-
- linkcmd=LINK /map
-
- defines a macro named linkcmd and assigns it the string LINK /map.
-
- You can define macros in the description file, on the command line, in a
- command file (see Section 10.5, "NMAKE Command File"), or in TOOLS.INI (see
- Section 10.6, "The TOOLS.INI File"). Each macro defined in the description
- file must appear on a separate line. The line cannot start with a space or
- tab.
-
- When you define a macro in the description file, NMAKE ignores spaces on
- either side of the equal sign. The string itself can contain embedded
- spaces. You do not need to enclose string in quotation marks (if you do,
- they become part of the string).
-
- Slightly different rules apply when you define a macro on the command line
- or in a command file. The command-line parser treats spaces as argument
- delimiters. Therefore, the string itself, or the entire macro, must be
- enclosed in double quotation marks if it contains embedded spaces. All three
- forms of the following command-line macro are legal and equivalent:
-
- NMAKE program=sample
- NMAKE "program=sample"
- NMAKE "program = sample"
-
- The macro program is passed to NMAKE, with an assigned value of sample.
-
- If the string contains spaces, either the string or the entire macro must
- appear within quotes. Either form of the following command-line macro is
- allowed:
-
- NMAKE linkcmd="LINK /map"
- NMAKE "linkcmd=LINK /map"
-
- However, the following form of the same macro is not allowed. It contains
- spaces that are not enclosed by quotation marks:
-
- NMAKE linkcmd = "LINK /map"
-
- A macro name can be given a null value. Both of the following definitions
- assign a null value to the macro linkoptions:
-
- NMAKE linkoptions=
- NMAKE linkoptions=" "
-
- A macro name can be "undefined" with the !UNDEF preprocessing directive (see
- Section 10.3.7, "Preprocessing Directives"). Assigning a null value to a
- macro name does not undefine it; the name is still defined, but with a null
- value.
-
- A macro can be followed by a comment, using the syntax described in the
- preceding section on comments.
-
-
- 10.3.4.2 Using Macros
-
- Use a macro by enclosing its name in parentheses preceded by a dollar sign
- ($). For example, you can use the linkcmd macro defined above by
- specifying
-
- $(linkcmd)
-
- NMAKE replaces every occurrence of $(linkcmd) with LINK /map.
-
- The following description file defines and uses three macros:
-
- program=sample
- L=LINK
- options=
-
- $(program).exe : $(program).obj
- $(L) $(options) $(program).obj;
-
- NMAKE interprets the description block as
-
- sample.exe : sample.obj
- LINK sample.obj;
-
- NMAKE replaces every occurrence of $(program) with sample, every instance
- of $(L) with LINK, and every instance of $(options) with a null string.
-
-
- An undefined macro is replaced by a null string.
-
- If you use as a macro a name that has never been defined, or was undefined,
- NMAKE treats that name as a null string. No error occurs.
-
- To use the dollar sign ($) as a literal character, specify two dollar signs
- ($$).
-
- The parentheses are optional if macroname is a single character. For
- example, $L is equivalent to $(L). However, parentheses are recommended
- for consistency.
-
-
- 10.3.4.3 Special Macros
-
- NMAKE provides several special macros to represent various filenames and
- commands. One use for these macros is in predefined inference rules. (See
- Section 10.3.5.4.) Like user-defined macro names, special macro names are
- case sensitive. For example, NMAKE interprets CC and cc as different
- macro names.
-
- Tables 10.2 through 10.5 summarize the four categories of special macros.
- The filename macros offer a convenient representation of filenames from a
- dependency line; these are listed in Table 10.2. The recursion macros,
- listed in Table 10.3, allow you to call NMAKE from within your description
- file. Tables 10.4 and 10.5 describe the command macros and options macros
- that make it convenient for you to invoke the Microsoft language compilers.
-
-
- The filename macros conveniently represent filenames from the dependency
- line.
-
- Table 10.2 lists macros that are predefined to represent file names. As with
- all one-character macros, these do not need to be enclosed in parentheses.
- (The $$@ and $** macros are exceptions to the parentheses rule for macros;
- they do not require parentheses even though they contain two characters.)
- Note that the macros in Table 10.2 represent filenames as you have specified
- them in the dependency line, and not the full specification of the filename.
-
-
- Table 10.2 Filename Macros
-
- ╓┌────────────────┌──────────────────────────────────────────────────────────╖
- Macro
- Reference Meaning
- ────────────────────────────────────────────────────────────────────────────
- $@ The current target's full name, as currently specified.
- This is not necessarily the full path name.
-
- Macro
- Reference Meaning
- ────────────────────────────────────────────────────────────────────────────
- $* The current target's full name minus the file extension.
-
- $** The dependents of the current target.
-
- $? The dependents that have a more recent modification time
- than the current target.
-
- $$@ The target that NMAKE is currently evaluating. You can
- use this macro only to specify a dependent.
-
- $< The dependent file that has a more recent modification
- time than the current target (evaluated only for
- inference rules).
-
- ────────────────────────────────────────────────────────────────────────────
-
-
-
- The example below uses the $? macro, which represents all dependents that
- are more recent than the target. The ! command modifier causes NMAKE to
- execute a command once for each dependent in the list (see Table 10.1). As a
- result, the LIB command is executed up to three times, each time replacing a
- module with a newer version.
-
- trig.lib : sin.obj cos.obj arctan.obj
- !LIB trig.lib -+$?;
-
- In the next example, NMAKE updates files in another directory by replacing
- them with files of the same name from the current directory. The $@ macro is
- used to represent the current target's full name:
-
- #Files in objects directory depend on versions in current directory
- DIR=c:\objects
- $(DIR)\globals.obj : globals.obj
- COPY globals.obj $@
- $(DIR)\types.obj : types.obj
- COPY types.obj $@
- $(DIR)\macros.obj : macros.obj
- COPY macros.obj $@
-
- Macro modifiers specify parts of the predefined filename macros.
-
- You can append one of the modifiers in the following list to any of the
- filename macros to extract part of a filename. If you add one of these
- modifiers to the macro, you must enclose the macro name and the modifier in
- parentheses.
-
- Modifier Resulting Filename Part
- ────────────────────────────────────────────────────────────────────────────
- D Drive plus directory
-
- B Base name
-
- F Base name plus extension
-
- R Drive plus directory plus base name
-
- For example, assume that $@ has the value C:\SOURCE\PROG\SORT.OBJ. The
- following list shows the effect of combining each modifier with $@:
-
- Macro Reference Value
- ────────────────────────────────────────────────────────────────────────────
- $(@D) C:\SOURCE\PROG
-
- $(@F) SORT.OBJ
-
- $(@B) SORT
-
- $(@R) C:\SOURCE\PROG\SORT
-
- If $@ has the value SORT.OBJ without a preceding directory, the value of
- $(@R) is just SORT, and the value of $(@D) is a dot (.) to represent the
- current directory.
-
- Recursion macros let you use NMAKE to call NMAKE.
-
- Table 10.3 lists three macros that you can use when you want to call NMAKE
- recursively from within a description file.
-
- Table 10.3 Recursion Macros
-
- Macro
- Reference Meaning
- ────────────────────────────────────────────────────────────────────────────
- $(MAKE) The name used to call NMAKE recursively. The line on
- which it appears is executed even if the /N command-line
- option is specified.
-
- $(MAKEDIR) The directory from which NMAKE is called.
-
- $(MAKEFLAGS) The NMAKE options currently in effect. This macro is
- passed automatically when you call NMAKE recursively. You
- cannot redefine this macro. Use the preprocessing
- directive !CMDSWITCHES to update the MAKEFLAGS macro.
- (See Section 10.3.7, "Preprocessing Directives.")
-
- ────────────────────────────────────────────────────────────────────────────
-
-
- To call NMAKE recursively, use the command
-
- $(MAKE) /$(MAKEFLAGS)
-
- The MAKE macro is useful for building different versions of a program. The
- following description file calls NMAKE recursively to build targets in the
- \VERS1 and \VERS2 directories.
-
- all : vers1 vers2
-
- vers1 :
- cd \vers1
- $(MAKE)
- cd ..
-
- vers2 :
- cd \vers2
- $(MAKE)
- cd ..
-
- The example changes to the \VERS1 directory and then calls NMAKE
- recursively, causing NMAKE to process the file MAKEFILE in that directory.
- Then it changes to the \VERS2 directory and calls NMAKE again, processing
- the file MAKEFILE in that directory.
-
- You can add options to the ones already in effect for NMAKE by following the
- MAKE macro with the options in the same syntax as you would specify them on
- the command line. You can also pass the name of a description file with the
- /F option instead of using a file named MAKEFILE.
-
- Deeply recursive build procedures can exhaust NMAKE's run-time stack,
- causing an error. If this occurs, use the EXEHDR utility to increase NMAKE's
- run-time stack. The following command, for example, gives NMAKE.EXE a stack
- size of 16,384 (0x4000) bytes:
-
- exehdr /stack:0x4000 nmake.exe
-
- Command macros are shortcut calls to Microsoft compilers.
-
- NMAKE defines several macros to represent commands for Microsoft products.
- (See Table 10.4.) You can use these macros as commands in a description
- block, or invoke them using a predefined inference rule. (See Section
- 10.3.5, "Inference Rules.") You can redefine these macros to represent part
- or all of a command line, including options.
-
- Table 10.4 Command Macros
-
- ╓┌────────────────┌─────────────────────────────────────────┌────────────────
- Macro Reference Command Action Predefined Value
- ─────────────────────────────────────────────────────────────────────────────
- $(AS) Invokes the Microsoft Macro AS=ml
- Assembler
-
- $(BC) Invokes the Microsoft Basic BC=bc
- Compiler
-
- $(CC) Invokes the Microsoft C Compiler CC=cl
-
- $(COBOL) Invokes the Microsoft COBOL Compiler COBOL=cobol
-
- $(FOR) Invokes the Microsoft FORTRAN FOR=fl
- Compiler
-
- $(PASCAL) Invokes the Microsoft Pascal PASCAL=pl
- Compiler
- Macro Reference Command Action Predefined Value
- ─────────────────────────────────────────────────────────────────────────────
- Compiler
-
- $(RC) Invokes the Microsoft Resource Compiler RC=rc
-
- ─────────────────────────────────────────────────────────────────────────────
-
-
-
- Options macros pass preset options to Microsoft compilers.
-
- The macros in Table 10.5 are used by NMAKE to represent options to be passed
- to the commands for Microsoft languages. By default, these macros are
- undefined. You can define them to mean the options you want to pass to the
- commands. Whether or not they are defined, the macros are used automatically
- in the predefined inference rules. If the macros are undefined, or if they
- are defined to be null strings, a null string is generated in the command
- line. (See Section 10.3.5.4, "Predefined Inference Rules.")
-
- Table 10.5 Options Macros
-
- ╓┌─────────────────────────┌─────────────────────────────────────────────────╖
- Macro Reference Passed to
- ────────────────────────────────────────────────────────────────────────────
- $(AFLAGS) Microsoft Macro Assembler
- $(BFLAGS) Microsoft Basic Compiler
- $(CFLAGS) Microsoft C Compiler
- $(COBFLAGS) Microsoft COBOL Compiler
- $(FFLAGS) Microsoft FORTRAN Compiler
- $(PFLAGS) Microsoft Pascal Compiler
- $(RFLAGS) Microsoft Resource Compiler
- ────────────────────────────────────────────────────────────────────────────
-
-
-
- 10.3.4.4 Substitution within Macros
-
- You can replace text in a macro as well as in the description file.
-
- Just as macros allow you to substitute text in a description file, you can
- also substitute text within a macro itself. The substitution is temporary;
- it applies only to the current use of the macro and does not modify the
- original macro definition. Use the following form:
-
- $(macroname:string1=string2)
-
- Every occurrence of string1 is replaced by string2 in the macro macroname.
- Do not put any spaces or tabs between macroname and the colon. Spaces
- between the colon and string1 or between string1 and the equal sign are part
- of string1. Spaces between the equal sign and string2 or between string2 and
- the right parenthesis are part of string2. If string2 is a null string, all
- occurrences of string1 are deleted from the macroname macro.
-
- Macro substitution is case sensitive. This means that the case as well as
- the characters in string1 must exactly match the target string in the macro,
- or the substitution is not performed. It also means that the string2
- substitution is exactly as specified.
-
-
- Example 1
-
- The following description file illustrates macro substitution:
-
- SOURCES = project.for one.for two.for
-
- project.exe : $(SOURCES:.for=.obj)
- LINK $**;
-
- COPY : $(SOURCES)
- !COPY $** c:\backup
-
- The predefined macro $** stands for the names of all the dependent files
- (see Table 10.2).
-
- If you invoke the example file with a command line that specifies both
- targets,
-
- NMAKE project.exe copy
-
- NMAKE executes the following commands:
-
- LINK project.obj one.obj two.obj;
- COPY project.for c:\backup
- COPY one.for c:\backup
- COPY two.for c:\backup
-
- The macro substitution does not alter the SOURCES macro definition.
- Rather, it replaces the listed characters. When NMAKE builds the target
- PROJECT.EXE, it gets the definition for the predefined macro $** (the
- dependent list) from the dependency line, which specifies the macro
- substitution in SOURCES.
-
- The same is true for the second target, COPY. In this case, however, no
- macro substitution is requested, so SOURCES retains its original value,
- and $** represents the names of the FORTRAN source files. (In the example
- above, the target COPY is a pseudotarget; Section 10.3.2 describes
- pseudotargets.)
-
-
- Example 2
-
- If the macro OBJS is defined as
-
- OBJS=ONE.OBJ TWO.OBJ THREE.OBJ
-
- with exactly one space between each object name, you can replace each space
- in the defined value of OBJS with a space, followed by a plus sign, followed
- by a newline, by using
-
- $(OBJS: = +^
- )
-
- The caret (^) tells NMAKE to treat the end of the line as a literal newline
- character. This example is useful for creating response files.
-
-
- 10.3.4.5 Substitution within Predefined Macros
-
- You can also substitute text in any predefined macro except $$@. The
- principle is the same as for other macros. The command in the following
- description block substitutes within a predefined macro. Note that even
- though $@ is a singlecharacter macro, the substitution makes it a
- multi-character macro invocation, so it must be enclosed in parentheses.
-
- target.abc : depend.xyz
- echo $(@:targ=blank)
-
- If dependent depend.xyz has a later modification time than target
- target.abc, then NMAKE executes the command
-
- echo blanket.abc
-
- The example uses the predefined macro $@, which equals the full name of the
- current target (target.abc). It substitutes blank for targ in the
- target, resulting in blanket.abc.
-
-
- 10.3.4.6 Inherited Macros
-
- When NMAKE executes, it inherits macro definitions equivalent to every
- environment variable. The inherited macro names are converted to uppercase.
-
-
- Inherited macros can be used like other macros. You can also redefine them.
- The following example redefines the inherited macro PATH:
-
- PATH = c:\tools\bin
-
- sample.exe : sample.obj
- LINK sample;
-
- Inherited macros take their definitions from environment variables.
-
- No matter what value the environment variable PATH had before, it has the
- value c:\tools\bin when NMAKE executes the LINK command in this
- description block. Redefining the inherited macro does not affect the
- original environment variable; when NMAKE terminates, PATH still has its
- original value.
-
- Inherited macros have one restriction: in a recursive call to NMAKE, the
- only macros that are preserved are those defined on the command line or in
- environment variables. Macros defined in the description file are not
- inherited when NMAKE is called recursively. To pass a macro to a recursive
- call:
-
-
- ■ Use the SET command before the recursive call to set the variable for
- the entire NMAKE session.
-
- ■ Define the macro on the command line for the recursive call.
-
-
- The /E option causes macros inherited from environment variables to override
- any macros with the same name in the description file.
-
-
- 10.3.4.7 Precedence among Macro Definitions
-
- If you define the same macro name in more than one place, NMAKE uses the
- macro with the highest precedence. The precedence from highest to lowest is
- as follows:
-
-
- 1. A macro defined on the command line
-
- 2. A macro defined in a description file or include file
-
- 3. An inherited environment-variable macro
-
- 4. A macro defined in the TOOLS.INI file
-
- 5. A predefined macro such as CC and AS
-
-
-
- 10.3.5 Inference Rules
-
- Inference rules are templates that define how a file with one extension is
- created from a file with a different extension. When NMAKE encounters a
- description block that has no commands, it searches for an inference rule
- that matches the extensions of the target and dependent files. Similarly, if
- a dependent file doesn't exist, NMAKE looks for an inference rule that shows
- how to create the missing dependent from another file with the same base
- name.
-
- Inference rules tell NMAKE how to create files with a specific extension.
-
- Inference rules provide a convenient shorthand for common operations. For
- instance, you can use an inference rule to avoid repeating the same command
- in several description blocks. You can define your own inference rules or
- use predefined inference rules.
-
- ────────────────────────────────────────────────────────────────────────────
- NOTE
- An inference rule is useful only when a target and dependent have the same
- base name, and have a one-to-one correspondence. For example, you cannot
- define an inference rule that replaces several modules in a library, because
- the modules would have different base names than the target library.
- ────────────────────────────────────────────────────────────────────────────
-
- Inference rules can exist only for dependents with extensions that are
- listed in the .SUFFIXES directive. (For information on the .SUFFIXES
- directive, see Section 10.3.6, "Directives.") NMAKE searches in the current
- or specified directory for a file whose base name matches the target and
- whose extension is listed in the .SUFFIXES list. If it finds such a file, it
- applies the inference rule that matches the extensions of the target and the
- located file.
-
- The .SUFFIXES list specifies an order of priority for NMAKE to use when
- searching for files. If more than one file is found, and thus more than one
- rule matches a dependency line, NMAKE searches the .SUFFIXES list and uses
- the rule whose extension appears earlier in the list. For example, the
- dependency line
-
- project.exe :
-
- can be matched to several predefined inference rules and possibly one or
- more user-defined rules, all of which describe a command for creating an
- .EXE file. NMAKE uses the inference rule corresponding to the first matching
- file it finds.
-
-
- 10.3.5.1 Inference Rule Syntax
-
- An inference rule has the following syntax:
-
- .fromext.toext:
- commands
-
- The first line lists two extensions: fromext extension represents the
- filename extension of a dependent file, and toext represents the extension
- of a target file. Extensions are not case sensitive.
-
- The second line of the inference rule gives the command to create a target
- file of toext from a dependent file of fromext. Use the same rules for
- commands in inference rules as in description blocks. (See Section 10.3.1,
- "Description Blocks.")
-
-
- 10.3.5.2 Inference Rule Search Paths
-
- The inference-rule syntax described above tells NMAKE to look for the
- specified files in the current directory. You can also specify directories
- to be searched by NMAKE when it looks for files with the extensions fromext
- and toext. An inference rule that specifies paths has the following syntax:
-
-
- {frompath}.fromext {topath}.toext:
- commands
-
- NMAKE searches in the frompath directory for files with the fromext
- extension. It uses commands to create files with the toext extension in the
- topath directory, if the fromext file has a later modification time than the
- toext file.
-
- The paths in the inference rule must exactly match the paths explicitly
- specified in the dependency line of a description block.
-
- If you use a path on one element of the inference rule, you must use paths
- on both. You can specify the current directory for either element by using
- the operating system notation for the current directory, which is a dot (.),
- or by specifying an empty pair of braces.
-
- You can specify only one path for each element in an inference rule. To
- specify more than one path, repeat the inference rule with the alternate
- path.
-
-
- 10.3.5.3 User-Defined Inference Rules
-
- You can define inference rules in the description file or in TOOLS.INI (see
- Section 10.6, "The TOOLS.INI File"). An inference rule lists two file
- extensions and one or more commands.
-
-
- Example 1
-
- The following inference rule tells NMAKE how to build a .OBJ file from a .C
- file:
-
- .c.obj:
- CL /c $<
-
- In this example, the predefined macro $< represents the name of a dependent
- that has a more recent modification time than the target.
-
- NMAKE applies this inference rule to the following description block:
-
- sample.obj :
-
- The description block lists only a target, SAMPLE.OBJ. Both the dependent
- and the command are missing. However, given the target's base name and
- extension, plus the inference rule, NMAKE has enough information to build
- the target.
-
- NMAKE first looks for a file with the same base name as the target and with
- one of the extensions in the .SUFFIXES list. If SAMPLE.C exists (and no
- files with higher-priority extensions exist), NMAKE compares its time to
- that of SAMPLE.OBJ. If SAMPLE.C has changed more recently, NMAKE compiles it
- using the CL command listed in the inference rule:
-
- CL /c sample.c
-
-
- Example 2
-
- The following inference rule compares a .C file in the current directory
- with the corresponding .OBJ file in another directory:
-
- {.}.c{c:\objects}.obj:
- cl /c $<;
-
- The path for the .C file is represented by a dot. A path for the dependent
- extension is required because one is specified for the target extension.
-
- This inference rule matches a dependency line containing the same
- combination of paths, such as:
-
- c:\objects\test.obj : test.c
-
- This rule does not match a dependency line such as:
-
- test.obj : test.c
-
- In this case, NMAKE uses the predefined inference rule .c.obj when building
- the target.
-
-
- 10.3.5.4 Predefined Inference Rules
-
- NMAKE provides predefined inference rules containing commands for creating
- object, executable, and resource files. Table 10.6 describes the predefined
- inference rules.
-
- Table 10.6 Predefined Inference Rules
-
- ╓┌──────────┌─────────────────────────────────────┌──────────────────────────╖
- Rule Command Default Action
- ────────────────────────────────────────────────────────────────────────────
- .asm.obj $(AS) $(AFLAGS) /c $*.asm ML /c $*.ASM
- .asm.exe $(AS) $(AFLAGS) $*.asm ML $*.ASM
- .bas.obj $(BC) $(BFLAGS) $*.bas; BC $*.BAS;
- .c.obj $(CC) $(CFLAGS) /c $*.c CL /c $*.C
- .c.exe $(CC) $(CFLAGS) $*.c CL $*.C
- .cbl.obj $(COBOL) $(COBFLAGS) $*.cbl; COBOL $*.CBL;
- .cbl.exe $(COBOL) $(COBFLAGS) $*.cbl, $*.exe; COBOL $*.CBL, $*.EXE;
- .for.obj $(FOR) /c $(FFLAGS) $*.for FL /c $*.FOR
- .for.exe $(FOR) $(FFLAGS) $*.for FL $*.FOR
- .pas.obj $(PASCAL) /c $(PFLAGS) $*.pas PL /c $*.PAS
- .pas.exe $(PASCAL) $(PFLAGS) $*.pas PL $*.PAS
- .rc.res $(RC) $(RFLAGS) /r $* RC /r $*
- ────────────────────────────────────────────────────────────────────────────
-
-
- For example, assume you have the following description file:
-
- sample.exe :
-
- This description block lists a target without any dependents or commands.
- NMAKE looks at the target's extension (.EXE) and searches for an inference
- rule that describes how to create an .EXE file. Table 10.6 shows that more
- than one inference rule exists for building an .EXE file. NMAKE looks for a
- file in the current or specified directory that has the same base name as
- the target sample and one of the extensions in the .SUFFIXES list. For
- example, if a file called SAMPLE.FOR exists, NMAKE applies the .for.exe
- inference rule. If more than one file with the base name SAMPLE is found,
- NMAKE applies the inference rule for the extension listed earliest in the
- .SUFFIXES list. In this example, if both SAMPLE.C and SAMPLE.FOR exist,
- NMAKE uses the .c.exe inference rule to compile SAMPLE.C and links the
- resulting file SAMPLE.OBJ to create SAMPLE.EXE.
-
- ────────────────────────────────────────────────────────────────────────────
- NOTE
-
- By default, the options macros such as CFLAGS shown in Table 10.5 are
- undefined. As explained in Section 10.3.4.2, "Using Macros," this causes no
- problem; NMAKE replaces an undefined macro with a null string. Because the
- predefined options macros are included in the inference rules, you can
- define these macros and have their assigned values passed automatically to
- the predefined inference rules. The predefined inference rules are listed in
- Table 10.6.
- ────────────────────────────────────────────────────────────────────────────
-
-
- 10.3.5.5 Precedence among Inference Rules
-
- If the same inference rule is defined in more than one place, NMAKE uses the
- rule with the highest precedence. The precedence from highest to lowest is
-
-
- 1. An inference rule defined in the description file. If more than one,
- the last one applies.
-
- 2. An inference rule defined in the TOOLS.INI file. If more than one, the
- last one applies.
-
- 3. A predefined inference rule.
-
-
- User-defined inference rules always override predefined inference rules.
- NMAKE uses a predefined inference rule only if no user-defined inference
- rule exists for a given target and dependent.
-
- If two inference rules could produce a target with the same extension, NMAKE
- uses the inference rule whose dependent's extension appears first in the
- .SUFFIXES list. See Table 10.7 in the next section, "Directives."
-
-
- 10.3.6 Directives
-
- The directives in Table 10.7 provide additional control of NMAKE operations.
- You can use them in a description file outside of a description block or in
- the TOOLS.INI file. The four directives listed in the table are case
- sensitive and must appear in all uppercase letters. (Preprocessing
- directives are not case sensitive; see Section 10.3.7, "Preprocessing
- Directives.")
-
- Table 10.7 Directives
-
- Directive Action
- ────────────────────────────────────────────────────────────────────────────
- .IGNORE : Ignores exit codes returned by programs
- called from the description file. This
- directive has the same effect as
- invoking NMAKE with the /I option.
-
- .PRECIOUS : target... Tells NMAKE not to delete targets if the
- commands that build them quit or are
- interrupted. Overrides the NMAKE default,
- which is to delete the target if
- building was interrupted by CTRL+C or
- CTRL+BREAK.
-
- .SILENT : Does not display lines as they are
- executed. This directive has the same
- effect as invoking NMAKE with the /S
- option.
-
- .SUFFIXES : list Lists file suffixes for NMAKE to try
- when building a target file for which no
- dependents are specified. This list is
- used together with inference rules. See
- Section
- 10.3.5, "Inference Rules."
-
- ────────────────────────────────────────────────────────────────────────────
-
-
- The .IGNORE and .SILENT directives affect the file from their location
- onward. Location within the file does not matter for the .PRECIOUS and
- .SUFFIXES directives; they affect the entire description file.
-
- NMAKE refers to the value of the .SUFFIXES directive when using inference
- rules. When NMAKE finds a target without dependents, it searches the current
- directory for a file with the same base name as the target and a suffix from
- list. If NMAKE finds such a file, and if an inference rule applies to the
- file, then NMAKE treats the file as a dependent of the target. The order of
- the suffixes in the list defines the order in which NMAKE searches for the
- file. The list is predefined as follows:
-
- .SUFFIXES : .exe .obj .asm .c .bas .cbl .for .pas .res .rc
-
- To add additional suffixes to the end of the list, specify .SUFFIXES :
- followed by the additional suffixes. To clear the list, specify .SUFFIXES :
- by itself. To change the list order or to specify an entirely new list,
- clear the list and specify a new .SUFFIXES : setting.
-
-
- 10.3.7 Preprocessing Directives
-
- NMAKE preprocessing directives are similar to compiler preprocessing
- directives. You can use the !IF, !IFDEF, !IFNDEF, !ELSE, and !ENDIF
- directives to conditionally process the description file. With other
- preprocessing directives you can display error messages, include other
- files, undefine a macro, and turn certain options on or off. NMAKE reads and
- executes the preprocessing directives before processing the description file
- as a whole.
-
- Preprocessing directives (listed in Table 10.8) begin with an exclamation
- point (!), which must appear at the beginning of the line. You can place
- spaces between the exclamation point and the directive keyword. These
- directives are not case sensitive.
-
- Table 10.8 Preprocessing Directives
-
- ╓┌─────────────────────────┌─────────────────────────────────────────────────╖
- Directive Description
- ────────────────────────────────────────────────────────────────────────────
- !CMDSWITCHES Turns on or off NMAKE options /D, /I, /N, and /S.
- {+| -}opt... (See Section 10.4, "Command-Line Options.") Do
- not specify the slash ( / ). If !CMDSWITCHES is
- specified with no options, all options are reset
- to the values they had when NMAKE was started.
- This directive updates the MAKEFLAGS macro. Turn
- an option on by preceding it with a plus sign (+
- ), or turn it off by preceding it with a minus
- sign (-).
-
- !ERROR text Prints text, then stops execution.
-
- !IF constantexpression Reads the statements between the !IF keyword and
- the next !ELSE or !ENDIF keyword if
- constantexpression evaluates to a nonzero value.
-
- Directive Description
- ────────────────────────────────────────────────────────────────────────────
- !IFDEF macroname Reads the statements between the !IFDEF keyword
- and the next !ELSE or !ENDIF keyword if
- macroname is defined. NMAKE considers a macro
- with a null value to be defined.
-
- !IFNDEF macroname Reads the statements between the !IFNDEF keyword
- and the next !ELSE or !ENDIF keyword if
- macroname is not defined.
-
- !ELSE Reads the statements between the !ELSE and
- !ENDIF keywords if the preceding !IF, !IFDEF, or
- !IFNDEF statement evaluated to zero. Anything
- following !ELSE on the same line is ignored.
-
- !ENDIF Marks the end of an !IF, !IFDEF, or !IFNDEF
- block. Anything following !ENDIF on the same
- line is ignored.
-
- Directive Description
- ────────────────────────────────────────────────────────────────────────────
-
- !INCLUDE filename Reads and evaluates the description file
- filename before continuing with the current
- description file. If filename is enclosed by
- angle brackets (< >), NMAKE searches for the
- file first in the current directory and then in
- the directories specified by the INCLUDE macro.
- Otherwise, it looks only in the current
- directory. The INCLUDE macro is initially set to
- the value of the INCLUDE environment variable.
-
- !UNDEF macroname Marks macroname as undefined in NMAKE's symbol
- table.
-
- ────────────────────────────────────────────────────────────────────────────
-
-
-
-
- 10.3.7.1 Expressions in Preprocessing
-
- The constantexpression used with the !IF directive can consist of integer
- constants, string constants, or program invocations. Integer constants can
- use the unary operators for numerical negation (-), one's complement (~),
- and logical negation (!). They can also use any binary operator listed in
- Table 10.9.
-
- Table 10.9 Preprocessing-Directive Binary Operators
-
- ╓┌─────────────────────────────────┌─────────────────────────────────────────╖
- Operator Description
- ────────────────────────────────────────────────────────────────────────────
- + Addition
-
- - Subtraction
-
- * Multiplication
-
- / Division
- Operator Description
- ────────────────────────────────────────────────────────────────────────────
- / Division
-
- % Modulus
-
- & Bitwise AND
-
- | Bitwise OR
-
- ^ Bitwise XOR
-
- && Logical AND
-
- || Logical OR
-
- << Left shift
-
- >> Right shift
-
- == Equality
- Operator Description
- ────────────────────────────────────────────────────────────────────────────
- == Equality
-
- != Inequality
-
- < Less than
-
- > Greater than
-
- <= Less than or equal to
-
- >= Greater than or equal to
-
- ────────────────────────────────────────────────────────────────────────────
-
-
-
- You can group expressions by enclosing them in parentheses. NMAKE treats
- numbers as decimal unless they start with 0 (octal) or 0x (hexadecimal). Use
- the equality (==) operator to compare two strings for equality, or the
- inequality (!=) operator to compare for inequality. Enclose strings in
- double quotation marks.
-
-
- Example
-
- The following example shows how preprocessing directives can be used to
- control whether the linker inserts CodeView information into the .EXE file:
-
-
- !INCLUDE <infrules.txt>
- !CMDSWITCHES +D
- winner.exe : winner.obj
- !IFDEF debug
- ! IF "$(debug)"=="y"
- LINK /CO winner.obj;
- ! ELSE
- LINK winner.obj;
- ! ENDIF
- !ELSE
- ! ERROR Macro named debug is not defined.
- !ENDIF
-
- In this example, the !INCLUDE directive inserts the INFRULES.TXT file into
- the description file. The !CMDSWITCHES directive sets the /D option, which
- displays the times of the files as they are checked. The !IFDEF directive
- checks to see if the macro debug is defined. If it is defined, the !IF
- directive checks to see if it is set to y. If it is, NMAKE reads the LINK
- command with the /CO option; otherwise, NMAKE reads the LINK command without
- /CO. If the debug macro is not defined, the !ERROR directive prints the
- specified message and NMAKE stops.
-
-
- 10.3.7.2 Executing a Program in Preprocessing
-
- NMAKE can invoke programs and check their status.
-
- You can invoke any program from within NMAKE by placing the program's name
- or path name within square brackets ( [ ] ). The program is executed during
- preprocessing, and its exit code replaces the program specification in the
-
- description file. A nonzero exit code usually indicates an error. You can
- use this value to control execution, as in the following example:
-
- !IF [c:\util\checkdsk] != 0
- ! ERROR Not enough disk space; NMAKE terminating.
- !ENDIF
-
-
- 10.3.8 Extracting Filename Components
-
- "Special Macros," Section 10.3.4.3, showed how qualifiers could be added to
- macros that represented filenames in order to select components of the name
- or path. This feature is especially useful when creating a general-purpose
- description block that works with the name of any dependent.
-
- Besides these macro modifiers, NMAKE offers another feature that allows you
- to extract components of the name of the first dependent file as you have
- specified it in the description file or on the command line (not the full
- filename specification on disk). The components can then be recombined with
- specific paths, extensions, or directories to create the particular name or
- path you need, without having to specify the exact name or path when you
- write the description block.
-
- The first dependent file is the first file listed to the right of the colon
- on a dependency line. If a dependent is implied from an inference rule,
- NMAKE considers it to be the first dependent file. If more than one
- dependent is implied from inference rules, the .SUFFIXES list determines
- which dependent is first.
-
- You can use either of the following syntaxes:
-
- %s
-
- %|«parts»F
-
- where parts can be one or more of the following letters, or can be omitted:
-
-
- Letter Description
- ────────────────────────────────────────────────────────────────────────────
- No letter Complete name
- d Drive
- p Path
- f File base name
- e File extension
-
- You can specify more than one letter. The order of the letters is not
- significant; NMAKE constructs the filename that meets (or comes closest to
- meeting) all the specifications. The letters are case sensitive.
-
- The %s option substitutes the complete name; it is equivalent to both %|F
- and %|dpfeF.
-
- NMAKE interprets any percent symbol (%) within a command line (either in a
- description block or an inference rule) as the start of a file specifier
- using this syntax. Therefore, if you need to use a literal percent symbol
- within a command line, you must specify it as a double percent symbol (%%).
-
-
-
- Example
-
- The following example demonstrates this special syntax:
-
- sample.exe : c:\project\sample.obj
- LINK %|dpfF, a:%|pfF.exe;
-
- This example represents the following command:
-
- LINK c:\project\sample, a:\project\sample.exe;
-
- In this example, the sequence %|dpfF represents the same drive, path, and
- base name as the dependent on the dependency line, while the sequence %|pfF
- represents only the path and base name of the dependent. The command tells
- the LINK utility to build the executable file on another drive in a
- directory of the same name.
-
-
- 10.4 Command-Line Options
-
- NMAKE accepts a number of options, listed in Table 10.10. You can specify
- options in uppercase or lowercase and use either a slash or dash. For
- example, -A, /A, -a, and /a all represent the same option. This book uses a
- slash and uppercase letters.
-
- Table 10.10 NMAKE Options
-
- ╓┌─────────────────────────────────┌─────────────────────────────────────────╖
- Option Action
- ────────────────────────────────────────────────────────────────────────────
- /A Forces execution of all commands in
- description blocks in the description
- file even if targets are not out-of-date
- with respect to their dependents. Does
- not affect the behavior of incremental
- commands such as ILINK; using /A does
- not force a full link.
-
- /C Suppresses nonfatal error or warning
- messages and the NMAKE copyright message.
-
- /D Displays the modification time of each
- file.
-
- /E Causes environment variables to override
- macro definitions in description files.
- Option Action
- ────────────────────────────────────────────────────────────────────────────
- macro definitions in description files.
- See Section 10.3.4, "Macros."
-
- /F filename Specifies filename as the name of the
- description file. If you supply a dash (
- -) instead of a filename, NMAKE gets
- description-file input from the standard
- input device. (Terminate keyboard input
- with either F6 or CTRL+Z.) If you omit
- /F, NMAKE searches the current directory
- for a file called
- MAKEFILE and uses it as the description
- file. If MAKEFILE doesn't exist, NMAKE
- uses inference rules for the
- command-line targets.
-
- /HELP Calls the QuickHelp utility. If NMAKE
- cannot locate the help file or QuickHelp,
- it displays a brief summary of NMAKE
- Option Action
- ────────────────────────────────────────────────────────────────────────────
- it displays a brief summary of NMAKE
- command-line syntax and exits to the
- operating system.
-
- /I Ignores exit codes from commands listed
- in the description file. NMAKE processes
- the whole description file even if
- errors occur.
-
- /N Displays but does not execute the
- description file's commands. This option
- is useful for debugging description
- files and checking which targets are
- out-of-date.
-
- /NOLOGO Suppresses the NMAKE copyright message.
-
- /P Displays all macro definitions,
- inference rules, target descriptions,
- Option Action
- ────────────────────────────────────────────────────────────────────────────
- inference rules, target descriptions,
- and the .SUFFIXES list on the standard
- output device.
-
- /Q Checks modification times for
- command-line targets (or first target in
- description file if no command-line
- targets are specified). NMAKE returns a
- zero exit code if all such targets are
- up-to-date and a nonzero exit code if
- any target is out-of-date. Only
- preprocessing commands in the
- description file are executed. This
- option is useful when running NMAKE from
- a batch file.
-
- /R Ignores inference rules and macros that
- are defined in the TOOLS.INI file or
- that are predefined.
- Option Action
- ────────────────────────────────────────────────────────────────────────────
- that are predefined.
-
- /S Suppresses the display of commands
- listed in the description file.
-
- /T Changes modification times for
- command-line targets (or first target in
- description file if no command-line
- targets are specified). Only
- preprocessing commands in the
- description file are executed. Contents
- of target files are not modified.
-
- /X filename Sends all error output to filename,
- which can be a file or a device. If you
- supply a dash (-) instead of a filename,
- error output is sent to the standard
- output device.
-
- Option Action
- ────────────────────────────────────────────────────────────────────────────
- /Z Used for internal communication between
- NMAKE (or NMK) and PWB.
-
- /? Displays a brief summary of NMAKE
- command-line syntax and exits to the
- operating system.
-
- ────────────────────────────────────────────────────────────────────────────
-
-
-
-
- Example
-
- The following command line specifies two NMAKE options:
-
- NMAKE /F sample.mak /C targ1 targ2
-
- The /F option tells NMAKE to read the description file SAMPLE.MAK. The /C
- option tells NMAKE not to display nonfatal error messages and warnings. The
- command specifies two targets (targ1 and targ2) to update.
-
- In the following example, NMAKE updates the target targ1:
-
- NMAKE /D /N targ1
-
- Since no description file is specified, NMAKE searches the current directory
- for a description file named MAKEFILE. The /D option displays the
- modification time of each file; the /N option displays the commands in
- MAKEFILE without executing them.
-
-
- 10.5 NMAKE Command File
-
- If you find yourself repeatedly using the same sequence of command-line
- arguments, you can place them in a text file and pass the file's name as a
- command-line argument to NMAKE. NMAKE opens the command file and reads the
- arguments. This feature is especially useful if the argument list exceeds
- the maximum length of a command line (128 characters in DOS, 256 in OS/2).
-
- To provide input to NMAKE with a command file, type
-
- NMAKE @commandfile
-
- In the commandfile field, enter the name of a file containing the
- information NMAKE expects on the command line. You can split input between
- the command line and a command file. Use the name of the command file
- (preceded by @) in place of the input information on the command line.
-
-
- Example 1
-
- Assume you have created a filenamed UPDATE containing this line:
-
- /S "program = sample" sort.exe search.exe
-
- If you start NMAKE with the command
-
- NMAKE @update
-
- then NMAKE reads its command-line arguments from UPDATE. The at sign (@)
- tells NMAKE to read arguments from the file. The effect is the same as if
- you typed the arguments directly on the command line:
-
- NMAKE /S "program = sample" sort.exe search.exe
-
- NMAKE treats the file as if it were a single set of arguments and replaces
- each line break with a space. Macro definitions that contain spaces must be
- enclosed in quotation marks, just as if you had typed them on the command
- line.
-
- The quotation marks that delimit a macro force all characters between them
- to be interpreted literally. Therefore, if you split a macro between lines,
- an unwanted line break is inserted into the macro. Macros that span multiple
- lines must be continued by ending each line except the last with a backslash
- ( \ ):
-
- /S "program \
- = sample" sort.exe search.exe
-
- This file is equivalent to the first example. The backslash allows the macro
- definition ("program = sample") to span two lines.
-
-
- Example 2
-
- If the command-file UPDATE contains this line:
-
- /S "program = sample" sort.exe
-
- you can give NMAKE the same command-line input as in the example above by
- specifying the command
-
- NMAKE @update search.exe
-
-
- 10.6 The TOOLS.INI File
-
- You can customize NMAKE by placing commonly used macros, inference rules,
- and description blocks in the TOOLS.INI initialization file. Settings for
- NMAKE must follow a line that begins with [NMAKE]. This section of the
- initialization file can contain macro definitions, .SUFFIXES lists, and
- inference rules. For example, if TOOLS.INI contains the following section:
-
- [NMAKE]
- CC=qcl
- CFLAGS=/Gc /Gs /W3 /Oat
- .c.obj:
- $(CC) /c $(CFLAGS) $*.c
-
- NMAKE reads and applies the lines following [NMAKE]. The example redefines
- the macro CC to invoke the Microsoft QuickC (R) Compiler, defines the macro
- CFLAGS, and redefines the inference rule for making .OBJ files from .C
- sources. (Note that macros are case sensitive; a macro called cc is not
- substituted in a rule that uses $(CC).)
-
- NMAKE looks for TOOLS.INI in the current directory. If it isn't there, NMAKE
- searches the directory specified by the INIT environment variable.
-
- Macros and inference rules appearing in TOOLS.INI can be overridden. See
- Section 10.3.4.7, "Precedence among Macro Definitions," and Section
- 10.3.5.5, "Precedence among Inference Rules."
-
-
- 10.7 Inline Files
-
- NMAKE can create "inline files" which contain any text you specify. One use
- of inline files is to write a response file for another utility such as LINK
- or LIB. This eliminates the need to maintain a separate response file and
- removes the restraint on the maximum length of a command line.
-
- Use this syntax to create an inline file called filename:
-
- target : dependents command << «filename» inlinetext <<«KEEP | NOKEEP»
-
- All inlinetext between the two sets of double angle brackets (<<) is placed
- in the inline file. The filename is optional. If you don't supply filename,
- NMAKE gives the inline file a unique name. NMAKE places the inline file in
- the directory specified by the TMP environment variable. If TMP is not
- defined, the inline file is placed in the current directory.
-
- Directives are not allowed in an inline file. NMAKE treats a directive in an
- inline file as literal text.
-
- The inline file can be temporary or permanent. If you don't specify the
- option, or if you specify NOKEEP, the file is temporary. Specify KEEP to
- retain the file after the build ends.
-
-
- Example
-
- The following description block creates a LIB response file named LIB.LRF:
-
- OBJECTS=add.obj sub.obj mul.obj div.obj
- math.lib : $(OBJECTS)
- LIB @<<lib.lrf
- $*.lib
- -+$(OBJECTS: = &^
- -+)
- listing;
- <<KEEP
-
- The resulting response file tells LIB which library to use, the commands to
- execute, and the name of the listing file to produce:
-
- math.lib
- -+add.obj &
- -+sub.obj &
- -+mul.obj &
- -+div.obj
- listing;
-
- The file MATH.LIB must exist beforehand for this example to work.
-
-
- Multiple Inline Files
-
- The inline file specification can create more than one inline file. For
- instance,
-
- target.abc : depend.xyz
- cat <<file1 <<file2
- I am the contents of file1.
- <<KEEP
- I am the contents of file2.
- <<KEEP
-
- The example creates the two inline files, FILE1 and FILE2. All inline text
- is written to the files sequentially. Therefore, the text
-
- I am the contents of file1.
-
- goes into FILE1, not FILE2, even though the text is nested between the angle
- brackets for FILE2 and the <<KEEP statement which follows. NMAKE then
- executes the command
-
- cat file1 file2
-
- The KEEP keywords tell NMAKE not to delete FILE1 and FILE2 when done.
-
-
- 10.8 Sequence of NMAKE Operations
-
- When you are writing a complex description file, it can be helpful to know
- the sequence in which NMAKE performs operations. This section describes
- those operations and their order.
-
- NMAKE first looks for a description file.
-
- When you run NMAKE from the command line, NMAKE's first task is to find the
- description file:
-
-
- 1. If the /F option is used, NMAKE searches for the filename specified in
- the option. If NMAKE cannot find that file, it returns an error.
-
- 2. If the /F option is not used, NMAKE looks for a file named MAKEFILE in
- the current directory. If there are targets on the command line, NMAKE
- builds them according to the instructions in MAKEFILE. If there are no
- targets on the command line, NMAKE builds only the first target it
- finds in MAKEFILE.
-
- 3. If NMAKE cannot find MAKEFILE, NMAKE looks for target files on the
- command line and attempts to build them using inference rules (either
- defined by the user in TOOLS.INI or predefined by NMAKE). If no target
- is specified, NMAKE returns an error.
-
-
- Macro definitions follow a priority.
-
- NMAKE then assigns macro definitions with the following precedence (highest
- first):
-
-
- 1. Macros defined on the command line
-
- 2. Macros defined in a description file or include file
-
- 3. Inherited macros
-
- 4. Macros defined in the TOOLS.INI file
-
- 5. Predefined macros (such as CC and RFLAGS)
-
-
- Macro definitions are assigned in order of priority, not in the order in
- which NMAKE encounters them. For example, a macro defined in an include file
- overrides a macro with the same name from the TOOLS.INI file. Note that a
- macro within a description file can be redefined; the most recent definition
- in the description file is used.
-
- Inference rules also follow a priority.
-
- NMAKE also assigns inference rules, using the following precedence (highest
- first):
-
-
- 1. Inference rules defined in a description file or include file
-
- 2. Inference rules defined in the TOOLS.INI file
-
- 3. Predefined inference rules (such as .c.obj)
-
-
- You can use command-line options to change some of these precedences.
-
-
- ■ The /E option allows macros inherited from the environment to override
- macros defined in the description file.
-
- ■ The /R option tells NMAKE to ignore macros and inference rules that
- are defined in TOOLS.INI or are predefined.
-
-
- NMAKE preprocesses directives before running the description-file commands.
-
-
- Next, NMAKE evaluates any preprocessing directives. If an expression for
- conditional preprocessing contains a program in square brackets ( [ ] ), the
- program is invoked during preprocessing, and the program's exit code is used
- in the expression. If an !INCLUDE directive is specified for a file, NMAKE
- preprocesses the included file before continuing to preprocess the rest of
- the description file. Preprocessing determines the final description file
- that NMAKE reads.
-
- NMAKE updates targets in the description file.
-
- NMAKE is now ready to update the targets. If you specified targets on the
- command line, NMAKE updates only those targets. If you did not specify
- targets on the command line, NMAKE updates just the first target it finds in
- the description file. (This behavior differs from the MAKE utility's
- default; see Section 10.10, "Differences between NMAKE and MAKE.") If you
- specify a pseudotarget, NMAKE always updates the target. If you use the /A
- option, NMAKE always updates the target, even if the file is not
- out-of-date.
-
- If the dependents of the targets are themselves out-of-date or do not exist
- yet, NMAKE updates them first. If the target has no explicit dependent,
- NMAKE looks in the current directory for one or more files with the same
- base name as the target and whose extensions are in the .SUFFIXES list. (See
- Section 10.3.6, "Directives," for a description of the .SUFFIXES list.) If
- it finds such files, NMAKE treats them as dependents and updates the target
- according to the commands.
-
- Errors usually stop the build.
-
- NMAKE normally stops processing the description file when a command returns
- a nonzero exit code. In addition, if NMAKE cannot tell whether the target
- was built successfully, it deletes the target. If you use the /I
- command-line option, NMAKE ignores error codes and attempts to continue
- processing. The .IGNORE directive has the same effect as the /I option. To
- prevent NMAKE from deleting the partially created target if you interrupt
- the build with CTRL+C or CTRL+BREAK, specify the target name in the
- .PRECIOUS directive.
-
- Alternatively, you can use the dash (-) command modifier to ignore the error
- code for an individual command. An optional number after the dash tells
- NMAKE to continue if the command returns an exit code that is less than or
- equal to the number, and to stop if the exit code is greater than the
- number.
-
- You can document errors by using the !ERROR directive to print descriptive
- text. The directive causes NMAKE to print some text, then stop, even if you
- use /I, .IGNORE, or the dash (-) modifier.
-
-
- 10.9 A Sample NMAKE Description File
-
- The following example illustrates many of NMAKE's features. The description
- file creates an executable file from C-language source files:
-
- # This description file builds SAMPLE.EXE from SAMPLE.C,
- # ONE.C, and TWO.C, then deletes intermediate files.
-
- CFLAGS = /c /AL /Od $(CODEVIEW) # controls compiler options
- LFLAGS = /CO # controls linker options
- CODEVIEW = /Zi # controls CodeView data
-
- OBJS = sample.obj one.obj two.obj
- all : sample.exe
-
- sample.exe : $(OBJS)
- link $(LFLAGS) @<<sample.lrf
- $(OBJS: =+^
- )
- sample.exe
- sample.map;
- <<KEEP
-
- sample.obj : sample.c sample.h common.h
- CL $(CFLAGS) sample.c
-
- one.obj : one.c one.h common.h
- CL $(CFLAGS) one.c
-
- two.obj : two.c two.h common.h
- CL $(CFLAGS) two.c
-
- clean :
- -del *.obj
- -del *.map
- -del *.lrf
-
- Assume that this description file is named SAMPLE.MAK. To invoke it, enter
-
- NMAKE /F SAMPLE.MAK all clean
-
- NMAKE then builds SAMPLE.EXE and deletes intermediate files.
-
- Here is how the description file works. The CFLAGS, CODEVIEW, and LFLAGS
- macros define the default options for the compiler, linker, and inclusion of
- CodeView information. You can redefine these options from the command line
- to alter or delete them. For example,
-
- NMAKE /F SAMPLE.MAK CODEVIEW= CFLAGS= all clean
-
- creates an .EXE file that does not contain CodeView information.
-
- The OBJS macro specifies the object files that make up SAMPLE.EXE, so they
- can be reused without having to type them again. Their names are separated
- by exactly one space so that the space can be replaced with a plus sign (+)
- and a carriage return in the link response file. (This is illustrated in the
- second example in Section 10.3.4.4, "Substitution within Macros.")
-
- The all pseudotarget points to the real target, SAMPLE.EXE. If you do not
- specify any target on the command line, NMAKE ignores the clean
- pseudotarget but still builds all, since all is the first target in the
- description file.
-
- The dependency line containing the target sample.exe makes the object
- files specified in OBJS the dependents of SAMPLE.EXE. The command section of
- the block contains only link instructions. No compilation instructions are
- given, since they are given explicitly later in the file. (You could also
- define an inference rule to specify how an object file is to be created from
- a C source file.)
-
- The link command is unusual in that the link parameters and options are not
- passed directly to LINK. Rather, an inline response file is created
- containing these elements. This eliminates the need to maintain a separate
- link response file. It also allows the LINK command line to exceed the
- normal limit on the length of a command line (128 characters in DOS, 256
- characters in OS/2).
-
- The next three dependencies define the relationship of the source code to
- the object files. The .H (header or include) files are also dependents,
- since any changes to them would require recompilation.
-
- The clean pseudotarget deletes unneeded files after a build. The dash
- modifier (-) tells NMAKE to ignore errors returned by the deletion commands.
- If you want to save any of these files, don't specify clean on the command
- line; NMAKE then ignores the clean pseudotarget.
-
-
- 10.10 Differences between NMAKE and MAKE
-
- NMAKE replaces the Microsoft MAKE program. NMAKE differs from MAKE in the
- following ways:
-
-
- ■ NMAKE does not evaluate targets sequentially. Instead, NMAKE updates
- the targets you specify when you invoke it, regardless of their
- positions in the description file. If no targets are specified, NMAKE
- updates only the first target in the file.
-
- ■ NMAKE requires a special syntax when specifying a target in more than
- one dependency line. (See Section 10.3.1.8, "Specifying a Target in
- Multiple Description Blocks.")
-
- ■ NMAKE accepts command-line arguments from a file.
-
- ■ NMAKE provides more command-line options.
-
- ■ NMAKE provides more predefined macros.
-
- ■ NMAKE permits substitutions within macros.
-
- ■ NMAKE supports directives placed in the description file.
-
- ■ NMAKE allows you to specify include files in the description file.
-
-
- The first item in the list deserves special emphasis. While MAKE updates
- every target, working from beginning to end of the description file, NMAKE
- expects you to specify targets on the command line. If you do not, NMAKE
- builds only the first target in the description file.
-
- This difference is clear if you run NMAKE using a typical MAKE description
- file, which lists a series of subordinate targets followed by a higher-level
- target that depends on the following subordinates:
-
- pmapp.obj : pmapp.c
- CL /c /G2sw /W3 pmapp.c
-
- pmapp.exe : pmapp.obj pmapp.def
- LINK pmapp, /align:16, NUL, os2, pmapp
-
- MAKE builds both targets (PMAPP.OBJ and PMAPP.EXE), but NMAKE builds only
- the first target (PMAPP.OBJ).
-
- Because of these performance differences, you may want to convert MAKE files
- to NMAKE files. MAKE description files are easy to convert. One way is to
- create a new description block at the beginning of the file. Give this block
- a pseudotarget named all and list the top-level target as a dependent of
- all. To build all, NMAKE must update every file upon which the target all
- depends:
-
- all : pmapp.exe
-
- pmapp.obj : pmapp.c
- CL /c /G2sw /W3 pmapp.c
-
- pmapp.exe : pmapp.obj pmapp.def
- LINK pmapp, /align:16, NUL, os2, pmapp
-
- If the above file is named MAKEFILE, you can update the target PMAPP.EXE
- with the command
-
- NMAKE
-
- or the command
-
- NMAKE all
-
- It is not necessary to list PMAPP.OBJ as a dependent of all. NMAKE builds a
- dependency tree for the entire description file and builds whatever files
- are needed to update PMAPP.EXE. If PMAPP.C has a later modification time
- than PMAPP.OBJ, NMAKE compiles PMAPP.C to create PMAPP.OBJ, then links
- PMAPP.OBJ to create PMAPP.EXE.
-
- The same technique is suitable for description files with more than one
- top-level target. List all the top-level targets as dependents of all:
-
- all : pmapp.exe second.exe another.exe
-
- The example updates the targets PMAPP.EXE, SECOND.EXE, and ANOTHER.EXE.
-
- If the description file lists a single, top-level target, you can use an
- even simpler technique. Move the top-level block to the beginning of the
- file:
-
- pmapp.exe : pmapp.obj pmapp.def
- LINK pmapp, /align:16, NUL, os2, pmapp
- pmapp.obj : pmapp.c
- CL /c /G2sw /W3 pmapp.c
-
- NMAKE updates the second target (PMAPP.OBJ) whenever needed to keep the
- first target (PMAPP.EXE) current.
-
-
- 10.11 Using NMK
-
- When you maintain a project under DOS or in a DOS session under OS/2, you
- will probably need to use the NMK utility. NMK uses only 5K of memory,
- leaving room for the programs called during the build. You run NMK the same
- way you run NMAKE, using the same command-line syntax and the same
- description-file syntax. NMK calls NMAKE to read the description file and
- perform the build.
-
- The behavior of NMK is slightly different from that of NMAKE. The
- fundamental difference is that NMAKE rechecks the update status of all files
- after each build step, whereas NMK checks file status only once, at the
- start of the build process. If your description file simply compiles a
- series of files and then links them, this difference never causes a problem.
- But consider the following example, which uses a pseudotarget to clean up
- old files during the build:
-
- all : clean example.exe
-
- example.exe : example.asm
- ML example
-
- clean :
- del example.obj
- del example.exe
-
- This description file erases EXAMPLE.OBJ and EXAMPLE.EXE, then recompiles.
- Under NMAKE, it works as intended; that is, it
-
-
- 1. Erases files
-
- 2. Checks the status of EXAMPLE.EXE
-
- 3. Rebuilds EXAMPLE.EXE because EXAMPLE.EXE is no longer present
-
-
- However, NMK checks the status of the environment only at the beginning of
- the build. Since EXAMPLE.EXE exists when the build starts, the preceding
- description file
-
-
- 1. Erases files
-
- 2. Stops execution, because EXAMPLE.EXE was present and up-to-date at the
- beginning of the process
-
-
- PWB never generates a description file that requires dynamic status checking
- to run correctly, so you can use PWB-created description files with either
- NMAKE or NMK.
-
-
- 10.12 Using Exit Codes with NMAKE
-
- NMAKE stops execution if a program executed by one of the commands in the
- NMAKE description file encounters an error. The exit code returned by the
- program is displayed as part of the error message.
-
- Assume the NMAKE description file TEST contains the following lines:
-
- TEST.OBJ : TEST.FOR
- FL /c TEST.FOR
-
- If the source code in TEST.FOR causes an error (but not a warning), you
- would see the following message the first time you use NMAKE with the NMAKE
- description file TEST:
-
- NMAKE : fatal error U1077: 'FL /c TEST.FOR' - return code '2'
-
- This error message indicates that the command FL /c TEST.FOR in the NMAKE
- description file returned exit code 2.
-
- You can cause NMAKE to ignore an exit code for a command by preceding the
- command with a dash modifier (-). If you specify a number after the dash
- modifier (-n), NMAKE stops only if the exit code is greater than the
- specified number. (See Table 10.1.) You disable this behavior for the entire
- description file by invoking NMAKE with the /I option.
-
- You can also test exit codes in NMAKE description files with the !IF
- preprocessing directive. See Section 10.3.7.2, "Executing a Program in
- Preprocessing."
-
- If you prefer to use DOS batch files instead of NMAKE description files, you
- can test the code returned with the IF command. See a DOS manual for more
- information.
-
- NMAKE returns an exit code to the operating system or the calling program. A
- value of 0 indicates execution of NMAKE with no errors. Warnings return exit
- code 0.
-
- Code Meaning
- ────────────────────────────────────────────────────────────────────────────
- 0 No error
- 2 Program error
- 4 System error─out of memory
-
-
- 10.13 Related Topics in Online Help
-
- In addition to information covered in this chapter, information on the
- following topics can be found in online help.
-
- Topics Access
- ────────────────────────────────────────────────────────────────────────────
- Syntax and procedural information on From the list of Utilities on the
- NMAKE "Microsoft Advisor Contents" screen,
- choose "NMAKE"
-
- Using TOOLS.INI From the "Microsoft Advisor Contents"
- screen, choose "Programmer's
- WorkBench"; then choose "Using
- TOOLS.INI" from the list of topics
- relating to customizing PWB
-
-
-
-
-
-
-
- Chapter 11 Creating Help Files with HELPMAKE
- ────────────────────────────────────────────────────────────────────────────
-
- If you've used the Programmer's WorkBench (PWB) or one of the Microsoft
- Quick languages, you already know the advantages of online help, or the
- Microsoft Advisor. The Microsoft Help File Maintenance utility (HELPMAKE)
- lets you extend these advantages by customizing the help files supplied with
- Microsoft language products, or by creating your own help files for them.
-
- HELPMAKE translates help text files into a help database accessible within
- these environments:
-
-
- ■ Microsoft Programmer's WorkBench (PWB)
-
- ■ Microsoft QuickHelp utility
-
- ■ Microsoft CodeView debugger
-
- ■ Microsoft Editor version 1.02
-
- ■ Microsoft QuickC compiler versions 2.0 and later
-
- ■ Microsoft QuickBasic(tm) versions 4.5 and later
-
- ■ Microsoft QuickPascal(tm) version 1.0
-
- ■ Microsoft Word version 5.5
-
-
- This chapter describes how to create and modify help files using the
- HELPMAKE utility.
-
-
- 11.1 Structure and Contents of a Help Database
-
- HELPMAKE creates a help database from one or more input files that contain
- information formatted for the help system. This section defines some of the
- terms involved in formatting and outlines the formats that HELPMAKE can
- process.
-
-
- 11.1.1 Contents of a Help File
-
- Each help input file consists of one or more help "topics." A topic is the
- fundamental unit of help information. It is usually a screenful of
- information about a particular subject. You identify the subject by one or
- more "context strings," which are the words and phrases for which you want
- to be able to request help. When help is requested on a context string, the
- topic is displayed.
-
- The .context command defines a context string for the topic that follows it.
- In the source file for C help, for example, this line introduces help for
- the #include directive:
-
- .context #include
-
- The .context command and other formatting elements are described in Section
- 11.5, "Help Text Conventions."
-
- Whether a context string contains one word or several words depends on the
- application. For example, because Microsoft QuickBasic considers spaces to
- be delimiters, a context string in QuickBasic help files is limited to a
- single word. Other applications, such as PWB, can handle context strings
- that span several words. In either case, the application hands the context
- string to an internal "help engine" that searches the database for
- information.
-
- Often, especially with library routines, the same information applies to
- more than one subject. For example, the C-language string-to-number
- functions strtod, strtol, and strtoul share the same help text. The help
- file lists all three function names as contexts for one block of topic text.
- The converse, however, is not true. You cannot associate a single context
- string with several blocks of topic text located at different places in the
- help file.
-
- Cross-references help you navigate a help database.
-
- Cross-references make it possible to view information about related topics,
- including header files and code examples. The help for the C-language open
- function, for example, references the access function. Cross-references can
- point to other contexts in the same help database, to contexts in other help
- databases, or even to ASCII files outside the database.
-
- Help files can have two kinds of cross-references:
-
-
- ■ Implicit
-
- ■ Explicit, or hyperlinks
-
-
- Implicit cross-references are coded with an ordinary .context command.
-
- The word "open" is an implicit cross-reference throughout Microsoft C help,
- and introduces help for the open function. If you select the word "open"
- anywhere in C help, the help system displays information on the open
- function. The context for open begins with an ordinary .context command. As
- a result, anywhere that you select "open," the help system references this
- context.
-
- Hyperlinks are explicit cross-references marked by invisible text.
-
- A "hyperlink" is an explicit cross-reference tied to a word or phrase at a
- specific location in the help file. You create hyperlinks when you write the
- help text. The hyperlink consists of a word or phrase followed by invisible
- text that gives the context to which the hyperlink refers.
-
- For example, to cause an instance of the word "formatting" to display help
- on the printf function, you would create an explicit cross-reference from
- the word "formatting" to the context "printf." Elsewhere in the file,
- "formatting" has no special significance, but at that one position, it
- references the help for printf. For details on how to create hyperlinks, see
- Section 11.5.4.
-
- Formatting flags let you change the appearance of text.
-
- Help text can also include formatting attributes to control the appearance
- of the text on the screen. Using these attributes, you can make certain
- words appear in various colors, inverse video, and so forth, depending on
- the application displaying help and the graphics capabilities of your
- computer.
-
-
- 11.1.2 Help File Formats
-
- You can create sources for help text files in any of three formats:
-
-
- ■ QuickHelp format
-
- ■ Rich Text Format (RTF)
-
- ■ Minimally formatted ASCII
-
-
- In addition, you can reference unformatted ASCII files, such as include
- files, from within a help database.
-
- An entire help system (such as the ones supplied with Microsoft C, FORTRAN,
- MASM, or QuickBasic) can use any combination of files formatted with
- different format types. With C, for example, the README.DOC information file
- is encoded as minimally formatted ASCII; the help files for the PWB, C
- language, and run-time library are written in QuickHelp format before being
- compressed by HELPMAKE. The database also cross-references the header
- (include) files, which are unformatted ASCII files stored outside the
- database.
-
-
- QuickHelp
-
- QuickHelp format is the default format into which HELPMAKE decodes help
- databases. Any text editor can create a QuickHelp-format help text file.
- QuickHelp format also lends itself to a relatively easy automated
- translation from other document formats.
-
- QuickHelp files can contain any kind of cross-reference or formatting
- attribute. Typically, you use QuickHelp format when modifying a
- Microsoft-supplied database.
-
- QuickHelp format makes use of dot commands (such as .context─see the
- description of QuickHelp dot commands in Section 11.6.1). To use dot
- commands other than .context and .comment, the / T option is required for
- encoding and decoding. For details, see Section 11.3, "Helpmake Options."
-
-
- Rich Text Format
-
- Rich Text Format (RTF) is a Microsoft word-processing format that several
- word processors support, including Microsoft Word version 5.0 and later, and
- Microsoft Word for Windows. You can use RTF as an intermediate format to
- simplify transferring help files from one format to another. Like QuickHelp
- files, RTF files can contain formatting attributes and cross-references.
-
- An RTF word processor provides the easiest way to create an RTF file, but
- you can manually insert RTF codes with an ordinary text editor. There are
- also utility programs that convert text files in other formats to RTF
- format.
-
- See Section 11.6.2, "Rich Text Format," for more information.
-
-
- Minimally Formatted ASCII
-
- Minimally formatted ASCII files define contexts and their topic text; they
- cannot contain screen-formatting commands or explicit cross-references.
- (Implicit cross-references work the same way they do in the other formats.)
- Minimally formatted ASCII files are often used to display text in a
- README.DOC or small help files that do not require compression. See Section
- 11.6.3, "Minimally Formatted ASCII Format," for more information.
-
-
- Unformatted ASCII
-
- Unformatted ASCII files are exactly what their name implies: regular ASCII
- files with no formatting commands, context definitions, or special
- information. HELPMAKE does not process unformatted ASCII files in any
- special way. An unformatted ASCII file does not become part of the help
- database; only its name is used as the object of a cross-reference.
- Unformatted ASCII files are useful for storing program examples. Any word
- that is an implicit cross-reference in other help files is also an implicit
- cross-reference in unformatted ASCII files.
-
-
- 11.2 Invoking HELPMAKE
-
- The HELPMAKE program can encode to create new help files or decode to modify
- existing ones. Encoding converts a text file to a compressed help database.
- HELPMAKE can encode text files written in QuickHelp, RTF, and minimally
- formatted ASCII format. Decoding converts a help database to a text file for
- editing. Regardless of the source format, HELPMAKE always decodes a help
- database into a QuickHelp-format text file.
-
- You invoke HELPMAKE with the following syntax:
-
- HELPMAKE {/E«n» | /D«c» |
- / H| /?} [[options]] sourcefiles
-
- The options modify the action of HELPMAKE; they are described in Section
- 11.3, "HELPMAKE Options."
-
- You must supply either the /E (encode) or the /D (decode) option. When
- encoding, you must also use the /O option to specify the file name of the
- database.
-
- The sourcefiles field is required. It specifies the input file(s) for
- HELPMAKE. If you use the /D (decode) option, sourcefiles can be one or more
- help database files (such as PWB.HLP). HELPMAKE decodes the database files
- to the standard output device. If you use the /E (encode) option,
- sourcefiles can be one or more help text files (such as PWB.SRC). File names
- are separated with a space. You can use standard wild-card characters to
- specify a group of related files.
-
- The example below invokes HELPMAKE with the /V, /E, and /O options (see
- Section 11.3.1, "Options for Encoding"). HELPMAKE reads input from the text
- file my.txt and writes the compressed help database in the file my.hlp.
- The /E option, without a compression specification, maximizes compression.
- Note that the DOS or OS/2 redirection symbol (>) sends a log of HELPMAKE
- activity to the file my.log. You may want to redirect the log file because,
- in its verbose mode (given by /V), HELPMAKE can generate a lengthy log.
-
- HELPMAKE /V /E /Omy.hlp my.txt > my.log
-
- The example below invokes HELPMAKE to decode the help database my.hlp into
- the text file my.src, given with the /O option. Once again, the /V option
- results in verbose output, and the output is directed to the log file
- my.log. Section 11.3.2 describes additional options for decoding.
-
- HELPMAKE /V /D /Omy.src my.hlp > my.log
-
-
- 11.3 HELPMAKE Options
-
- HELPMAKE accepts the command-line options described below. You can specify
- options in uppercase or lowercase letters and precede them with either a
- forward slash ( / ) or a dash ( - ). Most options apply only to encoding,
- others apply only to decoding, and a few apply to both. The /T option is
- required if you want to use dot commands with the QuickHelp format (which is
- the default format).
-
-
- 11.3.1 Options for Encoding
-
- When you encode a file─that is, when you build a help database─you must
- specify the /E option. HELPMAKE also accepts other options to control
- encoding. The encoding options are listed below:
-
- ╓┌───────────┌───────────────────────────────┌───────────────────────────────╖
- Option Action
- ────────────────────────────────────────────────────────────────────────────
- Option Action
- ────────────────────────────────────────────────────────────────────────────
- /Ac Specifies c as an
- application-specific control
- character for the help
- database file. The character
- marks a line that contains
- special information for
- internal use by the
- application. For example, the
- Microsoft Advisor uses the
- colon (:).
-
- /C Makes context strings for this
- help file case sensitive.
-
- /E«n» Creates (encodes) a help
- database from a specified text
- file. The n specifies the
- type(s) of compression. If n
- is omitted, HELPMAKE
- Option Action
- ────────────────────────────────────────────────────────────────────────────
- is omitted, HELPMAKE
- compresses the file as much as
- possible (about 50%). The
- value of n is in the range 0
- -15. It is the sum of
- successive integral powers of
- 2 representing various
- compression techniques:
-
- Value Technique
-
- 0 No compression
-
- 1 Run-length compression
-
- 2 Keyword compression
-
- 4 Extended keyword compression
-
- Option Action
- ────────────────────────────────────────────────────────────────────────────
- 8 Huffman compression
-
- Add values to combine
- compression techniques. For
- example, use / E3 to get
- run-length and keyword compres-
-
- sion. Use / E0 in the testing
- stages of help database
- creation where you need to
- create the database quickly
- and are not yet concerned with
- size.
-
- /Kfilename Optimizes keyword compression
- by supplying a list of
- characters that act as word
- separators. The filename is a
- Option Action
- ────────────────────────────────────────────────────────────────────────────
- separators. The filename is a
- file containing your list of
- separator characters.
-
-
- The / E2 and / E3 options tell
- HELPMAKE to identify
- "keywords"─words occurring
- often enough to justify
- replacing them with shorter
- character sequences. A word is
- any series of characters that
- do not appear in the separator
- list. The default separator
- list includes all ASCII
- characters from 0 to 32, ASCII
- character 127, and the
- following characters:
-
- Option Action
- ────────────────────────────────────────────────────────────────────────────
- ! " # & ` ' ( ) * + - , / : ;
- < = > ? @ [ \ ] ^ _ { | } ~
-
- You can improve keyword
- compression by designing a
- separator list tailored to a
- specific help file. If your
- help file contains #include
- directives, #include is
- encoded (by default) as
- include. To encode #include as
- a keyword, create a separator
- list that omits the #:
-
- ! " & ` ' ( ) * + - , / : ;
- < = > ? @ [ \ ] ^ _ { | } ~
-
- Characters in the range 0 -31
- Option Action
- ────────────────────────────────────────────────────────────────────────────
- Characters in the range 0 -31
- are always separators, so you
- need not include them. A
- customized list must include
- all other separators, however,
- including the space (which
- follows ! in the list above).
- If you omit the space,
- HELPMAKE encodes sequences of
- words as keywords.
-
- /L Locks the generated file so
- that it cannot later be
- decoded.
-
- /NOLOGO Suppresses the HELPMAKE
- copyright message.
-
- /Ooutfile Specifies outfile as the name
- Option Action
- ────────────────────────────────────────────────────────────────────────────
- /Ooutfile Specifies outfile as the name
- of the help database.
-
- /Sn Specifies the type of input
- file, according to the
- following n values:
-
- Option File Type
-
- /S1 Rich Text Format (RTF)
-
- /S2 QuickHelp (default)
-
- /S3 Minimally formatted ASCII
-
- /T Translates dot commands into
- internal format. If your help
- file contains dot commands
- other than .context and
- Option Action
- ────────────────────────────────────────────────────────────────────────────
- other than .context and
- .comment, you must supply this
- option when encoding it. Dot
- commands are described in
- Section 11.6.1,"QuickHelp
- Format," and in later sections.
- The /T option causes the
- option /A: to be assumed.
-
-
- /V«n» Controls verbosity of
- diagnostic and informational
- output. Larger values of n add
- more information. Omitting n
- produces a full listing. The
- values of n are listed below:
-
- Option Output
-
- Option Action
- ────────────────────────────────────────────────────────────────────────────
- /V Maximum diagnostic output
-
- /V0 No diagnostic output and no
- banner
-
- /V1 HELPMAKE banner only
-
- /V2 Pass names
-
- /V3 Contexts on first pass
-
- /V4 Contexts on each pass
-
- /V5 Any intermediate steps within
- each pass
-
- /V6 Statistics on help file and
- compression
- Option Action
- ────────────────────────────────────────────────────────────────────────────
- compression
-
- /Wwidth Indicates the fixed width of
- the resulting help text in
- number of characters. The
- value of width can range from
- 11 to 255. If the /W option is
- omitted, the default is 76.
- When encoding an RTF source
- (/S1), HELPMAKE automatically
- formats the text to width.
- When encoding QuickHelp (/S2)
- or minimally formatted ASCII
- (/S3) files, HELPMAKE
- truncates lines to this width.
-
-
-
-
- 11.3.2 Options for Decoding
-
- The /D option decodes a help database into QuickHelp files. HELPMAKE also
- accepts other options to control decoding. The decoding options are listed
- below:
-
- ╓┌────────────┌──────────────────────────────┌───────────────────────────────╖
- Option Action
- ────────────────────────────────────────────────────────────────────────────
- /D«c» Decodes the input file into
- its original text or
- component parts. If a
- destination file is not
- specified with the /O option,
- the help file is decoded to
- the standard output device.
- The form of decoding is
- controlled by the form of /D«
- c» specified:
-
- Form Effect
- Option Action
- ────────────────────────────────────────────────────────────────────────────
- Form Effect
-
- /D Fully decodes the help
- database, leaving all
- cross-references and
- formatting information intact.
-
- /DS Splits a concatenated help
- database into its components
- using their original names. If
- the database was not created
- by concatenation, HELPMAKE
- copies it to a file with its
- original name. The database is
- not decompressed.
-
- /DU Decompresses the database and
- removes all screen formatting
- and cross-
- Option Action
- ────────────────────────────────────────────────────────────────────────────
- and cross-
- references. The output can be
- used later for input and
- recompression, but all screen
- formatting and
- cross-references are lost.
-
- /NOLOGO Suppresses the HELPMAKE
- copyright message.
-
- /O«outfile» Specifies outfile for the
- decoded output from HELPMAKE.
- If outfile is omitted, the
- help database is decoded to
- the standard output device.
- HELPMAKE always decodes help
- database files into QuickHelp
- format.
-
- Option Action
- ────────────────────────────────────────────────────────────────────────────
- /T Translates dot commands from
- internal format into
- dot-command format. You must
- always supply this option
- when decoding a help database
- that contains dot commands
- other than .context and
- .comment.
-
-
-
-
- /V«n» Controls verbosity of
- diagnostic and informational
- output. Larger values of n
- add more information.
- Omitting n produces a full
- listing. The values of n are
- Option Action
- ────────────────────────────────────────────────────────────────────────────
- listing. The values of n are
- listed below:
-
- Option Output
-
- /V Maximum diagnostic output
-
- /V0 No diagnostic output and no
- banner
-
- /V1 HELPMAKE banner only
-
- /V2 Pass names
-
- /V3 Contexts on first pass
-
-
-
-
- 11.3.3 Options for Help
-
- The following are the options for help.
-
- Option Action
- ────────────────────────────────────────────────────────────────────────────
- / ? Displays a brief summary of HELPMAKE
- command-line syntax and exits without
- encoding or decoding any files. All
- other information on the command line is
-
- ignored.
-
- / «HELP» Calls the QuickHelp utility and displays
- help about HELPMAKE. If HELPMAKE cannot
- find QuickHelp or the help file, it
- displays the same information as with
- the /? option. No files are encoded or
- decoded. All other information on the
- command line is ignored.
-
-
-
- 11.4 Creating a Help Database
-
- There are two ways to create a Microsoft-compatible help database.
-
- The first method is to decompress an existing help database, modify the
- resulting help text file, and recompress the help text file to form a new
- database.
-
- The second method is to append a new help database to an existing help
- database. This method involves the following steps:
-
-
- 1. Create a help text file in QuickHelp format, RTF, or minimally
- formatted ASCII.
-
- 2. Use HELPMAKE to create a help database file. The example below invokes
- HELPMAKE, using yourhelp.txt as the input file and producing a help
- database file named yourhelp.hlp:
-
- HELPMAKE /V /E /Oyourhelp.hlp yourhelp.txt > yourhelp.log
-
-
- 3. Back up the existing database.
-
- 4. Append the new help database file to the existing database. The
- example below appends the new database yourhelp.hlp to the
- alang.hlp database. (In the example, the / b modifier for the DOS
- COPY command combines the files as binary files.)
-
- COPY alang.hlp /b + yourhelp.hlp /b
-
-
- 5. Test the database. Assume yourhelp.hlp contains the context sample.
- If you type sample in PWB and request help on it, the help window
- should display the text associated with the context sample.
-
-
- ────────────────────────────────────────────────────────────────────────────
- WARNING
-
- The PWB editor truncates lines longer than about 250 characters. Some
- databases contain lines longer than this. To edit or create database files
- with extremely long lines, you must either use an editor (such as Microsoft
- Word) that does not restrict line length, or extend long lines using the
- backslash (\) line-continuation character.
- ────────────────────────────────────────────────────────────────────────────
-
-
- 11.5 Help Text Conventions
-
- The source text that HELPMAKE uses to create Microsoft help databases must
- follow specific organizational conventions. The following sections explain
- these conventions.
-
-
- 11.5.1 Structure of the Help Text File
-
- The Microsoft help system is simply a data-retrieval tool. It imposes no
- restrictions on the content or organization of help data. However, the
- HELPMAKE utility and the data-display routines in the help system expect a
- help file to follow a standard format. This section explains how to create
- correctly formatted help text files.
-
- In all three help text formats, the help text source file is a sequence of
- topics, each preceded by one or more context definitions. The following
- table lists the various formats and the corresponding context definition
- statements:
-
- Format Context Definition
- ────────────────────────────────────────────────────────────────────────────
- QuickHelp .context context
-
- RTF \ par >>context \ par
-
- Minimally formatted ASCII >>context
-
- Unformatted ASCII None
-
- In QuickHelp format, each topic begins with one or more .context statements.
- These statements link the context string to its topic text. The topic text
- consists of all subsequent lines up to the next .context statement.
-
- In RTF format, each context definition must be in a paragraph of its own
- (denoted by \ par), beginning with the help delimiter (>>). As in QuickHelp,
- the topic text consists of all subsequent paragraphs up to the next context
- definition.
-
- In minimally formatted ASCII, each context definition must be on a separate
- line, and each must begin with the help delimiter (>>). As in RTF and
- QuickHelp files, all subsequent lines up to the next context definition
- constitute the topic text.
-
- See Section 11.6, "Using Help Database Formats," for detailed information
- about these three formats.
-
- ────────────────────────────────────────────────────────────────────────────
- WARNING
-
- HELPMAKE warns you if it encounters a duplicate context string definition
- within a given help source file. Each context string must be unique.
- ────────────────────────────────────────────────────────────────────────────
-
-
- 11.5.2 Local Contexts
-
- Context strings beginning with the "at" sign (@) are "local." Making a
- context local saves file space and speeds access. However, local contexts
- cannot be cross-referenced with an implicit link, and they have no meaning
- outside the local file.
-
- When you use a local context, HELPMAKE does not generate a global context
- string (a context string that is known throughout the help system). Instead,
- it embeds an encoded cross-reference that has meaning only within the
- current context. For example,
-
- .context normal
- This is a normal topic, accessible by the context string "normal".
- [button\v@local\v] is a cross-reference to the following topic.
-
- .context @local
-
- This topic can be reached only by the explicit cross-reference
- in the previous topic (or by browsing the file sequentially).
-
- In the example above, the text button\v@local\v references local as a
- local context. If the user selects the text button or scrolls through the
- file, the help system displays the topic text that follows the context
- definition for local. Because local is defined with the "at" sign @, it
- can be accessed only by a hyperlink within the same help file or by
- sequentially browsing the file.
-
- If you want a topic to be accessible in both local and global contexts, you
- simply mark the topic text with both global and local .context statements.
- For example, to make topic both global and local, add the following
- statements:
-
- .context topic
- .context @topic
-
- Naturally, both .context statements must appear immediately before the topic
- text to which they point.
-
- To create a context that begins with a literal @, precede it with a
- backslash ( \ ).
-
-
- 11.5.3 Context Prefixes
-
- Microsoft help databases use several "context prefixes." A context prefix is
- a single letter followed by a period. It appears before a context string
- with a predefined meaning. These contexts may appear in the resulting text
- file when you decode a Microsoft help database.
-
- Context prefixes are used internally by Microsoft.
-
- Except for the h. prefix described below, the context prefixes are used by
- Microsoft to mark environment- or product-specific features. You would not
- normally add them to the help files you write.
-
- You can use the h. prefix to identify standard help-file contexts. For
- instance, h.default identifies the default help screen (the screen that
- normally appears when you select top-level help). Table 11.1 lists the
- standard h. contexts.
-
- Table 11.1 Standard h. Contexts
-
- ╓┌─────────────────────────────────┌─────────────────────────────────────────╖
- Context Description
- ────────────────────────────────────────────────────────────────────────────
- h.contents The table of contents for the help file.
- You should also define the string
- "contents" for direct reference to this
- context.
-
- h.default The default help screen, typically
- displayed when the user presses SHIFT+F1
- at the "top level" in some applications.
-
- h.index The index for the help file. You can
- also define the string "index" for
- direct reference to this context.
-
- h.notfound The help text displayed by some
- applications when the help system cannot
- find information about the requested
- context. The text could be an index of
- contexts, a topical list, or general
- Context Description
- ────────────────────────────────────────────────────────────────────────────
- contexts, a topical list, or general
- information about using help.
-
- h.pg# A specific page within the help file.
- This is used in response to a "go to
- page #" request.
-
- h.pg$ The help text that is logically last in
- the file. This is used by some
- applications in response to a "go to the
- end" request made within the help window.
-
- h.pg1 The help text that is logically first in
- the file. This is used by some
- applications in response to a "go to the
- beginning" request made within the help
- window.
-
- h.title The title of the help database.
- Context Description
- ────────────────────────────────────────────────────────────────────────────
- h.title The title of the help database.
-
- ────────────────────────────────────────────────────────────────────────────
-
-
-
- The context prefixes in Table 11.2 are internal to Microsoft products. They
- appear in decompressed databases, but you do not need to use them.
-
- Table 11.2 Microsoft Product Context Prefixes
-
- ╓┌─────────────────────────────────┌─────────────────────────────────────────╖
- Prefix Purpose
- ────────────────────────────────────────────────────────────────────────────
- d. Dialog box. Each dialog box is assigned
- a number. Its help context string is d.
- followed by the number (for example,
- d.12).
-
- Prefix Purpose
- ────────────────────────────────────────────────────────────────────────────
- e. Error number. If a product supports the
- error-numbering scheme used by Microsoft
- languages, it displays help for each
- error using this prefix. For example,
- the context e.P0105 refers to the
- Microsoft QuickPascal Compiler error
- message number P0105.
-
- h. Help item. Prefixes miscellaneous help
- context strings that may be constructed
- or otherwise hidden from the user. For
- example, most applications look for the
- context string h.contents when Contents
- is chosen from the Help menu.
-
- m. Menu item. Contexts that relate to
- product menu items are defined by their
- shortcut keys. For example, the Exit
- Prefix Purpose
- ────────────────────────────────────────────────────────────────────────────
- shortcut keys. For example, the Exit
- selection on the File menu item is
- accessed by ALT+F, X and is referenced
- in help by m.f.x.
-
- n. Message number. Each message box is
- assigned a number. Its help context
- string is n. plus the number (for
- example, n.5).
-
- ────────────────────────────────────────────────────────────────────────────
-
-
-
-
- 11.5.4 Hyperlinks
-
- Explicit cross-references, or hyperlinks, are marked with invisible text in
- the help text file. A hyperlink is a word or phrase followed by invisible
- text that names the context to which the hyperlink refers.
-
- The keystroke that activates the hyperlink depends on the application.
- Consult the documentation for each product for the specific keystroke.
-
- When the user activates the hyperlink, the help system displays the topic
- referenced by the invisible text. The invisible cross-reference text is
- formatted as one of the following:
-
- Hidden Text Action
- ────────────────────────────────────────────────────────────────────────────
- contextstring Displays the topic associated with
- contextstring. For example, exeformat
- displays the topic text for the context
- exeformat.
-
- filename! Treats filename as a single topic to be
- displayed. For example,
- $INCLUDE:stdio.h! searches the
- directories in the INCLUDE environment
- variable for file stdio.h and displays
- it as a single help topic.
-
- filename!contextstring Works the same as contextstring, except
- only the help file filename is searched
- for the context. If the file is not
- already open, the help system finds it
- (by searching either the current path or
- an explicit environment variable) and
- opens it. For example,
- $BIN:readme.doc!patches searches for
- readme.doc in the BIN environment
- variable and displays the topic
- associated with patches.
-
- !command Executes the command specified after the
- exclamation point (!).
-
- In the following example, the word Example is a hyperlink. The \b,\p, and
- \v formatting flags mark hyperlinks in the help text. (The formatting flags
- are listed later in this chapter, in Table 11.4.)
-
- \bSee also:\p Example\vopen.ex\v
-
- The hyperlink refers to open.ex. If you select any of the letters of
- Example, the help system displays the topic whose context is open.ex. On
- the screen, this line appears as follows:
-
- See also: Example
-
- An application might display See also: and Example in different colors
- or character types, depending on factors such as your default color
- selection and type of monitor.
-
- When a hyperlink needs to cross-reference more than one word, you must use
- an anchor, as in the following example:
-
- \bSee also:\p \uExample\p\vprintf.ex\v, fprintf, scanf, sprintf,
- vfprintf, vprintf, vsprintf
- \aformatting table\vprintf.table\v
-
- This part of the example is an anchored hyperlink:
-
- \aformatting table\vprintf.table\v
-
- The anchor must fit on one line.
-
- The \ a flag creates an anchor for the cross-reference. In the example, the
- phrase following the \ a flag (formatting table) is the hyperlink. It refers
- to the context printf.table. The first \v flag marks both the end of the
- hyperlink and the beginning of the invisible text. The name printf.table
- is invisible; it does not appear on the screen when the help is displayed.
- The second \v flag ends the invisible text.
-
-
- 11.6 Using Help Database Formats
-
- A database can be written in any of three text formats. The list below
- briefly describes these types. Sections 11.6.1-11.6.3 describe the
- formatting types in detail.
-
- An entire help system (such as the one supplied with PWB or QuickC) can
- handle any combination of formats. For example, the help files for Microsoft
- C are written in QuickHelp format, and the README.DOC file is unformatted
- ASCII.
-
- Type Characteristics
- ────────────────────────────────────────────────────────────────────────────
- QuickHelp Uses dot commands and embedded
- formatting characters (the default
- formatting type expected by HELPMAKE);
- supports highlighting, color, and
- cross-references. Files in this format
- must be compressed before use.
-
- RTF Uses a subset of standard RTF; supports
- highlighting, color, and
- cross-references; supports some dot
- commands. Files in this format must be
- compressed before use.
-
- Minimally formatted ASCII Uses a help delimiter (>>) to define
- help contexts; does not support
- highlighting, color, or crossreferences.
- Files in this format can be compressed,
- but compression is not required.
-
-
- 11.6.1 QuickHelp Format
-
- The QuickHelp format uses a dot command and embedded formatting flags to
- convey information to HELPMAKE.
-
-
- 11.6.1.1 QuickHelp Dot Commands
-
- QuickHelp provides a number of dot commands that identify topics and convey
- other topic-related information to the help system. If your help file
- contains dot commands other than .context or .comment, you must supply the /
- T option when encoding and decoding with HELPMAKE.
-
- You can define more than one context for a single topic.
-
- The most important dot command is the .context command. Every topic in a
- QuickHelp file begins with one or more .context commands. Each .context
- command defines a context string for the topic text. You can define more
- than one context for a single topic, as long as you do not place any topic
- text between them.
-
- Typical .context commands are shown below. The first defines a context for
- the #include C preprocessor directive. The second set illustrates multiple
- contexts for one block of topic text. In this case, the same topic text
- explains all of the string-to-number conversion routines in C.
-
- .context #include
- .
- . description of #include goes here
- .
- .context strtod
- .context strtol
- .context strtoul
- .
- . description of string-to-number functions goes here
- .
-
- The QuickHelp format includes several other dot commands. Table 11.3 lists
- the dot commands available in QuickHelp format.
-
- Table 11.3 QuickHelp Dot Commands
-
- ╓┌─────────────────────────────────────┌─────────────────────────────────────╖
- Command Action
- ────────────────────────────────────────────────────────────────────────────
- .category string Lists the category in which the
- current topic appears and its
- position in the list of topics. The
- category name is used by the
- QuickHelp Categories command, which
- displays the topics list. Supported
- only by QuickHelp.
-
- .command Indicates that the topic text is not
- a displayable help topic. Use this
- command to hide hyperlink topics and
- other internal information.
-
- .comment string The string is a comment that appears
- .. string only in the help source file.
- Command Action
- ────────────────────────────────────────────────────────────────────────────
- .. string only in the help source file.
- Comments are not inserted in the
- help database, so they cannot be
- restored when you decompress a help
- file.
-
- .context string The string introduces a topic.
-
- .end Ends a paste section. See the .paste
- command below. Supported only by
- QuickHelp.
-
- .freeze numlines Locks the first numlines lines at
- the top of the screen. This can be
- used to preserve a bar of
- cross-reference buttons for a help
- topic and prevent it from being
- scrolled.
-
- Command Action
- ────────────────────────────────────────────────────────────────────────────
- .length topiclength Indicates the default window size,
- in topiclength lines, of the topic
- about to be displayed.
-
- .line number Tells HELPMAKE to reset the line
- number to begin at number for
- subsequent lines of the input file.
- Line numbers appear in HELPMAKE
- error messages. HELPMAKE does not
- put the .line command into the help
- database, so it is not restored
- during decompression. See .source.
-
- .list Indicates that the current topic
- contains a list of topics. QuickHelp
- displays a highlighted line; you can
- choose a topic by moving the
- highlighted line over the desired
- Command Action
- ────────────────────────────────────────────────────────────────────────────
- highlighted line over the desired
- topic and pressing ENTER. Help
- searches for the first word of the
- line. Supported only by QuickHelp.
-
- .mark name «column» Defines a mark immediately preceding
- the following line of text. The
- marked line shows a script command
- where the display of a topic begins.
- The name identifies the mark. The
- column is an integer value
- specifying a column location within
- the marked line. Supported only by
- QuickHelp.
-
-
-
-
- .next context Tells the help system to look up the
- Command Action
- ────────────────────────────────────────────────────────────────────────────
- .next context Tells the help system to look up the
- next topic using
- context instead of the topic that
- physically follows it in the file.
- You can use this command to skip
- large blocks of .command or .popup
- topics.
-
- .paste pastename Begins a paste section. The
- pastename appears in the QuickHelp
- Paste menu. Supported only by
- QuickHelp.
-
- .popup Tells the help system to display the
- current topic as a popup instead of
- a normal, scrollable topic.
- Supported only by QuickHelp.
-
- .previous context Tells the help system to look up the
- Command Action
- ────────────────────────────────────────────────────────────────────────────
- .previous context Tells the help system to look up the
- previous topic using context instead
- of the topic that physically
- precedes it in the file. You can use
- this command to skip large blocks of
- .command or .popup topics.
-
- .raw Turns off special processing of
- certain characters by the
- application.
-
- .ref topic «, topic» ... Tells the help system to display the
- topic in the Reference menu. You can
- list as many topics as needed;
- separate each additional topic with
- a comma. A .ref command is formatted
- without regard to the /W option.
- Supported only by QuickHelp.
-
- Command Action
- ────────────────────────────────────────────────────────────────────────────
- If no topic is specified, QuickHelp
- searches the line immediately
- following for a See: or See Also:
- reference; if present, the reference
- must be the first non-white-space
- characters on the line.
-
- .source filename Tells HELPMAKE that subsequent
- topics come from filename. By
- default, when an error occurs, the
- error message contains the name and
- line number of the input file. The
- .source command tells HELPMAKE to
- use filename in the error message
- instead of the name of the input
- file and to reset the line number to
- 1. This is useful when you
- concatenate several sources to form
- Command Action
- ────────────────────────────────────────────────────────────────────────────
- concatenate several sources to form
- the input file. HELPMAKE does not
- put the .source command into the
- help database, so it is not restored
- during decompression. See .line.
-
- .topic text Defines text as the name or title to
- be displayed in place of the context
- string if the application help
- displays a title. This command is
- always the first line in the context
- unless you also use the .length or
- .freeze commands.
-
- ────────────────────────────────────────────────────────────────────────────
-
-
-
-
- 11.6.1.2 QuickHelp Formatting Flags
-
- The QuickHelp format provides a number of formatting flags that are used to
- highlight parts of the help database and to mark hyperlinks in the help
- text.
-
- Each formatting flag consists of a backslash ( \ ) followed by a character.
- Table 11.4 lists the formatting flags.
-
- Table 11.4 QuickHelp Formatting Flags
-
- ╓┌─────────────────────────────────┌─────────────────────────────────────────╖
- Formatting Flag Action
- ────────────────────────────────────────────────────────────────────────────
- \ a Anchors text for cross-references
-
- \ b, \ B Turns boldface on or off
-
- \ i, \ I Turns italics on or off
-
- \ p, \ P Turns off all attributes
- Formatting Flag Action
- ────────────────────────────────────────────────────────────────────────────
- \ p, \ P Turns off all attributes
-
- \ u, \ U Turns underlining on or off
-
- \ v, \V Turns invisibility on or off
- (hides cross-references in text)
-
- \\ Inserts a single backslash in text
-
- ────────────────────────────────────────────────────────────────────────────
-
-
-
- On monochrome monitors, text labeled with the bold, italic, and underline
- attributes appears in various ways, depending on the application (for
- example, high intensity and reverse video are commonly displayed). On color
- monitors, these attributes are translated by the application into suitable
- colors, depending on the user's default color selections.
-
- The \ b, \ i, \ u, and \v options are toggles, turning on and off their
- respective attributes. You can use several of these on the same text. Use
- the \ p attribute to turn off all attributes. Use the \v attribute to hide
- cross-references and hyperlinks in the text.
-
- HELPMAKE truncates the lines in QuickHelp files to the width specified with
- the / W option. Only visible characters count toward the character-width
- limit. Lines that begin with an application-specific control character are
- truncated to 255 characters regardless of the width specification. See
- Section 11.3.1, "Options for Encoding," for more information on truncation
- and application-specific control characters.
-
- In the example below, the \ b flag initiates boldface text for Returns:,
- and the \ p flag changes the remaining text to plain text.
-
- \bReturns:\p a handle if successful, or -1 if not.
- errno: EACCES, EEXIST, EMFILE, ENOENT
-
- In the example below, the \ a flag anchors text for the hyperlink Example.
- The \v flags define the cross-reference sample_prog and make the text
- between the \v flags invisible. Cross-references are described in the
- following section.
-
- \aExample \vsample_prog\v
-
-
- 11.6.1.3 QuickHelp Cross-References
-
- Help databases contain two types of cross-references, implicit and explicit.
- They are described in Section 11.1.1, "Contents of a Help File."
-
- Any word that appears as a global context is implicitly cross-referenced.
- For example, any time you request help in PWB on close, the help window
- displays information about that function. You do not code implicit
- cross-references into your help text files.
-
- Insert formatting flags to mark explicit cross-references.
-
- Explicit cross-references (hyperlinks) are words or phrases on the screen
- that point to a context. For example, almost every "See:" and "See also:"
- reference in online help has a hyperlink pointing to the appropriate
- context. You can view the cross-referenced material immediately by
- activating the hyperlink, without having to search the help system's menus
- for the topic. You must insert formatting flags in your help text files to
- mark explicit cross-references.
-
- If the hyperlink consists of a single word, you can use invisible text to
- flag it in the source file. The \v formatting flag creates invisible text,
- as follows:
-
- hyperlink\vcontext\v
-
- Put the first \v flag immediately following the word you want to be the
- hyperlink. Following the flag, insert the context that the hyperlink points
- to. The second \v flag marks the end of the context; that is, the end of the
- invisible text. HELPMAKE generates a cross-reference whose context is the
- invisible text and whose hyperlink is the word.
-
- If the hyperlink consists of a phrase, rather than a single word, you must
- use anchored text to create explicit cross-references. Use the \ a and \v
- flags to create anchored text as follows:
-
- \ahyperlink-words\vcontext\v
-
- The \ a flag marks an anchor for the cross-reference. The text that follows
- the \ a flag is the hyperlink. The hyperlink must fit entirely on one line.
- The first \v flag marks both the end of the hyperlink and the beginning of
- the invisible text that contains the cross-reference context. The second \v
- flag marks the end of the invisible text.
-
- The C functions abs, cabs, and fabs in the following examples are implicit
- cross-references because they have a global context in the help system.
-
- See also: abs, cabs, fabs
-
- The next example shows the encoding for an explicit cross-reference to an
- example program and a function template from the help database for the
- Microsoft C run-time library:
-
- See also: Example\vopen.ex\v, Template\vopen.tm\v, close
-
- Here, the hyperlinks are Example and Template, which reference the
- contexts open.ex and open.tm. The example also contains an implicit
- cross-reference to the close function.
-
- The final example shows the encoding for an explicit cross-reference to an
- entire family of functions:
-
- See also: \ais... functions\vis_functions\v, atoi
-
- The cross-reference uses anchored text to associate a phrase, rather than
- just a word, with a context. In this example, the hyperlink is the anchored
- phrase is... functions, and it cross-references the context is_functions.
- In addition, the example contains an implicit cross-reference to the
- C-language atoi routine.
-
-
- 11.6.1.4 QuickHelp Example
-
- The code below is an example in QuickHelp format that contains a single
- entry:
-
- .context open
- .length 13
- \bInclude:\p <fcntl.h>, <io.h>, <sys\\types.h>, <sys\\stat.h>
-
- \bPrototype:\p int open(char *path, int flag[, int mode]);
- oflag: O_APPEND O_BINARY O_CREAT O_EXCL O_RDONLY
- O_RDWR O_TEXT O_TRUNC O_WRONLY
- (can be joined by |)
- pmode: S_IWRITE S_IREAD S_IREAD | S_IWRITE
-
- \bReturns:\p a handle if successful, or -1 if not.
- errno: EACCES, EEXIST, EMFILE, ENOENT
-
- \bSee also:\p \uExample\p\vopen.ex\v, \uTemplate\p\vopen.tp\v,
- access, chmod, close, creat, dup, dup2, fopen, sopen,
- umask
-
- The .length command near the beginning of the example specifies the size of
- the initial window for the help text. Here, the initial window displays 13
- lines.
-
- The manifest constants (such as O_WRONLY and EEXIST), the C keywords (such
- as int and char), and the other functions (such as access and sopen) are
- implicit cross-references. The words Example and Template are explicit
- cross-references to the example open.ex and to the open template open.tp,
- respectively. Note the use of double backslashes in the include file names.
-
-
-
- 11.6.2 Rich Text Format
-
- Rich Text Format (RTF) is a Microsoft word-processing format supported by
- several word processors, including Microsoft Word 5.0 and Microsoft Word for
- Windows. RTF allows documents to be transferred between applications without
- loss of formatting. The HELPMAKE utility recognizes a subset of the full RTF
- syntax. If your file contains RTF codes that are not part of the subset,
- HELPMAKE discards them.
-
- To create an RTF-formatted file, enter the text and format it as you want it
- to appear: bold, underlined, hidden, italic, and so forth. (You can combine
- attributes.) You can also format paragraphs, selecting body and first-line
- indenting. The only items you need to insert into an RTF file manually are
- the help delimiter (>>) and the context string that start each entry.
-
- When you have entered and formatted the text, save it in RTF format. In
- Microsoft Word 5.0, for example, this means choosing Transfer Save, then
- highlighting RTF in the format: field.
-
- You do not see the RTF formatting codes when you load an RTF file into a
- compatible word processor; the word processor removes them and displays the
- text with the specified attribute(s). However, you can view these codes by
- loading an RTF file into a plain-text word processor.
-
- HELPMAKE recognizes the subset of RTF codes listed in Table 11.5.
-
- Table 11.5 RTF Formatting Codes
-
- ╓┌─────────────────────────────────┌─────────────────────────────────────────╖
- RTF Code
- ────────────────────────────────────────────────────────────────────────────
- \ b Boldface. The application decides how to
- display this; often it is
- intensified text.
-
- \ fin Paragraph first-line indent, n columns.
-
- RTF Code
- ────────────────────────────────────────────────────────────────────────────
- \ i Italic. The application decides how to
- display this; often it is reverse video.
-
- \ lin Paragraph indent from left margin, n
- columns.
-
- \ line New line (not new paragraph).
-
- \ par End of paragraph.
-
- \ pard Default paragraph formatting.
-
- \ plain Default attributes. On most screens,
- this is nonblinking normal
- intensity.
-
- \ tab Tab character.
-
- RTF Code
- ────────────────────────────────────────────────────────────────────────────
- \ ul Underline. The application decides how
- to display this; some adapters that do
- not support underlining display it as
- blue text.
-
- \ v Hidden text. Hidden text is used for
- cross-reference information and for some
- application-specific communications; it
- is not
- displayed.
-
- ────────────────────────────────────────────────────────────────────────────
-
-
-
- When HELPMAKE compresses the file, it formats the text to the width given
- with the / W option, ignoring the paragraph formats.
-
- As with the other text formats, each entry in the database source consists
- of one or more context strings, followed by topic text. An RTF file can
- contain QuickHelp dot commands.
-
- The help delimiter (>>) at the beginning of any paragraph marks the
- beginning of a new help entry. The text that follows on the same line is
- defined as a context for the topic. If the next paragraph also begins with
- the help delimiter, it also defines a context string for the same topic
- text. You can define any number of contexts for a block of topic text. The
- topic text comprises all subsequent paragraphs up to the next paragraph that
- begins with the help delimiter.
-
- The example below is a help database containing a single entry using subset
- RTF text. Note that RTF uses curly braces ( { } ) for nesting. Thus, the
- entire file is enclosed in curly braces, as is each specially formatted text
- item.
-
- {\rtf1
- \pard >>open\par
- {\b Include:} <fcntl.h>, <io.h>, <sys\\types.h>, <sys\\stat.h>\par
- \par
- {\b Syntax:} int open( char * filename, int oflag[, int pmode
- ] );\par
- oflag: O_APPEND O_BINARY O_CREAT O_EXCL O_RDONLY\par
-
- O_RDWR O_TEXT O_TRUNC O_WRONLY\par
- (may be joined by |)\par
- pmode: S_IWRITE S_IREAD S_IREAD | S_IWRITE\par
- \par
- {\b Returns:} a handle if successful, or -1 if not.\par
- errno: EACCES, EEXIST, EMFILE, ENOENT\par
- \par
- {\b See also:} Examples{\v open.ex}, access, chmod, close, creat,
- dup,\par
- dup2, fopen, sopen, umask\par
- >>open.ex\par
- To build this help file, use the following command:\par
- \par
- HELPMAKE /S1 /E15 /OOPEN.HLP OPEN.RTF\par
- \par
-
- < Back >{\v !B}
- }
-
- RTF files normally contain additional information that is not visible to the
- user; HELPMAKE ignores this extra information.
-
-
- 11.6.3 Minimally Formatted ASCII Format
-
- A minimally formatted ASCII text file comprises a sequence of topics, each
- preceded by one or more unique context definitions. Each context definition
- must be on a separate line beginning with a help delimiter (>>). Subsequent
- lines up to the next context definition constitute the topic text.
-
- Minimally formatted ASCII files cannot contain highlighting.
-
- There are two ways to use a minimally formatted ASCII file. You can compress
- it with HELPMAKE, creating a help database, or an application can access the
- uncompressed file directly. Compressing minimally formatted ASCII files
- increases search speed. Uncompressed files are somewhat larger and slower to
- search. Minimally formatted ASCII files have a fixed width, and they cannot
- contain highlighting (or other nondefault attributes) or explicit
- cross-references.
-
- The following example, coded in minimally formatted ASCII, shows the same
- text as the QuickHelp example presented earlier in this section. The first
- line of the example defines open as a context string. The minimally
- formatted ASCII help file must begin with the help delimiter (>>), so that
- HELPMAKE or the application can verify that the file is indeed an ASCII help
- file.
-
- >>>>open
-
- Include: <fcntl.h>, <io.h>, <sys\types.h>, <sys\stat.h>
-
- Prototype: int open(char *path, int flag[, int mode]);
- oflag: O_APPEND O_BINARY O_CREAT O_EXCL O_RDONLY
- O_RDWR O_TEXT O_TRUNC O_WRONLY
- (can be joined by |)
- pmode: S_IWRITE S_IREAD S_IREAD | S_IWRITE
-
- Returns: a handle if successful, or -1 if not.
- errno: EACCES, EEXIST, EMFILE, ENOENT
-
- See also: access, chmod, close, creat, dup, dup2, fopen, sopen,
- umask
-
- When displayed, the help information appears exactly as it is typed into the
- file. Any formatting codes are treated as ASCII text.
-
-
- 11.7 Related Topics in Online Help
-
- Information on the following related topics can be found in online help.
-
- Topic Access
- ────────────────────────────────────────────────────────────────────────────
- HELPMAKE Choose "HELPMAKE" from the "Microsoft Advisor Contents" screen
- QuickHelp Choose "QH" from the "Microsoft Advisor Contents" screen
-
-
-
-
-
-
- Chapter 12 Linking Object Files with LINK
- ────────────────────────────────────────────────────────────────────────────
-
- This chapter describes the Microsoft Segmented-Executable Linker (LINK),
- which combines compiled or assembled object files into an executable file.
- It explains LINK's input syntax and fields and tells how to use options to
- control LINK. It discusses overlays in DOS programs and concludes with
- background information about LINK.
-
-
- 12.1 Overview
-
- LINK combines 80x86 object files into either an executable file or a
- dynamic-link library (DLL). The object-file format is the Microsoft
- Relocatable Object-Module Format (OMF), based on the Intel 8086 OMF. LINK
- uses library files in Microsoft library format.
-
- LINK creates "relocatable" executable files and DLLs─that is, the operating
- system can load and execute these files in any unused section of memory.
- LINK can create DOS executable files with up to 1 megabyte of code and data
- (or up to 16 megabytes when using overlays), or OS/2 and Microsoft Windows
- programs with up to 16 megabytes.
-
- For more information on OMF, executable-file format, and the linking
- process, see the MS-DOS Encyclopedia.
-
- Use BIND to create an OS/2 program that also runs under DOS.
-
- The linker produces programs that run under DOS only or under OS/2 only, but
- not both. However, if an OS/2 program limits its OS/2 function calls to the
- Family API subset, you can use the Microsoft Bind Utility (BIND) to modify
- the OS/2 executable file so that it runs under both OS/2 and DOS. For more
- information, see online help.
-
- Use EXEHDR to examine the finished file.
-
- When the file (either executable or DLL) is created, you can examine the
- information that LINK puts in the file's header by using the Microsoft EXE
- File Header Utility (EXEHDR). For more information, see online help.
-
- Other programs can call LINK automatically.
-
- The Programmer's WorkBench (PWB) invokes LINK to create the final executable
- file or DLL. Therefore, if you develop your software with PWB, you might not
- need to read this chapter. However, the detailed explanations of LINK
- options might be helpful when you use the LINK Options dialog box in PWB.
- This information is also available in online help.
-
- The compiler or assembler supplied with your language (CL with C, FL with
- FORTRAN, ML with MASM) also invokes LINK. You can use most of the LINK
- options described in this chapter with this utility. Online help has more
- information about the compilers and assembler: select help for the
- appropriate language from the Compiler box of the help Contents screen.
-
- ────────────────────────────────────────────────────────────────────────────
- NOTE
-
- Unless otherwise noted, all references to "library" in this chapter refer to
- a static library, either a standard library created by the Microsoft Library
- Manager (LIB) or an import library created by the Microsoft Import Library
- Manager (IMPLIB), and not a DLL.
- ────────────────────────────────────────────────────────────────────────────
-
-
- 12.2 LINK Output Files
-
- LINK is a bound application that runs under both DOS and OS/2 and can create
- executable files for DOS, OS/2, or Windows. You do not have to run LINK
- under OS/2 to create OS/2 applications, or under DOS to create DOS programs.
- The kind of file produced is determined by the way the source code is
- compiled and the information supplied to LINK, not the operating system LINK
- runs under.
-
- A program that runs under DOS is called an executable file or application. A
- program or DLL that runs under Windows or OS/2 is called a segmented
- executable file. LINK creates the appropriate file according to the
- following rules:
-
-
- ■ If a module-definition file or import library is not specified and the
- object files and libraries do not contain export definitions, LINK
- creates an application that runs under DOS.
-
- ■ If a module-definition file containing a LIBRARY statement is
- specified, LINK creates a DLL for Windows or OS/2.
-
- ■ If any other form of module-definition file is specified, or if any of
- the object files contains an exported definition, LINK creates an
- application to run under Windows or OS/2.
-
-
- LINK looks for the default run-time libraries named in the object files.
- Default libraries can be real or protected mode. (The mode is usually set
- when the language product is installed.) Protected-mode libraries contain
- export definitions. If LINK finds protected-mode default libraries, the
- output file will be a segmented executable file rather than a DOS file.
-
- The file OS2.LIB is an import library. Linking with OS2.LIB produces an OS/2
- application or DLL. When you use a Microsoft high-level language to compile
- for protected mode, the compiler automatically specifies OS2.LIB as a
- default library.
-
- LINK's output is either an executable file or a DLL. For simplicity, this
- chapter sometimes refers to this output as the "main file" or "main output."
-
-
- Map files list the segments and symbols in a program.
-
- LINK also creates a "map" file, which lists the segments in the executable
- file. The /MAP option adds public symbols to the map file, and the /LINE
- option adds line numbers.
-
- LINK produces other files when certain options are used.
-
- Other options tell LINK to create other kinds of output files. The /INCR
- option creates .ILK and .SYM files for incremental linking with ILINK. LINK
- produces a .COM file instead of an .EXE file when the /TINY option is
- specified. The combination of /CO and /TINY puts debugging information into
- a .DBG file. A Quick library results when the /Q option is specified. For
- more information on these and other options, see Section 12.5, "LINK
- Options."
-
-
- 12.3 LINK Syntax and Input
-
- The LINK command has the following syntax:
-
- LINK objfiles«, «exefile» «,
- «mapfile»«, «libraries»«, deffile»
- » » »«;»
-
- The LINK fields perform the following functions:
-
-
- ■ The objfiles field is a list of the object files that are to be linked
- into an executable file or DLL. It is the only required field.
-
- ■ The exefile field lets you change the name of the output file from its
- default.
-
- ■ The mapfile field gives the map file a name other than its default
- name.
-
- ■ The libraries field specifies additional (or replacement) libraries to
- search for unresolved references.
-
- ■ The deffile field gives the name of a description file needed to
- create Windows and OS/2 applications and DLLs.
-
-
- Fields are separated by commas. You can specify all the fields or leave one
- or more fields (including objfiles) blank; LINK will then prompt you for the
- missing input. (For an explanation of how to use LINK prompts, see Section
- 12.4, "Running LINK.") To leave a field blank, enter only the field's
- trailing comma.
-
- Options can be specified in any field. For descriptions of each of LINK's
- options, see Section 12.5, "LINK Options."
-
- The fields must be entered in the order shown, whether they contain input or
- are left blank. A semicolon (;) at the end of the LINK command line
- terminates the command and suppresses prompting for any missing fields. LINK
- then assumes the default values for the missing fields.
-
- If your file appears in or is to be created in another directory or device,
- you must supply the full pathname. Filenames are not case sensitive.
-
- The next five sections explain how to use each of the LINK fields.
-
-
- 12.3.1 The objfiles Field
-
- The objfiles field specifies one or more object files to be linked. At least
- one filename must be entered. If you do not supply an extension, LINK
- assumes a default .OBJ extension. If the filename has no extension, add a
- period (.) at the end of its name.
-
- If you name more than one object file, separate the names with a plus sign
- (+) or a space. To extend objfiles to the following line, type a plus sign
- (+) as the last character on the current line, press ENTER, and continue. Do
- not split a name across lines.
-
-
- 12.3.1.1 Load Libraries
-
- The objfiles field can also specify library files. A library specified this
- way becomes a "load library." You must specify the library's filename
- extension; otherwise, LINK assumes an .OBJ extension.
-
- LINK treats load libraries as any other object file: it puts every object
- module from a load library in the executable file, regardless of whether a
- module satisfies an unresolved external reference. The effect is the same as
- if you had specified all the library's object-module names in the objfiles
- field.
-
- Specifying a load library can therefore create an executable file or DLL
- that is larger than it needs to be. (A library named in the libraries field
- adds only those modules required to resolve external references.) However,
- loading an entire library can be useful when
-
-
- ■ Repeatedly specifying the same group of object files
-
- ■ Placing a library in an overlay
-
- ■ Debugging, so you can call library routines that would not be included
- in the release version of the program
-
-
-
- 12.3.1.2 How LINK Searches for Object Files
-
- When searching for object (and load-library) files, LINK looks in the
- following locations in the order specified:
-
-
- 1. The directory specified for the file (if a path is included). If the
- file is not in that directory, the search terminates.
-
- 2. The current directory.
-
- 3. Any directories specified in the LIB environment variable.
-
-
- If LINK cannot find an object file, and a floppy drive is associated with
- that object file, LINK pauses and prompts you to insert a disk containing
- the object file.
-
- If you specify a library in the objfiles field, LINK treats it like any
- other object file. LINK therefore does not search for load libraries in
- directories named in the libraries field.
-
-
- 12.3.1.3 Overlays
-
- A special syntax for the objfiles field lets you create DOS programs that
- use overlay modules. For more information about overlays, see Section 12.7,
- "Using Overlays under DOS."
-
-
- 12.3.2 The exefile Field
-
- The exefile field is used to specify a name for the main output file. If you
- do not supply an extension, LINK assumes a default extension, either .EXE,
- .COM (when using the /TINY option), .DLL (when using a module-definition
- file containing a LIBRARY statement), or .QLB (when using the /Q option).
-
- If you do not specify an exefile, LINK gives the main output a default name.
- This name is the base name of the first file listed in the objfiles field,
- plus the extension appropriate for the type of executable file being
- created.
-
- LINK creates the main file in the current directory unless you specify an
- explicit path with the filename.
-
-
- 12.3.3 The mapfile Field
-
- The mapfile field is used to specify a filename for the map file or to
- suppress creation of a map file. A map file lists the segments in the
- executable file or DLL.
-
- You can specify a path with the filename. The default extension is .MAP.
- Specify NUL to suppress the creation of a map file. The default for the
- mapfile field is one of the following:
-
-
- ■ If this field is left blank on the command line or in a response file,
- LINK creates a map file with the base name of the exefile (or the
- first object file if no exefile is specified) and the extension .MAP.
-
- ■ When using LINK prompts, LINK assumes either the default described
- above (if an empty mapfile field is specified) or NUL.MAP, which
- suppresses creation of a map file.
-
-
- To add line numbers to the map file, use the /LINE option. To add public
- symbols, use the /MAP option. Both /LINE and /MAP force a map file to be
- created unless NULL is explicitly specified.
-
-
- 12.3.4 The libraries Field
-
- You can specify one or more standard or import libraries (not DLLs) in the
- libraries field. If you name more than one library, separate the names with
- a plus sign (+) or a space. To extend libraries to the following line, type
- a plus sign (+) as the last character on the current line, press ENTER, and
- continue. Do not split a name across lines. If you specify the base name of
- a library without an extension, LINK assumes a default .LIB extension.
-
- If no library is specified, LINK searches only the default libraries named
- in the object files to resolve unresolved references. If one or more
- libraries are specified, LINK searches them in the order named before
- searching the default libraries.
-
- You can tell LINK to search additional directories for specified or default
- libraries by giving a drive name or path specification in the libraries
- field; end the specification with a backslash ( \ ). (If you don't include
- the backslash, LINK assumes the last element of the path is a library file.)
- LINK looks for files ending in .LIB in these directories.
-
- You can specify a total of 32 paths or libraries in the field. If you give
- more than 32 paths or libraries, LINK ignores the additional specifications
- without warning you.
-
- You might need to specify library names when you want to
-
-
- ■ Use a default library that has been renamed.
-
- ■ Specify a library other than the default named in the object file (for
- example, a library that handles floating-point arithmetic differently
- from the default library).
-
- ■ Search additional libraries.
-
- ■ Find a library not in the current directory and not in a directory
- specified by the LIB environment variable.
-
-
-
- 12.3.4.1 Overriding Default-Library Searches
-
- Most compilers insert the names of the required language libraries in the
- object files. LINK searches for these default libraries automatically; you
- do not need to specify them in the libraries field. The libraries must
- already exist with the name expected by LINK. Default-library names usually
- refer to combined libraries built and named during setup; consult your
- compiler documentation for more information about default libraries.
-
- To make LINK ignore the default libraries, use the /NOD option. This leaves
- unresolved references in the object files, so you must use the libraries
- field to specify the alternative libraries that LINK is to search.
-
-
- 12.3.4.2 Import Libraries
-
- You can specify import libraries created by the IMPLIB utility anywhere you
- can specify standard libraries. You can also use the LIB utility to combine
- import libraries and standard libraries. These combined libraries can then
- be specified in the libraries field.
-
-
- 12.3.4.3 How LINK Resolves References
-
- LINK searches static libraries to resolve external references. A static
- library is either a standard library created by the LIB utility or an import
- library created by the IMPLIB utility. The linker searches first in the
- libraries and library directories you specify (in the order you specify
- them), then in the default libraries. If a default library is explicitly
- specified, it is searched in the order it is given.
-
- LINK uses only those library modules needed to resolve external references,
- not the entire library. However, if you enter a library as a load library in
- the objfiles field, all the modules of a load library are added to the main
- output.
-
-
- 12.3.4.4 How LINK Searches for Library Files
-
- When searching for libraries, LINK looks in the following locations in this
- order:
-
-
- 1. The directory specified for the file (if a path is included). If the
- file is not in that directory, the search terminates. (The default
- libraries named in object files by Microsoft compilers do not include
- path specifications.)
-
- 2. The current directory.
-
- 3. Any directories in the libraries field.
-
- 4. Any directories specified in the LIB environment variable.
-
-
- If LINK cannot locate a library file, it prompts you to enter the location.
- The /BATCH option disables this prompting.
-
-
- Example
-
- The following is a specification in the libraries field:
-
- C:\TESTLIB\ NEWLIBV3 C:\MYLIBS\SPECIAL
-
- LINK searches NEWLIBV3.LIB first for unresolved references. Since no
- directory is specified for NEWLIBV3.LIB, LINK searches the following
- locations in this order:
-
-
- 1. The current directory
-
- 2. The C:\TESTLIB\ directory
-
- 3. The directories in the LIB environment variable
-
-
- If LINK still cannot find NEWLIBV3.LIB, it prompts you with the message
-
- Enter new file spec
-
- You can then enter either a path to the library or a full pathname for
- another library.
-
- If unresolved references remain after searching NEWLIBV3.LIB, LINK then
- searches the library C:\MYLIBS\SPECIAL.LIB. If LINK cannot find this
- library, it prompts you as described above for NEWLIBV3.LIB. If there are
- still unresolved references, LINK searches the default libraries.
-
-
- 12.3.5 The deffile Field
-
- Use the deffile field to specify a module-definition file when you are
- linking a segmented executable file, which is an application or DLL for OS/2
- or Windows. A module-definition file is optional for an application but
- required for a DLL. If you specify a base name with no extension, LINK
- assumes a .DEF extension. If the filename has no extension, put a period (.)
- at the end of the name.
-
- By default, LINK assumes that no deffile needs to be specified. If you are
- linking for DOS, use a semicolon to terminate the command line before the
- deffile field (or accept the default NUL.DEF at the Definitions File
- prompt).
-
-
- 12.3.5.1 How LINK Searches for Module-Definition Files
-
- LINK searches for the module-definition file in the following order:
-
-
- 1. The directory specified for the file (if a path is included). If the
- file is not in that directory, the search terminates.
-
- 2. The current directory.
-
-
- For information on module-definition files, see Chapter 13.
-
-
- 12.3.6 Examples
-
- The following examples illustrate various uses of the LINK command line.
-
-
- Example 1
-
- LINK FUN+TEXT+TABLE+CARE, , FUNLIST, XLIB.LIB;
-
- This command line links the object files FUN.OBJ, TEXT.OBJ, TABLE.OBJ, and
- CARE.OBJ. By default, the executable file is named FUN.EXE, because the base
- name of the first object file is FUN, and no name is specified for the
- executable file. The map file is named FUNLIST.MAP. LINK searches for
- unresolved external references in the library XLIB.LIB before searching in
- the default libraries. LINK does not prompt for a .DEF file because a
- semicolon appears before the deffile field.
-
-
- Example 2
-
- LINK FUN, , ;
-
- This command produces a map file named FUN.MAP because a comma appears as a
- placeholder for the mapfile field on the command line.
-
-
- Example 3
-
- LINK FUN, ;
- LINK FUN;
-
- Neither of these commands produces a map file, because commas do not appear
- as placeholders for the mapfile field. The semicolon (;) terminates the
- command line and accepts all remaining defaults without prompting; the
- prompting default for the map file is not to create one.
-
-
- Example 4
-
- LINK MAIN+GETDATA+PRINTIT, , MAIN ;
-
- This command links the files MAIN.OBJ, GETDATA.OBJ, and PRINTIT.OBJ into a
- DOS executable file because no module-definition file is specified. The map
- file MAIN.MAP is created.
-
-
- Example 5
-
- LINK GETDATA+PRINTIT, , , , MODDEF
-
- This command links GETDATA.OBJ and PRINTIT.OBJ into a DLL if MODDEF.DEF
- contains a LIBRARY statement. Otherwise, it links them into a segmented
- executable file for OS/2 or Windows. LINK creates a map file named
- GETDATA.MAP.
-
-
- 12.4 Running LINK
-
- The simplest use of LINK is to combine one or more object files with a
- run-time library to create an executable file. You type LINK at the
- command-line prompt, followed by the names of the object files and a
- semicolon (;). LINK combines the object files with language libraries
- specified in the object files to create an executable file. By default, the
- executable file takes the name of the first object file in the list.
-
- To interrupt LINK and return to the operating-system prompt, press CTRL+C at
- any time.
-
- LINK expects you to supply at least one input field (the objfiles field),
- and as many as five. There are several ways to supply the input fields LINK
- expects:
-
-
- ■ Enter all the required input directly on the command line.
-
- ■ Omit one or more of the input fields and respond when LINK prompts for
- the missing fields.
-
- ■ Put the input in a response file and enter the response-file name in
- place of the expected input.
-
-
- These methods can be used in combination. The LINK command line was
- discussed in Section 12.3. The following sections explain the other two
- methods.
-
-
- 12.4.1 Specifying Input with LINK Prompts
-
- If any field is missing from the LINK command line and the line does not end
- with a semicolon, or if any of the supplied fields are invalid, LINK prompts
- you for the missing or incorrect information. LINK displays one prompt at a
- time and waits until you respond:
-
- Object Modules [.OBJ]:
- Run File [basename.EXE]:
- List File [NUL.MAP]:
- Libraries [.LIB]:
- Definitions File [NUL.DEF]:
-
- The LINK prompts correspond to the command-line fields described earlier in
- this chapter. If you want LINK to prompt you for every input field,
- including objfiles, type the command LINK by itself.
-
- Options can be entered anywhere in any field, before the semicolon if
- specified.
-
-
- 12.4.1.1 Defaults
-
- The default values for each field are shown in brackets. Press ENTER to
- accept the default, or type in the filename(s) you want. The basename is the
- base name of the first object file you specified. To select the default
- responses for all the remaining prompts and terminate prompting, type a
- semicolon (;) and press ENTER.
-
- If you specify a filename without giving an extension, LINK adds the
- appropriate default extension. To specify a filename that does not have an
- extension, type a period (.) after the name.
-
- Use a space or plus sign (+) to separate multiple filenames in the objfiles
- and libraries fields. To extend a long objfiles or libraries response to a
- new line, type a plus sign (+) as the last character on the current line and
- press ENTER. You can continue entering your response when the same prompt
- appears on a new line. Do not split a filename or a pathname across lines.
-
-
- 12.4.2 Specifying Input in a Response File
-
- You can supply input to LINK in a response file. A response file is a text
- file containing the input LINK expects on the command line or in response to
- prompts. Response files can be used to hold frequently used options or
- responses, or to overcome the 128-character limit on the length of a DOS
- command line.
-
-
- 12.4.2.1 Usage
-
- Specify the name of the response file in place of the expected command-line
- input or in response to a prompt. Precede the name with an at sign (@), as
- in @responsefile. You must specify an extension if the response file has
- one; there is no default extension. You can specify a path with the
- filename.
-
- You can specify a response file in any field (either on the command line or
- when responding to prompts) to supply input for one or more consecutive
- fields or all remaining fields. Note that LINK assumes nothing about the
- contents of the response file; LINK simply reads the fields from the file
- and applies them, in order, to the fields for which it has no input. LINK
- ignores any fields in the response file or on the command line after the
- five expected fields are satisfied or a semicolon (;) appears.
-
-
- Example
-
- The following command invokes LINK and supplies all input in a response
- file, except the last input field:
-
- LINK @input.txt, mydefs
-
-
- 12.4.2.2 Contents of the Response File
-
- Each input field must appear on a separate line or be separated from other
- fields on the same line by a comma. You can extend a field to the following
- line by adding a plus sign (+) at the end of the current line. A blank field
- can be represented by either a blank line or a comma.
-
- Options can be entered anywhere in any field, before the semicolon if
- specified.
-
- If a response file does not specify all the fields, LINK prompts you for the
- rest. Use a semicolon (;) to suppress prompting and accept the default
- responses for all remaining fields.
-
-
- Example
-
- FUN TEXT TABLE+
- CARE
- /MAP
- FUNLIST
- GRAF.LIB ;
-
- If the response file above is named FUN.LNK, the command
-
- LINK @FUN.LNK
-
- causes LINK to
-
-
- ■ Link the four object files FUN.OBJ, TEXT.OBJ, TABLE.OBJ, and CARE.OBJ
- into an executable file named FUN.EXE.
-
- ■ Include public symbols and addresses in the map file.
-
- ■ Make the name of the map file FUNLIST.MAP.
-
- ■ Link any needed routines from the library file GRAF.LIB.
-
- ■ Assume no module-definition file.
-
-
-
- 12.5 LINK Options
-
- This section explains how to use options to control LINK's behavior and
- modify LINK's output. It contains a description of each option following a
- brief introduction on how to specify options.
-
-
- 12.5.1 Specifying Options
-
- The following paragraphs discuss rules for using options.
-
-
- 12.5.1.1 Syntax
-
- All options begin with a slash ( / ). You can specify an option by using the
- shortest sequence of characters that uniquely identifies the option. The
- description for each option shows the minimum legal abbreviation with the
- optional part enclosed in double brackets. No gaps or transpositions of
- letters are allowed. For example,
-
- /B«ATCH»
-
- indicates that either /B or /BATCH can be used, as can /BA, /BAT, or /BATC.
- Option names are not case sensitive, so you can also specify /batch or
- /Batch. This chapter uses meaningful yet legal forms of the option names.
-
-
- 12.5.1.2 Usage
-
- LINK options can appear on the command line, in response to a prompt, or as
- part of a field in a response file. They can also be specified in the LINK
- environment variable. (For more information, see Section 12.6, "Setting
- Options with the LINK Environment Variable.") Options can appear in any
- field before the last input, except as noted in the descriptions.
-
- If an option appears more than once (for example, on the command line and in
- the LINK variable), the effect is the same as if the option was given only
- once. If two options conflict, the most recently specified option takes
- effect. This means that a command-line option or one given in response to a
- prompt overrides one specified in the LINK environment variable. For
- example, the command-line option /SEG:512 cancels the effect of the
- environment-variable option /SEG:256.
-
-
- 12.5.1.3 Numeric Arguments
-
- Some LINK options take numeric arguments. You can enter numbers either in
- decimal format or in standard C-language notation.
-
-
- 12.5.2 The /ALIGN Option
-
-
- Option
-
- /A«LIGNMENT»:size
-
- The /ALIGN option aligns segments in a segmented executable file at the
- boundaries specified by size. The size argument must be an integer power of
- two. For example,
-
- /ALIGN:16
-
- indicates an alignment boundary of 16 bytes. The default alignment is 512
- bytes.
-
- This option reduces the size of the disk file by reducing the size of gaps
- between segments. It has no effect on the size of the file when loaded in
- memory.
-
-
- 12.5.3 The /BATCH Option
-
-
- Option
-
- /B«ATCH»
-
- The /BATCH option suppresses prompting for libraries or object files that
- LINK cannot find. By default, the linker prompts for a new pathname whenever
- it cannot find a library that it has been directed to use. It also prompts
- you if it cannot find an object file that it expects to find on a floppy
- disk. When /BATCH is used, the linker generates an error or warning message
- (if appropriate). The /BATCH option also suppresses the LINK copyright
- message and echoed input from response files.
-
- Using this option can cause unresolved external references. It is intended
- primarily for users who use batch files or makefiles for linking many
- executable files with a single command and who wish to prevent linker
- operation from halting.
-
- ────────────────────────────────────────────────────────────────────────────
- NOTE
-
- This option does not suppress prompts for input fields. Use a semicolon (;)
- at the end of the LINK input to suppress input prompting.
- ────────────────────────────────────────────────────────────────────────────
-
-
- 12.5.4 The /CO Option
-
-
- Option
-
- /CO«DEVIEW»
-
- The /CO option adds line numbers and symbolic data to the executable file
- for use with the Microsoft CodeView debugger. The /CO option has no effect
- if the object files do not contain CodeView debugging information.
-
- You can run the resulting executable file outside CodeView; the debugging
- data in the file is ignored. However, it increases file size and slows
- execution slightly. You should link a separate release version without the
- /CO option after the program has been debugged.
-
- When /CO is used with the /TINY option, debug information is put in a
- separate file with the same base name as the .COM file and with the .DBG
- extension.
-
- The /CO option is not compatible with the /EXEPACK option for DOS executable
- files.
-
-
- 12.5.5 The /CPARM Option
-
-
- Option
-
- /CP«ARMAXALLOC»:number
-
- The /CPARM option sets the maximum number of 16-byte paragraphs needed by
- the program when it is loaded into memory. The operating system uses this
- value to allocate space for the program before loading it. This option is
- useful when you want to execute another program from within your program and
- you need to reserve memory for the program. The /CPARM option is valid only
- when linking DOS programs.
-
- LINK normally requests the operating system to set the maximum number of
- paragraphs to 65,535. Since this is more memory than DOS can supply, the
- operating system always denies the request and allocates the largest
- contiguous block of memory it can find. If the /CPARM option is used, the
- operating system allocates no more space than the option specified. Any
- memory in excess of that required for the program loaded is free for other
- programs.
-
- The number can be any integer value in the range 1 to 65,535. If number is
- less than the minimum number of paragraphs needed by the program, LINK
- ignores your request and sets the maximum value equal to whatever the
- minimum value happens to be. The minimum number of paragraphs needed by a
- program is never less than the number of paragraphs of code and data in the
- program. To free more memory for programs compiled in the medium and large
- models, link with /CPARM:1. This leaves no space for the near heap.
-
- ────────────────────────────────────────────────────────────────────────────
- NOTE
-
- You can change the maximum allocation after linking by using the EXEHDR
- utility, which modifies the executable-file header.
- ────────────────────────────────────────────────────────────────────────────
-
-
- 12.5.6 The /DOSSEG Option
-
-
- Option
-
- /DO«SSEG»
-
- The /DOSSEG option forces segments to be ordered as follows:
-
-
- 1. All segments with a class name ending in CODE
-
- 2. All other segments outside DGROUP
-
- 3. DGROUP segments, in the following order:
-
- a. Any segments of class BEGDATA. (This class name is reserved for
- Microsoft use.)
-
- b. Any segments not of class BEGDATA, BSS, or STACK.
-
- c. Segments of class BSS.
-
- d. Segments of class STACK.
-
-
-
- In addition, /DOSSEG option defines the following two labels:
-
- _edata = DGROUP : BSS
- _end = DGROUP : STACK
-
- The variables _edata and _end have special meanings for Microsoft
- compilers, so you should not define program variables with these names.
- Assembly-language programs can reference these variables but should not
- change them.
-
- The /DOSSEG option also inserts 16 null bytes at the beginning of the _TEXT
- segment (if this segment is defined). This behavior of the option is
- overridden by the /NONULLS option when both are used; use /NONULLS to
- override the DOSSEG comment record commonly found in standard Microsoft
- libraries.
-
- This option is principally for use with assembly-language programs. When you
- link high-level-language programs, a special object-module record in the
- Microsoft language libraries automatically enables the /DOSSEG option. This
- option is also enabled by assembly modules that use MASM directive .DOSSEG.
-
-
-
- 12.5.7 The /DSALLOC Option
-
-
- Option
-
- /DS«ALLOCATE»
-
- The /DSALLOC option tells LINK to load all data starting at the high end of
- the data segment. At run time, the data segment (DS) register is set to the
- lowest data-segment address that contains program data.
-
- By default, LINK loads all data starting at the low end of the data segment.
- At run time, the DS register is set to the lowest possible address to allow
- the entire data segment to be used.
-
- The /DSALLOC option is most often used with the /HIGH option to take
- advantage of unused memory within the data segment. These options are valid
- only for assembly-language programs that create DOS .EXE files.
-
-
- 12.5.8 The /EXEPACK Option
-
-
- Option
-
- /E«XEPACK»
-
- The /EXEPACK option directs LINK to remove sequences of repeated bytes
- (usually null characters) and to optimize the load-time relocation table
- before creating the executable file. (The load-time relocation table is a
- table of references relative to the start of the program, each of which
- changes when the executable image is loaded into memory and an actual
- address for the entry point is assigned.)
-
- The /EXEPACK option does not always produce a significant saving in disk
- space and may sometimes actually increase file size. Programs that have a
- large number of load-time relocations (about 500 or more) and long streams
- of repeated characters are usually shorter if packed. LINK notifies you if
- the packed file is larger than the unpacked file. The time required to
- expand a packed file may cause it to load more slowly than a file linked
- without this option.
-
- You cannot debug packed files with CodeView, because the /EXEPACK option
- removes symbolic information. A LINK warning message notifies you of this.
-
- The /EXEPACK option is not compatible with the /INCR option or with Windows
- programs.
-
-
- 12.5.9 The /FARCALL Option
-
-
- Option
-
- /F«ARCALLTRANSLATION»
-
- The /FARCALL option directs the linker to optimize far calls to procedures
- that lie in the same segment as the caller. This can result in slightly
- faster code; the gain in speed is most apparent on 80286-based machines and
- later. The /PACKC option can be used with /FARCALL when linking for OS/2.
- /PACKC is not recommended when linking Windows applications with /FARCALL.
-
- The /FARCALL option is off by default. If an environment variable (such as
- LINK or FL) includes /FARCALL, you can use the /NOFARCALL option to override
- it.
-
- FARCALL optimizes by creating more efficient code.
-
- A program that has multiple code segments may make a far call to a procedure
- in the same segment. Since the segment address is the same (for both the
- code and the procedure it calls), only a near call is necessary. Far calls
- appear in the relocation table; a near call does not require a table entry.
- By converting far calls to near calls in the same segment, the /FARCALL
- option both reduces the size of the relocation table and increases execution
- speed, since only the offset needs to be loaded, not a new segment. The
- /FARCALL option has no effect on programs that make only near calls, since
- there are no far calls to convert.
-
- When /FARCALL is specified, the linker optimizes code by removing the
- instruction call FAR label and substituting the following sequence:
-
- nop
- push cs
- call NEAR label
-
- During execution, the called procedure still returns with a far-return
- instruction. However, because both the code segment and the near address are
- on the stack, the far return is executed correctly. The nop (no-op)
- instruction is added so that exactly five bytes replace the five-byte
- far-call instruction.
-
- In rare cases, /FARCALL should be used with caution.
-
- There is a small risk with the /FARCALL option. If LINK sees the far-call
- opcode (9A hexadecimal) followed by a far pointer to the current statement,
- and that segment has a class name ending in CODE, it interprets that as a
- far call. This problem can occur when using _based (segname ("CODE")) in a
- C program. If a program linked with /FARCALL fails for no apparent reason,
- try using /NOFARCALL.
-
- Object modules produced by Microsoft high-level languages are safe from this
- problem because little immediate data is stored in code segments.
- Assemblylanguage programs are generally safe for use with the /FARCALL
- option if they do not involve advanced system-level code, such as might be
- found in operating systems or interrupt handlers.
-
-
- 12.5.10 The /HELP Option
-
-
- Option
-
- /HE«LP»
-
- The /HELP option calls the QuickHelp utility. If LINK cannot find the help
- file or QuickHelp, it displays a brief summary of LINK command-line syntax
- and options. Do not give a filename when using the /HELP option.
-
-
- 12.5.11 The /HIGH Option
-
-
- Option
-
- /HI«GH»
-
- At load time, the executable file can be placed either as low or as high in
- memory as possible. The /HIGH option causes DOS to place the executable file
- as high as possible in memory. Without the /HIGH option, DOS places the
- executable file as low as possible. This option is usually used with the
- /DSALLOC option. These options are valid only for assembly-language programs
- that create DOS .EXE files.
-
-
- 12.5.12 The /INCR Option
-
-
- Option
-
- /INC«REMENTAL»
-
- The /INCR option must be used to prepare for subsequent linking with ILINK.
- This option produces a .SYM file and an .ILK file, each containing
- additional information needed by ILINK.
-
- When /INCR is specified, LINK creates the main output file as a segmented
- executable file. If the main output is a DOS application, LINK adds a stub
- loader so that the program can run under DOS. The file is slightly larger
- than it would be without /INCR.
-
- The /PADC and /PADD options are often used with the /INCR option to increase
- buffer size and thereby increase the likelihood that incremental linking
- will be successful. The /TINY and /EXEPACK options are not compatible with
- /INCR.
-
- You should not use /INCR or ILINK for the release version of a product.
- ILINK is intended to speed linking during development and debugging. In rare
- cases, linking with /INCR causes warning L4001 to be generated. If this
- occurs, do not use this option or ILINK.
-
-
- 12.5.13 The /INFO Option
-
-
- Option
-
- /INF«ORMATION»
-
- The /INFO option displays to the standard output information about the
- linking process, including the phase of linking and the names of the object
- files being linked. This option is a useful way to determine the locations
- of the object files being linked, the number of segments, and the order in
- which they are linked.
-
-
- 12.5.14 The /LINE Option
-
-
- Option
-
- /LI«NENUMBERS»
-
- The /LINE option adds the line numbers and associated addresses from source
- files to the map file. The object file must contain line-number information
- for it to appear in the map file. If the object file has no line-number
- information, the /LINE option has no effect. (Use the /Zd or /Zi option with
- Microsoft compilers such as CL, FL, and ML to add line numbers to the object
- file.) If you also want to add public symbols to the map file, use the /MAP
- option.
-
- The /LINE option causes a map file to be created even if you did not
- explicitly tell the linker to do so. By default, the map file is given the
- same base name as the executable file with the extension .MAP. You can
- override the default name by specifying a new map filename in the mapfile
- field or in response to the List File prompt.
-
-
- 12.5.15 The /MAP Option
-
-
- Option
-
- /M«AP»
-
- The /MAP option adds to the map file all public (global) symbols defined in
- object files. When /MAP is specified, the map file contains a list of all
- the symbols sorted by name and a list of all the symbols sorted by address.
- If you do not use this option, the map file contains only a list of
- segments. If you also want to add line numbers to the map file, use the
- /LINE option.
-
- The /MAP option causes a map file to be created even if you did not
- explicitly tell the linker to do so. By default, the map file is given the
- same base name as the executable file with the extension .MAP. You can
- override the default name by specifying a new map filename in the mapfile
- field or in response to the List File prompt.
-
- Under some circumstances, adding symbols slows the linking process. If this
- is a problem, do not use /MAP.
-
-
- 12.5.16 The /NOD Option
-
-
- Option
-
- /NOD«EFAULTLIBRARYSEARCH»«:libraryname»
-
- The /NOD option tells LINK not to search default libraries named in object
- files. Specifying libraryname tells LINK to search all libraries named in
- the object files except libraryname. If you want LINK to ignore more than
- one library, specify /NOD once for each library. To tell LINK to ignore all
- default libraries, specify /NOD without a libraryname.
-
- High-level-language object files usually must be linked with a run-time
- library to produce an executable file. Therefore, if you use the /NOD
- option, you must also use the libraries field to specify an alternate
- library that resolves the external references in the object files.
-
-
- 12.5.17 The /NOE Option
-
-
- Option
-
- /NOE«XTDICTIONARY»
-
- The /NOE option prevents the linker from searching extended dictionaries,
- which are lists of symbol locations in libraries created with LIB. The
- linker consults extended dictionaries to speed up library searches.
-
- Using /NOE slows the linker. Use this option when you are redefining a
- symbol or function defined in a library and you get the error
-
- L2044 symbol multiply defined, use /NOE
-
-
- 12.5.18 The /NOFARCALL Option
-
-
- Option
-
- /NOF«ARCALLTRANSLATION»
-
- The /NOFARCALL option turns off far-call optimization (translation).
- Far-call optimization is off by default. However, if an environment variable
- (such as LINK or FL) includes the /FARCALL option, you can use /NOFARCALL to
- override /FARCALL.
-
-
- 12.5.19 The /NOGROUP Option
-
-
- Option
-
- /NOG«ROUPASSOCIATION»
-
- The /NOGROUP option ignores group associations when assigning addresses to
- data and code items. It is provided primarily for compatibility with
- previous versions of the linker (2.02 and earlier) and early versions of
- Microsoft compilers. This option is valid only for assembly-language
- programs that create DOS .EXE files.>
-
-
- 12.5.20 The /NOI Option
-
-
- Option
-
- /NOI«GNORECASE»
-
- This option preserves case in identifiers. By default, LINK treats uppercase
- and lowercase letters as equivalent. Thus ABC, Abc, and abc are
- considered the same name. When you use the /NOI option, the linker
- distinguishes between uppercase and lowercase, and considers these
- identifiers to be three different names.
-
- In most high-level languages, identifiers are not case sensitive, so this
- option has no effect. However, case is significant in C. It's a good idea to
- use this option with C programs to catch misnamed identifiers.
-
-
- 12.5.21 The /NOLOGO Option
-
-
- Option
-
- /NOL«OGO»
-
- The /NOLOGO option suppresses the copyright message displayed when LINK
- starts. This option has no effect if not specified first on the command line
- or in the LINK environment variable.
-
-
- 12.5.22 The /NONULLS Option
-
-
- Option
-
- /NON«ULLSDOSSEG»
-
- The /NONULLS option arranges segments in the same order they are arranged by
- the /DOSSEG option. The only difference is that the /DOSSEG option inserts
- 16 null bytes at the beginning of the _TEXT segment (if it is defined), but
- /NONULLS does not insert the extra bytes.
-
- If both the /DOSSEG and /NONULLS options are given, the /NONULLS option
- takes precedence. You can therefore use /NONULLS to override the DOSSEG
- comment record found in run-time libraries. This option is for segmented
- executable files.
-
-
- 12.5.23 The /NOPACKC Option
-
-
- Option
-
- /NOP«ACKCODE»
-
- This option turns off code-segment packing. Code-segment packing is normally
- off by default. However, if an environment variable (such as LINK or FL)
- includes the /PACKC option to turn on code-segment packing, you can use
- /NOPACKC to override /PACKC.
-
-
- 12.5.24 The /OV Option
-
-
- Option
-
- /O«VERLAYINTERRUPT»:number
-
- This option sets an interrupt number for passing control to overlays. By
- default, the interrupt number used for passing control to overlays is 63 (3F
- hexadecimal). The /OV option allows you to select a different interrupt
- number. This option is valid only when linking DOS programs.
-
- The number can be any number from 0 to 255, specified in decimal format or
- in C-language notation. Numbers that conflict with DOS interrupts can be
- used; however, their use is not advised. You should use this option only
- when you want to use overlays with a program that already reserves interrupt
- 63 for some other purpose.
-
-
- 12.5.25 The /PACKC Option
-
-
- Option
-
- /PACKC«ODE»«:number»
-
- The /PACKC option turns on code-segment packing. The linker packs code
- segments by grouping neighboring code segments that have the same
- attributes. Segments in the same group are assigned the same segment
- address; offset addresses are adjusted accordingly. All items have the same
- physical address whether or not the /PACKC option is used. However, /PACKC
- changes the segment and offset addresses so that all items in a group share
- the same segment.
-
- The number specifies the maximum size of groups formed by /PACKC. The linker
- stops adding segments to a group when it cannot add another segment without
- exceeding number; then it starts a new group. The default segment size
- without /PACKC (or when /PACKC is specified without number) is 65,500 bytes
- (64K - 36 bytes).
-
- The /PACKC option produces slightly faster and more compact code. It affects
- only programs with multiple code segments. This option is off by default
- and, if specified in an environment variable, can be overridden with the
- /NOPACKC option.
-
- Code-segment packing provides more opportunities for far-call optimization
- (which is enabled with the /FARCALL option). The /FARCALL and /PACKC options
- together produce faster and more compact code. However, this combination is
- not recommended for Windows applications.
-
- Use caution when packing assembly-language programs.
-
- Object code created by Microsoft compilers can safely be linked with the
- /PACKC option. This option is unsafe only when used with assembly-language
- programs that make assumptions about the relative order of code segments.
- For example, the following assembly code attempts to calculate the distance
- between CSEG1 and CSEG2. This code produces incorrect results when used
- with /PACKC, because /PACKC causes the two segments to share the same
- segment address. Therefore, the procedure would always return zero.
-
- CSEG1 SEGMENT PUBLIC 'CODE'
- .
- .
- .
- CSEG1 ENDS
-
- CSEG2 SEGMENT PARA PUBLIC 'CODE'
- ASSUME cs:CSEG2
-
- ; Return the length of CSEG1 in AX
-
- codesize PROC NEAR
- mov ax, CSEG2 ; Load para address of CSEG1
- sub ax, CSEG1 ; Load para address of CSEG2
- mov cx, 4 ; Load count
- shl ax, cl ; Convert distance from paragraphs
- ; to bytes
- codesize ENDP
-
- CSEG2 ENDS
-
-
- 12.5.26 The /PACKD Option
-
-
- Option
-
- /PACKD«ATA»«:number»
-
- The /PACKDoption turns on data-segment packing. The linker considers any
- segment definition with a class name that does not end in CODE as a data
- segment. Adjacent data-segment definitions are combined into the same
- physical segment. The linker stops adding segments to a group when it cannot
- add another segment without exceeding number bytes; then it starts a new
- group. The default segment size without /PACKD (or when /PACKD is specified
- without number) is 65,536 bytes (64K).
-
- The /PACKD option produces slightly faster and more compact code. It affects
- only programs with multiple data segments and is valid for OS/2 and Windows
- programs only. It might be necessary to use the /PACKD option to get around
- the limit of 255 physical data segments per executable file imposed by OS/2
- and Windows. Try using /PACKD if you get the following LINK error:
-
- L1073 file-segment limit exceeded
-
- This option may not be safe with other compilers that do not generate fixup
- records for all far data references.
-
-
- 12.5.27 The /PADC Option
-
-
- Option
-
- /PADC«ODE»«:padsize»
-
- The /PADC option adds filler bytes to the end of each code segment for use
- when later linking with ILINK. If you use /PADC, you must also specify the
- /INCR option.
-
- The padsize is optional; the default is 0 bytes. If incremental linking
- fails, you can specify a padsize in decimal format or C-language notation.
- For example, /PADC:256 adds an additional 256 bytes to each code segment.
- (You can also use 0400 or 0x100 to specify 256 bytes.)
-
- The linker recognizes code segments as segment definitions with class names
- that end in CODE. Microsoft high-level languages automatically use this
- declaration for code segments. Code padding is not usually necessary for
- programs with multiple code segments but is recommended for mixed-model
- programs, programs with one code segment, and assembly-language programs in
- which code segments are grouped.
-
-
- 12.5.28 The /PADD Option
-
-
- Option
-
- /PADD«ATA»«:padsize»
-
- The /PADD option adds filler bytes to the end of each data segment to permit
- subsequent linking with ILINK. If you use /PADD, you must also specify the
- /INCR option.
-
- The padsize is optional; the default is 16 bytes. The /INCR option itself
- adds 16 bytes. This default padding is usually sufficient for successful
- incremental linking. If incremental linking fails, you can specify a padsize
- in decimal format or C-language notation. (If you specify too large a
- padsize, you might exceed the 64K limitation on the size of the default data
- segment.) For example, /PADD:32 adds an additional 32 bytes to each data
- segment. (You can also use 040 or 0x20 to specify 32 bytes.)
-
-
- 12.5.29 The /PAUSE Option
-
-
- Option
-
- /PAU«SE»
-
- The /PAUSE option pauses the session before LINK writes the executable file
- or DLL to disk. This option is supplied for compatibility with machines that
- have two floppy drives but no hard disk. It allows you to swap floppy disks
- before LINK writes the executable file.
-
- If you specify the /PAUSE option, LINK displays the following message before
- it creates the main output:
-
- About to generate .EXE file
- Change diskette in drive letter and press <ENTER>
-
- The letter is the current drive. LINK resumes processing when you press
- ENTER.
-
- Do not remove a disk that contains either the map file or the temporary
- file. If LINK creates a temporary file on the disk you plan to remove,
- terminate the LINK session and rearrange your files so that the temporary
- file is on a disk that does not need to be removed. For more information on
- how LINK determines where to put the temporary file, see Section 12.9, "LINK
- Temporary Files."
-
-
- 12.5.30 The /PM Option
-
-
- Option
-
- /PM«TYPE»:type
-
- This option specifies the type of Windows or OS/2 application being
- generated. The /PM option is equivalent to including a type specification in
- the NAME statement in a module-definition file.
-
- The type field can take one of the following values:
-
- Value Description
- ────────────────────────────────────────────────────────────────────────────
- PM Presentation Manager (PM) or Windows
- application. The application uses the
- API provided by PM or Windows and must
- be executed in the PM or Windows
- environment. This is equivalent to NAME
- WINDOWAPI.
-
- VIO Character-mode application to run in a
- text window in the
- PM or Windows session. This is
- equivalent to NAME
- WINDOWCOMPAT.
-
- NOVIO The default. Character-mode application
- that must run full screen and cannot run
- in a text window in PM or in Windows.
- This is equivalent to NAME
- NOTWINDOWCOMPAT.
-
-
-
- 12.5.31 The /Q Option
-
-
- Option
-
- /Q«UICKLIBRARY»
-
- The /Q option directs the linker to produce a "Quick library" instead of an
- executable file. A Quick library is similar to a standard library in that
- both contain routines that can be called by a program. However, a standard
- library is linked with a program at link time; in contrast, a Quick library
- is linked with a program at run time.
-
- When /Q is specified, the exefile field refers to a Quick library instead of
- an application. The default extension for this field is then .QLB instead of
- .EXE.
-
- Quick libraries can be used only with programs created with Microsoft
- QuickBasic or early versions of Microsoft QuickC. These programs have the
- special code that loads a Quick library at run time.
-
-
- 12.5.32 The /SEG Option
-
-
- Option
-
- /SE«GMENTS»«:number»
-
- The /SEG option sets the maximum number of program segments. The default
- without /SEG or number is 128. You can specify number as any value from 1 to
- 16,384 in individual format or C-language notation. However, the number of
- segment definitions is constrained by available memory.
-
- LINK must allocate some memory to keep track of information for each
- segment; the larger the number you specify, the less free memory LINK has to
- run in. A relatively low segment limit (such as the 128 default) reduces the
- chance LINK will run out of memory. For programs with fewer than 128
- segments, you can minimize LINK's memory requirements by setting number to
- reflect the actual number of segments in the program. If a program has more
- than 128 segments, however, you must set a higher value.
-
- If the number of segments allocated is too high for the amount of memory
- available while linking, LINK displays the error message
-
- L1054 requested segment limit too high
-
- When this happens, try linking again after setting /SEG to a smaller number.
-
-
-
- 12.5.33 The /STACK Option
-
-
- Option
-
- /ST«ACK»:number
-
- The /STACK option lets you change the stack size from its default value of
- 2,048 bytes. The number is any positive value in decimal or C-language
- notation, up to 64K.
-
- Programs that pass large arrays or structures by value or with deeply nested
- subroutines may need additional stack space. In contrast, if your program
- uses the stack very little, you might be able to save space by decreasing
- the stack size. If a program fails with a stack-overflow message, try
- increasing the size of the stack.
-
- ────────────────────────────────────────────────────────────────────────────
- NOTE
-
- You can also use the EXEHDR utility to change the default stack size by
- modifying the executable-file header.
- ────────────────────────────────────────────────────────────────────────────
-
-
- 12.5.34 The /TINY Option
-
-
- Option
-
- /T«INY»
-
- The /TINY option produces a .COM file instead of an .EXE file. The default
- extension of the output file is .COM. When the /CO option is used with
- /TINY, debug information is put in a separate file with the same base name
- as the .COM file and with the .DBG extension.
-
- Not every program can be linked in the .COM format. The following
- restrictions apply:
-
-
- ■ The program must consist of only one physical segment. You can declare
- more than one segment in assembly-language programs; however, the
- segments must be in the same group.
-
- ■ The code must not use far references.
-
- ■ Segment addresses cannot be used as immediate data for instructions.
- For example, you cannot use the following instruction:
-
- mov ax, CODESEG
-
-
- ■ Windows and OS/2 programs cannot be converted to a .COM format.
-
-
-
- 12.5.35 The /W Option
-
-
- Option
-
- /W«ARNFIXUP»
-
- The /W option issues the L4000 warning when LINK uses a displacement from
- the beginning of a group in determining a fixup value. This option is
- provided because early versions of the Windows linker (LINK4) performed
- fixups without this displacement. This option is for linking segmented
- executable files.
-
-
- 12.5.36 The /? Option
-
-
- Option
-
- /?
-
- The /? option displays a brief summary of LINK command-line syntax and
- options.
-
-
- 12.6 Setting Options with the LINK Environment Variable
-
- You can use the LINK environment variable to set options that will be in
- effect each time you link. (Microsoft compilers such as CL, FL, and ML also
- use the options in the LINK environment variable.)
-
-
- 12.6.1 Setting the LINK Environment Variable
-
- You set the LINK environment variable with the following operating-system
- command:
-
- SET LINK=options
-
- LINK expects to find options listed in the variable exactly as you would
- type them in fields on the command line, in response to a prompt, or in a
- response file. It does not accept input for other fields; filenames in the
- LINK variable cause an error.
-
-
- Example
-
- SET LINK=/NOI /SEG:256 /CO
- LINK TEST;
- LINK /NOD PROG;
-
- In the example above, the commands are specified at the system prompt. The
- file TEST.OBJ is linked using the options /NOI, /SEG:256, and /CO. The
- file PROG.OBJ is then linked with the option /NOD, in addition to /NOI,
- /SEG:256, and /CO.
-
-
- 12.6.2 Behavior of the LINK Environment Variable
-
- You can specify options on the LINK command line or in a response file in
- addition to those in the LINK environment variable. If an option appears
- both in an input field and in the LINK variable, the input-field option
- overrides any environment-variable option it conflicts with. For example,
- the command-line option /SEG:512 overrides the environment-variable option
- /SEG:256.
-
-
- 12.6.3 Clearing the LINK Environment Variable
-
- You must reset the LINK environment variable to prevent LINK from using its
- options. To clear the LINK variable, use the operating-system command
-
- SET LINK=
-
- To see the current setting of the LINK variable, type SET at the
- operatingsystem prompt.
-
-
- 12.7 Using Overlays under DOS
-
- LINK can create DOS programs with "overlays." Overlays allow sections of a
- program to be loaded into memory only as needed. This permits running a
- program that would otherwise be too large to fit in available memory.
- Overlay programs execute more slowly, however, since the various program
- modules must be swapped into and out of memory.
-
- The CodeView debugger is compatible with overlaid modules. If you use
- CodeView to debug a program that has an overlay containing more than one
- code segment, you will see only the identifiers contained in the first
- segment of the overlay.
-
-
- 12.7.1 Restrictions on Overlays
-
- Not all programs can use overlays. You will probably need to reorganize the
- code to accommodate the limitations explained in this section. Even after
- reorganization, some programs might not be convertible to overlay form or
- might not show a significant reduction in the amount of memory needed to
- execute them.
-
- Consider the following restrictions before trying to overlay a program:
-
-
- ■ You can use overlays only in programs with multiple code segments,
- because separate segment names are needed for overlays. Only code is
- overlaid, not data. The data becomes part of the "root" section of the
- program that is always in memory.
-
- ■ Only 255 overlays can be specified. The program can define only 255
- logical segments (segments with different names). This limits the
- total size of an overlaid program to 16 megabytes.
-
- ■ Only one overlay (in addition to the root) can be in memory at any one
- time. You must structure your program accordingly.
-
- ■ Duplicate names for different overlays are not supported; each module
- can appear only once in a program.
-
- ■ You must use far call/return instructions to transfer control between
- overlaid files. You cannot overlay files containing near routines if
- other overlays call those routines.
-
- ■ You cannot jump out of or into overlaid files using the longjmp
- C-library function. You can, however, use long jumps within an
- overlaid file.
-
- ■ You cannot use a function pointer to call a routine out of or into
- overlaid files. You can, however, use a function pointer to call a
- routine within an overlaid file.
-
- ■ You cannot use the same public name in different overlays.
-
- ■ The code required to manage overlays adds about 2K to 3K to the size
- of the root module.
-
-
- ────────────────────────────────────────────────────────────────────────────
- WARNING
-
- Never rename an executable program file containing overlays if it is to run
- under DOS 2.x and earlier. LINK records the .EXE filename in the program
- file. If you rename the file, the overlay manager may not be able to locate
- the proper file. You can rename an .EXE file that will run under DOS 3.x and
- later.
- ────────────────────────────────────────────────────────────────────────────
-
-
- 12.7.2 Specifying Overlays
-
- Specify overlays by enclosing object-file (and possibly load-library) names
- in parentheses in the objfiles field. Each group of object files bracketed
- by parentheses represents one overlay. Overlays cannot be nested.
-
- The remaining modules (those not in parentheses), and any drawn from the
- run-time libraries, constitute the resident (or root) part of your program.
- The entry point to the program (for example, main() in a C program, or
- PROGRAM in a FORTRAN program) must be in the root.
-
-
- Example
-
- The following list of files contains three overlays:
-
- a + (b+c) + (d+e) + f + (g)
-
- In this example, the groups (b+c), (d+e), and (g) are overlays. The
- remaining files a and f and any modules from libraries in the libraries
- field remain memory-resident throughout the execution of the program.
-
- It is important to remember that whichever object file first defines a
- segment gets all contributions to that segment. In the example above, if
- D.OBJ and F.OBJ both define the same segment, the contribution from F.OBJ to
- that segment goes into the (d+e) overlay rather than into the root.
-
-
- 12.7.3 How Overlays Work
-
- Programs that use overlays require the overlay-manager code to handle module
- swapping. This code is included as part of the standard libraries for
- Microsoft high-level languages. If you specify overlays during linking, the
- code for the overlay manager is automatically linked with the rest of your
- program.
-
- LINK produces only one .EXE file. The overlay manager searches for this file
- whenever another overlay needs to be loaded. It first searches in the
- current directory. If the file is not there, the manager then searches the
- directories in the PATH environment variable. If the overlay manager still
- cannot find the file, it prompts for the pathname.
-
-
- Example
-
- Assume that an executable program called PAYROLL.EXE uses overlays and does
- not exist in either the current directory or the directories specified by
- PATH. If you run PAYROLL.EXE by entering a complete path specification, the
- overlay manager displays the following message when it attempts to load an
- overlay file:
-
- Cannot find PAYROLL.EXE
- Please enter new program spec:
-
- You can then enter the drive or directory, or both, where PAYROLL.EXE is
- located. For example, if the file is located in directory \EMPLOYEE\DATA\ on
- drive B, enter B:\EMPLOYEE\DATA\; if the current drive is B, you can enter
- just \EMPLOYEE\DATA\.
-
- If you later remove the disk in drive B and the overlay manager needs the
- overlay again, it does not find PAYROLL.EXE and displays the following
- message:
-
- Please insert diskette containing B:\EMPLOYEE\DATA\PAYROLL.EXE
- in drive B: and strike any key when ready.
-
- After the overlay file has been read from the disk, the overlay manager
- displays the following message:
-
- Please restore the original diskette.
- Strike any key when ready.
-
-
- 12.7.4 Overlay Interrupts
-
- LINK replaces far calls to routines in overlays with interrupts (followed by
- the module identifier and offset). By default, the interrupt number is 63
- (3F hexadecimal). You can use the /OV option to change the interrupt number.
-
-
-
- 12.8 Linker Operation under DOS
-
- LINK performs the following steps to produce a DOS executable file:
-
-
- 1. Reads the object modules submitted
-
- 2. Searches the given libraries, if necessary, to resolve external
- references
-
- 3. Assigns addresses to segments
-
- 4. Assigns addresses to public symbols
-
- 5. Reads code and data in the segments
-
- 6. Reads all relocation references in object modules
-
- 7. Performs fixups
-
- 8. Outputs an executable file (executable image and relocation
- information)
-
-
- Steps 5, 6, and 7 are performed iteratively─that is, LINK repeats these
- steps as many times as required before it progresses to step 8.
-
- The "executable image" contains the code and data that constitute the
- executable file. The "relocation information" is a list of references
- relative to the start of the program, each of which changes when the
- executable image is loaded into memory and an actual address for the entry
- point is assigned.
-
- The following sections explain the process LINK uses to concatenate segments
- and resolve references to items in memory.
-
-
- 12.8.1 Segment Alignment
-
- LINK uses each segment's alignment type to set the starting address for the
- segment. The alignment types are BYTE, WORD, DWORD, PARA, and PAGE. These
- correspond to starting addresses at byte, word, doubleword, paragraph, and
- page boundaries, representing addresses that are multiples of 1, 2, 4, 16,
- and 256, respectively. The default alignment is PARA.
-
- When LINK encounters a segment, it checks the alignment type before copying
- the segment to the executable file. If the alignment is WORD, DWORD, PARA,
- or PAGE, LINK checks the executable image to see if the last byte copied
- ends at an appropriate boundary. If not, LINK pads the image with extra null
- bytes.
-
-
- 12.8.2 Frame Number
-
- LINK computes a starting address for each segment in a program. The starting
- address is based on a segment's alignment and the sizes of the segments
- already copied to the executable file. The address consists of an offset and
- a "canonical frame number." The canonical frame number specifies the address
- of the first paragraph in memory containing one or more bytes of the
- segment. (A paragraph is 16 bytes of memory; therefore, to compute a
- physical location in memory, multiply the frame number by 16 and add the
- offset.) The offset is the number of bytes from the start of the paragraph
- to the first byte in the segment. For BYTE, WORD, and DWORD alignments, the
- offset may be nonzero. The offset is always zero for PARA and PAGE
- alignments. (An offset of zero means that the physical location is an exact
- multiple of 16.)
-
- The frame number of a segment can be obtained from the map file created by
- LINK. The first four digits of the start address give the frame number in
- hexadecimal. For example, a start address of 0C0A6 gives a frame number of
- 0C0A.
-
-
- 12.8.3 Segment Order
-
- LINK copies segments to the executable file in the same order that it
- encounters them in the object files. This order is maintained throughout the
- program unless LINK encounters two or more segments having the same class
- name. Segments having identical class names belong to the same class type
- and are copied as a contiguous block to the executable file.
-
- The /DOSSEG option might change the way in which segments are ordered.
-
-
- 12.8.4 Combined Segments
-
- LINK uses combine types to determine whether two or more segments sharing
- the same segment name should be combined into one large segment. The valid
- combine types are PUBLIC, STACK, COMMON, and PRIVATE.
-
- If a segment has combine type PUBLIC, LINK automatically combines it with
- any other segments having the same name and belonging to the same class.
- When LINK combines segments, it ensures that the segments are contiguous and
- that all addresses in the segments can be accessed using an offset from the
- same frame address. The result is the same as if the segment were defined as
- a whole in one source file.
-
- LINK preserves each individual segment's alignment type. This means that
- even though the segments belong to a single large segment, the code and data
- in the segments do not lose their original alignment. If the combined
- segments exceed 64K, LINK displays an error message.
-
- If a segment has combine type STACK, LINK carries out the same combine
- operation as for PUBLIC segments. The only exception is that STACK segments
- cause LINK to copy an initial stack-pointer value to the executable file.
- This stack-pointer value is the offset to the end of the first stack segment
- (or combined stack segment) encountered.
-
- If a segment has combine type COMMON, LINK automatically combines it with
- any other segments having the same name and belonging to the same class.
- When LINK combines COMMON segments, however, it places the start of each
- segment at the same address, creating a series of overlapping segments. The
- result is a single segment no larger than the largest segment combined.
-
- A segment has combine type PRIVATE only if no explicit combine type is
- defined for it in the source file. LINK does not combine private segments.
-
-
- 12.8.5 Groups
-
- Groups allow segments to be addressed relative to the same frame address.
- When LINK encounters a group, it adjusts all memory references to items in
- the group so that they are relative to the same frame address.
-
- Segments in a group do not have to be contiguous, belong to the same class,
- or have the same combine type. The only requirement is that all segments in
- the group fit within 64K.
-
- Groups do not affect the order in which the segments are loaded. Unless you
- use class names and enter object files in the right order, there is no
- guarantee the segments will be contiguous. In fact, LINK may place segments
- that do not belong to the group in the same 64K of memory. LINK does not
- explicitly check that all segments in a group fit within 64K of memory;
- however, LINK is likely to encounter a fixup-overflow error if this
- requirement is not met.
-
-
- 12.8.6 Fixups
-
- Once the starting address of each segment in a program is known and all
- segment combinations and groups have been established, LINK can "fix up" any
- unresolved references to labels and variables. To fix up unresolved
- references, LINK computes an appropriate offset and segment address and
- replaces the temporary values generated by the assembler with the new
- values.
-
- LINK carries out fixups for the types of references shown in Table 12.1.
-
- The size of the value to be computed depends on the type of reference. If
- LINK discovers an error in the anticipated size of a reference, it displays
- a fixupoverflow message. This can happen, for example, if a program attempts
- to use a 16-bit offset to reach an instruction which is more than 64K away.
- It can also occur if all segments in a group do not fit within a single 64K
- block of memory.
-
- Table 12.1 LINK Fixups
-
- Type Location of Reference LINK Action
- ────────────────────────────────────────────────────────────────────────────
- Short In JMP instructions Computes a signed, eight-bit
- that attempt to pass number for the reference and
- control to labeled displays an error message if
- instructions in the the target instruction belongs
- same segment or group. to a different segment or group
- The target instruction (has a different frame address),
- must be no more than or if the target is more than
- 128 bytes from the 128 bytes away in either
- point of reference. direction.
-
- Near In instructions that Computes a 16-bit offset for
- self-relative access data relative to the reference and displays an
- the same segment or error if the data are not in
- group. the same segment or group.
-
- Near In instructions that Computes a 16-bit offset for
- segment-relative attempt to access data the reference and displays an
- in a specified segment error message if the offset of
- or group, or relative the target within the specified
- to a specified segment frame is greater than 64K or
- less than 0, or if the
- register. beginning of the canonical
- frame of the target is not
- addressable.
-
- Long In CALL instructions Computes a 16-bit frame address
- that attempt to access and 16-bit offset for this
- an instruction in reference, and displays an
- another segment or error message if the computed
- group. offset is greater than 64K or
- less than 0, or if the
- beginning of the canonical
- frame of the target is not
- addressable.
-
- ────────────────────────────────────────────────────────────────────────────
-
-
-
- 12.9 LINK Temporary Files
-
- LINK uses available memory during the linking session. If LINK runs out of
- memory, it creates a disk file to hold intermediate files. LINK deletes this
- file when it finishes.
-
- When the linker creates a temporary disk file, you see the message
-
- Temporary file tempfile has been created.
- Do not change diskette in drive, letter.
-
- In the message displayed above, tempfile is the name of the temporary file
- and letter is the drive containing the temporary file. (The second line
- appears only for a floppy drive.)
-
- After this message appears, do not remove the disk from the drive specified
- by letter until the link session ends. If the disk is removed, the operation
- of LINK is unpredictable, and you might see the following message:
-
- Unexpected end-of-file on scratch file
-
- If this happens, run LINK again.
-
-
- Location of the Temporary File
-
- If the TMP environment variable defines a temporary directory, LINK creates
- temporary files there. If the TMP environment variable is undefined or the
- temporary directory doesn't exist, LINK creates temporary files in the
- current directory.
-
-
- Name of the Temporary File
-
- When running under OS/2 or DOS version 3.0 or later, LINK asks the operating
- system to create a temporary file with a unique name in the temporary-file
- directory.
-
- Under DOS versions earlier than 3.0, LINK creates a temporary file named
- VM.TMP. Do not use this name for your files. LINK generates an error message
- if it encounters an existing file with this name.
-
-
- 12.10 LINK Exit Codes
-
- LINK returns an exit code (also called return code or error code) that you
- can use to control the operation of batch files or makefiles.
-
- ╓┌─────────────────────────────────┌─────────────────────────────────────────╖
- Code Meaning
- ────────────────────────────────────────────────────────────────────────────
- 0 No error.
-
- 2 Program error. Commands or files given
- as input to the linker produced the
- error.
-
- 4 System error. The linker
-
- Ran out of space on output files
-
- Was unable to reopen the temporary file
-
- Code Meaning
- ────────────────────────────────────────────────────────────────────────────
- Experienced an internal error
-
- Was interrupted by the user
-
-
-
-
- 12.11 Related Topics in Online Help
-
- In addition to information covered in this chapter, information on the
- following topics can be found in online help.
-
- Topic Access
- ────────────────────────────────────────────────────────────────────────────
- Syntax and procedural information on Choose these topics from the
- LINK, BIND, and LIB "Microsoft Advisor Contents" screen
-
- Syntax and procedural information on Choose "Miscellaneous" from the list
- EXEHDR of utilities on the "Microsoft
- Advisor Contents" screen
-
-
-
-
-
-
-
- Chapter 13 Module-Definition Files
- ────────────────────────────────────────────────────────────────────────────
-
- This chapter describes the contents of a module-definition file. It begins
- with a brief overview of the purpose of module-definition files. The rest of
- the chapter discusses each statement in a module-definition file and
- describes syntax rules, argument fields, attributes, and keywords for each
- statement.
-
-
- 13.1 Overview
-
- A module-definition file is a text file that describes the name, attributes,
- exports, imports, system requirements, and other characteristics of an
- application or dynamic-link library (DLL) for OS/2 or Microsoft Windows.
- This file is required for DLLs and is optional (but desirable) for OS/2 and
- Windows applications.
-
- You use module-definition files in two situations:
-
-
- ■ You can specify a module-definition file in LINK's deffile field. The
- module-definition file gives LINK the information it needs to
- determine how to set up the application or DLL it creates.
-
- ■ You can provide LINK with the needed information when creating an
- application by using the Microsoft Import Library Manager utility
- (IMPLIB) to create an import library from a module-definition file (or
- from the DLL created by a module-definition file). You then specify
- the import library in LINK's libraries field.
-
-
- For more information about IMPLIB, see online help.
-
-
- 13.2 Module Statements
-
- A module-definition file contains one or more "module statements." Each
- module statement defines an attribute of the executable file, such as its
- name, the attributes of program segments, and the number and names of
- exported and imported functions and data. Table 13.1 summarizes the purpose
- of the module statements and shows the order in which they are discussed in
- this chapter.
-
- Table 13.1 Module Statements
-
- ╓┌─────────────┌─────────────────────────────────────────────────────────────╖
- Statement Purpose
- ────────────────────────────────────────────────────────────────────────────
- NAME Names the application (no library created)
- LIBRARY Names the DLL (no application created)
- DESCRIPTION Embeds text in the application or DLL
- STUB Adds a DOS executable file to the beginning of the file
- EXETYPE Identifies the target operating system
- Statement Purpose
- ────────────────────────────────────────────────────────────────────────────
- EXETYPE Identifies the target operating system
- PROTMODE Specifies a protected-mode application or DLL
- REALMODE Supported for compatibility
- STACKSIZE Sets stack size in bytes
- HEAPSIZE Sets local heap size in bytes
- CODE Sets default attributes for all code segments
- DATA Sets default attributes for all data segments
- SEGMENTS Sets attributes for specific segments
- OLD Preserves ordinals from a previous DLL
- EXPORTS Defines exported functions
- IMPORTS Defines imported functions
- ────────────────────────────────────────────────────────────────────────────
-
-
-
- 13.2.1 Syntax Rules
-
- The syntax rules in this section apply to all statements in a
- module-definition file. Other rules specific to each statement are described
- in the sections that follow.
-
-
- ■ Statement and attribute keywords are not case sensitive. A statement
- keyword can be preceded by spaces and tabs.
-
- ■ A NAME or LIBRARY statement, if used, must precede all other
- statements.
-
- ■ Most statements appear at most once in a file and accept one
- specification of parameters and attributes. The specification follows
- the statement keyword on the same or subsequent line(s). If repeated
- with a different specification later in the file, the later statement
- overrides the earlier one.
-
- ■ The SEGMENTS, EXPORTS, and IMPORTS statements can appear more than
- once in the file and take multiple specifications, each on its own
- line. The statement keyword must appear once before the first
- specification and can be repeated before each additional
- specification.
-
- ■ Comments in the file are designated by a semicolon (;) at the
- beginning of each comment line. A comment cannot share a line with
- part or all of a statement but can appear between lines of a multiline
- statement.
-
- ■ Numeric arguments can be specified in decimal or in C-language
- notation.
-
- ■ Name arguments cannot match a reserved word.
-
-
-
- Example
-
- The sample module-definition file below gives a description for a DLL. This
- sample file includes one comment and five statements.
-
- ; Sample module-definition file
-
- LIBRARY
-
- DESCRIPTION 'Sample dynamic-link library'
-
- CODE PRELOAD
-
- STACKSIZE 1024
-
- EXPORTS
- Init @1
- Begin @2
- Finish @3
- Load @4
- Print @5
-
-
- 13.2.2 Reserved Words
-
- The following words are reserved by the linker for use in module-definition
- files. These names cannot be used as arguments in module-definition
- statements.
-
- (This figure may be found in the printed book.)
-
- * DOS4 and HUGE are obsolete but are still reserved by the linker.
-
- In addition to the words listed above, the following words are reserved for
- use by future or other versions of the linker and should be avoided.
-
- (This figure may be found in the printed book.)
-
-
- 13.3 The NAME Statement
-
- The NAME statement identifies the executable file as an application (rather
- than a DLL). It can also specify the name and application type. The NAME or
- LIBRARY statement must precede all other statements. If NAME is specified,
- the LIBRARY statement cannot be used. If neither is used, the default is
- NAME and LINK creates an application.
-
-
- Syntax
-
- NAME «appname» «apptype» «NEWFILES»
-
-
- Remarks
-
- The fields can appear in any order.
-
- If appname is specified, it becomes the name of the application as it is
- known by OS/2 or Windows. This name can be any valid filename. If appname
- contains a space, begins with a nonalphabetic character, or is a reserved
- word, surround appname with double quotation marks. The name cannot exceed
- 255 characters (not including surrounding quotation marks). If appname is
- not specified, the base name of the executable file becomes the name of the
- application.
-
- If apptype is specified, it defines the type of application. This
- information is kept in the executable-file header. The apptype field can
- take one of the following values:
-
- Value Description
- ────────────────────────────────────────────────────────────────────────────
- WINDOWAPI Presentation Manager (PM) or Windows
- application. The application uses the
- API provided by PM or Windows and must
- be executed in the PM or Windows
- environment. This is equivalent to the
- LINK option /PM:PM.
-
- WINDOWCOMPAT Character-mode application to run in a
- text window in the PM or Windows session.
- This is equivalent to the LINK option
- /PM:VIO.
-
- NOTWINDOWCOMPAT The default. Character-mode application
- that must run full screen and cannot run
- in a text window in PM or Windows. This
- is equivalent to the LINK option
- /PM:NOVIO.
-
-
- Specify NEWFILES to tell the operating system that the application supports
- long filenames and extended file attributes (available under OS/2 version
- 1.2 and later). The synonym LONGNAMES is supported for compatibility.
-
-
- Example
-
- The example below assigns the name calendar to an application that can run
- in a text window in PM or Windows:
-
- NAME calendar WINDOWCOMPAT
-
-
- 13.4 The LIBRARY Statement
-
- The LIBRARY statement identifies the executable file as a DLL. It can also
- specify the name of the library and the type of library-module
- initialization required. The NAME or LIBRARY statement must precede all
- other statements. If LIBRARY is specified, the NAME statement cannot be
- used. If neither is used, the default is NAME.
-
-
- Syntax
-
- LIBRARY «libraryname» «initialization»
- «PRIVATELIB»
-
-
- Remarks
-
- The fields can appear in any order.
-
- If libraryname is specified, it becomes the name of the library as it is
- known by OS/2 or Windows. This name can be any valid filename. If
- libraryname contains a space, begins with a nonalphabetic character, or is a
- reserved word, surround the name with double quotation marks. The name
- cannot exceed 255 characters. If libraryname is not given, the base name of
- the DLL file becomes the name of the library.
-
- If initialization is specified, it determines the type of initialization
- required. The initialization field can take one of the following values:
-
- Value Description
- ────────────────────────────────────────────────────────────────────────────
- INITGLOBAL The default. The library-initialization
- routine is called only when the library
- is initially loaded into memory.
-
- INITINSTANCE The library-initialization routine is
- called each time a new process gains
- access to the DLL. This keyword applies
- only to OS/2.
-
-
- If PRIVATELIB is specified, it tells Windows that only one application may
- use the DLL.
-
-
- Example
-
- The following example assigns the name calendar to the DLL being defined
- and specifies that library initialization is performed each time a new
- process gains access to calendar:
-
- LIBRARY calendar INITINSTANCE
-
-
- 13.5 The DESCRIPTION Statement
-
- The DESCRIPTION statement inserts specified text into the application or
- DLL. This statement is useful for embedding source-control or copyright
- information into a file.
-
-
- Syntax
-
- DESCRIPTION 'text'
-
-
- Remarks
-
- The text is a string of up to 255 characters enclosed in single or double
- quotation marks (' or "). To include a literal quotation mark in the text,
- either specify two consecutive quotation marks of the same type or enclose
- the text with the other type of quotation mark. If a DESCRIPTION statement
- is not specified, the default text is the name of the main output file as
- specified in LINK's exefile field. You can view this string by using the
- Microsoft EXE File Header Utility (EXEHDR).
-
- The DESCRIPTION statement is different from a comment. A comment is a line
- that begins with a semicolon (;). Comments are not placed in the application
- or library.
-
-
- Example
-
- The following example inserts the text Tester's Version, Test "A",
- including a literal single quotation mark and a pair of literal double
- quotation marks, into the application or DLL being defined:
-
- DESCRIPTION "Tester's Version, Test ""A"""
-
-
- 13.6 The STUB Statement
-
- The STUB statement adds a DOS executable file to the beginning of an OS/2 or
- Windows application or DLL. The stub is invoked whenever the file is
- executed under DOS. Usually, the stub displays a message and terminates
- execution. By default, LINK adds a standard stub for this purpose. Use the
- STUB statement when creating a dual-mode program.
-
-
- Syntax
-
- STUB {'filename' | NONE}
-
-
- Remarks
-
- The filename specifies the DOS executable file to be added. LINK searches
- for filename first in the current directory and then in directories
- specified with the PATH environment variable. The filename must be
- surrounded by single or double quotation marks (' or ").
-
- The alternate specification NONE prevents LINK from adding a default stub.
- This saves space in the application or DLL, but the resulting file will hang
- the system if loaded in DOS.
-
-
- Example
-
- The following example inserts the DOS executable file STOPIT.EXE at the
- beginning of the application or DLL:
-
- STUB 'STOPIT.EXE'
-
- The file STOPIT.EXE is executed when you attempt to run the application or
- DLL under DOS.
-
-
- 13.7 The EXETYPE Statement
-
- The EXETYPE statement specifies under which operating system the application
- or DLL is to run. This statement is optional and provides an additional
- degree of protection against the program being run under an incorrect
- operating system.
-
-
- Syntax
-
- EXETYPE «OS2 | WINDOWS« version» |
- UNKNOWN»
-
-
- Remarks
-
- The EXETYPE keyword is followed by a descriptor of the operating system,
- either OS2 (for OS/2 applications and DLLs), WINDOWS (for WINDOWS
- applications and DLLs), or UNKNOWN (for other applications). The default
- without a descriptor or an EXETYPE statement is OS2.
-
- EXETYPE sets bits in the header which identify the operating system.
- Operating-system loaders can check these bits.
-
-
- Windows Programming
-
- The WINDOWS descriptor takes an optional version number. Windows reads this
- number to determine the minimum version of Windows needed to load the
- application or DLL. For example, if 3.0 is specified, the resulting
- application or DLL
-
- can run under Windows versions 3.0 and higher. If version is not specified,
- the default is 3.0. The syntax for version is
-
- number«.«number» »
-
- where each number is a decimal integer.
-
- In Windows programming, use the EXETYPE statement with a PROTMODE statement
- to specify an application or DLL that runs only under protected-mode
- Windows.
-
-
- 13.8 The PROTMODE Statement
-
- The PROTMODE statement specifies that the application or DLL runs only under
- OS/2 or under Windows 3.0 standard mode and 386 enhanced mode. PROTMODE lets
- LINK optimize to reduce both the size of the file on disk and its loading
- time. However, an OS/2 program created with PROTMODE cannot be bound using
- BIND. Use PROTMODE in combination with an EXETYPE WINDOWS statement to
- define an application or DLL that runs only under protected-mode Windows.
-
-
- Syntax
-
- PROTMODE
-
-
- Example
-
- The following statement combination defines an application that runs only
- under protected-mode (standard or 386 enhanced) Windows version 3.0:
-
- EXETYPE WINDOWS 3.0
- PROTMODE
-
-
-
-
-
- 13.9 The REALMODE Statement
-
- The REALMODE statement specifies that the application runs only in real
- mode. This statement is supported for compatibility with existing
- module-definition files. Use EXETYPE instead.
-
-
- Syntax
-
- REALMODE
-
-
- 13.10 The STACKSIZE Statement
-
- The STACKSIZE statement specifies the size of the stack in bytes. It
- performs the same function as LINK's /STACK option. If both are specified,
- the STACKSIZE statement overrides the /STACK option.
-
-
- Syntax
-
- STACKSIZE number
-
-
- Remarks
-
- The number must be a positive integer, in decimal or C-language notation, up
- to 64K.
-
-
- Example
-
- The following example allocates 4,096 bytes of stack space:
-
- STACKSIZE 4096
-
-
- 13.11 The HEAPSIZE Statement
-
- The HEAPSIZE statement defines the size of the application or DLL's local
- heap in bytes. This value affects the size of the default data segment
- (DGROUP). The default without HEAPSIZE is no local heap.
-
-
- Syntax
-
- HEAPSIZE {bytes | MAXVAL}
-
-
- Remarks
-
- The bytes field accepts a positive integer in decimal or C-language
- notation. The limit is MAXVAL; if bytes exceeds MAXVAL, the excess is not
- allocated.
-
- MAXVAL is a keyword that sets the heap size to 64K minus the size of DGROUP.
- This is useful in bound applications when you want to force a 64K
- requirement for DGROUP for the program in DOS. The bound program fails to
- load if 64K of memory is not available.
-
-
- Example
-
- The following example sets the local heap to 4,000 bytes:
-
- HEAPSIZE 4000
-
-
- 13.12 The CODE Statement
-
- The CODE statement defines the default attributes for all code segments
- within the application or DLL. The SEGMENTS statement can override this
- default for one or more specific segments.
-
-
- Syntax
-
- CODE «attribute...»
-
-
- Remarks
-
- This statement accepts several optional attribute fields: conforming,
- discard, executeonly, iopl, load, movable, and shared. Each can appear once,
- in any order. These fields are described in Section 13.15, "CODE, DATA, and
- SEGMENTS Attributes."
-
-
- Example
-
- The following example sets defaults for the program's code segments. No code
- segments in the program are loaded until accessed, and all require I/O
- hardware privilege.
-
- CODE LOADONCALL IOPL
-
-
- 13.13 The DATA Statement
-
- The DATA statement defines the default attributes for all data segments
- within the application or DLL. The SEGMENTS statement can override this
- default for one or more specific segments.
-
-
- Syntax
-
- DATA «attribute...»
-
-
- Remarks
-
- This statement accepts several optional attribute fields: instance, iopl,
- load, movable, readonly, and shared. Each can appear once, in any order.
- These fields are described in Section 13.15, "CODE, DATA, and SEGMENTS
- Attributes."
-
-
- Example
-
- The example below defines the application's data segment so that it cannot
- be shared by multiple copies of the program and cannot be written to. By
- default, the data segment can be read and written to and a new DGROUP is
- created for each instance of the application.
-
- DATA NONSHARED READONLY
-
-
- 13.14 The SEGMENTS Statement
-
- The SEGMENTS statement defines the attributes of one or more individual
- segments in the application or DLL. The attributes specified for a specific
- segment override the defaults set in the CODE and DATA statements (except as
- noted below). The total number of segment definitions cannot exceed the
- number set using LINK's /SEG option. (The default without /SEG is 128.)
-
- The SEGMENTS keyword marks the beginning of the segment definitions, where
- each definition is on its own line. The SEGMENTS statement must appear once
- before the first specification (on the same or preceding line) and can be
- repeated before each additional specification. SEGMENTS statements can
- appear more than once in the file.
-
-
- Syntax
-
- SEGMENTS
- «'»segmentname«'» «CLASS 'classname'»
- «attribute...»
-
-
- Remarks
-
- Each segment definition begins with segmentname, optionally enclosed in
- single or double quotation marks (' or "). The quotation marks are required
- if segmentname is a reserved word.
-
- The CLASS keyword optionally specifies the class of the segment. Single or
- double quotation marks (' or ") are required around classname. If you do not
- use the CLASS argument, the linker assumes that the class is CODE.
-
- This statement accepts several optional attribute fields: conforming,
- discard, executeonly, iopl, load, movable, readonly, and shared. Each can
- appear once, in any order. These fields are described in the next section,
- "CODE, DATA, and SEGMENTS Attributes."
-
-
- Example
-
- The following example specifies segments named cseg1, cseg2, and dseg.
- The first segment is assigned the class mycode and the second is assigned
- CODE by default. Each segment is given different attributes.
-
- SEGMENTS
- cseg1 CLASS 'mycode' IOPL
- cseg2 EXECUTEONLY PRELOAD CONFORMING
- dseg CLASS 'data' LOADONCALL READONLY
-
-
- 13.15 CODE, DATA, and SEGMENTS Attributes
-
- The following attribute fields apply to the CODE, DATA, and SEGMENTS
- statements previously described. Refer to "Remarks" in each of the previous
- sections for the attribute fields that are used by each statement. Most
- fields are used by all three statements; others are used as noted. Each
- field can appear once, in any order.
-
- Listed with each attribute field below are keywords that are legal values
- for the field, along with descriptions of the field and values. The defaults
- are noted. If two segments with different attributes are combined into the
- same group, LINK makes decisions to resolve any conflicts and assumes a set
- of attributes.
-
- ╓┌───────────────────┌───────────────────────────────────────────────────────╖
- Attribute Description
- ────────────────────────────────────────────────────────────────────────────
- conforming {CONFORMING | NONCONFORMING}
-
- For CODE and SEGMENTS statements only. Determines
- whether a code segment is an 80286 "conforming"
- segment for device drivers and system-level code. The
- conforming attribute is for OS/2 only.
-
- CONFORMING specifies that the segment executes at the
- caller's privilege level. When IOPL=YES is specified
- in CONFIG.SYS, no call gates are generated for calls
- or jumps.
-
- NONCONFORMING (the default) specifies that the segment
- can be accessed from Ring 2. When IOPL=YES is
- specified in CONFIG.SYS, call gates are generated.
-
- For more information, refer to Intel documentation for
- the 80286 processor and later.
- Attribute Description
- ────────────────────────────────────────────────────────────────────────────
- the 80286 processor and later.
-
- discard {DISCARDABLE | NONDISCARDABLE}
-
- For CODE and SEGMENTS statements only. Determines
- whether a code segment can be discarded from memory
- to fill a different memory request. If the discarded
- segment is accessed later, it is reloaded from disk.
- NONDISCARDABLE is the default. The discard attribute
- is for Windows only.
-
-
-
-
- executeonly {EXECUTEONLY | EXECUTEREAD}
-
- For CODE and SEGMENTS statements only. Determines
- whether a code segment can be read as well as executed.
-
- Attribute Description
- ────────────────────────────────────────────────────────────────────────────
- EXECUTEONLY specifies that the segment can only be
- executed. The keyword EXECUTE-ONLY is an alternate
- spelling.
-
- EXECUTEREAD (the default) specifies that the segment
- is both executable and readable. This attribute is
- necessary for a program to run under the Microsoft
- CodeView debugger.
-
- instance {NONE | SINGLE | MULTIPLE}
-
- For the DATA statement only. Affects the sharing
- attributes of the default data segment (DGROUP). This
- attribute interacts with the shared attribute.
-
- NONE tells the loader not to allocate DGROUP. Use NONE
- when a DLL has no data and uses an application's
- DGROUP.
- Attribute Description
- ────────────────────────────────────────────────────────────────────────────
- DGROUP.
-
- SINGLE (the default for DLLs) specifies that one
- DGROUP is shared by all instances of the DLL or
- application.
-
- MULTIPLE (the default for applications) specifies that
- DGROUP is copied for each instance of the DLL or
- application.
-
- iopl {IOPL | NOIOPL}
-
- Determines whether a segment has I/O privilege. OS/2
- only.
-
- IOPL specifies that a code segment has I/O privilege
- and that a data segment can be accessed only from an
- IOPL code segment.
-
- Attribute Description
- ────────────────────────────────────────────────────────────────────────────
- NOIOPL (the default) specifies that there is no I/O
- privilege for code and no protection for data.
-
- load {PRELOAD | LOADONCALL}
-
- Determines when a segment is loaded.
-
-
-
- (load, PRELOAD specifies that the segment is loaded when the
- continued) program starts.
-
- LOADONCALL (the default) specifies that the segment is
- not loaded until accessed and only if not already
- loaded.
-
- movable {MOVABLE | FIXED}
-
- Attribute Description
- ────────────────────────────────────────────────────────────────────────────
- Determines whether a segment can be moved in memory.
- Windows only. FIXED is the default. An alternative
- spelling for MOVABLE is MOVEABLE.
-
- readonly {READONLY | READWRITE}
-
- For DATA and SEGMENTS statements only. Determines
- access rights to a data segment.
-
- READONLY specifies that the segment can only be read.
-
- READWRITE (the default) specifies that the segment is
- both readable and writeable.
-
- shared {SHARED | NONSHARED}
-
- For real-mode Windows and for READWRITE data segments
- under OS/2 only. Determines whether all instances of
- Attribute Description
- ────────────────────────────────────────────────────────────────────────────
- under OS/2 only. Determines whether all instances of
- the program can share EXECUTEREAD and READWRITE
- segments. (Under OS/2, all code segments and READONLY
- data segments are shared.)
-
- SHARED (the default for DLLs) specifies that one copy
- of the segment is loaded and shared among all
- processes accessing the application or DLL. This
- attribute saves memory and can be used for code that
- is not self-modifying. An alternate keyword is PURE.
-
- NONSHARED (the default for applications) specifies
- that the segment must be loaded separately for each
- process. An alternate keyword is IMPURE.
-
- This attribute and the instance attribute interact for
- data segments. The instance attribute has the keywords
- NONE, SINGLE, and MULTIPLE. If DATA SINGLE is
- specified, LINK assumes SHARED; if DATA MULTIPLE is
- Attribute Description
- ────────────────────────────────────────────────────────────────────────────
- specified, LINK assumes SHARED; if DATA MULTIPLE is
- specified, LINK assumes NONSHARED. Similarly, DATA
- SHARED forces SINGLE, and DATA NONSHARED forces
- MULTIPLE.
-
-
-
-
- 13.16 The OLD Statement
-
- The OLD statement directs the linker to search another DLL for export
- ordinals. This statement preserves ordinal values used from older versions
- of a DLL. For more information on ordinals, see the sections below on the
- EXPORTS and IMPORTS statements.
-
- Exported names in the current DLL that match exported names in the old DLL
- are assigned ordinal values from the earlier DLL unless
-
-
- ■ The name in the old module has no ordinal value assigned, or
-
- ■ An ordinal value is explicitly assigned in the current DLL.
-
-
- Only one DLL can be specified; ordinals can be preserved from only one DLL.
- The OLD statement has no effect on applications.
-
-
- Syntax
-
- OLD 'filename'
-
-
- Remarks
-
- The filename specifies the DLL to be searched. It must be enclosed in single
- or double quotation marks (' or ").
-
-
- 13.17 The EXPORTS Statement
-
- The EXPORTS statement defines the names and attributes of the functions and
- data made available to other applications and DLLs, and of the functions
- that run with I/O privilege. By default, functions and data are hidden from
- other programs at run time. A definition is required for each function or
- data item being exported.
-
- The EXPORTS keyword marks the beginning of the export definitions, each on
- its own line. The EXPORTS keyword must appear once before the first
- definition (on the same or preceding line) and can be repeated before each
- additional definition. EXPORTS statements can appear more than once in the
- file.
-
- Some languages offer a way to export without using an EXPORTS statement. For
- example, in C the _exports keyword makes a function available from a DLL.
-
-
- Syntax
-
- EXPORTS
- entryname«=internalname» «@ord«
- RESIDENTNAME» » «NODATA» «pwords»
-
-
- Remarks
-
- The entryname defines the function or data-item name as it is known to other
- programs. The optional internalname defines the actual name of the exported
- function or data item as it appears within the exporting program; by
- default, this name is the same as entryname.
-
- The optional ord field defines a function's ordinal position within the
- moduledefinition table as an integer from 1 to 65,535. If ord is specified,
- the function can be called by either entryname or ord. Use of ord is faster
- and can save space.
-
- The optional keyword RESIDENTNAME specifies that entryname be kept resident
- in memory at all times. This keyword is applicable only if ord is used. (If
- ord is not used, the name entryname is always kept in memory.)
-
- The optional keyword NODATA specifies that there is no static data in the
- function.
-
- The pwords field specifies the total size of the function's parameters in
- words. This field is required only if the function executes with I/O
- privilege. When a function with I/O privilege is called, OS/2 consults
- pwords to determine how many words to copy from the caller's stack to the
- I/O-privileged function's stack.
-
-
- Example
-
- The following EXPORTS statement defines the three exported functions
- SampleRead, StringIn, and CharTest. The first two functions can be called
- either by their exported names or by an ordinal number. In the application
- or DLL where they are defined, these functions are named read2bin and
- str1, respectively. The first and last functions run with I/O privilege and
- therefore are given with the total size of the parameters.
-
- EXPORTS
- SampleRead = read2bin @8 24
- StringIn = str1 @4 RESIDENTNAME
- CharTest 6
-
-
- 13.18 The IMPORTS Statement
-
- The IMPORTS statement defines the names and locations of functions and data
- items to be imported (usually from a DLL) for use in the application or DLL.
- A definition is required for each function or data item being imported. This
- statement is an alternative to resolving references through an import
- library created by the IMPLIB utility; functions and data items listed in an
- import library do not require an IMPORTS definition.
-
- The IMPORTS keyword marks the beginning of the import definitions, each on
- its own line. The IMPORTS keyword must appear once before the first
- definition on the same or preceding line and can be repeated before each
- additional definition. IMPORTS statements can appear more than once in the
- file.
-
-
- Syntax
-
- IMPORTS
- «internalname=»modulename.entry
-
-
- Remarks
-
- The internalname specifies the function or data-item name as it is used in
- the importing application or DLL. Thus, internalname appears in the source
- code of the importing program, while the function may have a different name
- in the program where it is defined. By default, internalname is the same as
- the entry name. An internalname is required if entry is an ordinal value.
-
- The modulename is the filename of the exporting application or DLL that
- contains the function or data item.
-
- The entry field specifies the name or ordinal value of the function or data
- item as defined in the modulename application or DLL. If entry is an ordinal
- value, internalname must be specified. (Ordinal values are set in an EXPORTS
- statement.)
-
- ────────────────────────────────────────────────────────────────────────────
- NOTE
-
- A given symbol (function or data item) has a name for each of three
- different contexts. The symbol has a name used by the exporting program
- (application or DLL) where it is defined, a name used as an entry point
- between programs, and a name used by the importing program where the symbol
- is used. If neither program uses the optional internalname field, the symbol
- has the same name in all three contexts. If either of the programs uses the
- internalname field, the symbol may have more than one distinct name.
- ────────────────────────────────────────────────────────────────────────────
-
-
- Example
-
- The following IMPORTS statement defines three functions to be imported:
- SampleRead, SampleWrite, and a function that has been assigned an ordinal
- value of 1. The functions are found in the Sample, SampleA, and Read
- applications or DLLs, respectively. The function from Read is referred to
- as ReadChar in the importing application or DLL. The original name of the
- function, as it is defined in Read, may or may not be known and is not
- included in the IMPORTS statement.
-
- IMPORTS
- Sample.SampleRead
- SampleA.SampleWrite
- ReadChar = Read.1
-
-
- 13.19 Related Topics in Online Help
-
- In addition to information covered in this chapter, information on the
- following topics can be found in online help.
-
- Topic Access
- ────────────────────────────────────────────────────────────────────────────
- Syntax and procedural information on Choose "LIB" from the list of
- LIB utilities on the "Microsoft Advisor
- Contents" screen
-
- Module-definition files and IMPLIB Choose "LINK" from the list of
- utilities on the "Microsoft Advisor
- Contents" screen
-
-
-
-
-
-
-
-
-
- Chapter 14 Customizing the Microsoft Programmer's WorkBench
- ────────────────────────────────────────────────────────────────────────────
-
- The Microsoft Programmer's WorkBench (PWB) is not just a text editor, but
- also a full-featured platform for program development. It is both flexible
- (you can customize it to match your working habits) and extensible (you can
- add your own functions and features).
-
- This chapter explains three ways to customize the Programmer's WorkBench:
-
-
- ■ Setting switches
-
- ■ Assigning keystrokes
-
- ■ Writing macros
-
-
- While this chapter explains customizing techniques, it does not document
- every customizable feature. Please consult online help for detailed
- information about these and other PWB features.
-
- This chapter assumes you are familiar with basic PWB operation and
- terminology. If not, please read "Using the Programmer's WorkBench" in
- Installing and Using the Microsoft Macro Assembler Professional Development
- System. The Programmer's WorkBench is supplied with both the Macro Assembler
- and Microsoft C so that you can customize one copy of PWB to work with these
- and other languages.
-
-
- 14.1 Setting Switches
-
- The Programmer's WorkBench has a number of "switches," or user-configurable
- options, that control features such as how many lines the screen scrolls or
- whether you are prompted to save a file when you exit. Each switch has a
- name and can be assigned a value.
-
- There are two ways to set PWB switches. The easiest way is to choose Editor
- Settings from the Options menu. Saving the changes made to Editor Settings
- updates your TOOLS.INI initialization file. You can also directly edit
- TOOLS.INI. Either method can be used for more elaborate customizations, such
- as writing macros.
-
-
- 14.1.1 Changing Current Assignments and Switch Settings
-
- You can change the current editor switches and key assignments. Choose
- Editor Settings or Key Assignments from the Options menu. PWB displays these
- settings in a new window labeled Current Assignments and Switch Settings.
-
- The <ASSIGN> pseudofile is associated with the Current Assignments and
- Switch Settings window. A pseudofile exists only in memory; it has no
- counterpart on disk until you explicitly save it. Saving the <ASSIGN>
- pseudofile automatically saves any changes you make in the Current
- Assignments and Switch Settings window.
-
- To change a switch, edit the line on which it appears. For instance, the
- vscroll switch controls how many lines PWB scrolls vertically; its default
- setting is 1. To change it, move to the corresponding line:
-
- vscroll:1
-
- Change the 1 to 3 and move the cursor to another line. PWB highlights the
- line to indicate that the change has been executed. (If you make an illegal
- change, PWB signals an error.) The change takes effect immediately: PWB now
- scrolls text three lines at a time.
-
- When you save <ASSIGN>, PWB updates your TOOLS.INI file.
-
- PWB discards all changes at the end of a session unless you explicitly save
- them. You save changes by saving <ASSIGN> as you would any other file.
- Select Save from the File menu, or press SHIFT+F2.
-
- You can also use this method for more elaborate customizations, such as
- writing macros (see Section 14.3, "Writing Macros"). Simply insert a few
- blank lines in the Current Assignments and Switch Settings window and enter
- the new information in them.
-
- If you add or modify a line of the Current Assignments and Switch Settings
- window, PWB immediately alters its behavior accordingly; the new or changed
- lines are saved in TOOLS.INI when you save the <ASSIGN> file. However,
- deleting a line has no effect, either on PWB's behavior or the contents of
- TOOLS.INI; you must edit TOOLS.INI to remove an assignment.
-
-
- 14.1.2 Editing the TOOLS.INI Initialization File
-
- Another way to customize PWB is by editing TOOLS.INI, the initialization
- file used by PWB and other Microsoft language utilities. This is the most
- convenient way to perform extensive customizing.
-
- While the Current Assignments and Switch Settings window displays every
- customizable PWB item, the TOOLS.INI file contains lines only for items you
- have customized. PWB sets any items you omit from TOOLS.INI to a default
- value.
-
- TOOLS.INI is made up of sections that start with tags.
-
- Since TOOLS.INI can initialize a number of Microsoft tools, the file is
- divided into sections, one for each tool. Each section begins with a tag
- consisting of the tool's base name enclosed in square brackets: [PWB] for
- PWB.EXE, [NMAKE] for NMAKE.EXE, and so on.
-
- For example, assume you set the vscroll switch to 3 and saved the change,
- but you have not customized PWB in any other way. Your TOOLS.INI file will
- contain this section:
-
- [PWB]
- vscroll:3
-
- PWB reads TOOLS.INI at start-up and loads the settings from the [PWB]
- section.
-
- You can also create sections of TOOLS.INI that configure PWB for specific
- programming languages or operating systems. For instance, your TOOLS.INI
- file could contain a section beginning with the tag
-
- [PWB-.C]
-
- for C source files, and
-
- [PWB-.ASM]
-
- TOOLS.INI sections contain customization information.
-
- for assembly-language (.ASM) source files. Each time you load a file with
- the designated extension, PWB reads the appropriate section of TOOLS.INI.
- You can have a different set of macros and other customizations for each
- file type.
-
- TOOLS.INI can also contain sections specific to an operating system. The
- following tag introduces a section specific to DOS version 3.31, for
- instance:
-
- [PWB-3.31]
-
- You can combine tags as needed. For example, the tag
-
- [PWB-3.0 PWB-10.10R]
-
- applies to DOS version 3.0 and OS/2 version 1.1 real mode.
-
-
- 14.2 Assigning Functions to Keystrokes
-
- You can assign any PWB function to almost any keystroke. Keystroke
- assignments, like switches, are displayed in the Current Assignments and
- Switch Settings window (choose Key Assignments from the Options menu) and
- can be
-
- changed there. Suppose you want to assign the home cursor function to
- SHIFT+HOME. The default keystroke assignment for home is
-
- home:Goto
-
- If you change the assignment to
-
- home:Shift+Home
-
- SHIFT+HOME moves the cursor to the home (upper left) window position.
-
- You can assign the same function to more than one keystroke. For example,
- many keystrokes invoke the select function, which selects a text region. The
- preceding example adds a new keystroke (SHIFT+HOME) for the home function,
- but it does not remove the previous assignment (GOTO, the 5 key on the
- keypad).
-
- If you aren't sure whether a keystroke is already assigned, select the
- Current Assignments and Switch Settings window and press PGDN until you
- reach the Available Keys table. All unassigned keystrokes are displayed;
- once a keystroke is assigned, it no longer appears in this table.
-
- There are two limitations on keystroke assignments:
-
-
- ■ You should not reassign a keystroke that PWB assigns to a menu. For
- instance, ALT+F displays the File menu; PWB ignores any attempt to
- reassign ALT+F.
-
- ■ You should not reassign the ALT plus number keys 1- 6 (ALT+1, ALT+2,
- and so on). These keystrokes are reserved for the file history menu
- items.
-
-
- PWB uses the most recent duplicate key assignment.
-
- A keystroke can invoke only one function. If you accidentally assign a
- keystroke to more than one function, PWB uses the most recent assignment.
- For example,
-
- home:Ctrl+A
- setfile:Ctrl+A
-
- assigns the CTRL+A keystroke to two different functions, home and setfile.
- The second assignment overrides the first, assigning CTRL+A to setfile.
-
- You might occasionally want to "unassign," or disable, a keystroke. This is
- done by assigning the unassigned function to the keystroke. For example,
-
- unassigned:Ctrl+A
-
- disables CTRL+A. PWB signals an error when you press any unassigned key.
-
- As the list of assigned keystrokes shows, you can use SHIFT+CTRL as a
- prefix. For PWB to recognize this key combination, SHIFT must come first.
- For example, to use SHIFT+CTRL with M, you must type SHIFT+CTRL+M, not
- CTRL+SHIFT+M.
-
-
- 14.3 Writing Macros
-
- If you need a feature or function that is not a part of PWB, the quickest
- way to create it is by writing a macro in the TOOLS.INI file. A macro can do
- something as simple as inserting a line of text, or it can perform complex
- operations by invoking PWB functions and other macros.
-
-
- 14.3.1 Macro Syntax
-
- A macro can consist of any combination of PWB functions, literal text, and
- calls to previously defined macros. You can define up to 1,024 macros at one
- time.
-
- Anything inside quotation marks is literal text. Within literal text,
- quotation marks are represented by a backslash followed by quotation marks
- (\ ") and a backslash is represented by two consecutive backslashes (\ \).
- Only literal text is case sensitive; PWB ignores the case of everything
- else.
-
- The following macro "comments out" a line of MASM source code:
-
- comment:=begline "; "
- comment:alt+c
-
- The first line names the macro (comment); the macro commands follow the
- assignment operator ( := ). The begline editor function moves the cursor to
- the beginning of the current line. The text inside quotation marks (the MASM
- comment delimiter) is then inserted. The second line assigns a keystroke
- (ALT+C) to the macro.
-
- Macros can extend over one line.
-
- If a macro definition takes up more space than you have on one line (about
- 250 characters in PWB), you can use the backslash ( \ ) to continue the
- definition on the next line. Consider, for instance, the following macro,
- which comments out a line of C source code:
-
- comment:=begline "/* " endline " */"
-
- It could be written as
-
- comment:=begline \
- "/* " endline \
- " */"
-
- Notice the extra space before each backslash. If you want a space between
- the end of one line and the beginning of the next, you must precede the
- backslash with two spaces.
-
- You can pass arguments to PWB macros.
-
- You can use the arg function to pass arguments to functions. For example,
- the following macro passes the argument 15 to the plines function (which
- scrolls text down):
-
- movedown:=arg "15" plines
-
- Because arg precedes the literal text, the text isn't written to the screen.
- Instead, it is passed as an argument to the next function, plines. The macro
- scrolls the current text down 15 lines.
-
- Arguments can also use regular-expression syntax (regular expressions are
- documented in online help):
-
- endword:=arg arg "([ .,;:()[\\»!$)" psearch cancel
-
- The arg arg sequence directs the psearch function to treat the text argument
- as a regular-expression search pattern. This search pattern tells PWB to
- search for the next space, period, comma, semicolon, colon, parentheses, and
- square brackets. (Note that a backslash must precede any character that has
- a special meaning in regular expressions─in this case, the right bracket.)
-
- A macro can invoke other macros:
-
- lcomment:= "/* "
- rcomment:= " */"
- commentout:=begline lcomment endline rcomment
- commentout:ctrl+o
-
- The commentout macro invokes the previously defined macros lcomment and
- rcomment.
-
- In addition to standard PWB functions, PWB macros can invoke user-defined
- macro functions. See Section 9.6, "Returning Values with Macro Functions."
-
-
- 14.3.2 Macro Responses
-
- Some PWB functions ask you for confirmation. For example, the meta exit
- (quit without saving) function normally asks if you really want to exit.
- Such questions always take the answer "yes" (Y) or "no" (N).
-
- When you invoke such a function in a macro, the function assumes an answer
- of yes and does not ask for confirmation. For example, the macro definition
-
-
- quit:=meta exit
- quit:alt+x
-
- The meta prefix modifies the action of a function.
-
- invokes meta exit when you press ALT+X. Because the meta exit function is
- invoked from a macro, PWB exits without asking for confirmation.
-
- The following operators allow you to restore normal prompting or change the
- default responses:
-
- Operator Description
- ────────────────────────────────────────────────────────────────────────────
- < Asks for confirmation; if not followed
- by another
- < operator, prompts for all further
- questions
-
- <y Assumes a response of "yes"
-
- <n Assumes a response of "no"
-
-
- A response operator applies to the function immediately preceding it. For
- example, you can add the < operator to the quit macro definition to
- restore the usual prompt:
- ────────────────────────────────────────────────────────────────────────────
- quit:=meta exit <
- quit:alt+x
-
- Now the macro prompts for a response before it exits.
-
-
- 14.3.3 Macro Arguments
-
- If you enter an argument in PWB and then invoke a macro, the argument is
- passed to the first function in the macro that takes an argument:
-
- tripleit:=copy paste paste
-
- The tripleit macro invokes the copy and paste editing functions. When you
- highlight a text area and then invoke the macro, your highlighted argument
- is passed to the copy function, which copies the argument to the clipboard.
- The macro then invokes paste twice. The effect is to insert two copies of
- the highlighted text.
-
- You cannot pass more than one argument from PWB to a macro.
-
- You cannot pass more than one argument from PWB to a macro, even if the
- macro invokes more than one function that can accept an argument. The
- argument always goes to the first function in the macro that takes an
- argument.
-
- You can also prompt for input inside a macro and pass the input as an
- argument using the prompt function as shown below:
-
- newfile:=arg "Next file: " prompt setfile <
- newfile:alt+n
-
- The newfile macro prompts for a file name and then switches to the
- specified file. The sequence arg "Next file: " passes the text argument
- Next file: to prompt, which prints it in the text-argument dialog box and
- waits for the user to respond. The response is passed as a text argument to
- the setfile function, which switches to that file.
-
-
- 14.3.4 Macro Conditionals
-
- Macros can take different actions depending on certain conditions. Such
- macros take advantage of the fact that PWB editing functions return values─
- a TRUE (nonzero) value if successful or FALSE (zero) if unsuccessful.
-
- Macros can use four conditional operators:
-
- Operator Description
- ────────────────────────────────────────────────────────────────────────────
- :>label Defines a label that can be targeted by other operators
- =>label Jumps to label
- +>label Jumps to label if the previous function returns TRUE
- ->label Jumps to label if the previous function returns FALSE
-
- For example, the leftmarg macro moves the cursor to the left margin of
- the editing window:
- ────────────────────────────────────────────────────────────────────────────
- leftmarg:=:>leftmore left +>leftmore
-
- The macro above invokes the left function repeatedly (jumping to the label
- leftmore) until it returns FALSE, indicating the cursor has reached the left
- margin.
-
- Macro execution depends on the status of conditionals.
-
- The label must appear immediately after the conditional operator, with no
- intervening spaces. A conditional operator without a label exits the macro
- immediately if the condition is satisfied. If the condition is not
- satisfied, the macro continues execution. The following example demonstrates
- this:
-
- turnon:=insertmode +> insertmode
-
- This macro turns on insert mode regardless of whether insert mode is
- currently on or off. If insert mode is off, the first invocation of
- insertmode toggles the mode on and returns TRUE, causing the +> operator to
- terminate the macro. If insert mode is currently on, the first invocation of
- insertmode turns insert mode off and returns FALSE. The macro then invokes
- insertmode a second time, turning insert mode back on.
-
-
- 14.3.5 Recording Macros
-
- You can also create a macro by recording a procedure as you perform it. The
- keystroke sequence is saved and can be replayed, like any other macro. To
- record a macro:
-
-
- 1. Choose Set Record from the Edit menu. The Set Macro Record dialog box
- appears.
-
- 2. Type the name you want the macro to have in the Name text box.
-
- 3. Tab to the Key Assignment text box and press the key to which you are
- assigning the macro. (For example, press ALT+T to assign the macro to
- ALT+T. The name of the keystroke appears in the text box.) If the
- keystroke (such as ENTER, TAB, or ESC) would normally exit the dialog
- box or move to the next field, type in the keystroke's name.
-
- 4. Click the OK button.
-
- 5. Choose Record On from the Edit menu to start the recording.
-
- 6. Type the text or perform the actions you want to record. (You can
- select text or fields with the mouse as well as the keyboard. Mouse
- selections are automatically converted into equivalent keystrokes.)
-
- 7. Choose Record On again to end the recording.
-
-
- You have now created a named macro available through the assigned keystroke.
- Pressing this key replays the actions you recorded.
-
- ────────────────────────────────────────────────────────────────────────────
- WARNING
-
- If you do not select a name for your macro, it is assigned the default name
- recordvalue. Unless you plan to discard the macro when exiting, do not let a
- recorded macro's name default to recordvalue. Any subsequent macro recorded
- with the recordvalue default name will overwrite the first recordvalue
- macro.
- ────────────────────────────────────────────────────────────────────────────
-
- A recorded macro is temporary; PWB discards it when you exit. To save a
- recorded macro:
-
-
- 1. Choose Edit Macro from the Edit menu. This opens the <RECORD>
- pseudofile and displays the macros you recorded.
-
- 2. Make any changes required. For example, you might want to change the
- macro's name or modify the keystroke sequence.
-
- 3. Save the macro using the Save command from the File menu.
-
-
- The macros defined in the <RECORD> pseudofile are added to your TOOLS.INI
- file when you save the <RECORD> file. PWB automatically reloads them at the
- next session.
-
- You can append functions to an existing macro without having to record the
- original steps again:
-
-
- 1. Choose Set Record from the Edit menu. The Set Macro Record dialog box
- appears.
-
- 2. Type the macro's name in the Name text box.
-
- 3. Tab to the Clear First check box and cancel selection. This causes any
- new actions to be appended to the original macro, rather than
- replacing (clearing) it.
-
- 4. Click the OK button.
-
- 5. Choose Record On from the Edit menu to start the recording.
-
- 6. Perform the actions you want added to the macro.
-
- 7. Choose Record On again to end the recording.
-
-
- Remember to save the modified macro before exiting, or the new version will
- be discarded.
-
- You can record a series of actions without executing them.
-
- You can make a "silent" recording, which records a series of actions without
- executing them. This allows you to create a macro without altering or
- damaging the file. Start the recording with a meta record command (press F9,
- SHIFT+CTRL+R). When the macro is complete, terminate recording with record
- (press SHIFT+CTRL+R).
-
- PWB gives no visual feedback during silent recording. If you need to see the
- macro being created, open the <RECORD> pseudofile in a second window as
- described above. This is an excellent way to get a better understanding of
- macros and editor functions.
-
-
- 14.3.6 Temporary Macros
-
- You can use the assign function to create a macro that lasts only until the
- end of the current session. For example, the following steps create the
- comment macro described above:
-
-
- ■ Press ALT+A
-
- ■ Type comment:=begline "; "
-
- ■ Press ALT+=
-
-
- This key sequence tells PWB to open dialog boxes where the macro and key
- assignments are to be typed. To assign ALT+C to the macro,
-
-
- ■ Press ALT+A
-
- ■ Type comment:alt+c
-
- ■ Press ALT+=
-
-
- The macro is available immediately and is discarded when you exit PWB.
-
-
- 14.4 Related Topics in Online Help
-
- Information on the following related topics can be found in online help. All
- the topics listed below are found by choosing "Programmer's WorkBench" from
- the "Microsoft Advisor's Help System Contents" screen.
-
- Topic Access
- ────────────────────────────────────────────────────────────────────────────
- Writing macros Choose "Writing and Using Macros"
-
- TOOLS.INI Choose "Using TOOLS.INI"
-
- Regular expressions Choose "Writing and Using Macros;" then
- choose "Regular Expressions" from under
- the "Building Macros" subhead
-
- The prompt and meta functions Choose "Using PWB Functions," and from
- the next screen, choose "Alphabetical
- List"
-
- Assigning keystrokes Choose "Setting PWB Switches" and then
- "Assign Function"
-
-
-
-
-
-
- Chapter 15 Debugging Assembly-Language Programs with CodeView
- ────────────────────────────────────────────────────────────────────────────
-
- You can diagnose software problems and locate programming errors quickly
- with the CodeView debugger. This chapter explains how to
-
-
- ■ Display and modify variables and memory
-
- ■ Control the flow of execution
-
- ■ Use advanced CodeView debugging techniques
-
- ■ Modify CodeView's behavior with command-line switches and the
- TOOLS.INI file
-
-
- CodeView supports the Microsoft mouse (or any fully compatible pointing
- device). This chapter first describes CodeView operations with the mouse,
- then with function keys. Command-window commands are not generally
- discussed, except when there is no comparable mouse or function-key command.
- Unless a specific mouse button is named, "clicking" means pressing and
- quickly releasing the left mouse button.
-
-
- 15.1 Understanding Windows in CodeView
-
- CodeView divides the screen into logically separate sections called windows.
- Windows permit a large amount of information to be displayed in an organized
- and easy-to-read fashion.
-
- Each window displays a different type of data.
-
- Each CodeView window has a distinct function and operates independently of
- the others. The name of each window described below appears in the top of
- the window's frame:
-
-
- ■ The Source window displays the source code. You can open a second
- source window to view an include file, another source file, or the
- same source file at a different location. Any ASCII text file can be
- viewed in the Source window.
-
- ■ The Command window accepts debugging commands from the keyboard.
-
- ■ The Watch window displays the current values of selected variables.
-
- ■ The Local window lists the values of all variables local to the
- current procedure.
-
- ■ The Memory window shows the contents of memory. You can open a second
- Memory window to view a different section of memory.
-
- ■ The Register window displays the contents of the microprocessor's
- registers, as well as the processor flags.
-
- ■ The 8087 window displays the registers of the coprocessor or its
- software emulator.
-
-
- Figure 15.1 shows all CodeView windows.
-
- (This figure may be found in the printed book.)
-
- The first time you run CodeView, it displays three windows. The Local window
- is at the top, the Source window fills the middle of the screen, and the
- Command window is at the bottom. CodeView records which windows were open
- and how they were positioned at the time you exit. These settings become the
- default the next time you run CodeView.
-
- There are two ways to open windows. You can choose the desired window from
- the View menu or press its shortcut key. In addition, some operations (such
- as selecting a Watch variable) automatically open the appropriate window if
- it isn't already open.
-
- All displays are updated automatically.
-
- CodeView continually and automatically updates the contents of all windows.
- However, if you want to interact with a particular window (such as entering
- a command, setting a breakpoint, or modifying a variable), you must first
- select that window.
-
- The selected window is called the "active" window. The active window is
- marked in three ways:
-
-
- ■ The window's name is highlighted.
-
- ■ The text cursor appears in the window.
-
- ■ The vertical and horizontal scroll bars move into the window.
-
-
- Figure 15.2 shows the Source window as the active window.
-
- (This figure may be found in the printed book.)
-
- To select a new active window, click that window (position the mouse pointer
- in the window and press the left mouse button). You can also press F6 or
- SHIFT+F6 to move from one window to the next.
-
- Windows often contain more information than can be displayed in the area
- allotted to the window. There are several ways to view these additional
- contents.
-
- To view additional contents with the mouse:
-
-
- ■ Drag the scroll box on the horizontal or vertical scroll bars.
- (Position the mouse pointer on the scroll box and, while holding down
- the left mouse button, move the mouse in the appropriate direction.)
-
- ■ Click the arrows at the top and bottom of the scroll bars.
-
- ■ Click the gray area to either side of the scroll box in a scroll bar.
-
-
- To view additional contents with the keyboard:
-
-
- ■ Press the direction keys (LEFT, RIGHT, UP, DOWN) to move the cursor.
-
- ■ Press PGUP, PGDN, CTRL+PGUP (page left), and CTRL+PGDN (page right) to
- move the cursor to a different page of the window's contents.
-
- ■ Press CTRL+HOME to move the cursor to the beginning of the window's
- contents.
-
- ■ Press CTRL+END to move the cursor to the end of the window's contents.
-
-
- Typing commands when the Source window is active causes CodeView to
- temporarily shift its focus to the Command window. Whatever you type is
- appended to the last line in the Command window. If the Command window is
- closed, CodeView beeps in response to your entry and ignores the input.
-
-
- Adjusting the Windows
-
- Although you can't reposition the windows, you can change their size or
- close them. The Maximize, Size, and Close commands from the View menu
- perform these functions, or you can press CTRL+F10, CTRL+F8, and CTRL+F4,
- respectively. Window manipulation is especially easy with a mouse:
-
-
- ■ To maximize a window (enlarge it so it fills the screen), click the up
- arrow at the right end of the window's top border, or double-click the
- window's title. (Position the mouse pointer anywhere on the title and
- press the left mouse button twice, rapidly.) To restore the window to
- its original size, click the double arrow at the right end of the top
- border or press CTRL+F10.
-
- ■ To change the size of a window, position the mouse pointer anywhere
- along the line at the top of the window. Press and hold down the left
- mouse button, then drag the mouse to enlarge or reduce the window. The
- same action on a vertical border widens or narrows the window.
-
- ■ To close a window, click the dot at the left end of the top border.
- The adjacent windows automatically expand to recover the unused space.
- You can also close any window whose View menu name has a dot next to
- it: choose that window from the menu or press the window's shortcut
- key.
-
-
- CodeView remembers the last debugging session.
-
- CodeView stores session information in a file called CURRENT.STS, which is
- created in the directory pointed to by the INIT environment variable (or in
- the current directory, if there is no INIT variable). The session
- information includes such items as the name of the program being debugged,
- the CodeView windows that were open, breakpoint locations, and other status.
- This information becomes the default status the next time you run CodeView.
-
-
-
- 15.2 Overview of Debugging Techniques
-
- There is no single best approach to debugging. CodeView offers a variety of
- debugging tools that let you select a method appropriate for the program or
- for your work habits. This section describes some approaches to solving
- debugging problems.
-
- Broadly speaking, two things can go wrong in a program:
-
-
- ■ The program doesn't manipulate the data the way you expected it to.
-
- ■ The flow of execution is incorrect.
-
-
- These problems usually overlap. Incorrect execution can corrupt the data,
- and bad data can cause execution to take an unexpected turn. Because
- CodeView allows you to trace program execution while simultaneously
- displaying whatever combination of variables you want, you don't have to
- know ahead of time whether the problem is bad data manipulation, a bad
- execution path, or some combination of both.
-
- CodeView has specific features that deal with the problems of bad data and
- incorrect execution:
-
-
- ■ You can view and modify any program variable, any section of memory,
- or any processor register. These features are explained in Section
- 15.3, "Viewing and Modifying Program Data."
-
- ■ You can monitor the path of execution and precisely control where
- execution pauses. These features are explained in Section 15.4,
- "Controlling Execution."
-
-
-
- 15.3 Viewing and Modifying Program Data
-
- CodeView offers a variety of ways to display the values of program
- variables, processor registers, and memory. You can also modify the values
- of all these items as the program executes. This section shows how to
- display and modify variables, registers, and memory.
-
-
- 15.3.1 Displaying Variables in the Watch Window
-
- To add a variable to the Watch window, position the cursor on the variable's
- name, using the mouse or the direction keys (LEFT, RIGHT, UP, DOWN). Then
- choose the Add Watch command from the Watch menu, or press CTRL+W.
-
- A dialog box appears with the selected variable's name displayed in the
- Expression field. If you don't want to watch the variable shown, type in the
- name of another variable. Click the OK button or press ENTER to add this
- variable to the Watch window.
-
- The Watch window appears at the top of the screen. Selecting a Watch
- variable automatically opens the Watch window if the window isn't already
- open.
-
- A newly added variable may be followed by the message:
-
- <Watch Expression Not in Context>
-
- This message appears when execution has not yet reached the procedure where
- a local variable is defined. Global variables (those declared outside
- procedures) never cause CodeView to display this message; they can be
- watched from anywhere in the program.
-
- To remove a variable from the Watch window, choose the Delete Watch command
- from the Watch menu or press CTRL+U. Then select the variable to be removed
- from the list in the dialog box. You can also position the cursor on any
- line in the Watch window and press CTRL+Y to delete that line.
-
- You can watch an unlimited number of variables.
-
- You can place as many variables as you like in the Watch window; the
- quantity is limited only by available memory. You can scroll the Watch
- window to position it at those variables you want to view. CodeView
- automatically updates all Watch window variables as the program runs,
- including those not currently visible within the Watch window frame.
-
- A variable can be specified by its address as well as its name. You can give
- its address in segment:offset form, where either component can be a register
- name or a number. You can extract a variable's address by prefixing the &
- operator to its name. Prefixing a variable's address (or any address) with
- the BY, WO, or DW operator displays the byte, word, or doubleword value
- starting at that address.
-
- There are several ways to display a variable's value.
-
- By default, CodeView displays variables as decimal values. You can select
- the radix by typing n8, n10, or n16 in the Command window for an octal,
- decimal, or hexadecimal display. CodeView remembers the current radix when
- you exit; it becomes the default radix the next time you run CodeView.
-
-
- 15.3.2 Displaying Expressions in the Watch Window
-
- The Watch window is not limited to variables. You can enter an expression
- (that is, any valid combination of variables, constants, and operators) for
- CodeView to evaluate and display. You can also select the format in which
- CodeView displays the expression.
-
- MASM expressions are evaluated using C rules.
-
- CodeView does not include an expression evaluator specifically for MASM. It
- uses the C expression evaluator instead. This means you must enter MASM
- variables or expressions in a form the C evaluator recognizes, which is not
- always the way they appear in a MASM program. (Online help describes the
- operators and precedence order for C expressions. The last part of this
- section also gives examples of some of the more commonly used expression
- forms.)
-
- The Language command from the Options menu offers a choice of Auto, C,
- Basic, or FORTRAN expression evaluators. However, the Basic and FORTRAN
- expression evaluators do not support address evaluation, pointer
- conversions, type casting, or other operations needed when debugging
- assembly-language code.
-
- Besides arithmetic and memory-reference expressions, CodeView can also
- display Boolean expressions. For example, if a variable is never supposed to
- be larger than 100 or less than 25, the expression
-
- (var < 25 || var > 100)
-
- evaluates to one (TRUE) if var goes out of bounds.
-
-
- Changing Display Format
-
- By default, CodeView displays expression values in decimal form. You can
- change the display radix to octal or hexadecimal with the Radix (N) command
- described at the end of the previous section.
-
- Another way to change the display format is to append a comma and a
- single-digit format specifier to any watched variable, expression, or
- address. For example, to display varname in octal form, type varname,o
- in the Watch expression box. (If varname is already in the Watch window,
- simply append a comma and the octal specifier ,o and then move the cursor
- off the line.) The following list describes the use of each specifier:
-
- ╓┌─────────────────────────────────┌─────────────────────────────────────────╖
- Specifier Form Displayed
- ────────────────────────────────────────────────────────────────────────────
- c Least-significant byte of the variable
- displayed as a single character
-
- d Decimal value
-
- e or E Eight bytes displayed as a
- double-precision exponential
- number
- Specifier Form Displayed
- ────────────────────────────────────────────────────────────────────────────
- number
-
- f Four bytes displayed as a
- single-precision floating-point
- number
-
- g or G Eight bytes displayed as a
- double-precision exponential
- number
-
- i Signed integer value
-
- o Unsigned octal value
-
- s String; all following bytes displayed as
- ASCII characters, up to next null
- character (ASCII 0)
-
- u Unsigned decimal value
- Specifier Form Displayed
- ────────────────────────────────────────────────────────────────────────────
- u Unsigned decimal value
-
- x or X Hexadecimal value, without leading 0x
-
-
-
-
- Displaying MASM Expressions
-
- Expressions using registers or indexes are more complex. The following
- sections show how to substitute CodeView expressions using the C expression
- evaluator for MASM expressions.
-
- Register Indirection - The C expression evaluator does not recognize
- brackets to indicate the memory location pointed to by a register. Instead,
- use the BY, WO, or DW operator to reference the corresponding byte, word, or
- doubleword value.
-
- MASM Expression CodeView Equivalent
- ────────────────────────────────────────────────────────────────────────────
- BYTE PTR [bx] BY bx
- WORD PTR [bp] WO bp
- DWORD PTR [bp] DW bp
-
- Register Indirection with Displacement - To perform based, indexed, or
- based-indexed indirection with a displacement, use the BY, WO, or DW
- operator combined with addition.
-
- MASM Expression CodeView Equivalent
- ────────────────────────────────────────────────────────────────────────────
- BYTE PTR [di+6] BY di+6
- BYTE PTR Test [bx] BY &Test+bx
- WORD PTR [si] [bp+6] WO si+bp+6
- DWORD PTR [bx] [si] DW bx+si
-
- Address of a Variable - Use the address operator (&) instead of the OFFSET
- operator.
-
- MASM Expression CodeView Equivalent
- ────────────────────────────────────────────────────────────────────────────
- OFFSET Var &Var
-
- PTR Operator - Use C type casts, or the BY, WO, and DW operators in
- conjunction with the address operator (&), to replace the PTR operator.
-
- MASM Expression CodeView Equivalents
- ────────────────────────────────────────────────────────────────────────────
- BYTE PTR Var BY &Var
- *(unsigned char*)&Var
-
- WORD PTR Var WO &Var
- *(unsigned *)&Var
-
- DWORD PTR Var DW &Var
- *(unsigned long*)&Var
-
-
- Strings - Add a comma and the string specifier ,s after the variable name.
-
- MASM Expression CodeView Equivalent
- ────────────────────────────────────────────────────────────────────────────
- Stringvar Stringvar,s
-
- Because CodeView uses the C expression evaluator and C strings end with an
- ASCII null (zero), CodeView displays all characters up to the next null in
- memory when you request a string display. If you intend to debug a MASM
- program, you should terminate string variables with a null.
-
- Array and Structure Elements - The C expression evaluator equates an array
- name with the address of its first element. Therefore, you should prefix an
- array name with the address operator (&), then add the desired offset. The
- offset can be added directly, or it can appear within parentheses. It can be
- a number, a register name, or a variable.
-
- The following examples (using byte, word, and doubleword arrays) show how
- this is done:
-
- MASM Expression CodeView Equivalents
- ────────────────────────────────────────────────────────────────────────────
- String[12] BY &String+12
- *(&String+12)
-
- aWords[bx+di] WO &aWords+bx+di
- *(unsigned*)(&aWords+bx+di)
-
- aDWords[bx+4] DW &aDWords+bx+4
- *(unsigned long*)(&aDWords+bx+4)
-
-
- Pointers - MASM 6.0 lets you define pointer-type variables. Since these are
- the same as C pointers, the C expression evaluator works as it does with C
- programs.
-
- You dereference a pointer simply by typing its name in the Watch window. The
- pointer's address is displayed, followed by all the elements of the variable
- to which the pointer refers. Multiple levels of indirection (that is,
- pointers referencing other pointers) can be displayed simultaneously.
-
-
- 15.3.3 Displaying Local Variables
-
- When your program is executing within the scope of a procedure, the Local
- window automatically displays the variables local to that procedure (stack
- variables). This includes arguments declared in PROC directives and
- variables explicitly declared as LOCAL within the procedure.
-
- Note that variables you create on the stack are not displayed in the Local
- window, since CodeView is aware only of the assembler-created stack. You can
- display user-defined stack variables in the Watch window by specifying their
- address in segment:offset form.
-
-
- 15.3.4 Using Pointers to Display Arrays and Strings
-
- Unlike high-level-language compilers, MASM does not provide symbolic
- information for arrays. Consequently, CodeView cannot distinguish between a
- simple variable and an array, and therefore cannot directly display an
- assemblylanguage array in expanded form. (See Section 15.3.2, "Displaying
- Expressions in the Watch Window," to display individual array elements.)
-
- A user-defined pointer lets you view an expanded array.
-
- For debugging purposes, you can overcome MASM's lack of array information by
- using the TYPEDEF directive to define a pointer type, and from that a
- pointer variable for the array. (Place the directive and pointer definition
- within a conditional-assembly block, so the pointer won't be added to your
- release code.) You can then view the array from CodeView by placing the
- pointer in the Watch window. For example:
-
- array BYTE 20 DUP (0) ; array of 20 bytes
-
- IF debug
- PBYTE TYPEDEF PTR BYTE ; PBYTE type is pointer to bytes
- parray PBYTE array ; parray points to array
- ENDIF
-
- If you declare multiple levels of pointers (pointers to pointers to
- pointers, and so on), multiple levels of indirection can be displayed
- simultaneously by expanding each subpointer.
-
- If it is inconvenient to view a character array in hexadecimal form, cast
- the variable's name to a character pointer by placing (char *) in front of
- the name. The character array is then displayed as a string delimited by
- apostrophes. You can also append the string-format specifier ,s to the
- expression.
-
- Note that the C expression evaluator expects a string to terminate with the
- ASCII null character (0). If you do not include a terminating null in the
- string's definition, the evaluator continues displaying memory as characters
- until it encounters a null. The Memory window is an effective way to view
- nonterminated strings.
-
-
- 15.3.5 Displaying Structures
-
- MASM adds structure and union information to the debugging table. You can
- display MASM structures in expanded form, just as you would in C, Basic,
- Pascal, or FORTRAN.
-
- Structures contain multiple data values, often of different data types,
- arranged in one or more layers. Therefore, they are often referred to as
- "aggregate" data items. CodeView lets you control how much of a structure is
- shown; that is, whether all, part, or none of its components are displayed.
-
-
- The following example defines a structure and pointer types to implement a
- simple linked list:
-
- PTRLINKEDLIST TYPEDEF PTR LINKEDLIST
- PTRDATAWORD TYPEDEF PTR WORD
-
- LINKEDLIST STRUCT
- ptrNext PTRLINKEDLIST 0
- ptrData PTRDATAWORD 0
- LINKEDLIST ENDS
-
- rootNode linkedList < >
-
- Once rootNode has been defined, the program calls the MALLOC function
- (which is available from the libraries of Microsoft high-level languages) to
- allocate memory for a structure pointer and a data pointer. The addresses of
- each are assigned to the corresponding pointers in rootNode, readying the
- list for its first entry.
-
- The program stores a list item at the memory location specified by the
- preceding pointer, then calls MALLOC to allocate memory for the next list
- item. This process is repeated for each new list item, creating a linked
- list of data structures.
-
- To display the linked list of structures, add rootNode to the Watch
- window. It initially appears in the form:
-
- +rootnode = {...}
-
- The brackets indicate that this is an aggregate variable (since it's a
- structure). The plus sign (+) indicates that the structure has not yet been
- expanded to display its components.
-
- To expand rootnode, double-click its display line. (Position the mouse
- pointer anywhere on the line and press the left mouse button twice,
- rapidly.) You can also move the cursor to the line and press ENTER. The
- Watch window display changes to
-
- -rootnode
- +ptrnext = 0F00:1111
- ptrdata = 0x0032 "2"
-
- The address and data values shown here are arbitrary. They depend on the
- data values stored and on the memory location from where MALLOC obtained
- free space. The minus sign (-) indicates that rootnode has been fully
- expanded; no further expansion is possible. The plus sign (+) indicates that
- ptrnext points to another structure that has not been expanded.
-
- Any structure element can be independently expanded or contracted. To expand
- the next structure, double-click ptrnext, or press ENTER when the cursor is
- on that line. The Watch window display changes to
-
- -rootnode
- -ptrnext = 0F00:1111
- +ptrnext = 0F00:2222
- ptrdata = 0x0034 "4"
- ptrdata = 0x0032 "2"
-
- Note that both the data value and its ASCII equivalent are displayed. To
- contract the structure, double-click its line a second time or position the
- cursor on the line and press ENTER.
-
- The process of expanding structures pointed to by ptrnext may be repeated
- indefinitely until you reach the last structure in the list. Its identifier
- will be prefixed with a minus sign, indicating that no more space for
- structures has been allocated.
-
- You can view individual elements instead of the entire structure.
-
- If you want to view only one or two elements of a large structure, indicate
- the specific structure elements in the Expression field of the Add Watch
- dialog box. Structure elements are separated by a dot (.), so you would type
-
-
- rootnode.ptrnext.ptrnext
-
- to view the pointer from the third structure in the list.
-
-
- 15.3.6 Using Quick Watch
-
- Choose the Quick Watch command from the Watch menu (or press SHIFT+F9) to
- display the Quick Watch dialog box. If the cursor is in the Source, Local,
- or Watch window, the variable at the current cursor position appears in the
- dialog box. If it isn't the item you want to display, type in the desired
- expression or variable; then press ENTER. The Quick Watch window immediately
- displays the specified item.
-
- The Quick Watch display automatically expands structures and pointers to
- their first level. You can expand or contract an element just as you would
- in the Watch window: position the cursor on the appropriate line and press
- ENTER. If the array needs more lines than the Quick Watch window can
- display, drag the scroll box with the mouse, or press DOWN or PGDN to view
- the rest of the array.
-
- You can add Quick Watch variables to the Watch window.
-
- Choose the Add Watch button to add a Quick Watch item to the Watch window.
- Structures and pointers appear in the Watch window expanded as they were
- displayed in the Quick Watch dialog box.
-
- Quick Watch is a convenient way to take a quick look at a variable or
- expression. Since only one Quick Watch variable can be viewed at a time, you
- would not use Quick Watch for most of the variables you want to view.
-
-
- 15.3.7 Displaying Memory
-
- Choosing the Memory command from the View menu opens a Memory window. Two
- Memory windows can be open at one time.
-
- By default, memory is displayed as hexadecimal byte values, with 16 bytes
- per line. At the end of each line is a second display of the same memory in
- ASCII form. Values that correspond to printable ASCII characters (decimal 32
- to 127) are displayed in that form. Values outside this range are shown as
- dots (.).
-
- You can display memory values in any form.
-
- Byte values are not always the most convenient way to view memory. If the
- area of memory you're examining contains character strings or floating-point
- values, you might prefer to view them in a directly readable form. Choosing
- the Memory Window command from the Options menu displays a dialog box with a
- variety of display options:
-
-
- ■ ASCII characters
-
- ■ Byte, word, or doubleword binary values
-
- ■ Signed or unsigned integer decimal values
-
- ■ Short (32-bit), long (64-bit), or ten-byte (80-bit) floating-point
- values
-
-
- Figures 15.3 and 15.4 show two of these different displays.
-
- (This figure may be found in the printed book.)
-
- (This figure may be found in the printed book.)
-
- Another way to choose a display format is to cycle through the formats by
- repeatedly pressing SHIFT+F3.
-
- Not every four-byte or eight-byte sequence represents a valid floating-point
- number. If a section of memory cannot be displayed in the floating-point
- format you select, the number displayed includes the characters NAN─"not a
- number."
-
- You can change the contents of the memory by simply overtyping new values in
- the Memory window. See Section 15.3.9 for more information on modifying
- values.
-
-
- Displaying Variables with a Live Expression
-
- Section 15.3.4 explained how to display a specific array element by adding
- the appropriate expression to the Watch window. You can also watch a
- particular array element or structure element in the Memory window. This
- CodeView display feature is called a "live expression." The term "live"
- means that CodeView dynamically displays memory starting at the current
- value of the address expression you specify.
-
- To create a live expression, choose the Memory Window command from the
- Options menu; then select the Live Expression check box. Type the element
- you want to view in the Address Expression field. For example, if array is
- a variable whose current value is being indexed by the value in the BI
- register and you wish to view it, type array [bi]. Then choose the OK
- button or press ENTER.
-
- If no memory windows are open, a new Memory window opens. The first memory
- location in the window is the first memory location of the live expression.
- The section of memory displayed changes to the section the live expression
- currently references.
-
- You can use the Memory Window command from the Options menu to display the
- memory in a directly readable form. This is especially convenient when the
- live expression represents strings or floating-point values, which are
- difficult to interpret in hexadecimal form.
-
- It is usually more convenient to view an item in the Watch window than as a
- live expression. However, some items are more easily viewed as live
- expressions. For example, you can examine what is currently on top of the
- stack by entering SS:SP as the live expression. In fact, any legal
- combination of register values (such as ES:DI or DS:SI) can be entered in
- segment:offset form.
-
-
- 15.3.8 Displaying the Processor Registers
-
- Choosing the Register command from the View menu (or pressing F2) opens a
- window on the right side of the screen. The microprocessor's current
- register values appear in this window. At the bottom of the window is a
- group of mnemonics representing the processor flags. Pressing F2 a second
- time closes the window.
-
- Video intensity shows changed values.
-
- When you first open the Register window, all register and flag values are
- shown in normal text. When you change a register or flag, the changed value
- is highlighted. For example, suppose the overflow flag is not set when the
- Register window is first opened. The corresponding mnemonic is NV and
- appears in light gray. If the overflow flag is subsequently set, the
- mnemonic changes to OV and appears in bright white. If your computer uses an
- 80386/486 processor and you are running the real-mode version of CodeView
- choosing the 386 Instructions command from the Options menu displays the
- registers as 32-bit values. Choosing this command a second time returns to
- the 16-bit display.
-
- You can also display the registers of an 8087-80387 coprocessor (or the
- built-in coprocessor of the 80486) in a separate window by choosing the 8087
- command from the View menu. If your program uses the coprocessor emulator,
- the emulated registers are displayed instead.
-
- The Register values reveal program status.
-
- The Register window is a valuable debugging tool. Almost every assembly
- instruction alters a register or flag. As each line of code is executed, the
- register values and flags that change are highlighted, so you can see
- whether each instruction does what you intended it to.
-
- Also, when you execute an instruction whose operand has a memory location
- (such as a variable), the effective address of the operand, as well as the
- value stored at that address, is displayed at the bottom of the Register
- window.
-
-
- 15.3.9 Modifying the Values of Variables, Memory, and Registers
-
- You can easily change the values of variables, memory locations, or
- registers displayed in the Watch, Local, Memory, Register, or 8087 windows.
- Simply position the cursor at the value you want to change and edit it to
- the appropriate value. In the Watch and Local windows, the change is
- accepted by CodeView when you move the cursor off the line. If you change
- your mind, press ALT+BKSP to undo the last change you made.
-
- You can also alter expressions in the Watch window by adding an operator or
- changing the variable displayed. When you have altered the expression and
- moved the cursor off the line, CodeView will immediately show the new value
- of the modified expression.
-
- The starting address of each line of memory displayed is shown at the left
- of the Memory window in segment:offset form. Altering the address
- automatically shifts the display to the corresponding section of memory.
- Under OS/2, if your program does not own that section of memory, memory
- values are displayed as double question marks (??).
-
- It's easy to change memory values...
-
- You can also change the values of memory locations by modifying the right
- side of the memory display (where memory values are shown in ASCII form).
- For example, to change a byte from decimal 75 to decimal 85, place the
- cursor over the letter K, which corresponds to the position where the memory
- value is 75 (K is ASCII 75), and type in U (ASCII 85).
-
- ...or flags.
-
- To toggle a processor flag, double-click its mnemonic. You can also position
- the cursor on a mnemonic, then press any key (except ENTER, TAB, or SPACE).
- Press ALT+BKSP (undo) to restore the flag to its previous setting.
-
- Be cautious when modifying memory or a register.
-
- The effect of changing a register, flag, or memory location can vary from no
- effect at all to crashing the operating system. Be cautious when altering
- these values.
-
-
- 15.4 Controlling Execution
-
- There are two forms of program execution under CodeView:
-
-
- ■ Continuous; the program executes until either a previously specified
- breakpoint has been reached or the program terminates.
-
- ■ Single-step; the program pauses after each line of code has been
- executed.
-
-
- Sections 15.4.1 and 15.4.2 explain how each form of execution works and the
- most effective way to use each.
-
- As you are debugging, you can display the program in source-code form or
- assembly form. Section 15.4.3 explains the advantages of each.
-
-
- 15.4.1 Continuous Execution
-
- Continuous execution lets you quickly execute the bug-free sections of code
- which would otherwise take a long time to execute one instruction at a time.
-
-
- The simplest form of continuous execution is to click the line of code you
- want to debug or examine in more detail with the right mouse button. The
- program executes up to the start of this line, then pauses. An alternative
- method is to position the cursor on this line, then press F7.
-
- You can also pause execution at a specific line of code with a "breakpoint."
- There are several types of breakpoints. Breakpoints are explained in the
- following section.
-
-
- Selecting Breakpoint Lines
-
- Breakpoints can be tied to lines of code.
-
- You can skip over those parts of the program that you don't want to examine
- by specifying one or more lines as breakpoints. The program executes up to
- the first breakpoint, then pauses. Pressing F5 continues program execution
- up to the next breakpoint, and so on. (You can halt execution at any time by
- pressing CTRL+C.)
-
- There is no limit to the number of breakpoints.
-
- You can set as many breakpoints as you like (limited only by available
- memory). There are several ways to set breakpoints:
-
-
- ■ Double-click anywhere on the desired breakpoint line. The selected
- line is highlighted to show that it is a breakpoint. To remove the
- breakpoint, double-click the line a second time.
-
- ■ Position the cursor anywhere on the line at which you want execution
- to pause. Press F9 to select the line as a breakpoint and highlight
- it. Press F9 a second time to remove the breakpoint and highlighting.
-
- ■ Display the Set Breakpoint dialog box by choosing Set Breakpoint from
- the Watch menu. Select one of the breakpoint options that permits a
- line ("location") to be specified. The line at the cursor is the
- default breakpoint line in the Location field. If this line is not the
- desired location, enter the line number desired. (You must place a
- period in front of the line number, or CodeView will interpret the
- number as an absolute address.) To remove the breakpoint, use F9 or
- choose Edit Breakpoints from the Watch menu to display the Edit
- Breakpoints dialog box.
-
-
- Not every line can be a breakpoint.
-
- A breakpoint line must be a program line that represents executable code.
- You cannot select a blank line, a comment, or a declaration (such as a
- variable declaration or a segment specifier) as a breakpoint.
-
- A breakpoint can also be set at an address. Type the address in
- segment:offset form in the Set Breakpoint dialog box. (Address breakpoints,
- unlike line breakpoints, are not saved in CodeView's status file, and
- therefore are not restored when you restart a debugging session.)
-
- A breakpoint can be set to the name of a procedure if the procedure was
- declared with the PROC directive. If not, the procedure must contain a
- labeled line. Type the procedure's name or the line's label in the Set
- Breakpoint dialog box.
-
- Once execution has paused, you can continue execution by clicking the F5=Go
- button in the display or by pressing F5. Execution continues to the next
- breakpoint. If there are no more breakpoints, execution continues to the end
- of the program, or until a fatal error occurs.
-
- ────────────────────────────────────────────────────────────────────────────
- NOTE
-
- The Set Breakpoint dialog box contains a Commands text box. You can type
- Command-window commands in this box, separated by semicolons. These commands
- are executed when the breakpoint is reached. See the Command Window section
- of CodeView online help for a full description of Command-window commands.
- ────────────────────────────────────────────────────────────────────────────
-
-
- Conditional Breakpoints
-
- Breakpoints are not limited to specific lines of code. CodeView can also
- pause when a variable reaches a particular value or just changes value. This
- is a "conditional breakpoint." In previous versions of CodeView, conditional
- breakpoints are called "watchpoints" and "tracepoints."
-
- You can associate a conditional breakpoint with a specific line of code, so
- that execution pauses at that line only if the variable has simultaneously
- reached a particular value or changed value. The check boxes in the Set
- Breakpoint dialog box select these other breakpoint types.
-
- To pause execution when a variable reaches a particular value, type an
- expression that is usually false in the Expression field of the Set
- Breakpoint dialog box. For example, if you want to pause when the variable
- looptest equals 17, type looptest == 17.
-
- To pause execution when a variable changes value, you need to type only the
- name of the variable in the Expression field. For large variables (such as
- arrays or character strings), you can specify the number of bytes you want
- checked (up to 32K) in the Length field. Execution pauses when any one of
- these values changes.
-
- ────────────────────────────────────────────────────────────────────────────
- NOTE
-
- CodeView checks every conditional breakpoint after executing each line of
- source code. Unless you have enabled the use of the debug registers with the
- CodeView /R command-line option, this computational overhead greatly slows
- execution. (Execution is even slower if you are executing in Mixed mode or
- Assembly mode, because conditional breakpoints are checked after each
- machine instruction.)
-
- For maximum speed when debugging, either associate conditional breakpoints
- with specific lines, or set conditional breakpoints only after you have
- reached the section of code that needs to be debugged. You can also use the
- Disable button in the Edit Breakpoints dialog box to temporarily suspend
- evaluation of a previously set conditional breakpoint.
- ────────────────────────────────────────────────────────────────────────────
-
-
- Using Breakpoints
-
- One of the most common bugs is a loop that executes too many or too few
- times. If you set a breakpoint on the statement that controls the loop
- statements, the program pauses after each iteration. With the loop variable
- or critical program variables in the Watch or Local windows, it should be
- easy to see what's going wrong in the loop.
-
- You can specify how many times a breakpoint is reached before stopping.
-
- You do not have to pause at a breakpoint the first time execution reaches
- it. CodeView lets you specify the number of times you want to ignore the
- breakpoint condition before pausing. Type the number in the Pass Count field
- of the Set Breakpoint dialog box. This feature can eliminate a lot of
- tedious singlestepping.
-
- Another programming error is erroneously assigning a value to a variable
- that should not change. Type the variable in the Expression field of the Set
- Breakpoint dialog box. Execution breaks whenever this variable changes─even
- unintentionally.
-
- You can assign new values to variables while execution is paused.
-
- Breakpoints are a convenient way to pause the program so you can assign new
- values to variables. For example, if a limit value is set by a variable, you
- can change the value to see whether program execution is affected.
-
-
- 15.4.2 Single-Stepping
-
- In single-stepping, CodeView pauses after each line of code is executed. The
- next line to be executed is highlighted.
-
- There are two ways to single-step.
-
- You can single-step through a program with the Step and Trace commands. Step
- (executed by pressing F10) steps over procedure calls. All the code in the
- procedure is executed, but it appears to you as if the procedure executed in
- a single step. Trace (executed by pressing F8) traces through every step of
- all procedures. Each line of the procedure is executed as a separate step.
-
- You can alternate between Trace and Step as you like. The method you use
- depends only on whether you want to see what happens within a particular
- procedure. (Note that interrupt calls are always stepped over; you do not
- see individual steps of the execution.)
-
- If CodeView cannot locate the source code for a procedure in the current
- directory, it pauses and asks for the name of the file that contains the
- source. If you cannot supply a source file, CodeView disassembles the
- executable code and displays that instead. (If you are executing in Source
- mode, and the source code for a procedure is not available, CodeView steps
- over the procedure, even if you use the Trace command.)
-
- Note that breakpoints are active during both step and trace mode. If the
- procedure you step over contains a breakpoint, execution stops at the
- breakpoint.
-
- You can trace through the program continuously (without having to press F8
- at each step), using the Animate command from the Run menu. The speed of
- execution is controlled by the Trace Speed command from the Options menu.
- You can halt animated execution at any time by pressing any key.
-
-
- 15.4.3 Changing the Program Display Mode
-
- The F3 function switches the display between Source mode, Mixed mode, and
- Assembly mode. You can also switch display modes by choosing the Source
- Window command from the Options menu and then selecting a display mode in
- the Source Window Options dialog box. (If the source-code text file cannot
- be located, CodeView automatically disassembles the executable file and
- displays it in assembly-language form.)
-
- The Source mode shows the program as you wrote it. The Mixed mode and
- Assembly mode each expand macros and code-generating directives (such as
- .STARTUP) into assembly-language instructions. You can execute these
- instructions one at a time (rather than as a single item), and verify that
- the assembler has created the correct instructions from the macro or the
- directive.
-
- Figures 15.5 and 15.6 show Mixed mode and Assembly mode, respectively, for
- the same code.
-
- (This figure may be found in the printed book.)
-
- (This figure may be found in the printed book.)
-
-
- 15.5 Replaying a Debug Session
-
- CodeView can automatically create a "tape" (a disk file) with the debugging
- instructions and input data you entered when testing a program. The tape can
- then be "replayed" to repeat the debugging process. You initiate recording
- by choosing the History On command from the Run menu. Choosing History On a
- second time terminates recording. The recording is saved in the .CVH file in
- the current directory.
-
- Dynamic replay has several uses. The most obvious is repeating a debug
- session for the corrected version of a program. Dynamic replay usually works
- with slightly modified programs. However, the more you change the program,
- the less likely the new version will replay reliably.
-
- You can also use the recording as a bookmark. You can quit after a long
- debugging session, then pick up the session later in the same place.
-
- Dynamic replay makes it easy to correct a mistake.
-
- Most importantly, dynamic replay allows you to back up when you make an
- error or overshoot the section of code with the bug. This feature is
- important because not all bugs appear on the first path of execution you
- try.
-
- For example, you might have to manually execute a procedure many times
- before its bug appears. If you then enter a command that alters the
- machine's or program's status, thereby losing the information you need to
- find the cause of the bug, you would have to restart the program and
- manually repeat every debugging step to return to that point. Even worse, if
- you don't remember the exact sequence of events that exposed the bug, it
- could take hours to reproduce them.
-
- Dynamic replay of a recorded tape eliminates this problem. Choose the Undo
- command from the Run menu to automatically restart the program and
- continuously execute every command up to (but not including) the last one
- you entered. You can repeat this process as many times as you like until you
- return to the desired point in execution.
-
- You can add additional steps to an existing tape. Choose History On, then
- choose Replay. When replay has completed, perform whatever new debugging
- steps you want, then choose History On a second time to terminate recording.
- The new tape contains both the original and the added commands.
-
- ────────────────────────────────────────────────────────────────────────────
- NOTE
-
- CodeView records only those mouse commands that apply to CodeView. Mouse
- commands recognized by the application being debugged are not recorded.
- ────────────────────────────────────────────────────────────────────────────
-
-
- Replay Limitations under OS/2
-
- There are some limitations to dynamic replay when debugging under OS/2:
-
-
- ■ The program must not respond to asynchronous events. Replay under
- Presentation Manager is not currently supported because of this
- restriction.
-
- ■ Breakpoints must be specified at specific source lines or for specific
- symbols (rather than by absolute addresses), or replay may fail.
-
- ■ Single-thread programs behave normally during replay. However, one of
- the threads in a multithread program may cause an asynchronous event,
- violating the first restriction in this list. Multithread programs are
- therefore more likely to fail during replay.
-
- ■ Multiprocess replay will fail. Each new process invokes a new CodeView
- session. The existence of multiple sessions makes it impractical to
- record the sequence of events if you execute commands in a session
- other than the original session.
-
-
-
- 15.6 Advanced CodeView Techniques
-
- Once you are comfortable displaying and changing variables, stepping through
- the program, and using dynamic replay, you might want to experiment with the
- advanced techniques explained below.
-
-
- Debugging OS/2 Programs
-
- You can debug protected-mode and bound programs under CodeView. See the
- Debug Multiple Processes and Debug Multiple Threads sections of CodeView
- online help for information about executing threads and multiple processes.
-
-
-
- Setting Command-Line Arguments
-
- If your program retrieves command-line arguments, you can specify them with
- the Set Runtime Arguments command from the Run menu. Type the arguments in
- the Command Line field before you begin execution. (Arguments entered after
- execution begins cause an automatic restart.)
-
-
- Opening Multiple Source Windows
-
- You can open two Source windows at the same time. The windows can display
- two different sections of the same program, or one window can show the
- calling program and the other a procedure file. You can move freely between
- the windows, executing lines of code as you like.
-
-
- Calling Procedures
-
- Any procedure in your program (whether user-written or from a library) can
- be called from the Command window or the Watch window. In the Command
- window, use the Display Expression command as follows:
-
- ?procname (arglist)
-
- The procedure procname is evaluated with the arglist arguments and the
- returned value is displayed in the Command window. (Note that CodeView
- cannot evaluate a function that returns an aggregate type.) In the Watch
- window, simply enter the procedure call. If the procedure does not return a
- value, the value displayed is the value of the AX register upon return from
- the procedure.
-
- You can evaluate any procedure, not just those called by your program. All
- object code specified to the linker is linked into the program. Any public
- functions in this code can be evaluated from the Command window.
-
- You can use this feature to call functions from within CodeView that you
- would not normally include in the final version of your program. For
- example, you could include the OS/2 API functions that control semaphores,
- then execute them from the Command window to manipulate the run-time
- environment at any point in the debugging process. (Remember that altering
- the environment during program execution may have unexpected side effects.)
-
-
-
- Executing Faster when Using Breakpoints
-
- Breakpoints can slow execution. You can increase CodeView's speed with the
- /R command-line option if you have an 80386/486-based computer and are
- running CodeView under DOS. This option enables the four debug registers,
- which support breakpoint-checking in hardware rather than in software. (The
- CodeView options are described in Section 15.7.)
-
-
- Printing Selected Items
-
- You can print all or part of the contents of any window with the Print
- command from the File menu. In the Print dialog box, a check box lets you
- print selected text from the window, the material currently displayed in the
- window, or the complete contents of the window. Select text by dragging the
- mouse across it, or by holding down the SHIFT key and pressing the direction
- keys (LEFT, RIGHT, UP, DOWN).
-
- By default, print output is to the file CODEVIEW.LST in the current
- directory. You can choose whether the new material is appended to an
- existing file or overwrites it, using the Append/Overwrite check box. If you
- want print output to go to a different file, type its name in the To File
- Name field. If you want the output to go to a printer, enter the appropriate
- device name such as LPT1 or COM2.
-
-
- Redirecting CodeView Input and Output
-
- The Command window accepts DOS-like commands that redirect input and output.
- These commands can also be included on the command line that invokes
- CodeView. Whatever items follow the /C option on the command line are
- treated as CodeView commands to be immediately executed at start-up.
-
- CV /c "<infile; t>outfile" myprog
-
- In the example above, input is redirected from infile, which can contain
- start-up commands for CodeView. When CodeView exhausts all commands in the
- input file, focus automatically shifts to the Command window. Output is sent
- to outfile and echoed to the Command window. The t must precede the >
- command for output to be sent to the Command window.
-
- Redirection is a useful way to automate CodeView start-up. It also lets you
- keep a viewable record of command-line input and output, a feature not
- available with dynamic replay. No record is kept of mouse operations. Some
- applications (particularly interactive ones) may need modification to allow
- for redirection of input to the application itself.
-
-
- Executing Faster with Additional Memory
-
- If you are running DOS and your computer uses expanded or extended memory,
- you can increase CodeView's execution speed by selecting the /X or /E
- option. CodeView moves as much as it can of itself and the symbolic CodeView
- information to higher memory (above the first megabyte).
-
- The /X option uses extended memory and gives the greatest speed increase.
- This option requires the HIMEM.SYS driver, which is included on your
- distribution disks. Add DEVICE = HIMEM.SYS to your CONFIG.SYS file to load
- HIMEM.SYS at boot time.
-
- The /E option uses expanded memory. The speed increase is not as great as
- that supplied by the /X option. The expanded memory manager (EMM) must be
- LIM 4.0, and no single module's debug information can exceed 48K. If the
- symbol table exceeds this limit, try reducing file-name information by not
- specifying full path names at compile time and by specifying CodeView
- information (/Zi) only with those program modules that need debugging.
-
- If you do not specify either /X or /E (or the /D disk-overlay option),
- CodeView automatically searches for the HIMEM.SYS driver and extended memory
- so it can implement the /X option. If it fails, CodeView searches for
- expanded memory to implement the /E option. If that search fails, CodeView
- uses a default disk overlay of 64K. (See the description of the /D option in
- the next section.)
-
-
- 15.7 CodeView Command-Line Options
-
- The following options can be added to the command line that invokes
- CodeView. The Starting Up CodeView section of CodeView online help contains
- more information about these options.
-
- ╓┌─────────────────────────────────┌─────────────────────────────────────────╖
- Option Description
- Option Description
- ────────────────────────────────────────────────────────────────────────────
- /2 Two-monitor debugging. The display
- adapters must be configured for
- different addresses, such as Hercules (R)
- and VGA. The application is displayed on
- the primary monitor (the monitor the
- operating system normally directs output
- to), while CodeView's output appears on
- the secondary monitor.
-
- /25 Display in 25-line mode.
-
- /43 Display in 43-line mode.
-
- /50 Display in 50-line mode.
-
- /B Display in black and white. This assures
- that the display is readable when a
- color display is not used. You should
- also specify this option along with the
- Option Description
- ────────────────────────────────────────────────────────────────────────────
- also specify this option along with the
- /2 option when the secondary monitor is
- black and white.
-
- /Ccommands Execute commands immediately on start-up.
- The commands must be separated with a
- semicolon. If any commands require a
- space, enclose the entire list in double
- quotation marks.
-
- /D«buffersize» Use disk overlays to increase the size
- of the program that can be debugged,
- where buffersize is the decimal size of
- the overlay buffer, in kilobytes.
- Smaller buffers leave more room for the
- program being debugged, while larger
- buffers increase the speed of execution.
- The acceptable range is 16K to 128K. The
- default size is 64K. (DOS only.)
- Option Description
- ────────────────────────────────────────────────────────────────────────────
- default size is 64K. (DOS only.)
-
-
- /E Use expanded memory for symbolic
- information and CodeView overlays. (DOS
- only.)
-
- /F Flip screen video pages (rather than
- swap). When your application does not
- use graphics, eight video screen pages
- are available. Switching from CodeView
- to the output screen is accomplished by
- directly selecting the appropriate video
- page. Cannot be used with /S. (DOS only.)
-
- /G Suppress "snow" on a CGA display. (DOS
- only.)
-
- /I«0 | 1» Control trapping of nonmaskable
- Option Description
- ────────────────────────────────────────────────────────────────────────────
- /I«0 | 1» Control trapping of nonmaskable
- interrupts and 8259 interrupts. A value
- of 0 forces interrupt trapping on
- machines CodeView doesn't recognize as
- IBM-
- compatible. A value of 1 (the default)
- disables interrupt trapping. (DOS only.)
-
- /K Disable keyboard monitors (under OS/2)
- and keyboard interrupts (under DOS).
- This allows you to regain control of the
- computer under deadlock conditions, but
- prevents CodeView from recording
- keyboard entries when recording a debug
- session.
-
- /Ldll Load symbolic information for the
- specified dynamic-link libraries (DLL).
- (OS/2 only.) This option is required
- Option Description
- ────────────────────────────────────────────────────────────────────────────
- (OS/2 only.) This option is required
- only for DLLs loaded with DOSLOADMODULE.
- CodeView automatically loads debug
- information for statically linked DLLs.
-
- /M Disable CodeView's use of the mouse.
- This simplifies debugging programs that
- accept mouse commands.
-
- /N«0 | 1» Identical to /I, but applies only to
- nonmaskable interrupts. (DOS only.)
-
- /O Debug child processes ("offspring").
- (OS/2 only.)
-
- /R Use 80386/486 hardware debug registers
- to speed execution. (DOS only.)
-
-
- Option Description
- ────────────────────────────────────────────────────────────────────────────
- /S Swap screen in buffers (rather than
- flip). When your program uses graphics,
- all eight video pages must be used.
- Switching from CodeView to the output
- screen is accomplished by saving the
- previous screen in a buffer. Cannot be
- used with /F. (DOS only.)
-
- /TSF Toggle (invert) the sense of the
- Statefileread switch in TOOLS.INI. If
- Statefileread is set to no (do not read
- the status file), the status file is
- read, and vice-versa.
-
- /X Use extended memory for CodeView and
- symbolic information. (DOS only.)
-
-
-
-
- 15.8 Customizing CodeView with the TOOLS.INI File
-
- The TOOLS.INI file customizes the behavior and user interface of several
- Microsoft products. The TOOLS.INI file is a plain ASCII text file. You
- should place it in a directory pointed to the INIT environment variable. (If
- you do not use the INIT environment variable, CodeView looks for TOOLS.INI
- only in the CodeView source directory.)
-
- The CodeView section of TOOLS.INI is preceded by the following line:
-
- [cv]
-
- If you run the protected-mode version of CodeView, use [cvp] instead. If you
- run both versions, include both: [cv cvp]. You can have separate sections
- for cv and cvp if you want different customizations.
-
- Most of the TOOLS.INI customizations for CodeView control screen colors, but
- you can also specify such things as start-up commands or the default name of
- the file that receives CodeView output. See the Configure CodeView section
- of CodeView online help for full information about all TOOLS.INI switches
- that control CodeView.
-
-
- 15.9 Related Topics in Online Help
-
- In addition to information covered in this chapter, information on the
- following topics can be found in online help.
-
- Topic Access
- ────────────────────────────────────────────────────────────────────────────
- CodeView information Choose "CodeView Debuggers" from the
- "Microsoft Advisor Contents" screen
-
- ML command-line options Choose "Macro Assembler" from the
- "Command Line" section of the "Microsoft
- Advisor Contents" screen
-
-
-
-
-
-
-
-
-
- Chapter 16 Converting C Header Files to MASM Include Files
- ────────────────────────────────────────────────────────────────────────────
-
- The H2INC utility translates C header files into MASM-compatible include
- files. C header files normally have the extension .H; MASM include files
- normally have the extension .INC. This is the origin of the program's name:
- "H to INC."
-
- H2INC simplifies porting data structures from your C programs to MASM
- programs. This is especially useful when you have
-
-
- ■ A program that mixes C code and MASM code with globally accessible
- data structures
-
- ■ A program prototyped in C that you're translating to MASM for
- compactness and fast execution
-
-
- The H2INC program translates data declarations, function prototypes, and
- type definitions. H2INC does not convert C code into MASM code. When H2INC
- encounters a C statement that would compile into executable code, H2INC
- ignores the statement and issues a warning message to the standard output.
-
- H2INC accepts C source code compatible with Microsoft C 6.0 and creates
- include files suitable for MASM 6.0. These include files will not work with
- versions of MASM prior to 6.0.
-
- H2INC is designed to translate project header files that you have written
- specifically for translation to MASM 6.0 include files. It is not designed
- to translate header files such as PM.H and WINDOWS.H.
-
- This chapter explains how H2INC performs the C code translation and how the
- command-line options control the conversions.
-
-
- 16.1 Basic H2INC Operation
-
- H2INC is designed to provide automatic translation of C declarations that
- you need to include in the MASM portions of an application. However, the set
- of C statements processed by H2INC must be those needed by and interpretable
- by MASM. H2INC converts only function prototypes, some preprocessor
- directives,
-
- and C declarations outside the scope of procedures. For example, H2INC
- translates the C statement
-
- #define MAX_EMPLOYEES 400
-
- into this MASM statement:
-
- MAX_EMPLOYEES EQU 400t
-
- The t specifies the decimal radix.
-
- H2INC does not translate C code into MASM code. Statements such as the
- following are ignored:
-
- printf( "This is an executable statement.\n" );
-
- H2INC translates declarations, not executable code.
-
- By default, H2INC creates a single .INC file. If the C header file includes
- other header files, the statements from the original and nested files are
- translated and combined into one .INC file. This behavior can be changed
- with the /Ni option (see Section 16.2).
-
- The program also preprocesses some statements, just as the C preprocessor
- would. For example, given the following statements, if VERSION is not
- defined, H2INC ignores the #ifdef block.
-
- #ifdef VERSION
- #define BOX_VALUE 4
- #endif
-
- If VERSION is defined, H2INC translates the statements inside the block
- from C syntax to MASM syntax.
-
- H2INC normally discards comments. If you use the /C option, C comments are
- passed to the output file. If the line starts with a /* or // , the
- comment specifier is converted to a semicolon (;). If the line is part of a
- multiline comment, a semicolon is prefixed to each line.
-
- H2INC ignores anything that is not a comment or that cannot be translated.
- These items do not appear in the output file. If H2INC encounters an error,
- it stops translating and deletes the resulting .INC file.
-
-
- 16.2 H2INC Syntax and Options
-
- To run H2INC, type H2INC at the command-line prompt, followed by the
- options desired and the names of the .H files you want to convert:
-
- H2INC [[options]] file.H ...
-
- You can specify more than one file.H. File names are separated by a space.
- The contents of each file.H are translated into a single file in the current
- directory with the name file.INC. The original file.H is not altered.
-
- The following lists describe the available options. You can specify more
- than one option. Note that the options are case sensitive except for /HELP.
-
-
- H2INC recognizes /? to display a summary of H2INC syntax, and /HELP to
- invoke QuickHelp for H2INC. If QuickHelp is not available, H2INC displays a
- short list of H2INC options. This option is not case sensitive.
-
- H2INC recognizes but ignores C 6.0 options that aren't specified in the
- following two lists.
-
-
- Options Directly Affecting H2INC Output
-
- This first list describes the options that directly affect the H2INC output:
-
-
- Option Action
- ────────────────────────────────────────────────────────────────────────────
- /C Passes comments in the .H file to the
- .INC file.
-
- /Fa «filename» Specifies that the output file contain
- only equivalent MASM statements. This is
- the default. If specified, the filename
- overrides the default, keeping the base
- name of the C header files and adding
- the .INC extension.
-
- /Fc «filename» Specifies that the output file contain
- equivalent MASM statements plus original
- C statements converted to comment lines.
-
- /Mn Assumes the .MODEL directive is not
- specified for the MASM source or the
- generated .INC files. Instructs H2INC to
- declare explicitly the distances for all
- pointers and functions.
-
- /Ni Suppresses the expansion of nested
- include files.
-
- /Zu Makes all structure and union tag names
- unique.
-
-
- Options Indirectly Affecting H2INC Output
-
- This second list describes the options that indirectly affect the H2INC
- output:
-
- Option Action
- ────────────────────────────────────────────────────────────────────────────
- /AT Specifies tiny memory model (.COM).
-
- /AS Specifies small memory model, the
- default.
-
- /AC Specifies compact memory model.
-
- /AM Specifies medium memory model.
-
- /AL Specifies large memory model.
-
- /AH Specifies huge memory model.
-
- /D«const«=value» » Defines a constant or macro.
-
- /G0 Enables 8086/8088 instructions (default).
-
- /G1 Enables 80186/80188 instructions.
-
- /G2 Enables 80286 instructions.
-
- /G3 Enables 80386 instructions. Changes the
- default word size to DWORD.
-
- /G4 Enables 80486 instructions. Changes the
- default word size to DWORD.
-
- /Gc Specifies Pascal as the default calling
- convention.
-
- /Gd Specifies C as the default calling
- convention for functions (default).
-
- /Gr Specifies the _fastcall calling
- convention for functions. Generates a
- warning since H2INC does not translate
- _fastcall functions and prototypes.
-
- /Ht Enables generation of text equates. By
- default, text items are not translated.
-
- /Ipaths Searches named paths for include files
- before searching the paths in the
- INCLUDE environment variable. Paths are
- separated with a semicolon (;).
-
- /J Changes default character type from
- signed char to unsigned char.
-
- /nologo Suppresses display of the sign-on banner.
-
- Option Action
- ────────────────────────────────────────────────────────────────────────────
- /Tc «filename» Enables the processing of files whose
- name does not end in .H.
-
- /uident "Undefines" one of the predefined
- identifiers. (See Section 16.3.1.)
-
- /U "Undefines" all predefined identifiers.
- (See Section 16.3.1.)
-
- /w Suppresses compiler warning messages;
- same as /W0.
-
- /W0 Suppresses all warning messages.
-
- /W1 Displays level 1 warning messages
- (default).
-
- /W2 Displays level 1 and level 2 warning
- messages.
-
- /W3 Displays level 1, 2, and 3 warning
- messages.
-
- /W4 Displays all warning messages.
-
- /X Excludes search for include files in the
- standard places.
-
- /Za Disables language extensions (allows
- ANSI standard only).
-
- /Zc Causes functions declared as _pascal to
- be case insensitive.
-
- /Ze Enables language extensions (default).
-
- /Zn string Adds string to all names generated by
- H2INC. Used to eliminate name conflicts
- with other H2INC-generated include files.
-
- /Zp{1 | 2 | 4} Packs structure on a 1-, 2-, or 4-byte
- boundary, following C packing rules.
- Default is /Zp2.
-
-
- 16.3 Converting Data and Data Structures
-
- The primary use of H2INC is to convert data automatically from C format into
- MASM format. This section shows how H2INC converts constants, variables,
- pointers, and other C data structures to definitions recognizable to MASM.
-
- Since the names of the items translated by H2INC may be distinguished only
- by the case of the names, you should specify OPTION CASEMAP:NONE in any MASM
- files that include .INC files generated with H2INC.
-
-
- 16.3.1 User-Defined and Predefined Constants
-
- H2INC translates constants from C to MASM format. For example, C symbolic
- constants of the form
-
- #define CORNERS 4
-
- are translated to MASM constants of the form
-
- CORNERS EQU 4t
-
- in cases where CORNERS is an integer constant or is preprocessed to an
- integer constant. See Section 1.2.4, "Integer Constants and Constant
- Expressions," for more information on integer constants in MASM.
-
- TEXTEQU is new to MASM 6.0.
-
- When the defined expression evaluates to a noninteger value, such as a
- floating-point number or a string, H2INC defines the expression with TEXTEQU
- and adds angle brackets to create text macros. By default, however, these
- TEXTEQU expressions are not added to the include file. Set the /Ht option to
- tell H2INC to generate TEXTEQU expressions.
-
- /* #define PI 3.1415 */
- PI TEXTEQU <3.1415>
-
- H2INC uses this form when the expression is anything other than a constant
- integer expression. H2INC does not check the constant or string for
- validity. For example, although the following C definitions are valid, H2INC
- creates invalid string equates without generating an error.
-
- These C statements
-
- #define INT 6
- #define FOREVER for(;;)
-
- generate these MASM statements:
-
- INT EQU 6t
- FOREVER TEXTEQU <for(;;)>
-
- The first #define statement is invalid because INT is a MASM instruction; in
- MASM 6.0, instructions are reserved and cannot be used as identifiers. The
- for loop definition is invalid because MASM cannot assemble C code.
-
- Predefined constants control the contents of .INC files.
-
- You can make use of the following predefined constants in your C code to
- conditionally generate the code in .INC files. The predefined constants and
- the conditions under which they are defined are
-
- Predefined Constant When Defined
- ────────────────────────────────────────────────────────────────────────────
- _H2INC Always defined
-
- M_I86 Always defined
-
- MSDOS Always defined
-
- _MSC_VER Defined as 600 for this release
-
- M_I8086 Defined if /G0 is specified
-
- M_I286 Defined if /G0 is not specified
-
- NO_EXT_KEYS Defined if /Za is specified
-
- _CHAR_UNSIGNED Defined if /J is specified
-
- M_I86SM Defined if /AS is specified
-
- M_I86MM Defined if /AM is specified
-
- M_I86CM Defined if /AC is specified
-
- M_I86LM Defined if /AL is specified
-
- M_I86HM Defined if /AH is specified
-
- For example, if your C header file includes definitions which are specific
- to the C portion of the program or otherwise are not appropriate for
- translation by H2INC, you can bracket the C-specific code with
-
- #ifndef _H2INC
- /* C-specific code */
- #endif
-
- In this case, only the C compiler processes the bracketed code.
-
- The /u and /U options affect these predefined constants. The /uarg option
- undefines the constant specified as the argument. The /U option disables the
- definition of all predefined constants. Neither /u or /U affects constants
- defined by the /D option.
-
- H2INC places an OPTION EXPR32 directive in the .INC file so that MASM
- correctly handles long integers within expressions. This means that the .INC
- files as well as all the .ASM files which include .INC files created with
- H2INC will resolve integer expressions in 32 bits instead of 16 bits.
-
-
- 16.3.2 Variables
-
- H2INC translates variables from C to MASM format. For example, this C
- declaration
-
- int my_var;
-
- is translated into the MASM declaration
-
- EXTERNDEF my_var:SWORD
-
- H2INC converts C variable types to MASM types as follows:
-
- C Type MASM Type
- ────────────────────────────────────────────────────────────────────────────
- char BYTE or SBYTE (controlled by /J option)
-
- signed char SBYTE
-
- unsigned char BYTE
-
- short SWORD
-
- unsigned short WORD
-
- int SWORD (SDWORD with /G3 or /G4 option)
-
- unsigned int WORD (DWORD with /G3 or /G4 option)
-
- long SDWORD
-
- unsigned long DWORD
-
- float REAL4
-
- double REAL8
-
- long double REAL10
-
- H2INC assumes that a variable is external unless the variable is explicitly
- declared as static. For example, the C declaration
-
- long big_data;
-
- is converted to this MASM declaration:
-
- EXTERNDEF big_data:SDWORD
-
- See Sections 1.2.6, "Data Types," and 4.1.1, "Allocating Memory for Integer
- Variables," for more information on MASM data types, and Section 8.2.2,
- "Declaring Symbols Public and External," for information on EXTERNDEF.
-
- H2INC does not allocate space for arrays since all variables are assumed to
- be external. For example, the C declaration
-
- int two_d[10][20];
-
- translates to
-
- EXTERNDEF two_d:SWORD
-
- H2INC does not translate static variables, since the scope of these
- variables extends only to the file where they are declared.
-
-
- 16.3.3 Pointers
-
- H2INC translates C pointer variables into their MASM equivalents. The C
- declarations
-
- int *ptr_var;
- char NEAR *pCh;
-
- are translated into these MASM statements:
-
- EXTERNDEF ptr_var:PTR SWORD
- EXTERNDEF pCh:NEAR PTR SBYTE
-
- If you set the /Mn option, H2INC specifies all distances explicitly (for
- example, NEAR PTR instead of PTR). If /Mn is not set, the distances are
- generated only when they differ from the default values implied by the
- memory model specified by the /A command-line option.
-
- H2INC converts _segment and _based variables to type WORD in MASM.
-
- See Sections 1.2.6, "Data Types," and 3.3, "Accessing Data with Pointers and
- Addresses," for information about MASM pointers.
-
-
- 16.3.4 Structures and Unions
-
- H2INC translates C structures and unions into their MASM equivalents. H2INC
- modifies the C structure or union definition to account for differences from
- MASM structure and union definitions. This list describes these
- modifications.
-
-
- ■ C allows a structure or union variable to have the same name as the
- type name, but MASM does not. The H2INC /Zu option prevents the
- structure name from matching a variable or instance by prefixing every
- MASM structure name with @tag_.
-
- ■ If a C structure or union definition does not have a name, H2INC
- supplies one for the MASM conversion. These generated structure names
- take the form @tag_n, where n is an integer that starts at zero and
- is incremented for each structure name H2INC generates.
-
- ■ If the /Zn option is specified, H2INC inserts the given string between
- the underscore and the number in the generated structure names. This
- eliminates name conflicts with other H2INC-generated include files.
-
- ■ H2INC adds the alignment value to the converted structure definition.
-
-
- The following examples show how these rules are applied when converting
- structures. (Union conversions are not shown; they are handled identically.)
- These examples assume that the C header file defines an alignment value of
- 2. (See Section 5.2.1, "Declaring Structure and Union Types," for
- information on alignment values.)
-
- The following named C structure definition
-
- struct file_info
- {
- unsigned char file_addr;
- unsigned int file_size;
- };
-
- is converted to the following MASM form. Except for explicitly specifying
- the alignment value, the conversion is direct:
-
- file_info STRUCT 2t
- file_addr BYTE ?
- file_size WORD ?
- file_info ENDS
-
- If the same C structure definition is converted using the /Zu option, the
- @tag_ prefix is added to the structure's name so that the name does not
- duplicate the name of a structure component:
-
- @tag_file_info STRUCT 2t
- file_addr BYTE ?
- file_size WORD ?
- @tag_file_info ENDS
-
- If the original C structure definition is modified to be an unnamed-type
- declaration of a specific instance (myfile)
-
- struct
- {
- unsigned char file_addr;
- unsigned int file_size;
- } myfile ;
-
- its MASM conversion looks like the following example. (The specific integer
- added to the @tag_ prefix is determined by the sequence in which H2INC
- creates tag names.)
-
- @tag_7 STRUCT 2t
- file_addr BYTE ?
- file_size WORD ?
- @tag_7 ENDS
- EXTERNDEF C myfile:@tag_7
-
- Nested structures may have as many levels as desired; they are not limited
- to one level. Nested structures are "unnested" (expanded) in the correct
- hierarchical sequence, as shown with the C structure and H2INC-generated
- code in this example.
-
- /* C code: */
- struct phone
- {
- int areacode;
- long number;
- };
-
- struct person
- {
- char name[30];
- char sex;
- int age;
- int weight;
- struct phone;
- } Jim;
-
- ; H2INC generated code:
- phone STRUCT 2t
- areacode SWORD ?
- number SDWORD ?
- phone ENDS
-
- person STRUCT 2t
- name SBYTE 30t DUP (?)
- sex SBYTE ?
- age SWORD ?
- weight SWORD ?
- STRUCT
- areacode SWORD ?
- number SDWORD ?
- ENDS
- person ENDS
-
- EXTERNDEF C Jim:person
-
- See Section 5.2 for information on MASM structures and unions.
-
-
- 16.3.5 Bit Fields
-
- H2INC translates C bit fields into MASM records. H2INC looks at a structure
- definition; if it consists only of bit fields of the same type and if the
- total size of the bit fields does not exceed the type of the bit fields,
- then H2INC outputs a RECORD definition with the name of the structure. All
- bit-field names are modified to include the structure name for uniqueness,
- since record fields have global scope in MASM.
-
- For example,
-
- struct s
- {
- int i:4;
- int j:4;
- int k:4;
- }
-
- becomes:
-
- s RECORD @tag_0:4,
- k@s:4,
- j@s:4,
- i@s:4
-
- The @tag variable pads out the record to the type size of the bit fields
- so alignment of the structures will be correct.
-
- If the bit fields are too large, are not of the same type, or are mixed with
- fields that are not bit fields, H2INC generates a RECORD definition inside
- the structure and then uses the definition.
-
- For example,
-
- struct t
- {
- int i;
- unsigned char a:4;
- int j:9;
- int k:9;
- long l;
- } m;
-
- becomes:
-
- t STRUCT 2t
- i SWORD ?
- rec@t_0 RECORD @tag_1:4,
- a@t:4
- @bit_0 rec@t_0 <>
- rec@t_1 RECORD @tag_2:7,
- j@t:9
- @bit_1 rec@t_1 <>
- rec@t_2 RECORD @tag_3:7,
- k@t:9
- @bit_2 rec@t_2 <>
- l SDWORD ?
- t ENDS
-
- EXTERNDEF C m:t
-
- Notice that j and k are not packed because their total size exceeds the
- 16 bits of an integer in C.
-
- Since the @bit field names are local to the structure, these begin with 0
- for each structure type; the @rec variables have global scope and so
- their number always increases.
-
- The C bit-field declaration
-
- struct SCREENMODE
- {
- unsigned int disp_mode : 4;
- unsigned int fg_color : 3;
- unsigned int bg_color : 3;
- };
-
- is converted into the following MASM record:
-
- SCREENMODE RECORD disp_mode@SCREENMODE:4,
- fg_color@SCREENMODE:3,
- bg_color@SCREENMODE:3
-
- See Section 5.3 for information about MASM records.
-
-
- 16.3.6 Enumerations
-
- H2INC converts C enumeration declarations into MASM EQU definitions that are
- treated as standard integer constants. If the C declaration is not assigned
- a value, the H2INC generates an EQU statement that supplies a value
- equivalent to its position in the list. For example, the C enumeration
- declaration
-
- enum tagName
- {
- id1,
- id2,
- id3 = 42,
- id4
- };
-
- is converted into the following EQU statements:
-
- id1 EQU 0t
- id2 EQU 1t
- id3 EQU 42t
- id4 EQU 43t
-
- See Section 1.2.4 for information on MASM integer constants.
-
-
- 16.3.7 Type Definitions
-
- All type definitions using C base types are translated directly. For
- example, H2INC converts the C type definitions
-
- typedef int INTEGER;
- typedef float FLOAT;
-
- to these MASM forms:
-
- INTEGER TYPEDEF SWORD
- FLOAT TYPEDEF REAL4
-
- Pointer types are converted in a similar fashion. The following declarations
-
-
- typedef int *PINT
- typedef int **PINT
- typedef int far *PINT
-
- become (respectively)
-
- PINT TYPEDEF PTR SWORD
- PINT TYPEDEF PTR PTR SWORD
- PINT TYPEDEF FAR PTR SWORD
-
- Addressing mode determines pointer size.
-
- The number of bytes allocated for the pointer is set by the addressing mode
- you have selected unless if is specifically overridden in the type
- definition.
-
- C statements using typedef which convert to a type with the same name as the
- type do not generate errors, but are not converted. For example, H2INC does
- not convert
-
- typedef int SWORD
- typedef unsigned char BYTE
-
- since these typedef statements would generate these MASM statements:
-
- SWORD TYPEDEF SWORD
- BYTE TYPEDEF BYTE
-
- See Section 3.3, "Accessing Data with Pointers and Addresses," for
- information on using TYPEDEF in MASM 6.0.
-
-
- 16.4 Converting Function Prototypes
-
- When H2INC converts C function prototypes into MASM function prototypes, the
- elements of the C syntax are converted into the corresponding elements of
- the MASM syntax.
-
- The syntax of a C function prototype is
-
- [[storage]] [[distance]]
- [[ret_type]] [[langtype]]
- label ( [[parmlist» )
-
- In C syntax, storage can be STATIC or EXTERN. H2INC does not translate
- static function prototypes because static functions are visible only within
- the current source module, and standard include files do not contain
- executable code.
-
- Procedures for returning values depend on the langtype specified.
-
- In C, the ret_type is the data type of the return value. Because the MASM
- PROTO directive does not specify how to handle return values, H2INC does not
- translate the return type. However, H2INC checks the langtype specified in
- the C prototype to determine how particular languages return the
- value─through the stack or through registers.
-
- For the Pascal, FORTRAN, or Basic langtype specifications, H2INC appends an
- additional parameter to the argument list if the return type is longer than
- four bytes. This parameter is always a near pointer with the type of the
- return value. If the value of the return value type is not supported, this
- parameter is an untyped near pointer.
-
- For the _cdecl langtype specification in the C prototype, all returned data
- is passed in registers (AX or AX plus DX). There is no restriction on the
- return type. Additional parameters are not necessary.
-
- The langtype represents the naming and passing conventions for a language
- type. H2INC accepts the following C language types and converts them to
- their corresponding MASM language types:
-
- C Language Type MASM Language Type
- ────────────────────────────────────────────────────────────────────────────
- _cdecl C
-
- _fortran FORTRAN
-
- _pascal PASCAL
-
- _stdcall STDCALL
-
- _syscall SYSCALL
-
- H2INC explicitly includes the langtype in every function prototype. If no
- language type is specified in the .H file prototype, the default language is
- _cdecl (unless the default is overridden by the /Gc command-line option).
-
- In the MASM prototype syntax, the label is the name of the function or
- procedure.
-
- If you select the /Mn option, H2INC specifies the distance of the function
- (near or far), whether or not the C prototype specifies the distance. If /Mn
- is not set, H2INC specifies the distance only when it is different from the
- default distance specified by the memory model.
-
- If the C prototype's parameter list ends with a comma plus an ellipsis (,
- ...), the function can accept a variable number of arguments. H2INC converts
- this to the MASM form: a comma followed by the :VARARG keyword (, :VARARG)
- appended to the last parameter.
-
- H2INC does not translate _fastcall functions. Functions explicitly declared
- _fastcall (or invoking H2INC with the /Gr option) generate a warning
- indicating that the function declaration has been ignored.
-
- The following examples show how the preceding rules control the conversion
- of C prototypes to MASM prototypes (when the memory model default is small).
- The example function is my_func. The TYPEDEF generated by H2INC for the
- PROTO is given along with the PROTO statement.
-
- /* C prototype */
- my_func (float fNum, unsigned int x);
- ; MASM TYPEDEF
- @proto_0 TYPEDEF PROTO C :REAL4, :WORD
- ; MASM prototype
- my_func PROTO @proto_0
-
- /* C prototype */
- extern my_func1 (char *argv[]);
- ; MASM TYPEDEF
- @proto_1 TYPEDEF PROTO C :PTR PTR SBYTE
- ; MASM prototype
- my_func1 PROTO @proto_1
-
- /* C prototype */
- struct vconfig _far * _far pascal my_func2 (int, scri
- );
- ; MASM TYPEDEF
- @proto_2 TYPEDEF PROTO FAR PASCAL :SWORD, :scri
- ; MASM prototype
- my_func2 PROTO @proto_2
-
- /* C prototype */
- long pascal my_func3 (double y, struct vconfig vc);
- ; MASM TYPEDEF
- @proto_3 TYPEDEF PROTO PASCAL :REAL8, :vconfig
- ; MASM prototype
- my_func3 PROTO @proto_3
-
- /* C prototype */
- void _far _cdecl myfunc4 ( char _huge *, short);
- ; MASM TYPEDEF
- @proto_4 TYPEDEF PROTO FAR C :FAR PTR SBYTE, :SWORD
-
- ; MASM prototype
- myfunc4 PROTO @proto_4
-
- /* C prototype */
- short my_func5 (void *);
- ; MASM TYPEDEF
- @proto_5 TYPEDEF PROTO C :PTR
- ; MASM prototype
- my_func5 PROTO @proto_5
-
- /* C prototype */
- char my_func6 (int, ...);
- ; MASM TYPEDEF
- @proto_6 TYPEDEF PROTO C :SWORD, :VARARG
- ; MASM prototype
- my_func6 PROTO @proto_6
-
- /* C prototype */
- typedef char * ptrchar;
- ptrchar _cdecl my_func7 (char *);
- ; MASM TYPEDEF
- @proto_7 TYPEDEF PROTO C :PTR SBYTE
- ; MASM prototype
- my_func7 PROTO @proto_7
-
- See Section 7.3.6, "Declaring Procedure Prototypes," for more information on
- prototypes and Chapter 20, "Mixed-Language Programming," for information on
- calling conventions and mixed-language programs.
-
-
- 16.5 Related Topics in Online Help
-
- In addition to information covered in this chapter, information on the
- following topics can be found in online help.
-
- Topic Access
- ────────────────────────────────────────────────────────────────────────────
- INCLUDE Directive From the "MASM 6.0 Contents" screen,
- choose "Directives" and then
- "Miscellaneous"
-
- Include files From the "MASM 6.0 Contents" screen,
- choose "Example Code"; then choose
- "INCLUDE Files" to see a list of the
- include files provided with MASM 6.0
-
- MASM data types (constants, From the "MASM 6.0 Contents" screen,
- variables, structures, unions, choose "Directives"; then choose "Data
- real numbers, records) Allocation" or "Complex Data Types"
-
- TYPEDEF From the "MASM 6.0 Contents" screen,
- choose "Directives" and then "Complex
- Data Types"
-
- Procedures and prototypes From the "MASM 6.0 Contents" screen,
- choose "Directives"; then choose
- "Procedure and Code Labels"
-
-
-
-
-
-
- Chapter 17 Writing OS/2 Applications
- ────────────────────────────────────────────────────────────────────────────
-
- Microsoft Operating System/2 (OS/2) takes full advantage of 80286 and later
- processors. It supports memory far beyond the DOS 640K limit and offers a
- rich set of multitasking system calls. Although OS/2 is much more powerful
- than DOS, you may ultimately find it easier to program for OS/2.
-
- This chapter shows how to develop an OS/2 application and how to write
- dual-mode programs to run under both OS/2 and DOS.
-
- To write OS/2 applications, you must learn OS/2 system calls. While this
- chapter mentions a few of these calls, you should consult the references
- listed in the book's introduction to learn more about OS/2 system functions.
-
-
- OS/2 supports two modes─real mode, which emulates the DOS environment, and
- protected mode, which supports all the advanced features. For simplicity's
- sake, the rest of this chapter equates OS/2 with protected mode.
-
- ────────────────────────────────────────────────────────────────────────────
- NOTE
-
- Examples in this chapter support OS/2 1.x. Future versions of OS/2 may
- support different calling conventions.
- ────────────────────────────────────────────────────────────────────────────
-
-
- 17.1 OS/2 Overview
-
- There are three steps in developing OS/2 or dual-mode applications:
-
-
- 1. Write the source code, using procedure calls rather than interrupts to
- call system functions.
-
- 2. Assemble and link the program with OS2.LIB.
-
- 3. Optionally, convert the program so that it can run under both OS/2 and
- DOS.
-
-
- This chapter explains each of these steps, first looking at specific
- differences in how you write DOS and OS/2 code. Then it illustrates the
- development of a simple OS/2 program. Finally, the chapter discusses
- register initialization and additional OS/2 utilities.
-
-
- 17.2 Differences between DOS and OS/2
-
- Assembly language is assembly language. Most machine instructions you use in
- a DOS program are the same instructions you use in an OS/2 program. When you
- start making calls to the operating system, however, things change.
-
- You should understand the following differences between the two operating
- systems before attempting to write an OS/2 program.
-
-
- System Calls
-
- System calls control I/O and screen access.
-
- OS/2 is similar to DOS in that it offers a series of system calls that
- perform tasks such as opening or closing a disk file. The OS/2 system calls
- that handle keyboard input (KbdCharIn, for example) correspond to the
- interrupt 16h instructions in DOS. The OS/2 system calls for screen output
- (VioScrollDn, for example) correspond to DOS interrupt 10h calls. And the
- OS/2 disk and operating-system calls (DosGetDateTime, for example)
- correspond to DOS interrupt 21h calls.
-
- The effect is similar, but the way you actually make the calls is different.
- In DOS, you issue an interrupt. In OS/2, you make the system call with the
- INVOKE directive or the CALL instruction.
-
-
- New Instructions
-
- OS/2 is designed for advanced processors, and you may want to write programs
- that take advantage of the new instructions available on the 80286-80486. To
- use the new instructions and still target OS/2 1.x, place a .286 directive
- at the beginning of your source code.
-
- In general, you should avoid the directives that enable privileged
- instructions (.286P, .386P, and .486P), unless you are writing system-level
- code.
-
- Many OS/2 programs can be converted to run under DOS as well. To write
- programs to run on all DOS and OS/2 systems, use the default processor
- setting (.8086).
-
-
- The OS/2 Library
-
- MASM 6.0 provides OS2.INC and OS2.LIB.
-
- OS/2 programs must be linked to the system-call import library, OS2.LIB. The
- best way to perform this task is to use the INCLUDELIB directive, as shown
- in the example in the next section. In addition, you can include the OS2.INC
- file as an alternative to adding the prototypes for the OS/2 functions to
- your file.
-
- The OS2.LIB file makes system calls possible; it contains import definitions
- for all system calls. An import definition specifies the name of a procedure
- and the dynamic-link library (DLL) where the procedure resides. You can
- learn more about DLLs in Chapter 18, "Creating Dynamic-Link Libraries." To
- create an OS/2 application, however, you need to know only that OS2.LIB is
- required.
-
-
- Start-Up Code
-
- Unlike DOS, OS/2 automatically initializes all segment registers as required
- by the standard segment model. No special start-up sequence is required,
- although OS/2 places useful information in AX, BX, and CX (see Section 17.6,
- "Register and Memory Initialization") that you may want to save.
-
-
- Calling Conventions
-
- OS/2 1.x uses the Pascal calling convention.
-
- OS/2 system calls follow the Pascal calling and naming conventions. One way
- to enforce these conventions is to specify PASCAL in the .MODEL directive,
- then use the INVOKE directive to generate the correct code. Another is to
- include the OS2.INC file, which uses the PROTO directive to prototype the
- functions to follow the Pascal conventions. The prototypes specify Pascal as
- the calling convention. OS/2 functions return a value in AX. A nonzero value
- indicates an error. All registers except AX are preserved.
-
- The OS/2 2.x operating system uses different calling conventions. See the
- documentation provided with that product.
-
-
- Exit Code
-
- To exit an OS/2 program, call the OS/2 system function DosExit. If you use
- the .EXIT directive and the OS_OS2 attribute of the .MODEL statement, the
- assembler automatically generates the proper system call if you have a
- prototype for DosExit.
-
-
- Segment Restrictions
-
- Although OS/2 makes some operations easier, it does impose restrictions on
- the programmer. You cannot do segment arithmetic. That is, you cannot
- attempt to measure the distance between segments by subtracting one segment
- from another. In general, you also cannot add values to segment registers.
- Either operation may cause a protection violation, which would immediately
- terminate the program.
-
- Under OS/2, segment registers do not hold physical addresses; they hold
- "segment selectors." A segment selector is an index into the system's
- descriptor tables that hold the actual addresses. You can copy the segment
- selector or use it to access data, but you should not try to modify it.
-
- Huge pointer arithmetic is therefore different under OS/2. Under DOS, you
- can handle huge pointers easily by checking the OVERFLOW? flag after you
- increment or add to an offset address. If the result overflows (exceeds
- 64K), then you increment the segment address. Under OS/2, manipulation of
- huge pointers requires special techniques. See your OS/2 documentation for
- more information.
-
-
- 17.3 A Sample Program
-
- The following program prints Hello, world. It runs under OS/2 protected
- mode.
-
- ; HELLO.ASM
- ;
- .MODEL small, pascal, OS_OS2
- .286
-
- INCLUDELIB os2.lib
- INCLUDE os2.inc
-
- .STACK
- .DATA
- message BYTE "Hello, world.", 13, 10 ; Message to print
- bytecount DWORD ? ; Holds number of
- ; bytes written
- .CODE
-
- .STARTUP
- push 1 ; Select standard output
- push ds ; Pass address of message
- push OFFSET message
- push LENGTHOF message ; Pass length of message
- push ds ; Pass address of count
- push OFFSET bytecount ; returned by function
- call DosWrite ; Call system write
- ; function
- .EXIT 0 ; Exit with 0 return code
- END
-
- .STARTUP and .EXIT automatically generate code.
-
- The .STARTUP and .EXIT directives are very useful because they automatically
- produce correct code for the operating-system type specified with the .MODEL
- directive (see Section 2.2, "Using Simplified Segment Directives"). As
- described in Section 17.6, OS/2 initializes all segment registers;
- therefore, .STARTUP does nothing but indicate the starting point. To
- correctly exit an OS/2 program, you must call the DosExit function. The
- DosExit prototype is always available to MASM programs.
-
- In the example above, .EXIT automatically generates the following code under
- OS/2:
-
- .EXIT 0
- 0011 6A 01 * push +000000001h ; Action 1 ends
- all threads
- 0013 6A 00 * push +000000000h ; Pass 0 return code
- 0015 9A ---- 0000 E * call DosExit ; Call system function
- END
-
- Between .STARTUP and .EXIT, the entire program consists of a single call to
- the DosWrite function. The program pushes the parameters on the stack and
- then makes the call. No POP or ADD instructions are needed to restore the
- stack after DosWrite returns; DosWrite observes the Pascal calling
- convention and restores the stack itself before returning.
-
- The .MODEL statement helps ensure that the assembler produces correct code
- for calling DosWrite:
-
- .MODEL small, pascal, OS_OS2
-
- When you run HELLO.EXE, OS/2 looks at the import definitions in the
- executable-file header and makes sure that all needed DLLs are in memory. It
- then loads any needed DLLs not already in memory.
-
- The assembler must be informed that DosWrite and DosExit are far and observe
- the Pascal calling convention. This information is in the prototype.
-
- In the call to DosWrite, note that although OFFSET message is an immediate
- operand, the program pushes it directly onto the stack. This operation is
- legal on 80186-80486 processors but not on the 8086 or 8088:
-
- push OFFSET message
-
- The processors you want to target determine the instructions you should use.
-
-
- Since OS/2 programs can execute only on the 80286 or later processors, it is
- reasonable to use extended operations not supported by the 8086. However, if
- you want to write a program that can be converted to run under both OS/2 and
- DOS (as shown in Section 17.5), then you should write code that can run on
- the 8086. For example,
-
- mov ax, OFFSET msg
- push ax
-
- The following revision of the sample program illustrates the usefulness of
- the INVOKE directive. This version does everything the previous example did
- with far fewer statements:
-
- ; HELLO.ASM
-
- .MODEL small, pascal, OS_OS2
-
- INCLUDE os2.inc
- INCLUDELIB os2.lib
-
- .STACK
- .DATA
- message BYTE "Hello, world.", 13, 10 ; Message to print
- bytecount DWORD ? ; Holds number of
- ; bytes written
- .CODE
- .STARTUP
-
- INVOKE DosWrite,
- 1,
- ADDR message,
- LENGTHOF message,
- ADDR bytecount
-
- .EXIT 0 ; Exit with return code 0
- END
-
- The INVOKE directive generates a call to the given procedure after first
- pushing all other arguments on the stack. Like a call statement in a
- high-level language, the INVOKE directive handles types in a sophisticated
- way.
-
-
- 17.4 Building an OS/2 Application
-
- The easiest way to assemble and link the program is from the Programmer's
- WorkBench (PWB). From the Options Menu, select Link Options and choose OS/2
- Application. When you select Build from the Make menu, PWB calls ML and
- LINK, passing the proper options.
-
- From the command line, type
-
- ML hello.asm
-
- The next section discusses how to "bind" the program─that is, convert it so
- that it runs under either DOS or OS/2.
-
-
- 17.5 Binding OS/2 MASM Programs
-
- You can convert many OS/2 programs to run under both OS/2 and DOS 3.x. This
- conversion is called "binding" because it binds system calls to the API.LIB
- file provided with MASM 6.0. This file simulates OS/2 functions under DOS.
- The program must use a restricted set of system calls or it cannot be bound.
-
-
- OS/2 function calls are known collectively as the applications program
- interface (API). If you restrict your system calls to a subset of these
- functions known as the Family API, the program can be bound. See the
- Microsoft Operating System/2 Programmer's Reference for a list of the Family
- API functions.
-
- Online help also provides information on these utilities.
-
- If you use PWB, binding is easy. Select Bound Application from the LINK
- Options command in the Options menu. PWB does the rest, calling the BIND.EXE
- utility.
-
- If you want to bind the program to run under either OS/2 or DOS, use this
- command line:
-
- ML /Fb hello.asm
-
- You can use system calls outside the Family API provided that you never use
- them when running under DOS. The program can check the operating system and,
- if running under OS/2, can execute system calls that do not belong to the
- Family API. To follow this strategy, list OS/2-only calls with the BIND's /N
- option. It is the program's responsibility to make sure these calls are
- never made under DOS; otherwise, execution is terminated.
-
-
- 17.6 Register and Memory Initialization
-
- When you execute an OS/2 program, OS/2 stores information about the program
- directly in registers. With DOS programs, the information is kept in a
- separate program segment prefix (PSP). The registers hold these values when
- an OS/2 program begins:
-
- ╓┌─────────────────────────────────┌─────────────────────────────────────────╖
- Register Contents at Program Start
- ────────────────────────────────────────────────────────────────────────────
- AX Segment address of program's environment
-
- BX Offset of command-line arguments within
- the
- environment
-
- CX Length of near data area (DGROUP)
-
- SP Offset of the top of the stack within
- the stack segment
- Register Contents at Program Start
- ────────────────────────────────────────────────────────────────────────────
- the stack segment
-
- CS:IP Program's entry point
-
- DS Segment address of near data area
- (DGROUP)
-
- SS Segment address of stack
-
-
-
- Note that OS/2 automatically initializes SS:SP correctly. If the .MODEL
- directive specifies FARSTACK, SS is initialized to its own segment address.
- If the model is NEARSTACK, OS/2 sets SS to DGROUP and SP to the top of the
- stack within DGROUP.
-
- You may want to save the AX, BX, and CX registers at startup.
-
- Upon start-up, AX, BX, and CX all contain information highly useful to some
- programs. If you want to access the program's command-line arguments or know
- the size of DGROUP, you must save the contents of these registers
- immediately:
-
- FPBYTE TYPEDEF FAR PTR BYTE
-
- .DATA
-
- args FPBYTE 0
- cmds FPBYTE 0
-
- .CODE
-
- mov WORD PTR args[0], ax ; Save segment of args
- mov WORD PTR args[2], 0 ; Offset is 0
- mov WORD PTR cmds[0], ax ; Save segment of cmds
- mov WORD PTR cmds[2], bx ; Save offset of cmds
-
- The AX register points to the segment value of the start of the program's
- environment. AX:BX points to the starting address of arguments within the
- environment, the first of which is the program name. This name is followed
- by a null (zero) byte and the command-line arguments exactly as typed at the
- command prompt. A second null marks the end of the arguments.
-
- If you use simplified segments, .DATA is equivalent to DGROUP.
-
- Under OS/2, the data segment register, DS, contains the segment of the near
- data area, DGROUP. If you use simplified segment directives, this is the
- .DATA segment. You must place one data segment in a group called DGROUP if
- you do not use the simplified directives:
-
- _DATA SEGMENT WORD PUBLIC 'DATA'
- .
- .
- .
- _DATA ENDS
-
- DGROUP GROUP _DATA
- ASSUME DS:DGROUP
-
- Calling the group anything other than DGROUP, or not having a DGROUP, causes
- an error. Only the memory required by the program is allocated by OS/2. This
- means that the system has space in reserve for later memory requests and for
- other programs.
-
-
- 17.7 Other OS/2 Utilities
-
- In addition to LINK and BIND, MASM 6.0 provides other utilities useful for
- working with OS/2.
-
-
- EXEHDR
-
- The EXEHDR utility examines and can modify a DOS, Windows, or OS/2
- executable file header. In the case of OS/2 and Windows, EXEHDR reports a
- great deal more information: specifically, it displays the contents of
- segment tables and lists the attributes of the individual segments.
-
-
- IMPLIB
-
- The IMPLIB utility creates an import library that you can use when linking
- with a DLL or group of DLLs. Generally, there are three steps in using a
- DLL:
-
-
- 1. Copy the DLL to a directory listed in your CONFIG.SYS LIBPATH setting.
-
- 2. Run IMPLIB on the DLL to create an import library, or write a
- moduledefinition file.
-
- 3. Link the import library or module-definition file with any application
- that uses the DLL.
-
-
- An import library does not contain executable code but does contain the name
- and location of dynamic-link calls. These calls are resolved during run
- time.
-
- Chapter 18 goes into more detail about how to write DLLs.
-
-
- 17.8 Module-Definition Files
-
- You can create a module-definition file for an application. A
- module-definition file is a text file that contains statements that give
- directions to the linker. These statements can alter the attributes of
- individual segments─for example, whether multiple instances of the program
- share data. Module-definition files are optional. If you use one, begin the
- file with the NAME statement. The following sample module-definition file
- specifies an application, MYPROG, that shares the CONSTDAT segment:
-
- NAME MYPROG
-
- SEGMENTS CONSTDAT SHARED
-
-
- 17.9 Related Topics in Online Help
-
- In addition to information covered in this chapter, information on the
- following topics can be found in online help:
-
- ╓┌─────────────────────────────────┌─────────────────────────────────────────╖
- Topic Access
- Topic Access
- ────────────────────────────────────────────────────────────────────────────
- BIND See the "Microsoft Advisor Contents"
- screen
-
- OS/2 Include files Choose from the "MASM 6.0 Contents"
- screen
-
- PROTO, INVOKE From the "MASM 6.0 Contents" screen,
- choose "Directives" and then "Procedure
- and Code Labels"
-
- INCLUDE, INCLUDELIB From the "MASM 6.0 Contents" screen,
- select "Directives" and then
- "Miscellaneous Language Directives"
-
- EXEHDR From the "Microsoft Advisor Contents"
- screen, select "Miscellaneous" under
- "Microsoft Utilities"
-
- INCL_NOCOMMON Select "OS/2 Include Files" from the
- Topic Access
- ────────────────────────────────────────────────────────────────────────────
- INCL_NOCOMMON Select "OS/2 Include Files" from the
- "MASM 6.0 Contents" screen; from the
- next screen, select "Category Summary"
-
- CALL From the "MASM 6.0 Contents" screen,
- choose "Processor Instruction" and then
- "Control Flow"
-
- SHOW.EXE From the "MASM 6.0 Contents" screen,
- choose "Example Code" and then "SHOW
- (Text Viewer)"
-
-
-
-
-
-
-
-
- Chapter 18 Creating Dynamic-Link Libraries
- ────────────────────────────────────────────────────────────────────────────
-
- A "dynamic-link library" (DLL) links to the main program at run time (hence
- the term dynamic link). The program that calls the DLL is known as the
- "client program." One DLL can supply services for several clients
- simultaneously.
-
- The client program can choose to load the DLL into memory at the same time
- the main program loads, or it can choose to load the DLL only when it is
- needed.
-
- DLLs are available only in OS/2 and Windows. In non-Windows DOS programs,
- all object modules are statically linked to the program at link time. This
- chapter discusses DLL programming for OS/2 1.x only.
-
- After an overview of DLLs, this chapter describes the following stages in
- developing a DLL:
-
-
- ■ Understanding general DLL programming considerations
-
- ■ Writing an interface to the DLL's exported procedures and data
-
- ■ Writing initialization and termination code
-
- ■ Building the DLL
-
-
- The last step requires use of a module-definition file and an import
- library.
-
-
- 18.1 DLL Overview
-
- Like a standard (object-code) library, a DLL contains procedures that one or
- more programs can call. Yet unlike standard-library procedures, DLL
- procedures are never copied into an application's executable file. They
- reside only on disk in the DLL file.
-
- DLLs have several advantages:
-
-
- ■ Dynamic link libraries save significant space since the DLL's code and
- data exist in only one place, no matter how many different programs
- call the DLL. Applications that need a particular DLL can share it.
-
- In contrast, a standard library routine (the printf function in C, for
- example) becomes part of the executable code for each application that
- uses it. For example, if three different programs use the statically
- linked printf function, three copies of the printf code are on disk.
- Furthermore, if all three programs run at once, the printf code occurs
- three times in memory. If the same function were part of a DLL, it
- would exist in only one location on disk and in memory.
-
- ■ Dynamic linking makes applications and libraries more independent, and
- therefore they are easier to maintain. You can update a DLL without
- having to relink any of the programs that use it.
-
- ■ Applications link faster because the executable code for a dynamic
- link function is not copied into the application's .EXE file. Instead,
- only an import definition is copied.
-
-
- The purpose of a DLL is to supply ("export") procedures and data to client
- programs at run time. Items not exported are visible only within the DLL.
-
- Exported procedures are visible to the client program.
-
- The concept of exporting is analogous to the action of the PUBLIC directive,
- but goes further. A public item is available only to other source modules
- within the same program or DLL. An exported item is available to all
- programs running on the system. In addition to global procedures and data, a
- DLL can contain other procedures and data definitions to support the
- operations of exported procedures.
-
- Finally, a DLL can contain initialization and termination code to allocate
- and release resources needed by the procedures. Resources are typically
- files or dynamic memory. System services for OS/2 and Windows are provided
- through DLLs.
-
-
- 18.2 DLL Programming Requirements
-
- Four programming requirements arise from the nature of DLLs. These
- requirements apply to all code used in a dynamic-link call─both in an
- exported procedure and in any procedure it may call:
-
-
- ■ You cannot assume that the SS and DS registers hold the same value,
- unless you explicitly set SS equal to DS.
-
- ■ You should avoid using the math coprocessor or emulator routines
- unless you are certain a coprocessor or emulator library is available.
-
- ■ The DLL should be "re-entrant," because there is no guarantee that
- only one program will use the DLL. A re-entrant procedure is one that
- can be called by different programs concurrently. This creates
- problems for static data in the DLL, unless you declare data to be
- NONSHARED in the module-definitions file.
-
- ■ Be careful how you place data and code in segments. The location of
- data and code in different segments and the contents of the
- module-definition file also determine the content of the executable
- file.
-
-
- This section discusses these requirements.
-
-
- 18.2.1 Separate Stack and Data Requirement
-
- The separate stack and data requirement involves both assembler assumptions
- and coding techniques. If you used the FARSTACK keyword as described in
- Section 18.3.1, "Choosing Module Attributes," the assembler makes correct
- assumptions about the contents of DS and SS.
-
- Do not assume that SS equals DS.
-
- In your own code, avoid any optimizing techniques that use SS to access
- items in the data segment or DS to access stack data. For example, the
- following code uses the ASSUME statement to be sure the correct stack is
- accessed:
-
- ASSUME DS:DGROUP
- .
- .
- .
- push ds
- lds si, sourcead ; Load DS for string ops
- ASSUME DS:NOTHING
- .
- .
- .
- ASSUME SS:STACK
- mov bx, ss:thing ; Access near data thing through SS
- ASSUME SS:NOTHING
-
- Thread-specific variables can be stored on the stack, as shown in the
- example above.
-
-
- 18.2.2 Floating-Point Math Requirement
-
- Don't assume the math coprocessor is available to the DLL.
-
- A stand-alone DLL─that is, a DLL created for general use by many programs─
- can make few assumptions about the calling program. Therefore, the safest
- way to perform floating-point calculations is to use alternate math
- routines. If you link to a Microsoft high-level language, you can access
- these routines through a language library. These routines give the fastest
- results possible without a coprocessor. See Section 6.3, "Using Emulator
- Libraries," for more information.
-
- Floating-point operations in DLLs can use a coprocessor or emulator routines
- if you are certain that a coprocessor or emulator libraries are available.
-
-
- 18.2.3 Re-entrance Requirement
-
- A procedure may be called by any number of different programs concurrently.
- That is, program A may call a DLL procedure while program B is still
- executing the same procedure. The basic problem of re-entrance is how data
- is shared.
-
- Be aware that re-entering the DLL can modify its data.
-
- For example, suppose you have a DLL that contains an accounting package; one
- of the functions adds up an employee's salary for a whole year. First it
- initializes the total to zero; then it increments this total one week at a
- time. While program A is in the middle of this function, program B could
- enter the procedure; its first action would be to initialize the total to
- zero. Control could then pass back to program A, which would then have zero
- total for salary. The problem is that two instances of the DLL share the
- same variable for totals.
-
- A procedure in a DLL must therefore follow this rule: it can access static
- data items but must not alter them. Otherwise, one instance of a procedure
- could corrupt data relied on by another instance of the procedure.
-
- There are several exceptions to this rule. First, if data is declared
- NONSHARED in the module-definitions file, each instance has its own copy of
- the data segment, and there is no conflict. Second, you can use semaphores
- to allow mutually exclusive access to data items. Finally, there may be some
- items you deliberately want all instances to alter─such as a global counter
- to keep track of number of instances.
-
- Section 18.4.1, "Writing the Module-Definition File," explains how to
- declare some data items as SHARED while declaring others to be NONSHARED.
-
-
- 18.2.4 Segment Strategy in a DLL
-
- Be careful how you place different kinds of data and code in different
- segments. When loading the DLL, OS/2 checks to see if the DLL is already in
- memory. If so, it loads only new copies of NONSHARED segments; it does not
- reload SHARED segments. Code segments are always SHARED.
-
- Control of DLL data and code works at the segment level. The DATA statement
- assigns default attributes for all data segments in the DLL, but the
- moduledefinition SEGMENTS statement overrides these attributes for any given
- segment.
-
- You may want to create a DLL that has some data shared between all programs
- that call the DLL and some data that is private to each instance. The
- following module-definition statement specifies that all data in GLOBDAT
- is shared and all data in PRIVDAT is not:
-
- SEGMENTS
- GLOBDAT SHARED 'data'
- PRIVDAT NONSHARED 'data'
-
- The segments have class `code' unless you specifically define the class as
- shown in this example. See Section 18.4.1 for more information on
- module-definition files.
-
-
- 18.3 Writing the DLL Code
-
- When you write the code for the DLL module, you need to select the correct
- module attributes, define the procedures and data in your DLL, and write the
- initialization and termination code. This section discusses these tasks.
-
-
- 18.3.1 Choosing Module Attributes
-
- As noted in Chapter 2, there are four fields for the .MODEL directive:
- memory model, language type, operating system, and stack type. When you
- write a DLL, you can choose the attributes you would normally use for the
- first two fields. OS/2 system calls use the Pascal calling convention, so
- you may find it convenient to make all your modules use this convention as
- well.
-
- DLLs use the OS_OS2 and FARSTACK attributes.
-
- The operating system and stack fields should be OS_OS2 and FARSTACK,
- respectively. You should use the NEARSTACK attribute only if you switch
- execution to your own stack.
-
- A usable declaration is therefore
-
- .MODEL large, pascal, os_os2, farstack
-
- If you are using full segment definitions, remember to generate an ASSUME
- directive for DS but not for SS.
-
- ASSUME DS:DGROUP ; Necessary with full segment definitions
-
-
- 18.3.2 Defining Procedures and Data
-
- Procedures and data in DLLs can be either global (available to the client
- process) or local (used only by the DLL). To create a global data item, make
- sure that it is public:
-
- EXTERNDEF dllvar
- .DATA
- dllvar WORD 0
-
- The variable must then be exported in a module-definition file, as shown in
- Section 18.4.1, "Writing the Module-Definition File." When executable files
- other than the DLL access the variable, they must treat it as far data, as
- in the following example:
-
- mov ax, SEG dllvar
- mov es, ax
- mov bx, es:dllvar
-
- An exported procedure (often called a dynamic-link procedure) must follow
- these rules:
-
-
- ■ It must be declared far and public. The MASM keyword EXPORT does both
- of these.
-
- ■ The procedure should initialize DS upon entry (unless you are not
- going to be accessing any static near data).
-
- ■ Data pointers in the parameter list should be far.
-
-
- The easiest way to realize most of these requirements is to use the EXPORT
- keyword and LOADDS in the procedure's prologuearg list (see Section 7.3.8).
- LOADDS generates instructions to save DS and load it with the value of the
- DLL's data segment. The EXPORT keyword makes the procedure FAR and PUBLIC,
- overriding the memory model. You may also need to use FORCEFRAME, which
- instructs the assembler to generate a stack frame even if there are no
- parameters or locals.
-
- The example DLL used in the chapter, CSTR.DLL, illustrates how DLLs can be
- shared by several processes. The procedures in the DLL write a string and
- keep track of the number of times the string is written. When more than one
- process uses the DLL, they all increment the global variable GCount, but
- each process increments its own private instance of the PCount variable.
-
- The only initialization code this DLL needs is code to set up the exit code.
- The next section shows how to write a module-definition file to create an
- import library and how to create a DLL from this code.
-
- The code for the CSTR.DLL example looks like this:
-
- .MODEL small, pascal, os_os2, farstack
- .286
-
- INCL_NOCOMMON EQU 1
- INCL_DOSPROCESS EQU 1
- INCL_VIO EQU 1
-
- INCLUDE OS2.INC
- INCLUDELIB OS2.LIB
-
- .DOSSEG
-
- VioWrtCStr PROTO FAR PASCAL, pchString:PCH, hv:HVIO
- GetGCount PROTO PASCAL
- GetPCount PROTO PASCAL
- CStrExit PROTO FAR
-
- .STACK
- .DATA ; Default segment is SHARED
-
- GCount WORD 0 ; Count of all calls
-
- @CurSeg ENDS
-
- PRIVDAT SEGMENT WORD ; Private segment is NONSHARED
-
- PCount WORD 0 ; Count of all this process
- ; calls to VioWrtCStr
- PRIVDAT ENDS
-
- .CODE
- .STARTUP
-
- pusha
-
- ; Initialization goes here. In this case, the only
- ; initialization is setting up the exit behavior.
-
- INVOKE DosExitList, EXLST_ADD, CStrExit
- INVOKE DosExitList, EXLST_EXIT,0
-
- popa
- retf
-
- VioWrtCStr PROC FAR PASCAL EXPORT <LOADDS> USES cx di si,
- pchString:PCH,
- hv:HVIO
-
- sub al, al ; Search for zero
- mov cx, 0FFFFh ; Set maximum length
- les di, pchString ; Load pointer
- mov si, di ; Copy it
- repne scasb ; Find null
- .IF zero? ; Continue if found
- sub di, si ; Calculate length
- xchg di, si ; Restore address and save length
-
- INVOKE VioWrtTTy, ; Let OS/2 do output
- es:di, ; Address of string
- si, ; Calculated length
- hv ; Video handle
-
- inc GCount ; Count as one of total calls
-
- ASSUME DS:PRIVDAT
- mov ax, PRIVDAT
- mov ds, ax
- inc PCount ; Count as one of process calls
- ASSUME DS:DGROUP
- sub ax, ax ; Success
- .ELSE
- mov ax, 1 ; Error
- .ENDIF
- ret
- VioWrtCStr ENDP
-
- GetGCount PROC FAR PASCAL EXPORT <LOADDS, FORCEFRAME>
- mov ax, GCount
- ret
- GetGCount ENDP
-
- GetPCount PROC FAR PASCAL EXPORT <LOADDS, FORCEFRAME> USES ds
- ASSUME DS:PRIVDAT
- mov ax, PRIVDAT
- mov ds, ax
- mov ax, PCount
- ASSUME DS:NOTHING
- ret
- GetPCount ENDP
-
- .DATA
- szOut BYTE 13, 10, "Exiting DLL...", 13, 10, 0
- .CODE
-
- CStrExit PROC FAR <LOADDS, FORCEFRAME>
- INVOKE VioWrtCStr,
- ADDR szOut,
- 0
- INVOKE DosExitList, EXLST_EXIT, 0
- CStrExit ENDP
-
- END
-
- These generated code for the VIOWrtCStr procedure follows. The code marked
- with asterisks is generated by the assembler.
-
- VioWrtCStr PROC FAR PASCAL EXPORT <LOADDS> USES cx di si,
- pchString:PCH,
- hv:HVIO
- 0000 55 * push bp
- 0001 8B EC * mov bp, sp
- 0003 1E * push ds
- 0004 B8 ---- R * mov ax, DGROUP
- 0007 8E D8 * mov ds, ax
- 0009 51 * push cx
- 000A 57 * push di
- 000B 56 * push si
- .
- . ; Procedure code here
- .
- ret
- 000C 5E * pop si
- 000D 5F * pop di
- 000E 59 * pop cx
- 000F 1F * pop ds
- 0010 C9 * leave
- 0011 CA 0006 * ret 00006h
- 0014 VioWrtCStr ENDP
-
- The DLL should establish its own data segment.
-
- The DLL should change DS in this manner because each client program has its
- own private version of DGROUP. When a program calls your dynamic-link
- procedure, DS points to the program's data area, not yours. The solution is
- to initialize DS so that it points to your own default data area.
-
- However, one side effect of this approach is that it alters DS so that it no
- longer is equal to SS. Consequently, all data pointers in the parameter list
- must be far pointers, even if the data was stack data or near data.
-
-
- 18.3.3 Creating Initialization and Termination Code
-
- Begin initialization code with the .STARTUP directive.
-
- A DLL can contain procedures that require special resources, such as
- temporary files or dynamic memory blocks. Resources allocated during
- initialization exist for the lifetime of the client program and are removed
- when the client program exits. Usually the best method for managing these
- resources is to write initialization and termination code.
-
- A DLL can have a starting point just as an application does. In the case of
- a DLL, this starting point marks the beginning of the initialization code. A
- DLL does not need a starting point if it has no need for initialization. Do
- not use .EXIT, since .EXIT will terminate the client program.
-
- Attributes of the initialization code are defined in the module-definition
- file (see Section 18.4.1). Initialization code can have the INITGLOBAL or
- INITINSTANCE attribute.
-
- INITGLOBAL specifies that the initialization code executes only once─when
- the DLL is first loaded into memory. INITINSTANCE specifies that
- initialization code should execute once for each program that uses the DLL.
- INITGLOBAL is the default. You should use termination code only for DLLs
- that have been defined with INITINSTANCE unless you know that the first
- process to use the DLL is the last to terminate.
-
- To specify INITINSTANCE, place the LIBRARY statement in your
- moduledefinition file:
-
- LIBRARY CSTR INITINSTANCE
-
- In the statement above, CSTR is the name of the DLL.
-
- To include a termination procedure, invoke DosExitList in the initialization
- code. DosExitList is a system function that attaches a termination procedure
- to a program. When the program terminates, OS/2 executes the procedure as
- part of the program exit sequence. In the termination procedure itself,
- release any system resources (such as memory or files) allocated during
- initialization.
-
- This is the termination code for the CSTR.DLL module:
-
- CStrExit PROC FAR <LOADDS, FORCEFRAME>
- INVOKE VioWrtCStr,
- ADDR szOut,
- 0
- INVOKE DosExitList, EXLST_EXIT, 0
- CStrExit ENDP
-
- The termination code in CSTR.DLL uses the INVOKE directive to set up a call
- to the DosExitList function. You can perform a similar operation by simply
- pushing arguments on the stack and observing the correct calling convention.
-
-
- The effect of DosExitList in the initialization code is to make OS/2 call
- the termination procedure when the current process exits. The "current
- process" in this case is the client program, not the DLL or the DLL
- initialization code.
-
-
- 18.4 Building the DLL
-
- To create a DLL, you need to assemble the DLL code, write a
- module-definition file, use LINK to create the DLL, generate an import
- library, and then link the DLL to the client program.
-
-
- 18.4.1 Writing the Module-Definition File
-
- A module-definition file is required for DLLs.
-
- The module-definition file is an ASCII text file that lists attributes of a
- library or application (in the case of an application, this file is
- optional). The moduledefinition file gives directions to the linker that
- supplement the information on the command line.
-
- This module-definition file tells the linker to create a DLL called CSTR.DLL
- with INITINSTANCE data. The library has exported procedure VioWrtCStr,
- GetPCount, and GetGCount, and the data segment PRIVDAT is not shared
- between programs:
-
- LIBRARY CSTR INITINSTANCE
-
- EXPORTS
- VioWrtCstr
- GetGCount
- GetPCount
-
- DATA SINGLE NONSHARED
-
- The LIBRARY statement need not specify a name. If the name is omitted, the
- linker gives the library the base filename of the module-definition file.
- The default file extension is .DLL. The INITINSTANCE attribute is optional
- and is significant only if you have initialization code. If you specify
- INITINSTANCE, then the library initialization is called each time a new
- process gains access to the library. Otherwise, it will be called once only.
-
-
- At least one procedure must be listed after EXPORTS.
-
- The EXPORTS statement lists identifiers (procedures and variables) that can
- be accessed directly by client programs. Note that if you give a procedure
- the EXPORTS attribute from within the source code, you do not need to list
- the procedure here. The EXPORTS keyword automatically exports the procedure
- by name, so putting the names of the procedures in the module-definition
- file is not required. However, exported variables must be listed in a
- module-definition file.
-
- The DATA statement lists attributes for data segments (DGROUP) in the DLL.
- The default for DLLs, SINGLE, specifies that one DGROUP is shared by all
- instances of the DLL. NONSHARED specifies that all other data segments are
- not to be shared. See Section 13.15, "CODE, DATA, and SEGMENTS Attributes."
-
-
-
- 18.4.2 Generating an Import Library with IMPLIB
-
- The DLL exports a procedure; the client program imports it.
-
- Just as a procedure is exported by a DLL, it must be imported by an
- application. An application's EXE header must indicate what dynamic-link
- procedures are used and where they reside. The easiest way to specify this
- information is with an "import library," which is a .LIB file that contains
- the import information in object-record form. The IMPLIB utility automates
- this process for you.
-
- To create an import library, run the IMPLIB utility on the module-definition
- file:
-
- IMPLIB MYDYNLIB.LIB MYDYNLIB.DEF
-
- The result is the import library, MYDYNLIB.LIB, which you then link to any
- program that calls CSTR.DLL. You would then list MYDYNLIB.LIB in the
- libraries field (the fourth field) of the LINK command. Or, in
- assembly-language programs, you can link to this library automatically by
- just adding the following statement to the source code of your program:
-
- INCLUDELIB MYDYNLIB.LIB
-
-
- 18.4.3 Creating and Using the DLL
-
- Now you can use LINK to create the DLL. The LINK utility uses the object
- module of the DLL code and the module definition to create the CSTR.DLL:
-
- LINK CSTR.OBJ , , , , MYDYNLIB.DEF
-
- If linking is successful, the linker creates a file with a .DLL extension.
-
- You can link several modules together to create a DLL. The following command
- line links several object modules and an object-code library (BIGLIB.LIB) to
- form a DLL. The module-definition file is MYDYNLIB.DEF:
-
- LINK MOD1 MOD2 MOD3,,, BIGLIB, MYDYNLIB
-
- To use the DLL, copy the .DLL file to a directory listed in the LIBPATH
- setting in your CONFIG.SYS file.
-
- To create an executable file using the DLL, link the client program with the
- import library as shown:
-
- LINK CALLDLL.OBJ , , , MYDYNLIB.LIB
-
- By running CALLDLL.EXE in separate OS/2 windows, you can see that both
- client programs access the DDL at the same time. When the last process exits
- the DLL, the DLL is removed from memory.
-
-
- 18.5 Related Topics in Online Help
-
- In addition to information covered in this chapter, information on the
- following topics can be found in online help.
-
- Topic Access
- ────────────────────────────────────────────────────────────────────────────
- LINK From the "Microsoft Advisor Contents"
- screen, select LINK
-
- Module-definition files Select Module-Definition Files from the
- "LINK Contents" screen
-
- EXPORT Select from the MASM Language Index
-
- EXTERNDEF From the "MASM 6.0 Contents" screen,
- select "Directives"; then select "Scope
- and Visibility" from the next screen
-
- LOADDS, Choose "Proc" from the MASM Language
- FORCEFRAME Index
-
- IMPLIB Select "IMPLIB Summary" from the "LINK
- Contents" screen
-
-
-
-
-
-
- Chapter 19 Writing Memory-Resident Software
- ────────────────────────────────────────────────────────────────────────────
-
- Through its memory-management system, DOS allows a program to remain
- resident in memory after terminating. The resident program can later regain
- control of the processor to perform tasks such as background printing or
- "popping up" a calculator on the screen. Such a program is commonly called a
- TSR, from the Terminate-and-Stay-Resident function it uses to return to DOS.
-
-
- This chapter explains the techniques of writing memory-resident software.
- The first two sections present introductory material. Following sections
- describe important DOS and BIOS interrupts and focus on how to write safe,
- compatible, memory-resident software. Two example programs illustrate the
- techniques described in the chapter. These programs are also available as
- sample programs on the MASM 6.0 disks.
-
-
- 19.1 Terminate-and-Stay-Resident Programs
-
- DOS maintains a pointer to the beginning of unused memory. Programs load
- into memory at this position. They terminate execution by returning control
- to DOS. Normally, the pointer remains unchanged, allowing DOS to reuse
- memory when loading other programs.
-
- A terminating program can, however, prevent other programs from loading on
- top of it. It does this by returning to DOS through the
- terminate-and-stay-resident function, which resets the free-memory pointer
- to a higher position. This leaves the program resident in a protected block
- of memory, even though it is no longer running.
-
- The terminate-and-stay-resident function (Function 31h) is one of the DOS
- services invoked through Interrupt 21h. The following fragment shows how a
- TSR program terminates using Function 31h and remains resident in a
- 1000h-byte block of memory:
-
- mov ah, 31h ; Request DOS Function 31h
- mov al, err ; Set return code
- mov dx, 100h ; Reserve 100h paragraphs
- ; (1000h bytes)
- int 21h ; Terminate-and-stay-resident
-
- ────────────────────────────────────────────────────────────────────────────
- NOTE
-
- In current versions of DOS, Interrupt 27h also provides a
- terminate-and-stayresident service. However, Microsoft cannot guarantee
- future support for Interrupt 27h and does not recommend its use.
- ────────────────────────────────────────────────────────────────────────────
-
-
- 19.1.1 Structure of a TSR
-
- TSRs consist of two distinct parts that execute at different times. The
- first part is the installation section, which executes only once, when DOS
- loads the program. The installation code performs any initialization tasks
- required by the TSR and then exits through the terminate-and-stay-resident
- function.
-
- A TSR consists of an installation section and a resident section.
-
- The second part of the TSR, called the resident section, consists of code
- and data left in memory after termination. Though often identified with the
- TSR itself, the resident section makes up only part of the entire program.
-
- The TSR's resident code must be able to regain control of the processor and
- execute after the program has terminated. Methods of executing a TSR are
- classified as either passive or active.
-
-
- 19.1.2 Passive TSRs
-
- The simplest way to execute a TSR is to transfer control to it explicitly
- from another program. Because the TSR in this case does not solicit
- processor control, it is said to be passive. If the calling program can
- determine the TSR's memory address, it can grant control via a far jump or
- call. More commonly, a program activates a passive TSR through a software
- interrupt. The installation section of the TSR writes the address of its
- resident code to the proper position in the interrupt vector table (see
- Section 7.4, "DOS Interrupts"). Any subsequent program can then execute the
- TSR by calling the interrupt.
-
- Passive TSRs often replace existing software interrupts. For example, a
- passive TSR might replace Interrupt 10h, the BIOS video service. By
- intercepting calls that read or write to the screen, the TSR can access the
- video buffer directly, increasing display speed.
-
- Passive TSRs allow limited access since they can be invoked only from
- another program. They have the advantage of executing within the context of
- the calling program, and thus run no risk of interfering with another
- process, which could happen with active TSRs.
-
-
- 19.1.3 Active TSRs
-
- The second method of executing a TSR involves signaling it through some
- hardware event, such as a predetermined sequence of keystrokes. This type of
- TSR is called active because it must continually search for its start-up
- signal. The advantage of active TSRs lies in their accessibility. They can
- take control from any running application, execute, and return, all on
- demand.
-
- An active TSR, however, must not seize processor control blindly. It must
- contain additional code that determines the proper moment at which to
- execute. The extra code consists of one or more routines called "interrupt
- handlers," described in the following section.
-
-
- 19.2 Interrupt Handlers in Active TSRs
-
- The memory-resident portion of an active TSR consists of two parts. One part
- contains the body of the TSR─the code and data that perform the program's
- main tasks. The other part contains the TSR's interrupt handlers.
-
- An interrupt handler is a routine that takes control when a specific
- interrupt occurs. Although sometimes called an "interrupt service routine,"
- a TSR's handler usually does not service the interrupt. Instead, it passes
- control to the original interrupt routine, which does the actual interrupt
- servicing.
-
- Collectively, interrupt handlers ensure that a TSR operates compatibly with
- the rest of the system. Individually, each handler fulfills at least one of
- the following functions:
-
-
- ■ Auditing hardware events that may signal a request for the TSR
-
- ■ Monitoring system status
-
- ■ Determining whether a request for the TSR should be honored, based on
- current system status
-
-
-
- 19.2.1 Auditing Hardware Events for TSR Requests
-
- Active TSRs commonly use a special keystroke sequence or the timer as a
- request signal. A TSR invoked through one of these channels must be equipped
- with handlers that audit keyboard or timer events.
-
- A keyboard handler receives control at every keystroke. It examines each
- key, searching for the proper signal or "hot key." Generally, a keyboard
- handler should not attempt to call the TSR directly when it detects the hot
- key. If the TSR cannot safely interrupt the current process at that moment,
- the keyboard handler is forced to exit to allow the process to continue.
- Since the handler cannot regain control until the next keystroke, the user
- has to press the hot key repeatedly until the handler can comply with the
- request.
-
- Instead, the handler should merely set a request flag when it detects a
- hot-key signal and then exit normally. Examples in the following paragraphs
- illustrate this technique.
-
- For computers other than the IBM PS/2 (R) series, an active TSR audits
- keystrokes through a handler for Interrupt 09, the keyboard interrupt:
-
- Keybrd PROC FAR
- sti ; Interrupts are okay
- push ax ; Save AX register
- in al, 60h ; AL = scan code of current key
- call CheckHotKey ; Check for hot key
- .IF !carry? ; If not hot key:
- ; Hot key pressed. Reset the keyboard to throw away keystroke.
- cli ; Disable interrupts while resetting
- in al, 61h ; Get current port 61h state
- or al, 10000000y ; Turn on bit 7 to signal clear keybrd
- out 61h, al ; Send to port
- and al, 01111111y ; Turn off bit 7 to signal break
- out 61h, al ; Send to port
- mov al, 20h ; Reset interrupt controller
- out 20h, al
- sti ; Reenable interrupts
- pop ax ; Recover AX
- mov cs:TsrRequestFlag, TRUE ; Raise request flag
- iret ; Exit interrupt handler
- .ENDIF ; End hot-key check
-
- ; No hot key was pressed, so let normal Int 09 service
- ; routine take over
-
- pop ax ; Recover AX and fall through
- cli ; Interrupts cleared for service
- KeybrdMonitor LABEL FAR ; Installed as Int 09 handler for
- ; PS/2 or for time-activated TSR
- ; Signal that interrupt is busy
- mov cs:intKeybrd.Flag, TRUE
- pushf ; Simulate interrupt by pushing flags,
- ; far-calling old Int 09 routine
- call cs:intKeybrd.OldHand
- mov cs:intKeybrd.Flag, FALSE
- iret
- Keybrd ENDP
-
- A TSR running on a PS/2 computer cannot reliably read key-scan codes using
- the above method. Instead, the TSR must search for its hot key through a
- handler for Interrupt 15h (Miscellaneous System Services). The handler
- determines the current keypress from the AL register when AH equals 4Fh, as
- shown here:
-
- MiscServ PROC FAR
- sti ; Interrupts okay
- .IF ah == 4Fh ; If Keyboard Intercept Service:
- push ax ; Preserve AX
- call CheckHotKey ; Check for hot key
- pop ax
- .IF !carry? ; If hot key:
- mov cs:TsrRequestFlag, TRUE ; Raise request flag
- clc ; Signal BIOS not to process the key
- ret 2 ; Simulate IRET without popping flags
- .ENDIF ; End carry flag check
- .ENDIF ; End Keyboard Intercept check
- cli ; Disable interrupts and fall through
- SkipMiscServ LABEL FAR ; Interrupt 15h handler if PC/AT
- jmp cs:intMisc.OldHand
- MiscServ ENDP
-
- The example program in Section 19.8 demonstrates how a TSR tests for a PS/2
- machine and then sets up a handler for either Interrupt 09 or Interrupt 15h
- to audit keystrokes.
-
- Setting a request flag in the keyboard handler allows other code, such as
- the timer handler (Interrupt 08), to recognize a request for the TSR. The
- timer handler gains control at every timer interrupt; the interrupts occur
- an average of 18.2 times per second. The following fragment shows how a
- timer handler tests the request flag and continually polls until it can
- safely execute the TSR.
-
- TestFlag PROC FAR
- .
- .
- .
- cmp TsrRequestFlag, FALSE ; Has TSR been requested?
- je exit ; If not, exit
- call CheckSystem ; Can system be interrupted
- ; safely?
- jc exit ; If not, exit
- call ActivateTsr ; If okay, call TSR
-
- Figure 19.1 illustrates the process. It shows a time line for a typical TSR
- signaled from the keyboard. When the keyboard handler detects the proper hot
- key, it sets a request flag called TsrRequestFlag. Thereafter, the timer
- handler continually checks the system status until it can safely call the
- TSR.
-
- The timer itself can serve as the start-up signal if the TSR executes
- periodically. Screen clocks that continuously show seconds and minutes are
- examples of TSRs that use the timer this way. ALARM.ASM, a program described
- in the next section, shows another example of a timer-driven TSR.
-
- (This figure may be found in the printed book.)
-
-
- 19.2.2 Monitoring System Status
-
- A TSR that uses a hardware device such as the video or disk must not
- interrupt while the device is active. A TSR monitors a device by handling
- the device's interrupt. Each interrupt handler need only set a flag to
- indicate that the device is in use and then clear the flag when the
- interrupt finishes.
-
- The following shows a typical monitor handler:
-
- NewHandler PROC FAR
- mov ActiveFlag, TRUE ; Set active flag
- pushf ; Simulate interrupt by
- ; pushing flags,
- call OldHandler ; then calling original routine
- mov ActiveFlag, FALSE ; Clear active flag
- iret ; Return from interrupt
- NewHandler ENDP
-
- Only hardware used by the TSR requires monitoring. For example, a TSR that
- performs disk input/output (I/O) must monitor disk use through Interrupt
- 13h. The disk handler sets an active flag that prevents the TSR from
- executing during a read or write operation. Otherwise, the TSR's own I/O
- would move the disk head. This would cause the suspended disk operation to
- continue with the head incorrectly positioned when the TSR returned control
- to the interrupted program.
-
- In the same way, an active TSR that displays to the screen must monitor
- calls to Interrupt 10h. The Interrupt 10h BIOS routine does not protect
- critical sections of code that program the video controller. The TSR must
- therefore ensure that it does not interrupt such nonreentrant operations.
-
- The activities of the operating system also affect the system status. With
- few exceptions, DOS functions are not reentrant and must not be interrupted.
- However, monitoring DOS is somewhat more complicated than monitoring
- hardware. Discussion of this subject is deferred until Section 19.4.
-
- The following comments describe the chain of events depicted in Figure 19.1.
- Each comment refers to one of the numbered pointers in the figure.
-
-
- 1. At time = t, the timer handler activates. It finds the flag
- TsrRequestFlag clear, indicating that the TSR has not been requested.
- The handler terminates without taking further action. Notice that
- Interrupt 13h is currently processing a disk I/O operation.
-
- 2. Before the next timer interrupt, the keyboard handler detects the hot
- key, signalling a request for the TSR. The handler sets
- TsrRequestFlag and returns.
-
- 3. At time = t + 1/18 second, the timer handler again activates and finds
- TsrRequestFlag set. The handler checks other active flags to
- determine if the TSR can safely execute. Since Interrupt 13h has not
- yet completed its disk operation, the timer handler finds
- DiskActiveFlag set. The handler therefore terminates without invoking
- the TSR.
-
- 4. At time = t + 2/18 second, the timer handler again finds
- TsrRequestFlag set and repeats its scan of the active flags.
- DiskActiveFlag is now clear, but in the interim, Interrupt 10h has
- activated as indicated by the flag VideoActiveFlag. The timer handler
- accordingly terminates without invoking the TSR.
-
- 5. At time = t + 3/18 second, the timer handler repeats the process. This
- time it finds all active flags clear, indicating that the TSR may
- safely execute. The timer handler calls the TSR, which sets its own
- active flag to ensure that it will not interrupt itself if requested
- again.
-
- 6. The timer and other interrupts continue to function normally while the
- TSR executes.
-
-
-
- 19.2.3 Determining Whether to Invoke the TSR
-
- Once a handler receives a request signal for the TSR, it checks the various
- active flags maintained by the handlers that monitor system status. If any
- of the flags are set, the handler ignores the request and exits. If the
- flags are clear, the handler invokes the TSR, usually through a near or far
- call. Figure 19.1 illustrates how a timer handler detects a request and then
- periodically scans various active flags until all the flags are clear.
-
- A TSR that changes stacks must not interrupt itself. Otherwise, the second
- execution would overwrite the stack data belonging to the first. A TSR
- prevents this by setting its own active flag before executing, as shown in
- Figure 19.1. A handler must check this flag along with the other active
- flags when determining whether the TSR can safely execute.
-
-
- 19.3 Example of a Simple TSR: ALARM
-
- This section presents a simple alarm clock TSR that demonstrates some of the
- material covered so far. The program accepts an argument from the command
- line that specifies the alarm setting in military form, such as 1635 for
- 4:35 P.M. For the sake of simplicity, the argument must consist of four
- digits, including leading zeros. To set the alarm at 7:45 A.M., for example,
- enter:
-
- ALARM 0745
-
- The installation section of the program begins with the Install procedure.
- Install computes the number of five-second intervals that must elapse
- before the alarm sounds and stores this number in the word CountDown. The
- procedure then obtains the vector for Interrupt 08 (timer) through Interrupt
- 21h Function 35h and stores it in the far pointer OldTimer. Interrupt 21h
- Function 25h replaces the vector with the far address of the new timer
- handler NewTimer. Once installed, the new timer handler executes at every
- timer interrupt. These interrupts occur 18.2 times per second or 91 times
- every five seconds.
-
- Each time it executes, NewTimer subtracts one from a secondary counter
- called Tick91. By counting 91 timer ticks, Tick91 accurately measures a
- period of five seconds. When Tick91 reaches zero, it's reset to 91 and
- CountDown is decremented by one. When CountDown reaches zero, the alarm
-