io Programmo 23

home *** CD-ROM | disk | FTP | other *** search

/ io Programmo 23 / IOPROG_23.ISO / SOFT / ASM / BCDASM.ZIP / BCDASM / DOC / BCD.TXT next >

Wrap

Text File | 1997-06-03 | 19.4 KB | 546 lines

80x86 assembly language BCD math -------------------------------- Contents: Copyright and disclaimer General considerations BCD formats, signs, min & max BCD-related CPU instructions Programming notes - General notes - Interfacing and user responsibilities - Assembly formalities - Procedural formalities - Procedure interrelations - Algorithms - BCDASM equates and macros list Credits Related material Copyright and disclaimer ------------------------ This text and the accompanying assembly source files and binary files ("BCDASM") are copyright Morten Elling, 1997. You're free to use BCDASM for educational purposes and in software (freeware, shareware, or commercial) but use it at your own risk. BCDASM is offered without any guarantee. You're free to distribute the BCDASM file archive through any media at your disposal on condition that the contents of the file archive are not modified. June, 1997 Morten Elling Ellemarksvej 12 DK-8000 Aarhus C Denmark <mailto:elling@post1.tele.dk> General considerations ---------------------- Numbers in business applications software must be large and precise. Accounting books must balance so floating-point math and its potential for rounding errors is insufficient. Performing business math in the CPU registers is OK if you can make do with numbers in the range +21,474,836.47 to -21,474,836.48 (32 bit). Inflation-ridden nations, large companies, and Italian car dealers need bigger numbers. BCD math is basic in-memory integer math that lets you handle numbers of considerable sizes, typically 18 digits but with the capacity for several thousand digits. By tradition, BCDs are used for business math and BCD schemes are supported by several computers. All Intel's iAPx86 chips (PCs) execute BCD instructions. Numbers in BCD format are not as storage-efficient as numbers in binary and they don't process quite as fast as their binary cousins (additional CPU instructions are required to adjust results). But being the sole base 10 format in an otherwise completely hexadecimal and octal world, BCDs do offer some advantages. Base 10 feels natural to humans (at least until genetic engineering equips us with 16 fingers). BCDs are easy to debug and convert to and from Ascii, and are easily divided or multiplied by multiples of 10. BCDs are integers. Other than supplying a routine that will display any number of decimals, BCDASM supports only integer operations. This doesn't mean that you cannot use BCDs for floating-point operations; it's unusual, though, and it will require extra programming but it may pay off if your primary concern is multi-digit results without rounding errors. With BCDs, keeping track of the decimal point is easier than with binary numbers. If you are programming for a PC equipped with an FPU (floating-point unit), consider using it. This requires the use of 10-byte packed BCDs which is the only BCD format recognized by the FPU. It also requires a conversion of the BCD numbers to and from the FPU's temporary real format. None of the BCDASM routines use FPU instructions since they all use a caller-supplied BCD size. BCD formats ----------- BCD stands for Binary Coded Decimal. BCD numbers are made up from the decimal numbers 0 thru 9 represented in binary: Dec Hex Binary 0 00h 0000 1 01h 0001 2 02h 0010 3 03h 0011 4 04h 0100 5 05h 0101 6 06h 0110 7 07h 0111 8 08h 1000 9 09h 1001 BCD numbers can be packed (the usual) storing two digits per byte (the most-significant digit in the high-order 4 bits), or un-packed storing one digit per byte (leaving the high-order 4 bits unused). The low-order 4 bits of a byte is also called the low nibble, the high-order 4 bits the high nibble. Packed BCD is sometimes called 'packed decimal' or 'decimal integer'. Unpacked BCD is sometimes called 'unpacked decimal' or 'ASCII integer'. The decimal number 8 is represented as 08h in packed BCD, and 08h in unpacked BCD. The decimal number 28 is 28h in packed BCD (but 1Ch in binary) and would require two unpacked BCD bytes. Assemblers usually have limited support for BCD handling: the DT directive which defines a ten-byte portion of storage, and the TBYTE operator which can create a pointer to a ten-byte. In Borland's Turbo Assembler (TASM), the DT directive defaults to packed BCD format but may be used for other data types as well; there is no assembler directive to support unpacked BCD. Examples (TASM.EXE v4.1): Pnum dt 81659247 ; Packed BCD (10 bytes) Unum db 7,4,2,9,5,6,1,8,0,0 db 0,0,0,0,0,0,0,0,0,0 ; Unpacked BCD (20 bytes) Dnum dt 81659247d ; 'd' postfix for decimal number Nnum dt -81659247 ; Negative packed BCD The numbers above are stored in memory as (hexadecimal): Pnum 47 92 65 81 00 00 00 00 00 00 Unum 07 04 02 09 05 06 01 08 00 00 00 00 00 00 00 00 00 00 00 00 Dnum 6F 05 DE 04 00 00 00 00 00 00 Nnum 47 92 65 81 00 00 00 00 00 80 ; Top bit = 1 i.e. the usual Intel little-endian format (the lower in memory, the less significant). NOTE: Borland's 32-bit assemblers (TASM32 v4.0, v5.0) do not recognize the unary minus operator when BCDs are initialized with the DT directive, i.e. only non-negative BCD values can be initialized. Borland's 16-bit assemblers do, however, accept negative and zero BCD values with DT. The term 'most significant byte' is avoided here and in the source files, in the interest of clarity. Logically, which is 'most significant' in the variable Nnum above: the 9th byte, the 8th, or the 3rd? It depends on what you're considering: the BCD format, the numeric range, or the current variable. The sign bit (bit 79 of a tbyte) is 1 for negative numbers, 0 for positive or zero. The whole top _byte_ is used for the sign, i.e. 00h or 80h (limiting the number range somewhat, but making BCDs easier to program). To change a BCD from positive to negative, only the top bit is changed. Note that the FPU ignores anything but the sign bit in the top byte of a tbyte (bits 72-78). Assuming the top byte is used for the sign, we're left with (10 - 1) * 2 = 18 digits for the number, hence the range of a 10-byte packed signed BCD is: minimum -999,999,999,999,999,999 maximum +999,999,999,999,999,999 or - since BCDs are primarily used for business calculations - +/-$9,999,999,999,999,999.99 (almost 10 quadrillions in USA and France; 10,000 billions elsewhere. Anyway, it was a lot of money in 1976). NOTE: The 10-byte packed signed BCD described above is the basic BCD type supported by the floating- point unit (80x87), and it's the one generally used in assembly source files. However, BCDs come in many flavors. Examples include: - 8-byte packed BCDs which fit better into the 80386's 32-bit registers - A 6-byte packed BCD with 47 bits used for the number and 1 bit for the sign (more storage- efficient). - A 12-byte unpacked BCD (easy Ascii-conversion) - BCDs stored big-endian, even on Intel machines. The numbers are easier to read this way when debugging, but perhaps they originated from a non-PC computer. BCD-related CPU and FPU instructions (details in a separate file) ------------------------------------ The following members of the 8086 instruction set were designed to support BCD math: - Packed BCD (affects the AL and flags registers) DAA Decimal Adjust after Addition DAS Decimal Adjust after Subtraction - Un-packed BCD (affects the AL, AH, and flags registers) AAA ASCII Adjust after Addition AAS ASCII Adjust after Subtraction AAM ASCII Adjust after Multiplication AAD ASCII Adjust before Division Two FPU instructions support ten-byte BCD numbers: - Packed BCD FBLD Load packed decimal FBSTP Store packed decimal and pop Programming notes ----------------- --- General notes BCDASM supports C, (16-bit) Pascal, and assembly language. The source files include routines that perform basic math, comparison, and conversion of packed, signed BCD numbers (files bcd*.asm). A single module demonstrates operations on unpacked, unsigned BCDs (file bcduu.asm). The application programming interface (API) of each procedure (function) is described in detail in the procedure's source file header. The headers have been extracted to the bcdapi.txt file which serves as the BCDASM API reference. Similarly, the algorithms used in BCDASM are described in comments embedded in the source files. In this section, therefore, you'll find a description of some general topics related to BCDASM, but not of the API. --- Interfacing and user responsibilities All users must ensure that BCDASM routines run on an 80186 processor or compatible. No other initialization or shutdown steps are required. BCDASM code does nothing but move bytes and bits in memory; no operating system calls, interrupts, port or file I/O. Assembly language users should use a .MODEL statement with a STDCALL, C, or PASCAL language specifier and include the bcd.asi header file in modules that call BCDASM procedures. (Note that a few procedures return carry, sign, or zero flags but the comparison and shift routines return other flags than you may expect from their names.) C language users must include the bcd.h header file in source files referencing BCDASM functions. Both assembly language and C language users should link with the appropriate BCDASM library file: bcd????.lib where ???? is a memory model code. Run makelib.bat without parameters to see a list of available memory models. Reassembly of the source code for 16-bit models requires TASM v3.2 or later, whereas the 32-bit flat model source code requires TASM v4.0 or later. Turbo Pascal users must reference the unit BCD (compiled as bcd.tpu) in a USES statement to access BCDASM procedures and functions (16-bit far code). Note that strings returned by BCDASM are not Pascal-format but zero-terminated, C-style; for convenience, the WriteZStr function is included in the unit file. BCDASM code typically adds 2-3 KB to the size of an executable (max. 4 KB if all routines are linked), and the stack usage is max. 100 bytes (size of passed arguments and local variables). All arguments passed to BCDASM routines must be correct; error checking is minimal. In particular, pointers must be valid and the byte size of BCD variables must be even and obey the following limits. (In short: if you keep your BCD size even, and below 24 KB - enough to hold 49,000+ digits - you're safe.) 16-bit limits: Min. 4 bytes Max. 0FFFCh bytes (65,532 decimal) Max. 07FFEh bytes (32,766 decimal) (*) 32-bit limits: Min. 4 bytes Max. 0FFFFFFFCh bytes = approx. 4 GigaBytes Max. 07FFFFFFEh bytes = approx. 2 GigaBytes (*) (*) Half-size operands required for sign extension, multiplication, and division; bcdSx, bcdImul, and bcdIdiv return results that are twice the size of the source operand. Although BCDASM can handle the maximum sizes, your operating system (or your patience) may not. Divide the max. size figures by 3 or 4 to get the practical limit imposed by the bcdFmt routine (BCD-to-Ascii). If necessary, users must devise a method to initialize BCD variables in software, typically by manipulating an array of bytes. Assembly language users have the option of using the DT directive when initializing variables. All BCDASM modules assume that bit 7 of the top byte of a BCD number holds the sign and that bits 0-6 of the top byte are undefined. --- Assembly formalities The assembly source files were written for Borland's Turbo Assembler v4.1 (MASM syntax). In order to keep symbol names case-sensitive, TASM's /ml switch must be used, and include files must be pointed to using the /i switch. No other switches are needed since the source code has been arranged to allow TASM's one-pass assembly. You can, however, pass the desired memory model and processor to TASM, e.g. cd .\src tasm /ml/i..\include\ /dMDL=16sp /dCPU=.8086 *.asm, ..\obj\; Refer to the model.inc file for these definitions; the default model is <large, pascal> and the default processor is <.186>. Except for bcd.asi, no include file contains an 'include' statement. Three features are used that require a recent version of TASM: - prototyping (PROCDESC, extended CALL) (TASM v3.2+) - enhanced macro facilities (VARARG, REST) (TASM v3.0+) - model FLAT (TASM v4.0+) Unless you want the 32-bit flat model, it is fairly simple to rewrite BCDASM for an earlier version of TASM (and slightly less simple to rewrite it for Microsoft's assembler). After two years with Ideal mode syntax, I'm back to MASM syntax but kept the good habit of bracketing memory references. TASM's built-in high-level language support (PROC, USES, ARG, LOCAL, RET, ENDP statements) is used extensively throughout the source code. While it isn't necessary to use a .MODEL statement to make use of the HLL or prototyping features, it does make things somewhat easier: - canned segments, available with .CODE, .DATA etc. - language and code distance specifiers automatically transferred to PROC, PROCDESC, PUBLIC, EXTRN etc. - predefined CODEPTR and DATAPTR pointer types as well as several model-related equates - predefined .STARTUP and .EXIT macros available - group override with OFFSET operator not needed Despite TASM's user guide, ARG, LOCAL, and DATAPTR don't work with 32-bit code unless used with a .MODEL FLAT statement; without it, args/locals are equated relative to BP, not EBP (grrrrr...). --- Procedural formalities Procedure entry and exit conditions are described in detail in each procedure header (see bcdapi.txt). Most procedures return information in the accumulator (AX/EAX) and -- for the benefit of .asm users -- a few return flags. In the 16-bit memory models, no assumptions are made about the ES register or direction flag on entry to a procedure. In the models that use near data pointers (tiny, small, and medium), DS is assumed to be = SS; in all other models, DS register contents are unimportant to BCDASM routines (no variables in DGROUP). Registers are preserved according to the requirements of the selected model language (refer to the @uses macro in modelt.inc). BCDASM supports C, STDCALL, and PASCAL. If the direction flag is used, it is always cleared on exit. In the 32-bit flat model, Win32 rules apply: DS = ES = SS, direction flag clear on entry and on exit. EAX, ECX, and EDX registers may be modified, the rest are preserved. Calling convention is STDCALL. Assembly language users who want 16-bit 'flat' functionality (near data models) can achieve this by equating @isDSeqESeqSS to one (1); it'll save ES loads and ES overrides. Equating @isDirectionUp to one (1) saves redundant CLD instructions. See modelt.inc for these equates. --- Procedure interrelations No 'internal' calls; each module is self-contained. --- Algorithms Basic multi-digit number-processing all the way, using LODS, STOS, and LOOP instructions. BCD multiplication is slow because this operation involves heavy use of MUL and AAM instructions which are among the very slowest on an 8086 but part of the BCD scheme. BCD division is similarly slow due to a divide-by-repeated-subtraction approach. I've emphasized modularity and clarity over speed during the development since multi-digit in-memory calculations are used primarily for their capacity. For much the same reasons, I've used equates (rax, rbx etc.) for the actual register names (uppercase AX, BX etc. is used for explicit 16-bit operations; lowercase eax, ebx etc. for 32-bit). Attempting to write assembly for both 16-bit and 32-bit like this is definitely _not_ to be recommended but I wanted to try it out for this occasion. True 80386+ support is rudimentary. BCDASM leaves room for improvement; grep for "ToDo" to locate a few places. --- BCDASM equates and macros list Defined by TASM: Defined or redefined by: - @Model, @Interface, @CodeSize, @DataSize, DATAPTR .model directive - @data, @stack .model directive, segment opening - @WordSize segment opening, processor directive - @Startup built-in .startup macro (The equates and macros defined in model.inc and modelt.inc are not redefined by any BCDASM module) Equates defined in model.inc: - CPU, MDL, @fartext Equates defined in modelt.inc: - @isCodeNear, @isDataFar, @isDataNear, @isStackFar, @isStackNear, @isUse32, @isWin32, @is386 - @isDirectionUp, @isDSeqESeqSS - @ES, @dui, @uint, @dsaddr, @dsr, @ssr, @nullptr - @bptr, @uiptr, @wptr, sh - @fram Macros defined in modelt.inc: - @CODESEG, @alignn, @ptype, @proto, @uses - @LDSEGM, @LDS, @LES, @cld, @shl, @shr - @enter, @leave, ENTERW,ENTERD,LEAVEW,LEAVED Credits ------- BCDASM is my own work but it may not have reached the .LIB state without a glance or two in "intel.zip" from 1990, a collection of assembly source files which unfortunately has no named author and no copyright notice. Specifically, the division and comparison routines in BCDASM were inspired by bcddiv.asm and bcdcmp.asm in "intel.zip" but completely rewritten for this occasion. The algorithm used in the multiplication routine was taken from Ray Duncan's MPMUL1.ASM which appeared in PC Magazine Nov 28, 1989 (vol. 8 no. 20). Related material ---------------- HugeCalc by Neil J. Rubenking, includes Turbo Pascal v6.0 source code. HugeCalc performs +-*/ exponential and factorial math on up to 254-digit integers, using pascal's string type to store the numbers as digits. Implemented as a command-line calculator, it supports unsigned numbers only. Downloadable at <http://www.pcmag.com> as part of VOL10N16.ZIP, the Sept 24, 1991 PC Magazine file archive. MONEY, a C++ money class by Adolfo di Mare, includes Borland C++ v2.0 source code. Uses C's double type (floor(double * 100.0)) with no fractional part to store the numbers, allowing 15+ digit precision. Downloadable at Simtel as <ftp://ftp.simtel.net/pub/simtelnet/msdos/cpluspls/money.zip>.