home *** CD-ROM | disk | FTP | other *** search
Text File | 1990-08-10 | 46.3 KB | 1,152 lines |
-
-
-
- 276
-
- CHAPTER 26 - SIMPLIFYING THE TEMPLATE
-
-
- By the time you have finished this chapter your assembler files
- will look cleaner. Unfortunately there is some heavy sledding
- before we get there.
-
-
- EXITING A PROGRAM
-
- Till now, we have exited most programs with CTRL-C; otherwise the
- program has done a return. A return to what? It has been
- returning to a section of code that does INT 20h, one of the ways
- of quitting a program when everything is in order. Notice the
- "everything is in order" in the last sentence. What happens if
- you have 2 files open, you are off in some subroutine, and you
- have things so hopelessly confused that you might as well give
- up? Can you call INT 20h? The answer is no for two reasons.
- First, you need CS to point to the PSP (program segment prefix).
- and you don't know where the PSP is. Secondly, you need to close
- files. Now, it is possible to make some code to do this, but why
- bother. We have a special interrupt for this:
-
- INT 21h function 4Ch
- AH = 4Ch
- AL = return code
-
- This will close all files, get you out of the program, and give a
- return code that is usable by the calling program. Here's a small
- program. Use template.asm and call this TEST4CH.ASM.
-
- ; - - - - - - - - - - - - - - - - - - - - - - - - -
- CODESTUFF SEGMENT PUBLIC 'CODE'
-
- ASSUME cs:CODESTUFF, ds:DATASTUFF
- EXTRN get_unsigned_byte:NEAR
-
- main proc far
-
- start:
- mov ax, DATASTUFF ; load ds
- mov ds,ax
-
- call get_unsigned_byte ; value is in al
- mov ah, 4Ch ; int 21h, function 4Ch
- int 21h
-
- main endp
- CODESTUFF ENDS
- ; - - - - - - - - - - - - - - - - - - - - - - - - - -
-
- We have revised the CODESTUFF segment so there is only one EXTRN
- statement and the beginning code:
-
-
- ______________________
-
- The PC Assembler Tutor - Copyright (C) 1990 Chuck Nelson
-
-
-
-
- Chapter 26 - Simplifying The Template 277
- _____________________________________
-
- push ds
- sub ax, ax
- push ax
-
- is gone. Since we will never again do a return to the PSP, there
- is no need for this code anymore.
-
- The program gets a single byte to use as the exit code and then
- exits using int 21h function 4Ch. Get this assembled and linked.
- It should ask for a number and then exit. But where is that
- number? It is available through the operating system. Make the
- following batch file. It runs TEST4CH.EXE and then looks at the
- error code. Unfortunately ERRORLEVEL is not available as an exact
- number to a batch file, so we are checking to see if the return
- code was above a certain level.
-
- ----------------- DO4CH.BAT -----------------------
- test4ch
- ECHO OFF
- IF ERRORLEVEL 1 ECHO The return code was over 0
- IF ERRORLEVEL 51 ECHO The return code was over 50
- IF ERRORLEVEL 101 ECHO The return code was over 100
- IF ERRORLEVEL 151 ECHO The return code was over 150
- IF ERRORLEVEL 201 ECHO The return code was over 200
- ECHO ON
- ----------------------------------------------------
-
- Here's one run of the batch file:
-
- >do4ch
- >int4ch
-
- The PC Assembler Helper Version 1.01
- Copyright (C) 1989 Chuck Nelson All rights reserved.
- Enter a number from 0 to 255 172
-
- >ECHO OFF
- The return code was over 0
- The return code was over 50
- The return code was over 100
- The return code was over 150
-
- This is what happens to DO4CH.BAT with a return code of 172.
-
- From now on, always use INT 21h function 4Ch to exit.
-
-
- SEGMENTS
-
- Our major simplification has to do with segment names. Before we
- go on with segment simplification, here are the rules the linker
- uses. If you don't remember them, you should review Chapter 10
- before going on.
-
- During the link process, the linker will combine any segments
- which:
-
-
-
-
-
- The PC Assembler Tutor 278
- ______________________
-
- 1) have the same name.
- 2) are declared PUBLIC.
- 3) have the same class name (type).
-
- The linker processes object modules from left to right on the
- command line. The classes will be ordered in the ordering in
- which they were encountered (including the empty class type).
- Within each class, the segments will be ordered in the ordering
- in which they were encountered.
-
- If we have all these rules, how do high-level languages manage to
- combine their data and code correctly? The answer is that they
- use standardized segment definitions. Here are the basic ones for
- our data:
-
- ;---------------------------------------------------
- _DATA SEGMENT WORD PUBLIC 'DATA'
- _DATA ENDS
- ;---------------------------------------------------
- ;---------------------------------------------------
- CONST SEGMENT WORD PUBLIC 'CONST'
- CONST ENDS
- ;---------------------------------------------------
- ;---------------------------------------------------
- _BSS SEGMENT WORD PUBLIC 'BSS'
- _BSS ENDS
- ;---------------------------------------------------
- ;---------------------------------------------------
- STACK SEGMENT PARA STACK 'STACK'
- STACK ENDS
- ;---------------------------------------------------
-
- If all the code will fit in one segment we can use a single
- segment name:
-
- ;---------------------------------------------------
- _TEXT SEGMENT WORD PUBLIC 'CODE'
- _TEXT ENDS
- ;---------------------------------------------------
-
- otherwise we can make independent segments, each with an
- independant name:
-
- ;---------------------------------------------------
- name_TEXT SEGMENT WORD PUBLIC 'CODE'
- name_TEXT ENDS
- ;---------------------------------------------------
-
- where the "name" can be anything, but the "_TEXT" remains
- invariable. Any subroutine calls within the segment can be NEAR,
- while any calls to a different segment should be FAR.
-
- The "WORD" in these definitions says that when the linker
- combines segments into a larger segment, each subsegment must
- start at an even address (a word boundary). This has to do with
- the speed of word fetches from memory that we discussed in the
- last chapter. "WORD" is fine for 16 bit data busses, but for a
-
-
-
-
- Chapter 26 - Simplifying The Template 279
- _____________________________________
-
- 80386 you actually want "DWORD" so things are correctly aligned
- with a 32 bit data bus. "PARA" means paragraph and that means
- aligned with a segment starting address (every 16 bytes).
- Everything will work with "PARA".
-
- For reasons of convenience, compilers put different types of data
- in different segments. For you, there is no reason to use more
- than one segment, and that is:
-
- ;--------------------------------
- _DATA SEGMENT WORD PUBLIC 'DATA'
- _DATA ENDS
- ;--------------------------------
-
- Compilers use these different segments because they can. If they
- were constrained to use only one segment name, they could do it
- with no problem. What is in these different segments?
-
- _DATA standard initialized data
- CONST data constants
- _BSS uninitialized static data
- STACK room for the SS:SP stack
-
- So what do these things mean?
-
- _DATA
-
- The _DATA segment stores all initialized data which exists from
- the time the program starts till the time that the program ends.
-
- In C:
- static int x = 5;
-
- In Pascal:
- const
- my_salary : real = 52.77
-
- These variables have a specific value at the start of the
- program, even before the first instruction is executed. This
- value may change during the program. The variable exists during
- the whole program.
-
-
- _BSS
-
- The _BSS segment stores all uninitialized data which exists from
- the time the program starts till the time that the program ends.
-
- In C:
- static int x ;
-
- In Pascal, any variable declared outside a procedure but without
- an initial value will be in _BSS. In compiled BASIC, everything
- except dynamic arrays is in the _BSS. These variables have an
- indeterminate value at the start of the program and exist during
- the whole program.
-
-
-
-
-
- The PC Assembler Tutor 280
- ______________________
-
-
- CONST
-
- CONST takes all constants which are longer than 2 bytes. If the
- compiler is on its toes, anything one or two bytes long will be
- coded into the machine instructions since this is much faster.
- What is a constant? It is anything that has a value but doesn't
- have a variable name:
-
- value = 275.29 ;
- printf ( "Mr. Yellow: 'Read my lips - no new taxis!'\n");
- result = value / 27.619 ;
- file_ptr = fopen ( "stuff.doc", "r+") ;
-
- All the numbers and all the text strings need to be stored
- somewhere. They are stored in the CONST segment and given an
- internal name by the compiler so they can be used at the
- appropriate location. They are not available in other parts of
- the program.{1} These constants are sometimes called literals.
-
-
- STACK
-
- BASIC does not use the stack in the same way as Pascal and C. In
- BASIC it is used only for passing variables between subroutines.
- In C and Pascal, most variables are temporary. They come into
- existance at the beginning of the subroutine and they disappear
- upon leaving the subroutine. When you call the subroutine again,
- the values these variables have are indeterminate. These
- variables all exist on the stack relative to BP, the base
- pointer. This is why you can have recursion in C and Pascal but
- not in BASIC.
-
- As I said, you don't need to put your different types of data in
- different segments. It can all go into _DATA.
-
-
-
- GROUPS
-
- We now come to the bizarre. You will notice that when the linker
- links all these object modules together, it will have four
- distinct segments with each segment having a distinct class name.
- We will get:
-
- _DATA 'DATA'
- CONST 'CONST'
- _BSS 'BSS'
- STACK 'STACK'
-
- The problem here is that we want to set DS at the beginning of
- ____________________
-
- 1. There is an exception. Some compilers check to make sure
- that there are no duplicates of the constant. These compilers
- give all duplicates the same address so there is only one copy of
- any one constant such as 0, 1, etc.
-
-
-
-
- Chapter 26 - Simplifying The Template 281
- _____________________________________
-
- the program so that it will reference all the data. How are we
- going to do this? The warped minds of electrical engineers and
- computer scientists spent hours and hours trying to find the most
- obscure way possible to unify data addressing and they came up
- with GROUPS.
-
- You can tell the linker that you want data from distinct segments
- to be referenced by the offset from the beginning of the lowest
- segment in memory that belongs to the group. Read this about five
- or ten times to get the hang of it. You tell the linker that a
- bunch of different segments belong to a group. It will find the
- segment which is lowest in memory and then whenever you ask for
- the GROUP offset, the linker will calculate the offset from the
- beginning of this first segment.
-
- The way you define a group is with a name, the word "GROUP", and
- then a list of those segments in the file which belong to the
- group:
-
- DGROUP GROUP _DATA, CONST, _BSS, STACK
-
- Note that it is the segment names, not the class names. DGROUP is
- the standard name for the data group. If the assembler gives the
- linker the correct information, the linker will adjust all
- offsets relative to the beginning of the group. The only limit on
- a group is that the distance from the first byte of the group to
- the last byte of the group must be 65535 bytes or less. This is
- because all the group segments must reside in one physical
- segment in memory.
-
- It is not even necessary for all the segments in a block of
- memory to belong to the group. Consider the following ordering of
- segments in memory.
-
- _DATA
- DATASTUFF
- CONST
- CODESTUFF
- _BSS
- EVENMORESTUFF
- STACK
-
- As long as the distance from one end of _DATA to the other end of
- STACK is 65535 bytes or less, the linker will adjust the offsets
- in _DATA, CONST, _BSS and STACK relative to the start of DGROUP
- and the linker will adjust the offsets of DATASTUFF, CODESTUFF
- and EVENMORESTUFF relative to their respective segment starting
- addresses. I didn't say that this was good programming, I only
- said that it was possible.
-
- Thoroughly confused? You're not alone. Just remember, in all
- compiled languages, we are going to combine these four types of
- segments into a single group where offsets are relative to the
- very beginning of the data.
-
- Before getting you even more confused, let's take a look at what
- we have so far. Make sure you actually do all of the following
-
-
-
-
- The PC Assembler Tutor 282
- ______________________
-
- examples. Use template.asm and at the very top, put in the
- following segments:
-
- ; - - - - - - - - -
- SEG1 SEGMENT 'STUFF'
- db 100 dup (?)
- seg1_data db ?
- db 899 dup (?)
- SEG1 ENDS
- ; - - - - - - - - -
- SEG3 SEGMENT 'STUFF'
- db 300 dup (?)
- seg3_data db ?
- db 699 dup (?)
- SEG3 ENDS
- ; - - - - - - - - -
- SEG5 SEGMENT 'STUFF'
- db 500 dup (?)
- seg5_data db ?
- db 499 dup (?)
- SEG5 ENDS
- ; - - - - - - - - -
-
- Call this program QGROUP1.ASM. These segments are 1000 bytes
- long, and the data names are 100, 300 and 500 bytes into their
- respective segments. Because these segments will be paragraph
- aligned, the second and third segments will start 1008 bytes (16
- X 63) after the proceeding one. You need to tell the assembler
- that these are in a group and give the proper ASSUME statement.
- We'll call this QGROUP:
-
- QGROUP GROUP SEG1, SEG3, SEG5
- ASSUME cs:CODESTUFF, ds:DATASTUFF, ds:QGROUP
-
- Here's some code:
-
- ; + + + + + + + + + + + + START CODE BELOW THIS LINE
- lea ax, seg1_data
- call print_unsigned
- lea ax, seg3_data
- call print_unsigned
- lea ax, seg5_data
- call print_unsigned
- ; + + + + + + + + + + + + END CODE ABOVE THIS LINE
-
- As you can see, all we are doing is putting the addresses into AX
- and then printing them as unsigned numbers. Here's the output:
-
- 00100
- 01308
- 02516
-
- Remember, each segment is starting 1008 bytes after the start of
- the previous one. Here's the same program with a few extra
- segments thrown in. Call it QGROUP2.ASM.:
-
- ; - - - - - - - - -
-
-
-
-
- Chapter 26 - Simplifying The Template 283
- _____________________________________
-
- SEG1 SEGMENT 'STUFF'
- db 100 dup (?)
- seg1_data db ?
- db 899 dup (?)
- SEG1 ENDS
- ; - - - - - - - - -
- SEG2 SEGMENT 'STUFF'
- db 200 dup (?)
- seg2_data db ?
- db 799 dup (?)
- SEG2 ENDS
- ; - - - - - - - - -
- SEG3 SEGMENT 'STUFF'
- db 300 dup (?)
- seg3_data db ?
- db 699 dup (?)
- SEG3 ENDS
- ; - - - - - - - - -
- SEG4 SEGMENT 'STUFF'
- db 400 dup (?)
- seg4_data db ?
- db 599 dup (?)
- SEG4 ENDS
- ; - - - - - - - - -
- SEG5 SEGMENT 'STUFF'
- db 500 dup (?)
- seg5_data db ?
- db 499 dup (?)
- SEG5 ENDS
- ; - - - - - - - - -
-
- This is almost the same thing but we have added two more
- segments. We are NOT going to join these two segments into the
- group. Here's the GROUP and ASSUME statements:
-
- QGROUP GROUP SEG1, SEG3, SEG5
- ASSUME ds:SEG2, ds:SEG4
- ASSUME cs:CODESTUFF, ds:DATASTUFF, ds:QGROUP
-
- Make sure the ASSUME statements are in that order or things may
- get confused. We also add some code:
-
- ; + + + + + + + + + + + + START CODE BELOW THIS LINE
- lea ax, seg1_data
- call print_unsigned
- lea ax, seg2_data
- call print_unsigned
- lea ax, seg3_data
- call print_unsigned
- lea ax, seg4_data
- call print_unsigned
- lea ax, seg5_data
- call print_unsigned
- ; + + + + + + + + + + + + END CODE ABOVE THIS LINE
-
- This shows the addresses of all five variables. Here's the new
- output:
-
-
-
-
- The PC Assembler Tutor 284
- ______________________
-
-
- 00100
- 00200
- 02316
- 00400
- 04532
-
- As you can see, the GROUPed segments have their offsets relative
- to the beginning of the group while the others have their offsets
- relative to the beginning of the segment.
-
- Make a copy of QGROUP1.ASM and call it QGROUP?.ASM. Leave the
- segment definitions, group definitions, ASSUME statements and
- code the same, but add six more lines of code at the end:
-
- ; compare the offsets
- mov ax, offset seg1_data
- call print_unsigned
- mov ax, offset seg3_data
- call print_unsigned
- mov ax, offset seg5_data
- call print_unsigned
- ; + + + + + + + + + + + + + + + END CODE ABOVE THIS LINE
-
- After the three LEAs we now do 3 OFFSETS. Assemble and link this.
- Here's the output:
-
- 00100
- 01308
- 02516
- 00100
- 00300
- 00500
-
- Wait a minute! Those last three numbers should be the same as the
- first three numbers. That's right, folks. This is a known error
- in the MASM assembler. In fact the Turbo Assembler copies this
- mistake when it is in "MASM" mode but does it right when it is in
- "IDEAL" mode. A86 does it right all the time. Here is the output
- from the same source file when assembled by A86:
-
- 00100
- 01308
- 02516
- 00100
- 01308
- 02516
-
- You have told the assembler to calculate all offsets relative to
- the beginning of the group and MASM is ignoring you every time
- you use the OFFSET operator. The code fix for this is to use an
- override when you use OFFSET:
-
- ; compare the offsets
- mov ax, offset QGROUP:seg1_data
- call print_unsigned
- mov ax, offset QGROUP:seg3_data
-
-
-
-
- Chapter 26 - Simplifying The Template 285
- _____________________________________
-
- call print_unsigned
- mov ax, offset QGROUP:seg5_data
- call print_unsigned
- ; + + + + + + + + + + + + + + + END CODE ABOVE THIS LINE
-
- Better yet, use LEA whenever possible. If you do use OFFSET with
- groups, you need to go through the text file with a word search
- to make sure that all OFFSETs have a group override. This is a
- subtle error and it is very hard to find if you are not looking
- for it.
-
-
- This system is designed so that we can have 64k of data and
- stack, all of which is addressable with DS without changing DS's
- value. What happens if you have more data than that? One thing
- for sure is that you don't have more than 64k of individually
- named variables. Either that or you have some huge calluses on
- your typing fingers.
-
- What you do have is arrays. If you run into space problems, you
- move the least used or the biggest arrays into their own
- segments. You can have one segment per array if you want. The
- standardized high-level language names for these segments is:
-
- ; - - - - - - - - - - - - - - - - - - - - -
- FAR_DATA SEGMENT PARA 'FAR_DATA'
- FAR_DATA ENDP
- ; - - - - - - - - - - - - - - - - - - - - -
- FAR_BSS SEGMENT PARA 'FAR_BSS'
- FAR_BSS ENDP
- ; - - - - - - - - - - - - - - - - - - - - -
-
- Once again, the '_DATA' is for initialized data while the '_BSS'
- is for uninitialized data. Use only the 'FAR_DATA' kind.{2} You
- will notice that these segments are NOT PUBLIC. Although an
- assembler will unify all segments with the same definition that
- are in the same file, the linker will not unify segments from
- different files which are not PUBLIC. If we create 4 different
- .ASM files, each with one segment:
-
- ; FARDATA1.ASM
- PUBLIC data1
- ; - - - - - - - - - - - - - - - - - - - - -
- FAR_DATA SEGMENT PARA 'FAR_DATA'
- data1 db 1A67h dup (0)
- FAR_DATA ENDS
- ; - - - - - - - - - - - - - - - - - - - - -
-
- ; FARDATA2.ASM
- PUBLIC data2
- ____________________
-
- 2. A high-level language has the right to set all the data of
- a 'BSS' segment to zero as part of its startup routine. Whether
- it does so or not depends on what it has told the linker. If you
- put initialized data into either a '_BSS' or a 'FAR_BSS' segment,
- it might easily wind up zero after startup.
-
-
-
-
- The PC Assembler Tutor 286
- ______________________
-
- ; - - - - - - - - - - - - - - - - - - - - -
- FAR_DATA SEGMENT PARA 'FAR_DATA'
- data2 db 0D4A8h dup (0)
- FAR_DATA ENDS
- ; - - - - - - - - - - - - - - - - - - - - -
-
- FARDATA3.ASM
- PUBLIC data3
- ; - - - - - - - - - - - - - - - - - - - - -
- FAR_DATA SEGMENT PARA 'FAR_DATA'
- data3 db 200h dup (0)
- FAR_DATA ENDS
- ; - - - - - - - - - - - - - - - - - - - - -
-
- FARDATA4.ASM
- PUBLIC data4
- ; - - - - - - - - - - - - - - - - - - - - -
- FAR_DATA SEGMENT PARA 'FAR_DATA'
- data4 db 8716h dup (0)
- FAR_DATA ENDS
- ; - - - - - - - - - - - - - - - - - - - - -
-
- and link these with TEMPLATE.OBJ and ASMHELP, we will get the
- following .MAP file:
-
- Start Stop Length Name Class
- 00000H 01A66H 01A67H FAR_DATA FAR_DATA
- 01A70H 0EF17H 0D4A8H FAR_DATA FAR_DATA
- 0EF20H 0F11FH 00200H FAR_DATA FAR_DATA
- 0F120H 17835H 08716H FAR_DATA FAR_DATA
- 17840H 1823FH 00A00H STACKSEG STACK
- 18240H 1875DH 0051EH DATASTUFF DATA
- 18760H 1A02FH 018D0H CODESTUFF CODE
-
- Program entry point at 1876:0000
-
- The numbers in the segment definitions were in hex so you could
- read the .MAP file more easily. We have created four different
- FAR_DATAs - one for each variable.
-
- The idea here is to leave DS alone if possible and use ES:SI or
- ES:DI for your manipulation of the array.
-
- mov ax, seg data1
- mov es, ax
- mov si, offset data1
-
- Of course, if you using two different FAR_DATA arrays from two
- different segments at the same time, you will probably need to
- use DS temporarily. This is the kind of thing you need to plan
- before you start a program which contains large arrays.
-
-
-
-
-
-
-
-
-
-
- Chapter 26 - Simplifying The Template 287
- _____________________________________
-
- You have now seen all possible segments for any Microsoft
- language and for Turbo C.{3} These are:
-
-
- _DATA SEGMENT WORD PUBLIC 'DATA'
- CONST SEGMENT WORD PUBLIC 'CONST'
- _BSS SEGMENT WORD PUBLIC 'BSS'
- STACK SEGMENT PARA STACK 'STACK'
- _TEXT SEGMENT WORD PUBLIC 'CODE'
-
- name_TEXT SEGMENT WORD PUBLIC 'CODE'
- FAR_DATA SEGMENT PARA 'FAR_DATA'
- FAR_BSS SEGMENT PARA 'FAR_BSS'
-
- DGROUP GROUP _DATA, CONST, _BSS, STACK
-
-
- We have another problem on our road to simplification. We want DS
- to have the address of the start of DGROUP. How do we do it?
- Well, before we had:
-
- mov ax, DATASTUFF
- mov ds, ax
-
- DATASTUFF was a segment. We do the same thing for groups:
-
- mov ax, DGROUP
- mov ds, ax
-
- We use a group name instead of a segment name. This means that
- our ultimate code segment will look like this
-
- ; - - - - - - - - - -
- _TEXT SEGMENT WORD PUBLIC 'CODE'
-
- DGROUP GROUP _DATA, CONST, _BSS, STACK
- ASSUME cs:_TEXT, ds:DGROUP
-
- start:
- mov ax, DGROUP
- mov ds, ax
-
- ; - - - - - - - - - -
- ; the program goes here
- ; - - - - - - - - - -
-
- mov ah, 4Ch
- ____________________
-
- 3. If you are using Turbo PASCAL, then there are only two
- segments possible. They are:
-
- DATA SEGMENT WORD PUBLIC
- CODE SEGMENT BYTE PUBLIC
-
- There is no class name. You can substitute DSEG for DATA and CSEG
- for CODE if you want. Turbo Pascal has no DGROUP.
-
-
-
-
- The PC Assembler Tutor 288
- ______________________
-
- mov al, ? ; replace ? with error code
- int 21h
-
- _TEXT ENDS
- ; - - - - - - - - - -
-
- Say, if all this stuff is standardized text, why are we forced to
- type all this drivel over and over again. The answer is that we
- aren't. All the segment information has a shorthand. Here's how
- it works. Every shorthand symbol starts with a dot. The assembler
- will then generate the desired text.{4} This is from MASM 5.0 on,
- so if you have an earlier assembler you'll have to write the full
- text.
-
- To start out, use the two starting directives DOSSEG (with no
- dot) and .MODEL. MODEL will be explained later.{5}
-
- DOSSEG
- .MODEL Medium
-
- For now, 'medium' is what we want.
-
- From that point, if you want a data segment, you just write
- .DATA, if you want code, you write .CODE. Every time that the
- assembler sees a segment directive it will close any segment that
- is open and start the segment indicated by the directive. (You
- can always reopen a segment). Here is what replaces the
- directives:
-
- DIRECTIVE REPLACEMENT TEXT
-
- .DATA _DATA SEGMENT WORD PUBLIC 'DATA'
- .CONST CONST SEGMENT WORD PUBLIC 'CONST'
- .DATA? _BSS SEGMENT WORD PUBLIC 'BSS'
- .STACK [size] STACK SEGMENT PARA STACK 'STACK'
- .CODE _TEXT SEGMENT WORD PUBLIC 'CODE'
-
- .CODE [name] name_TEXT SEGMENT WORD PUBLIC 'CODE'
- .FARDATA [name] FAR_DATA SEGMENT PARA 'FAR_DATA'
- .FARDATA? [name] FAR_BSS SEGMENT PARA 'FAR_BSS'
-
- The [name] in brackets will be explained in a minute. The [size]
- after the stack declaration allows you to customize the size of
- the stack. Without any size, the declaration
-
- .STACK
-
- will allocate 1k of memory for the stack. A size allocates a
- ____________________
-
- 4. It really generates no text. It is just that the assembler
- will generate the same machine code as if that text had been
- generated.
-
- 5. DOSSEG tells the assembler to tell the linker that the .EXE
- file should have the standard segment order. It is not necessary
- but it doesn't hurt.
-
-
-
-
- Chapter 26 - Simplifying The Template 289
- _____________________________________
-
- specific number of bytes:
-
- .STACK 2000h
-
- You can make it anything you want, but make sure it is an even
- number and remember that the limit for all four parts of DGROUP
- is 64k.
-
- To see how the names work, we need some text files. Here is a
- complete main file:
-
- ; FARDATA.ASM - driver module
- DOSSEG
- .MODEL medium
- EXTRN data2_routine:FAR, data3_routine:FAR
- .STACK 200h
- .FARDATA
- data1 db 0100h dup (0)
- .CODE
- main:
- mov ax, DGROUP
- mov ds, ax
- call data2_routine
- call data3_routine
- mov ax, 4C00h
- int 21h
- END main
-
- It has some data and some code though it doesn't really do
- anything. We will use this along with two other files for the
- examples. Here is FARDATA2:
-
- ; FARDATA2.ASM
- DOSSEG
- .MODEL medium
- PUBLIC data2_routine
- .FARDATA
- data2 db 0200h dup (0)
- .CODE
- data2_routine proc
- ret
- data2_routine endp
- END
-
- Notice that data2_routine doesn't have a FAR or NEAR. That's
- being taken care of by the memory model. Data2_routine's type
- does need to be declared EXTRN in the main module. The third
- routine has similar code. Here is the .MAP file when they are
- combined:
-
-
- Start Stop Length Name Class
- 00000H 00013H 00014H FARDATA_TEXT CODE
- 00014H 00014H 00001H FARDATA2_TEXT CODE
- 00016H 00016H 00001H FARDATA3_TEXT CODE
- 00020H 0011FH 00100H FAR_DATA FAR_DATA
- 00120H 0031FH 00200H FAR_DATA FAR_DATA
-
-
-
-
- The PC Assembler Tutor 290
- ______________________
-
- 00320H 0061FH 00300H FAR_DATA FAR_DATA
- 00620H 00620H 00000H _DATA DATA
- 00620H 0081FH 00200H STACK STACK
-
-
- You can see the FAR_DATAs there, but where did the FARDATA3_TXT
- come from? The assembler decided that we wanted independent code
- segments and gave each one the name of the assembler file it came
- from. Since all the object files in a program must have unique
- names, these segment names should also be unique. If we change
- the .MODEL from MEDIUM to COMPACT without touching anything else,
- then we get:
-
-
- Start Stop Length Name Class
- 00000H 00022H 00023H _TEXT CODE
- 00030H 0012FH 00100H FAR_DATA FAR_DATA
- 00130H 0032FH 00200H FAR_DATA FAR_DATA
- 00330H 0062FH 00300H FAR_DATA FAR_DATA
- 00630H 00630H 00000H _DATA DATA
- 00630H 0082FH 00200H STACK STACK
-
- If we now put a name after the .FARDATA directive, it will give
- the segment a unique name. Putting:
-
- .FARDATA jake_the_snake
-
- in FARDATA2.ASM, along with name changes in the other modules
- results in the following .MAP file:
-
-
- Start Stop Length Name Class
- 00000H 00022H 00023H _TEXT CODE
- 00030H 0012FH 00100H HACKSAW FAR_DATA
- 00130H 0032FH 00200H JAKE_THE_SNAKE FAR_DATA
- 00330H 0062FH 00300H HULKSTER FAR_DATA
- 00630H 00630H 00000H _DATA DATA
- 00630H 0082FH 00200H STACK STACK
-
-
-
- We are doing a number of interrelated things here, so let's try
- to unify what is going on. You have seen both NEAR and FAR
- routines in the Tutor. A NEAR routine alters IP and restores IP
- on the return. A FAR routine alters both CS and IP and restores
- them on the return.
-
- When we passed addresses of data, we have almost always passed
- just the offset of the data. That is because the data has almost
- always been in the DATASTUFF SEGMENT, and the value of DS has
- been known. In Chapter 19 we did "move_pascal_string" which was a
- subroutine where we passed both the segment and offset of the
- data. These are our two choices for passing addresses:
-
- OFFSET 1 word
- SEGMENT:OFFSET 2 words
-
-
-
-
-
- Chapter 26 - Simplifying The Template 291
- _____________________________________
-
- This gives us four basic possiblilities for program structure:
-
- SUBROUTINE CALL DATA ADDRESSES PASSED AS
-
- NEAR OFFSET
- FAR OFFSET
- NEAR SEGMENT:OFFSET
- FAR SEGMENT:OFFSET
-
- Each of these structural possibilities has a name called a MODEL
- name. They are:
-
- SUBROUTINE CALL ADDRESSES PASSED AS MODEL NAME
-
- NEAR OFFSET SMALL
- FAR OFFSET MEDIUM
- NEAR SEGMENT:OFFSET COMPACT
- FAR SEGMENT:OFFSET LARGE
-
-
- You tell the assembler which model you are working with by using
- the .MODEL directive:
-
- .MODEL medium
-
- The assembler will then make either NEAR or FAR the default type.
- This can be overridden if you have explicitly given a NEAR or
- FAR:
-
- my_proc procedure
-
- will generate the correct subroutine calls and returns for that
- model, while:
-
- my_proc1 procedure near
- my_proc2 procedure far
-
- will remain unaltered.
-
- At the assembler level, you need to code the address passing
- yourself, but if you have a MODEL and you are connected to a
- high-level language (with the same .MODEL type), the high-level
- language will pass all addresses as stated above.
-
- The advantage of this system is that using the .MODEL directive
- and appropriate EQU statements and MACROS (which we have not
- covered), it is possible to write a single subroutine which can
- then be assembled in all four model configurations. Coding this
- is non-trivial, but when you have done more programming you will
- see how to deal with the stack using EQUs and MACROs.
-
- For now, you want to stay with data addresses which are passed by
- offset only. This is much easier. These are the SMALL and MEDIUM
- models. Whether you choose NEAR or FAR procedures doesn't affect
- much except where parameters are on the stack (because of that
- extra CS).
-
-
-
-
-
- The PC Assembler Tutor 292
- ______________________
-
- Are these .MODELS important? They are nice, but not particularly
- vital. What happens is that in a manual you see a sample program
- like this:
-
- DOSSEG
- .MODEL medium
- .STACK
- .DATA
- variable1 dw 25
- .CODE
- sample proc
- mov ax, variable1
- ret
- sample endp
- ENDS
- END
-
- and you start comparing the size of this to the size it would be
- if you used the standard segment definitions. I have news for
- you. This is not a legitimate program. A legitimate program is a
- page or two long.{6} Also, at least to my way of thinking, you
- want visual separation between segments. The above is a
- disordered presentation of segments. We want order in our
- programs and the segment headers provide a visual structure. In
- the text file for ASMHELP, (which is about 3600 lines long), the
- SEGMENT declarations occupy about 20 lines. This is about 0.5% of
- the total length of the file.
-
- If you are going to assemble a file in multiple models, then it
- is worthwhile to use the .MODEL directives, otherwise it is
- optional depending more on your concept of what looks clear than
- any major difference.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- ____________________
-
- 6. Perhaps you want to scan \COMMENTS\MISHMASH.DOC which
- contains some real subroutines. They are all long.
-
-
-
-
- Chapter 26 - Simplifying The Template 293
- _____________________________________
-
- SUMMARY
-
- To exit a program, use INT 21h Function 4Ch
-
- mov ah, 4Ch ; exit program
- mov al, ? ; replace ? with error code
- int 21h
-
-
- A GROUP is a group of segments whose data will be referenced by
- the offset from the beginning of the group. You declare a group
- with:
-
- DGROUP GROUP _DATA, CONST, _BSS, STACK
-
- MASM calculates OFFSETS incorectly with groups, so you should
- either use LEA or the DGROUP override:
-
- lea ax, variable1
- mov ax, offset DGROUP:variable1
-
- To get the address of DGROUP in DS you need:
-
- ASSUME ds:DGROUP
-
- and:
-
- mov ax, DGROUP
- mov ds, ax
-
-
-
- The standardized segment definitions, along with their simplified
- directives are:
-
- DIRECTIVE REPLACEMENT TEXT
-
- .DATA _DATA SEGMENT WORD PUBLIC 'DATA'
- .CONST CONST SEGMENT WORD PUBLIC 'CONST'
- .DATA? _BSS SEGMENT WORD PUBLIC 'BSS'
- .STACK [size] STACK SEGMENT PARA STACK 'STACK'
- .CODE _TEXT SEGMENT WORD PUBLIC 'CODE'
-
- .CODE [name] name_TEXT SEGMENT WORD PUBLIC 'CODE'
- .FARDATA [name] FAR_DATA SEGMENT PARA 'FAR_DATA'
- .FARDATA? [name] FAR_BSS SEGMENT PARA 'FAR_BSS'
-
-
- In addition you have the different model names:
-
- SUBROUTINE CALL ADDRESSES PASSED AS .MODEL NAME
-
- NEAR OFFSET SMALL
- FAR OFFSET MEDIUM
- NEAR SEGMENT:OFFSET COMPACT
- FAR SEGMENT:OFFSET LARGE
-
-