home *** CD-ROM | disk | FTP | other *** search
-
- Assembler Tutorial
- ******************
-
- This chapter explains how to use the RISC OS Forthmacs ARM assembler in order
- to create short machine language code sequences. This chapter is a companion
- to the "ARM Assembler" chapter. That chapter describes the syntax of
- individual assembly language instructions. This chapter addresses "higher
- level" issues, such as how to begin and end the assembly process and how to
- communicate arguments and result between Forth and assembly language.
-
-
- Motivation
- ==========
-
- For nearly all debugging jobs, writing assembly language is unnecessary. Test
- loops can be usually be written more quickly and easily in high-level Forth,
- and will execute quickly enough to get the job done.
-
- However, in some cases the ultimate in speed is needed for certain critical
- operations, and assembly language may be the best way to go. In other cases,
- very specific combinations of machine instructions may exhibit problem
- behavior, and those combinations may need to be reproduced. Finally, some
- maintainers of the RISC OS Forthmacs system software itself may need to
- understand the assembler.
-
-
- Assumptions
- ===========
-
- The chapter assumes that you already understand the ARM instruction set,
- including such issues as processor modes, interrupts and registers sets. If
- not, you should first study a ARM reference, such as the manual published by
- the chip manufacturer.
-
- Please note the sysntax of this ARM assembler, it uses - as most Forth
- assemblers - the operand first - operator last syntax.
-
-
- Example: a simple "code word"
- =============================
-
- Here is a very simple assembly language program. It adds "1" to the contents
- of a register then returns to the Forth interpreter. This register r10 holds
- the top of stack value.
-
- code addone ( n -- n+1 )
- r10 r10 1 # add
- c;
-
- To execute it and display the result, you would type, for example,
-
- 5 addone .
-
- Here's what is happening, line by line:
-
- code addone ( n -- n+1 )
-
- code is a "defining word"; it creates a new command which can be executed by
- typing its name. The name of the new command in this case is ADDONE . The
- name could have been anything; I have chosen the name ADDONE because it
- describes the action of the program. You may already be familiar with another
- Forth defining word " : or COLON ". ":" also creates a new command; the
- difference between code and ":" is that ":" creates a new command whose
- behavior is described by a sequence of other Forth commands, whereas code
- creates a new command whose behavior is described by a sequence of assembly
- language instructions. After code creates the new command, it starts the
- assembler so that assembly language instructions may be entered.
-
- The stuff inside the parentheses is a comment; this particular comment
- indicates that the new command expects one argument ("n") on the stack before
- the word is executed, and after the command is executed, one result ("n+1") is
- left on the stack. The comment is optional, but its inclusion is strongly
- recommended.
-
- r10 r10 1 # add
-
- This is the assembly language instruction which defines the action of the new
- command. As you will recall from the "ARM Assembler" chapter, the
- RISC OS Forthmacs assembler syntax has the destination register first,
- followed by the source operand(s), followed by the operation name. So, in
- this case, the source operands are the global register r10 and the immediate
- number 1, the destination operand is the global register r10, and the
- operation is add, i.e. 1 is added to the contents of register r10, and the
- result is placed back in register r10.
-
- c;
-
- c; terminates the definition of a code definition. At the end of the
- instructions you have assembled, c; automatically appends one machine
- instruction, its effect is to return to Forth after the user-specified
- instructions have been executed.
-
- 5 addone .
-
- In order to invoke the new command, we enter the number 5 on the Forth stack,
- type the name of the command ADDONE , and then display the result by typing
- the print command "." .
-
- Perhaps you now wonder how the number got off the Forth stack and into the
- register r10, and afterwards how the number got out of r10 and back onto the
- Forth stack. The answer is simple: the top element of the Forth stack is
- always (!) kept in r10 , so no movement was necessary. That is why I chose
- r10 for the register in this example.
-
-
- Register Usage in Forth
- =======================
-
- To use the assembler effectively, you need to know which registers are
- available for use, and which of them must be left alone. Here are the rules:
-
- r8, r9, r12, and r14 are used internally by the Forth interpreter or operating
- system, their values must be left alone (otherwise the system will crash).
-
- r10 contains the top of the Forth stack. It is used for passing arguments and
- results back and forth between Forth and assembly language.
-
- r13 contains a pointer to a memory area containing the rest of the Forth stack
- (all elements other than the topmost one). That stack area is used for extra
- arguments and results. The section entitled "Stack Usage" tells you more
- about managing the stack area.
-
- r0 - r6 may be used freely within assembly language code sequences. Forth
- does not depend on the contents of these registers. However, some Forth
- commands do use these registers as scratch registers, so your code should not
- attempt to keep important values in these registers from one time to the next.
- While your code is being executed, Forth will not change the contents of any
- of these registers, so you can depend on them for the duration of your
- assembly language sequence. When your code finishes and returns to Forth, the
- next time that you execute your code the register values may have changed.
-
- You can find more information about this subject in the "ARM Assembler" and
- "Forthmacs Implementation" chapters.
-
- While your machine code is executing, it will run at the full speed of the
- system, without any interference or overhead imposed by RISC OS Forthmacs.
- RISC OS Forthmacs does not itself use interrupts, so the processor will
- execute exactly the sequence of instructions which you have coded. It is
- possible that other software in the system may have set up some interrupts,
- but that is beyond the control of RISC OS Forthmacs.
-
-
- Disassembler
- ============
-
- The RISC OS Forthmacs disassembler may be used to review the assembly language
- you have created:
-
- see addone
-
- The result will look something like this:
- code addone
- ( 1e878 ) add r10,r10,#1
- ( 1e87c ) ldr pc,[r8],#4
-
-
- The numbers along the left hand side are the addresses at which the various
- instructions appear. The addresses shown here will almost certainly be
- different from the addresses that you see.
-
- You will notice that even though our example contained only one assembly
- language instruction the disassembler shows 1 extra instruction. This extra
- instruction was automatically assembled by the c; command. Their purpose is
- to return control to Forth after the assembly language sequence has finished
- its execution (this is called the next instruction).
-
- The see command reads the name of a Forth command (in this case "addone"),
- determines what type of command it is (in this case "code ", meaning that the
- command's behavior was defined by the assembler), and then displays a
- reconstruction of the source code for that command. see also works for
- "colon" definitions, whose behaviour is defined in Forth instead of in
- assembly language. For an example of this, type "see find".
-
- Many of the normal Forth commands are defined in assembly language, and see
- can be used to look at how they are implemented. For example, type "see @" to
- see how the Forth "@" operator works (pronounced fetch, this operator takes an
- address from the top of the stack, reads the 32-bit contents of that address,
- and puts those contents back on top of the stack). You should try this right
- now and make sure you understand how it works. Note that the last
- instructions of "@" is exactly the same as the last instruction of "addone".
- Every code definition in RISC OS Forthmacs ends with these same three
- instructions.
-
- see automatically locates the address where the code for particular command
- begins. That address was allocated by code when the new command was defined.
- The disassembler can also be used to inspect machine code beginning at
- arbitrary addresses, not only that code which is created by code . Suppose
- that you know there is some code starting at address 100000 and you wish to
- look at it:
-
- 100000 dis
-
- On your system, this example probably won't work exactly as shown because your
- system may not have any code at address 100000 (in fact, it may not even have
- any memory there. The main point, though, is that you type the address of the
- code you wish to disassemble, followed by "dis".
-
- The disassembler will continue until it reaches a "definition ending"
- instruction, or until you stop it by typing the character "q", for "quit". It
- will also pause at the end of a screen and prompt you for a continuation
- character.
-
- After the disassembler has stopped, you can make it continue where it left off
- by typing +dis
-
-
- Setting the Starting Address
- ============================
-
- In most cases, you won't need to specify a starting address for the code you
- assemble. When you use the code defining word to begin assembling,
- RISC OS Forthmacs will find some appropriate memory for you and assemble your
- code there ( at here). You can then locate the memory RISC OS Forthmacs has
- chosen by using the see command to disassemble the code, looking at the
- addresses displayed alongside the machine instructions.
-
- If you really need to assemble at a specific address, you can do so as follows
- (Note: in nearly all cases, this technique is unnecessary; very rarely does it
- matter where exactly you locate a bit of code, and allowing RISC OS Forthmacs
- to allocate the memory for you is sufficient and convenient).
-
- Set the dp by
- here @
- your-adr dp !
- code demo
- ...... c;
- here !
-
-
- Conditional branches
- ====================
-
- In order to implement conditional operations and loops, most assemblers
- provide branch instructions and labels. RISC OS Forthmacs has branches and
- labels too, but it also has a much better way, which eliminates most of the
- troublesome aspects of coding conditionals and loops in assembly language.
- The RISC OS Forthmacs way is called "structured conditionals". For example,
- suppose we want to test a condition and execute some code only if the
- condition is true. Specifically, we want to compare r0 and r1, and execute
- some code only if r0 is less than r1 .
-
- Traditional assembler:
-
- cmp r0, r1
- bge temp
- ..some code we want to conditionally execute
- temp:
-
- Forthmacs assembler with structured conditionals:
-
- r0 r1 cmp
- < if
- ..some code we want to conditionally execute..
- then
-
- As you can see, RISC OS Forthmacs eliminates the need to mentally reverse the
- sense of the comparison, eliminates the need to invent and keep track of label
- names, and uses conventional mathematical comparison symbols (e.g. "<"),
- rather then alphabetic mnemonics. The complete set of comparison symbols is
- given in the "ARM Assembler" chapter.
-
- The "if .. then" construct can also include an "else" clause:
-
- r0 r1 s cmp \ the s is optional
- < if
- ..code to execute if r0 < r1..
- else
- ..code to execute if r0 >= r1..
- then
-
- Of course, the assembler actually generates conditional branch instructions
- because that's what the hardware supports directly, but RISC OS Forthmacs
- takes care of the "bookkeeping" for you.
-
- Another way would be to use the conditional instructions offered by the ARM
- cpu.
-
- r0 r1 cmp
- xx xx lt xxx
- yy yy ge xxx
-
-
-
- Delayed Branches
- ================
-
- ARM doesn't uses delayed branches at all, so don't worry.
-
-
- Loops
- =====
-
- RISC OS Forthmacs structured conditionals also have features for easily
- creating loops. Here is a loop which executes forever:
-
- Source Generates
-
- begin Label1:
- top r0 ) ldr ldr r10,[r10,#0]
- again b Label1
-
- This code assumes that the r10 register (top of stack, remember?) contains the
- address of a memory location, and the contents of that memory location is
- continuously read into the r0 register. This is an infinite loop; it won't
- stop until the system is reset, or power cycled, or externally interrupted in
- some way.
-
- Suppose we want the loop to execute 9 times then quit:
-
- r1 9 # mov
- begin
- r0 top ) ldr
- r1 r1 1 # s sub
- <= until
-
-
- We continue to loop "until" r1 <= 1 .
-
- Finally, here's an example where we perform a test at the top of the loop
- rather than at the bottom, illustrating "while":
-
- r1 9 # mov
- begin
- r1 r1 1 s sub
- > while
- r0 top ) ldr
- repeat
-
-
- This loop continues to execute "while" r11 > 1, and the "repeat" sends it back
- to the "begin".
-
- Structured conditionals and loops nest in the expected manner, to an arbitrary
- depth. For instance, a "begin .. until" can be completely contained within
- an "if .. then", which itself may be contained within a "begin .. while ..
- repeat".
-
-
- Scope Loops - Assembler vs. Forth
- =================================
-
- You can use assembly language for creating scope loops, but it is usually
- preferable to write them in Forth, because the Forth version is usually easier
- to write, easier to read, and easier to debug. The one advantage of an
- assembly language loop is that it is tighter. However this rarely matters.
- For comparison, suppose that you want to continually read location 1000 so
- that you can observe the action on an oscilloscope. This is how you would do
- it in assembly language:
-
- code test
- r0 th 1000 # mov
- begin
- r1 r0 ) ldr
- again
-
- Here's how you would do the same thing in Forth:
-
- begin 1000 @ drop again
-
- Additionally, the Forth version may be easily adapted to stop looping as soon
- as a key is typed:
-
- begin 1000 @ drop key? until
-
- More importantly, many of today's complicated chips require fairly extensive
- initialization sequences in order to configure them to the correct operating
- mode. Such code is much easier to write and debug in Forth, because you can
- "try things out" by typing commands at the keyboard, the looking at the
- registers to see what happened.
-
- A set of simple Forth commands sufficient to do most hardware debugging jobs
- can easily be described on a single page, and many engineers and technicians
- have learned enough Forth in 30 minutes to be able to write sophisticated
- diagnostics for complicated hardware.
-
-
- Stack Usage
- ===========
-
- A previous example has shown how to access the top element on the stack which
- is stored in r10. Things get a little more complicated if more than 1 stack
- argument is needed. Remember that the top of the stack is stored in r10, and
- subsequent stack items are stored in a memory area whose address is contained
- in r13. For convenience, the assembler provides alternate names for r10 and
- r13, reflecting the use of these registers for the stack. r10 is also known
- as top (Top of Stack), and r13 is also known as sp (Stack Pointer).
-
- The basic rules for the Forth stack are:
-
- a) Upon entry to a code definition (assembly language), the top of the stack
- is contained in top. The next item on the stack is in the memory location
- whose address is contained in sp. The item after that is in memory at SP+4 ,
- the next at SP+8 , etc. Note that successive stack items are 4 bytes
- (32-bits) apart.
-
- b) A definition may modify the stack contents, and upon exit from the
- definition the new top of the stack should be in top, and the next item should
- be in memory at that address contained in sp.
-
- c) Assembly code should not access memory at negative offsets from sp. This
- restriction safeguards against problems in an interrupt-driven environment, in
- case the same stack happens to be used for interrupt handlers.
-
- If items are removed from the stack by a code definition, care must be taken
- to make sure the correct top of stack value is left in top. Also remember that
- the RISC OS Forthmacs assembler provides macros to assist in managing the
- stack. Here are some examples; study them carefully:
-
- code and (s n1 n2 -- n3 )
- r0 sp pop
- top top r0 and c;
- code min (s n1 n2 -- n1|n2 )
- r0 sp pop
- top r0 cmp
- top r0 gt mov c;
- code drop (s n1 n2 -- n1 )
- top sp pop c;
- code dup (s n1 -- n1 n1 )
- top sp push c;
- code 1+ (s n -- n+1 ) top 1 incr c;
- code @ (s a_adr -- n )
- top top ) ldr c;
- \ a somewhat optimized fill
- code fill (s adr cnt char -- )
- r2 top top 8 #lsl orr
- r0 r1 top 3 sp ia! ldm \ r0-cnt r1-adr r2-data
- r0 4 # cmp
- gt if
- begin r3 r1 3 # s and
- r0 1 ne decr
- r2 r1 byte )+ ne str
- eq until
- r0 8 s decr
- r2 r2 r2 10 #lsl orr
- r3 r2 mov
- begin r2 r3 2 r1 ia! ge stm
- r0 8 ge s decr
- lt until
- r0 4 s incr
- r2 r1 )+ ge str
- r0 4 lt decr
- then
- begin r0 1 s decr
- r2 r1 byte )+ ge str
- lt until c;
- code >name \ (s cfa -- nfa )
- top 1 decr \ skip flag byte
- begin r0 top byte -( ldr
- r0 0 # cmp
- ne until
- begin r0 top byte -( ldr
- r0 20 # cmp
- lt until c;
-
-