home *** CD-ROM | disk | FTP | other *** search
- <!-- Forthmacs Formatter generated HTML output -->
- <html>
- <head>
- <title>Assembler Tutorial</title>
- </head>
- <body>
- <h1>Assembler Tutorial</h1>
- <hr>
- <p>
- This chapter explains how to use the Risc-OS Forthmacs ARM assembler in order to
- create short machine language code sequences. This chapter is a companion to
- the "ARM Assembler" chapter. That chapter describes the syntax of individual
- assembly language instructions. This chapter addresses "higher level" issues,
- such as how to begin and end the assembly process and how to communicate
- arguments and result between Forth and assembly language.
- <p>
- <p>
- <h2>Motivation</h2>
- <p>
- For nearly all debugging jobs, writing assembly language is unnecessary. Test
- loops can be usually be written more quickly and easily in high-level Forth, and
- will execute quickly enough to get the job done.
- <p>
- However, in some cases the ultimate in speed is needed for certain critical
- operations, and assembly language may be the best way to go. In other cases,
- very specific combinations of machine instructions may exhibit problem behavior,
- and those combinations may need to be reproduced. Finally, some maintainers of
- the Risc-OS Forthmacs system software itself may need to understand the
- assembler.
- <p>
- <p>
- <h2>Assumptions</h2>
- <p>
- The chapter assumes that you already understand the ARM instruction set,
- including such issues as processor modes, interrupts and registers sets. If
- not, you should first study a ARM reference, such as the manual published by the
- chip manufacturer.
- <p>
- <p>
- <h2>Example: a simple "code word"</h2>
- <p>
- Here is a very simple assembly language program. It adds "1" to the contents of
- a register then returns to the Forth interpreter.
- <p>
- <br><code> code addone ( n -- n+1 )</code><br>
- <br><code> r10 r10 1 # add</code><br>
- <br><code> c;</code><br>
- <p>
- To execute it and display the result, you would type, for example,
- <p>
- <br><code> 5 addone .</code><br>
- <p>
- Here's what is happening, line by line:
- <p>
- <br><code> code addone ( n -- n+1 )</code><br>
- <p>
- <code><A href="_smal_AH#187"> code </A></code> is a "defining word"; it creates
- a new command which can be executed by typing its name. The name of the new
- command in this case is <strong>addone</strong> . The name could have been
- anything; I have chosen the name <strong>addone</strong> because it describes
- the action of the program. You may already be familiar with another Forth
- defining word " <strong>:</strong> or <strong>colon</strong> <code><A href="_smal_AP#6f"> ". </A></code>
- ":" also creates a new command; the difference between <code><A href="_smal_AH#187"> code </A></code>
- and ":" is that ":" creates a new command whose behavior is described by a
- sequence of other Forth commands, whereas <code><A href="_smal_AH#187"> code </A></code>
- creates a new command whose behavior is described by a sequence of assembly
- language instructions. After <code><A href="_smal_AH#187"> code </A></code>
- creates the new command, it starts the assembler so that assembly language
- instructions may be entered.
- <p>
- The stuff inside the parentheses is a comment; this particular comment indicates
- that the new command expects one argument ("n") on the stack before the word is
- executed, and after the command is executed, one result ("n+1") is left on the
- stack. The comment is optional, but its inclusion is strongly recommended.
- <p>
- <br><code> r10 r10 1 # add</code><br>
- <p>
- This is the assembly language instruction which defines the action of the new
- command. As you will recall from the "ARM Assembler" chapter, the
- Risc-OS Forthmacs assembler syntax has the destination register first, followed
- by the source operand(s), followed by the operation name. So, in this case, the
- source operands are the global register r10 and the immediate number 1, the
- destination operand is the global register r10, and the operation is add, i.e.
- 1 is added to the contents of register r10, and the result is placed back in
- register r10.
- <p>
- <br><code> c;</code><br>
- <p>
- <code><A href="_smal_BJ#171"> c; </A></code> terminates the definition of a
- code definition. At the end of the instructions you have assembled, <code><A href="_smal_BJ#171"> c; </A></code>
- automatically appends one machine instruction, its effect is to return to Forth
- after the user-specified instructions have been executed.
- <p>
- <br><code> 5 addone .</code><br>
- <p>
- In order to invoke the new command, we enter the number 5 on the Forth stack,
- type the name of the command <strong>addone</strong> , and then display the
- result by typing the print command <code>"<A href="_smal_AV#d5"> ." </A></code>
- .
- <p>
- Perhaps you now wonder how the number got off the Forth stack and into the
- register r10, and afterwards how the number got out of r10 and back onto the
- Forth stack. The answer is simple: the top element of the Forth stack is always
- (!) kept in r10 , so no movement was necessary. That is why I chose r10 for the
- register in this example.
- <p>
- <p>
- <h2>Register Usage in Forth</h2>
- <p>
- To use the assembler effectively, you need to know which registers are available
- for use, and which of them must be left alone. Here are the rules:
- <p>
- r8, r9, r12, and r14 are used internally by the Forth interpreter or operating
- system, their values must be left alone (otherwise the system will crash).
- <p>
- r10 contains the top of the Forth stack. It is used for passing arguments and
- results back and forth between Forth and assembly language.
- <p>
- r13 contains a pointer to a memory area containing the rest of the Forth stack
- (all elements other than the topmost one). That stack area is used for extra
- arguments and results. The section entitled "Stack Usage" tells you more about
- managing the stack area.
- <p>
- r0 - r6 may be used freely within assembly language code sequences. Forth does
- not depend on the contents of these registers. However, some Forth commands <code><A href="_smal_AT#1c3"> do </A></code>
- use these registers as scratch registers, so your code should not attempt to
- keep important values in these registers from one time to the next. While your
- code is being executed, Forth will not change the contents of any of these
- registers, so you can depend on them for the duration of your assembly language
- sequence. When your code finishes and returns to Forth, the next time that you
- execute your code the register values may have changed.
- <p>
- While your machine code is executing, it will run at the full speed of the
- system, without any interference or overhead imposed by Risc-OS Forthmacs.
- Risc-OS Forthmacs does not itself use interrupts, so the processor will execute
- exactly the sequence of instructions which you have coded. It is possible that
- other software in the system may have set up some interrupts, but that is beyond
- the control of Risc-OS Forthmacs.
- <p>
- <p>
- <h2>Disassembler</h2>
- <p>
- The Risc-OS Forthmacs disassembler may be used to review the assembly language
- you have created:
- <p>
- <br><code> see addone</code><br>
- <p>
- The result will look something like this:
- <br><code> code addone</code><br>
- <br><code> ( 1e878 ) add r10,r10,#1</code><br>
- <br><code> ( 1e87c ) ldr pc,[r8],#4</code><br>
- <p>
- <p>
- The numbers along the left hand side are the addresses at which the various
- instructions appear. The addresses shown here will almost certainly be
- different from the addresses that you see.
- <p>
- You will notice that even though our example contained only one assembly
- language instruction the disassembler shows 1 extra instruction. This extra
- instruction was automatically assembled by the <code><A href="_smal_BJ#171"> c; </A></code>
- command. Their purpose is to return control to Forth after the assembly
- language sequence has finished its execution (this is called the <code><A href="_smal_AK#3a"> next </A></code>
- instruction).
- <p>
- The <code><A href="_smal_BT#2cb"> see </A></code> command reads the name of a
- Forth command (in this case "addone"), determines what type of command it is (in
- this case "code <code><A href="_smal_AO#6e"> ", </A></code> meaning that the
- command's behavior was defined by the assembler), and then displays a
- reconstruction of the source code for that command. <code><A href="_smal_BT#2cb"> see </A></code>
- also works for "colon" definitions, whose behaviour is defined in Forth instead
- of in assembly language. For an example of this, type "see find".
- <p>
- Many of the normal Forth commands are defined in assembly language, and <code><A href="_smal_BT#2cb"> see </A></code>
- can be used to look at how they are implemented. For example, type "see @" to
- see how the Forth "@" operator works (pronounced fetch, this operator takes an
- address from the top of the stack, reads the 32-bit contents of that address,
- and puts those contents back on top of the stack). You should try this right
- now and make sure you understand how it works. Note that the last instructions
- of "@" is exactly the same as the last instruction of "addone". Every code
- definition in Risc-OS Forthmacs ends with these same three instructions.
- <p>
- <code><A href="_smal_BT#2cb"> see </A></code> automatically locates the address
- where the code for particular command begins. That address was allocated by <code><A href="_smal_AH#187"> code </A></code>
- when the new command was defined. The disassembler can also be used to inspect
- machine code beginning at arbitrary addresses, not only that code which is
- created by <code><A href="_smal_AH#187"> code </A></code> . Suppose that you
- know there is some code starting at address 100000 and you wish to look at it:
- <p>
- <br><code> 100000 dis</code><br>
- <p>
- On your system, this example probably won't work exactly as shown because your
- system may not have any code at address 100000 (in fact, it may not even have
- any memory there. The main point, though, is that you type the address of the
- code you wish to disassemble, followed by "dis".
- <p>
- The disassembler will continue until it reaches a "definition ending"
- instruction, or until you stop it by typing the character "q", for "quit". It
- will also pause at the end of a screen and prompt you for a continuation
- character.
- <p>
- After the disassembler has stopped, you can make it continue where it left off
- by typing <code><A href="_smal_AL#cb"> +dis </A></code>
- <p>
- <p>
- <h2>Setting the Starting Address</h2>
- <p>
- In most cases, you won't need to specify a starting address for the code you
- assemble. When you use the <code><A href="_smal_AH#187"> code </A></code>
- defining word to begin assembling, Risc-OS Forthmacs will find some appropriate
- memory for you and assemble your code there ( at <code><A href="_smal_AM#21c"> here) </A>.</code>
- You can then locate the memory Risc-OS Forthmacs has chosen by using the <code><A href="_smal_BT#2cb"> see </A></code>
- command to disassemble the code, looking at the addresses displayed alongside
- the machine instructions.
- <p>
- If you really need to assemble at a specific address, you can do so as follows
- (Note: in nearly all cases, this technique is unnecessary; very rarely does it
- matter where exactly you locate a bit of code, and allowing Risc-OS Forthmacs to
- allocate the memory for you is sufficient and convenient).
- <p>
- Set the <code><A href="_smal_BH#1cf"> dp </A></code> by
- <br><code> here @</code><br>
- <br><code> your-adr dp !</code><br>
- <br><code> code demo</code><br>
- <br><code> ...... c;</code><br>
- <br><code> here !</code><br>
- <p>
- <p>
- <h2>Conditional branches</h2>
- <p>
- In order to implement conditional operations and loops, most assemblers provide
- branch instructions and labels. Risc-OS Forthmacs has branches and labels too,
- but it also has a much better way, which eliminates most of the troublesome
- aspects of coding conditionals and loops in assembly language. The
- Risc-OS Forthmacs way is called "structured conditionals". For example, suppose
- we want to test a condition and execute some code only if the condition is true.
- Specifically, we want to compare r0 and r1, and execute some code only if r0 is
- less than r1 .
- <p>
- <br><code> Traditional assembler:</code><br>
- <br><code> </code><br>
- <br><code> cmp r0, r1</code><br>
- <br><code> bge temp</code><br>
- <br><code> ..some code we want to conditionally execute</code><br>
- <br><code> temp:</code><br>
- <p>
- <br><code> Forthmacs assembler with structured conditionals:</code><br>
- <br><code> </code><br>
- <br><code> r0 r1 cmp</code><br>
- <br><code> < if</code><br>
- <br><code> ..some code we want to conditionally execute..</code><br>
- <br><code> then</code><br>
- <p>
- As you can see, Risc-OS Forthmacs eliminates the need to mentally reverse the
- sense of the comparison, eliminates the need to invent and keep track of label
- names, and uses conventional mathematical comparison symbols (e.g. "<"),
- rather then alphabetic mnemonics. The complete set of comparison symbols is
- given in the "ARM Assembler" chapter.
- <p>
- The "if .. then" construct can also include an "else" clause:
- <p>
- <br><code> r0 r1 s cmp \ the s is optional</code><br>
- <br><code> < if</code><br>
- <br><code> ..code to execute if r0 < r1..</code><br>
- <br><code> else</code><br>
- <br><code> ..code to execute if r0 >= r1..</code><br>
- <br><code> then</code><br>
- <p>
- Of course, the assembler actually generates conditional branch instructions
- because that's what the hardware supports directly, but Risc-OS Forthmacs takes
- care of the "bookkeeping" for you.
- <p>
- Another way would be to use the conditional instructions offered by the ARM cpu.
- <p>
- <br><code> r0 r1 cmp</code><br>
- <br><code> xx xx lt xxx</code><br>
- <br><code> yy yy ge xxx</code><br>
- <p>
- <p>
- <p>
- <h2>Delayed Branches</h2>
- <p>
- ARM doesn't uses delayed branches at all, so don't worry.
- <p>
- <p>
- <h2>Loops</h2>
- <p>
- Risc-OS Forthmacs structured conditionals also have features for easily creating
- loops. Here is a loop which executes forever:
- <p>
- <br><code> Source Generates</code><br>
- <br><code> </code><br>
- <br><code> begin Label1:</code><br>
- <br><code> top r0 ) ldr ldr r10,[r10,#0]</code><br>
- <br><code> again b Label1</code><br>
- <p>
- This code assumes that the r10 register (top of stack, remember?) contains the
- address of a memory location, and the contents of that memory location is
- continuously read into the r0 register. This is an infinite loop; it won't stop
- until the system is reset, or power cycled, or externally interrupted in some
- way.
- <p>
- Suppose we want the loop to execute 9 times then quit:
- <p>
- <p><pre>
- r1 9 # mov
- begin
- r0 top ) ldr
- r1 r1 1 # s sub
- <= until
- </pre><p>
- <p>
- <p>
- We continue to loop "until" r1 <code><A href="_smal_BQ#118"> <= </A></code>
- 1 .
- <p>
- Finally, here's an example where we perform a test at the top of the loop rather
- than at the bottom, illustrating "while":
- <p>
- <p><pre>
- r1 9 # mov
- begin
- r1 r1 1 s sub
- > while
- r0 top ) ldr
- repeat
- </pre><p>
- <p>
- <p>
- This loop continues to execute "while" r11 > 1, and the "repeat" sends it
- back to the "begin".
- <p>
- Structured conditionals and loops nest in the expected manner, to an arbitrary
- depth. For instance, a "begin .. until" can be completely contained within an
- "if .. then", which itself may be contained within a "begin .. while ..
- repeat".
- <p>
- <p>
- <h2>Scope Loops - Assembler vs. Forth</h2>
- <p>
- You can use assembly language for creating scope loops, but it is usually
- preferable to write them in Forth, because the Forth version is usually easier
- to write, easier to read, and easier to debug. The one advantage of an assembly
- language loop is that it is tighter. However this rarely matters. For
- comparison, suppose that you want to continually read location 1000 so that you
- can observe the action on an oscilloscope. This is how you would do it in
- assembly language:
- <p>
- <br><code> code test</code><br>
- <br><code> r0 th 1000 # mov</code><br>
- <br><code> begin</code><br>
- <br><code> r1 r0 ) ldr</code><br>
- <br><code> again</code><br>
- <p>
- Here's how you would do the same thing in Forth:
- <p>
- <br><code> begin 1000 @ drop again</code><br>
- <p>
- Additionally, the Forth version may be easily adapted to stop looping as soon as
- a key is typed:
- <p>
- <br><code> begin 1000 @ drop key? until</code><br>
- <p>
- More importantly, many of today's complicated chips require fairly extensive
- initialization sequences in order to configure them to the correct operating
- mode. Such code is much easier to write and debug in Forth, because you can
- "try things out" by typing commands at the keyboard, the looking at the
- registers to see what happened.
- <p>
- A set of simple Forth commands sufficient to do most hardware debugging jobs can
- easily be described on a single page, and many engineers and technicians have
- learned enough Forth in 30 minutes to be able to write sophisticated diagnostics
- for complicated hardware.
- <p>
- <p>
- <h2>Stack Usage</h2>
- <p>
- A previous example has shown how to access the top element on the stack which is
- stored in r10. Things get a little more complicated if more than 1 stack
- argument is needed. Remember that the top of the stack is stored in r10, and
- subsequent stack items are stored in a memory area whose address is contained in
- r13. For convenience, the assembler provides alternate names for r10 and r13,
- reflecting the use of these registers for the stack. r10 is also known as <code><A href="_smal_BG#1e"> top </A></code>
- (Top of Stack), and r13 is also known as <code><A href="_smal_BB#19"> sp </A></code>
- (Stack Pointer).
- <p>
- The basic rules for the Forth stack are:
- <p>
- a) Upon entry to a <code><A href="_smal_AH#187"> code </A></code> definition
- (assembly language), the top of the stack is contained in <code><A href="_smal_BG#1e"> top </A>.</code>
- The next item on the stack is in the memory location whose address is contained
- in <code><A href="_smal_BB#19"> sp </A>.</code> The item after that is in
- memory at <strong>sp+4</strong> , the next at <strong>sp+8</strong> , etc. Note
- that successive stack items are 4 bytes (32-bits) apart.
- <p>
- b) A definition may modify the stack contents, and upon exit from the definition
- the new top of the stack should be in <code><A href="_smal_BG#1e"> top </A>,</code>
- and the next item should be in memory at that address contained in <code><A href="_smal_BB#19"> sp </A>.</code>
- <p>
- c) Assembly code should not access memory at negative offsets from <code><A href="_smal_BB#19"> sp </A>.</code>
- This restriction safeguards against problems in an interrupt-driven environment,
- in case the same stack happens to be used for interrupt handlers.
- <p>
- If items are removed from the stack by a code definition, care must be taken to
- make sure the correct top of stack value is left in <code><A href="_smal_BG#1e"> top </A>.</code>
- Also remember that the Risc-OS Forthmacs assembler provides macros to assist in
- managing the stack. Here are some examples; study them carefully:
- <p>
- <p><pre>
- code and (s n1 n2 -- n3 )
- r0 sp pop
- top top r0 and c;
- code min (s n1 n2 -- n1|n2 )
- r0 sp pop
- top r0 cmp
- top r0 gt mov c;
- code drop (s n1 n2 -- n1 )
- top sp pop c;
- code dup (s n1 -- n1 n1 )
- top sp push c;
- code 1+ (s n -- n+1 ) top 1 incr c;
- code @ (s a_adr -- n )
- top top ) ldr c;
- \ a somewhat optimized fill
- code fill (s adr cnt char -- )
- r2 top top 8 #lsl orr
- r0 r1 top 3 sp ia! ldm \ r0-cnt r1-adr r2-data
- r0 4 # cmp
- gt if
- begin r3 r1 3 # s and
- r0 1 ne decr
- r2 r1 byte )+ ne str
- eq until
- r0 8 s decr
- r2 r2 r2 10 #lsl orr
- r3 r2 mov
- begin r2 r3 2 r1 ia! ge stm
- r0 8 ge s decr
- lt until
- r0 4 s incr
- r2 r1 )+ ge str
- r0 4 lt decr
- then
- begin r0 1 s decr
- r2 r1 byte )+ ge str
- lt until c;
- code >name \ (s cfa -- nfa )
- top 1 decr \ skip flag byte
- begin r0 top byte -( ldr
- r0 0 # cmp
- ne until
- begin r0 top byte -( ldr
- r0 20 # cmp
- lt until c;
- </pre><p>
- <p>
- </body>
- </html>
-