ARM code for Beginners

ARM code for Beginners
Part 2: Constants, Comparisons, Labels and Loops

Brian Pickard explains how anyone can program in ARM code.

Before I begin on this issues topics here are the solutions to the last articles set of problems.

Q. Modify the above to subtract the values R2 from R1 placing the answer in R0.
A: This is just a straight substitution of

SUB R0,R1,R2

in place of the

ADD R0,R1,R2

2. Modify the above to do the following R0=R1+R2+R3.
A: This needs a little more care. Inside the assembler code we need the following:

    [
    ADD R0,R1,R2 :REM as before
    ADD R0,R0,R3 :REM this is allowed R0 becomes equal to the value R0+R3
    MOV PC,R14   :REM return to BASIC program
    ]
    INPUT"Enter an integer value "B%
    INPUT"Enter another integer value "C%
    INPUT"Enter yet another integer value "D%
    A%=USR(mcode%)
    PRINT"The answer is ";A%
    END

Remember that we require a number in D% to get a value into R3.

3. Modify the above to do the following R0=R1+R2-R3.
A: This is similar to 2 but with the following inside the assembler:

    [
    ADD R0,R1,R2
    SUB R0,R0,R3
    MOV PC,R14
    ]

4. Modify the above to do the following R0=R1-R2-R3.
A: Again just the assembler code has changed from 2:

    
    [
    SUB R0,R1,R2
    SUB R0,R0,R3
    MOV PC,R14
    ]

Constants

In many programs values need to be given which are not changed and so need not be entered by the user at run time. These constants, known as immediate operands, can be given to registers in assembler code in various ways. For example:

    MOV R0,#48
    MOV R0,#ASC("0")  :REM load R0 with ascii code for zero
    MOV R0,#&30       :REM load R0 with hexadecimal number 30
    MOV R0,#(1<<5)    :REM load R0 with value one shifted 5 places left

All the above will produce R0 with the value of 48. The # symbol tells the assembler that an immediate operand (as opposed to a register operand) is to be used, the value of which follows the #. Note the use of the BASIC command ASC, this is one of the advantages of using the assembler that is built into BASIC. When this code is assembled any BASIC command will be interpreted, producing the correct result. Also the last example shows a left shift command. These can be useful since there is a limit to the actual value that a constant can have using immediate operands. This limitation arises since the whole command has to fit into 32 bits. I have not discussed the structure of ARM code since it is beyond the scope of these articles, but common sense dictates that if the command name, the registers and how they are used, all have to fit into 32 bits, then any immediate operand value has to fit into less than 32 bits. Just 12 bits are used and the way in which they are used is not straight forward. I will not go into great detail but just give the rules.

Any value from 0 to 255 is allowed.
Any value which is in the range 256 to 1023 is allowed if divisible by 4.
Any value which is in the range 1024 to 4096 is allowed if divisible by 16.
No values above 4096 are allowed.

This might seem restrictive but there are easy ways around this using memory locations (see a later article). Here are some examples of the use of immediate operands see if you can decide the results.

    ADD R0,R1,#4
    ADD R0,R0,#10
    SUB R0,R0,#4
    BUT note the following are not allowed.
    ADD R0,#4,#4
    SUB R1,#4,R2

An immediate operand can only be placed at the last register operand position.

Loops and Labels

One of the most useful constructions in a program is the loop. We can use loops in ARM code quite easily with the aid of labels. Consider the following assembler code:

    [
    MOV R0,#16   :REM place the value 16 into R0
    .alabel%     :REM a label used during assembling.
    SUB R0,R0,#1 :REM take one off the value of R0
    B alabel%    :REM B is the code for branch, this will jump to position a label%
    ]

Note the use of a dot (period) before a label name as in .alabel%.

During the assembling, the memory address at position alabel% is given to the variable alabel%. When the branch command B alabel% occurs the assembler works out the number of bytes back (in this case), to position alabel% and fills in this value in the branch command.

The assembler does not fill in the memory address of alabel% into the branch command. There are two good reasons for this.

1. If relative jumps are used rather than jumps to actual memory positions, the code produced can be loaded into any position in memory. This type of code is called relocatable code and is far more useful than code that has to run from a definite position in memory.
2. The actual memory address would not fit in the 32 bit code for the branch!

The branch instruction uses 24 bits to store the value of the jump. Since a jump has to be a multiple of 4 (all ARM commands are 4 bytes long) bytes the range is 4 times as large as would appear using a standard binary representation.

The above code is not much use since it is a infinite loop! We need to know how to get out of a loop.

Comparisons and Conditions

In ARM code there are two main comparison commands, CMP and CMN. These can with, a modified branch command, test a condition and decide what to do next. Adding extra lines and modifying the branch we can have:

[ MOV R0,#16 .aloop% SUB R0,R0,#1 CMP R0,#0 :REM CMP code for compare BGT aloop% :REM modified branch with GT (greater than) suffix added MOV PC,R14 ]

The CMP command actually calculates R0-0 (known as a notational subtraction) and sets the status flags depending on the result. If R0 is greater than 0 (R0 is positive) then the Negative flag is cleared and the Overflow is cleared and Zero flag is cleared. IF R0 is 0 then Zero flag would be set and the other two cleared. The modified branch command with its suffix GT (meaning greater than) checks the flags and decides if a branch is required. In our example the branch would not occur when R0 becomes zero.

The CMP and CMN commands can be used to compare the contents of two registers thus CMN R0,R1. The CMN command makes R1 negative first then does the subtraction, i.e. R0-(-R1), and sets the status flags.

CMN R3,#6 compares R3 with -6. This is useful since an immediate operand cannot include -ve sign. Suffixes can be added to most ARM commands resulting in a more compact and faster executing code.

    [
    MOV R0,#16
    .aloop%
    SUBS R0,R0,#1
    MOVEQ PC,R14
    B aloop%
    ]

This example does exactly the same but with no CMP line. The suffix S added to the SUB command makes the command set the status flags depending on the result. The zero flag will be set when R0 becomes zero. The EQ suffix on the MOV command means it will only be executed when the zero flag is set.

Here is a table of all the suffixes that test conditions together with the status flag setting.

Nested loops

Consider the following

    [
    MOV R1,#128
    .outerlp%
    MVN R2,#0   :REM MVN places inverted input into R2
                :REM i.e. 0 (all bits clear) inverted = -1 (all bits set)
    SUB R1,R1,#1
    .innerlp%
    ADD R2,R2,#1
    CMP R2,R1
    BNE innerlp%
    CMP R1,#0
    BNE outerlp%
    MOV PC,R14
    ]

Having one programming structure (in this case loops) contained within another is called nesting. The structure can be repeated as many times as required. Note the following is not a good structure.

    [
    MOV R1,#128
    .outerlp%
    MVN R2,#0
    SUB R1,R1,#1
    .innerlp%
    ADD R2,R2,#1
    CMP R1,#0
    BGT outerlp%
    CMP R2,R1
    BNE innerlp%
    MOV PC,R14
    ]

It is not a good idea to have branches jumping out of loops in such a way since it is difficult to follow the flow of the program. With careful thought the nested example can be re-designed with using the S suffix.

    [
    MOV R1,#128
    .outerlp%
    MVN R2,#0
    SUBS R1,R1,#1 :REM suffix S makes SUB command set status flags
    MOVEQ PC,R14  :REM this command only executed if Zero flag set
    .innerlp%
    ADDS R2,R2,#1 :REM again status flags set by this command
    BNE innerlp%  :REM this branch executed if Zero flag clear
    B outerlp%    :REM this branch always executed
    ]

All this is clever but I suggest you keep the comparisons in your code, it is easier to construct and debug. I include these examples to show how compact ARM code can be. We now have enough knowledge to produce small routines that mimic computer routines. For example multiplication. This can be done by repeated addition thus:

    [
    MOV R0,#0
    MOV R3,#0
    .multlp%
    ADD R0,R0,R1
    ADD R3,R3,#1
    CMP R3,R2
    BNE multlp%
    MOV PC,R14
    ]

In the above R1 and R2 hold the numbers to be multiplied, the answer is returned in R0 For the complete listing see the file Basic program MultEG inside the ArmCode section of DiskWorld. This listing also shows the ARM code can be used repeatedly without the need to assemble it each time.

That's it for this part, next time I will show how to include forward branches and use SWI's in ARM code.

Some Problems for solving.

1. Construct a BASIC program which includes an ARM code routine to place user values in registers R3 and R4 that will test the values in each register and give an zero if R3 = R4, one if R3 < R4 and two if R3 > R4. Answers to be given in R0 for the BASIC program to print out.

2. Find the errors in the following:

    [
    MOV R2,#-6
    MOV R1,#257
    lp%
    SUB R1,R1,#4
    CMP R1,R2
    B lp%
    MOV PC,R14
    ]

3. Construct a BASIC program using an ARM routine which will check if the value in R1 is a possible ASCII code for a capital letter. If it is then it will place in R0 the corresponding lower case ASCII code, which the BASIC program will print out. Hint: ASCII codes for A to Z are 65 to 90 inclusive. For a to z are 97 to 122 inclusive.

4. Using a loop, design an ARM routine which mimics integer division by repeated subtraction. The routine should give the correct answer back for the BASIC program to print out.

Answers again next time, problem 4 shows how a computer actually divides!

Brian Pickard