Yours recursively

Oooh tricky!

Recursion is undoubtedly one of the great mysteries of programming. Everyone who learns to program is constantly told by others of the many wonders there are to having recursive code inside your programs, yet getting to grips with the theory behind such a thing always seems incredibly difficult. The reason for this is that there is a great deal of confusion attached to the whole idea of a function calling itself. As I hope to prove, it's not really all that bad once you get the hang of it.

Why Recursion

The most obvious question is why bother with recursion, when you can just use normal iteration (like For loops and so on). Iteration is much easier to implement, uses less stack space, and certainly a lot easier to debug. Well, although the second two points might well be true, there are certain things that recursion can do a whole lot better and more importantly, in less code. This factor alone could make debugging a little easier.

There is another very good reason for choosing to program recursively: trees. Many advanced computing algorithms rely on a tree of some description. Things like compression (such as the Huffman algorithm), searches (like those used in route-finding programs such as the ancient GBRoute as well as games such as chess and draughts (or checkers for those of you viewing in American!) using an algorithm known as "MiniMax"), and compilers all require a tree of some description. Simple trees that have only two leaves (binary trees) are probably not quite so difficult to traverse iteratively, but as the size of the tree grows, implementing things this way will get progressively harder. Therefore, recursion is once more ready to help. In the example of a compiler, what might look like a linear sequence of instructions going down the screen does in fact become a huge tree that the compiler sits and crunches through in order to produce your code. It uses a method known as "Recursive Descent Parsing" to pick out all of the meaningful bits in your code, and as the name implies, this relies heavily on recursion. Although the problem doesn't really occur when using functions that Exec provides on the Amiga, if you were using singly-linked lists and wanted to examine the list backwards (an obvious reason for this is to release the memory used by the entire list, which you'd obviously want to do without accessing the memory that has already been freed), you'd have a bit of a job on your hands. Obviously, you only have links to take you from beginning to end in the list but not the other way around, so what do you do ? Well, with some hacking and several loops, you can definitely come up with an iterative solution to the problem, but a recursive implementation is far more elegant. Basically, using recursion in this case allows you to walk the list backwards.

A small example

Everyone who tries to teach recursion always begins with the clich�d factorial program. You've probably seen it a thousand times before and to be perfectly honest, it's not the most exciting reason to program recursively. Instead, I've included a short example of a recursive algorithm to convert a number as held in a memory location into something that you can print on the screen. Obviously, in order to do this we have to take a digit from the left of the number, convert it to it's ASCII equivalent (which is as simple as adding 48 (the ASCII code for zero) to the number) and printing it out. That's fine, but how do we implement it ? If we were talking iteration, we'd need a way of finding out how big a number is (ie. how many digits are in it -that can be done with a simple loop dividing the number by ten until you get zero and counting the number of divisions along the way), and then divide the original number by a progressively smaller power of ten in each iteration. Each time, we'd have to subtract the previous numbers multiplied by an increasing power of ten. So, for the number 5780, we'd divide (using integer division so that we don't get nasty floating point results) it by 1000 to get 5. Then we'd divide 5780 by 100 to get 57. Now subtract 50 (5 x 10) to get 7 and so on. Clearly, this is quite a cumbersome operation. It would be far easier if we could just knock off a digit from the right (and examine it) each time until we get zero, right ? The problem being that the number would come out back to front which generally isn't all that helpful. What we really need is a way of performing this operation and hanging on to the digits as we go so that they can be printed out in the correct order. What we want is for it to work just like a stack: Last In First Out. This can be done by holding the results in an array and then walking through it backwards, but if you were dealing with numbers that you didn't know the magnitude of, you'd have to perform a small loop to work out how many digits the number had, allocate memory for the array, do the operation and then clean up the memory for the array afterwards.

How do they do that ?

Naturally, there is a better way to perform this function. Funnily enough, it involves recursion. In essence, what we do is to actually knock the digit off the right each time, which of course is a doddle as it's merely an integer division by ten, but we don't actually print that digit we knocked off until we know that all the other digits have been knocked off. So, for each recursive call we say "knock this digit off, ask the rest of the functions to knock the rest of the digits off, and when they're finished with that, we print that digit that we knocked off ourselves". It couldn't be simpler! It should also be clear that this very same method is applied in each recursive call, so no code need be replicated.

I've included some source code below for you to study. To give the examples a slightly wider appeal, I have supplied code in ARexx, C, and 68000 assembly language. I appreciate that both C and ARexx already have functions to print numbers in various formats including hexadecimal from memory locations, but if you used them you wouldn't be learning anything now, would you ?

The key to breaking down the barrier of not comprehending recursion is to think of each of the recursive calls as a call to a completely separate function. Let's say that BaseConvert calls BaseConvert A, and that in turn calls BaseConvert B and so on. The reason we can get away with this method of thinking is because the variables in each call are preserved and so calling recursively actually leaves the existing variables completely intact. This proves useful for when those recursive calls start to drop out because we can use the knowledge built up from previous function calls to help us work out a result. Because of this, recursion depends upon stack-based variables. Without it, all the variables get overwritten at each function call, and you'd probably end up recursing forever which would ultimately mean a stack overflow and if you're really unlucky, a crash. That's probably why most people steer clear of recursion because it isn't worth the trouble (or so they think). If you can follow it, take a peek at the assembly language version of the example below. You'll notice that despite the fact that operating with registers is far faster, we still need to rely on the stack to keep track of the variables as we go. The result of this is that we have to push the values from registers onto the stack for when the function is called. Inside the function, we'd have to read (but not pop) the values from the stack back into registers. It sounds a little wasteful, but that's the only way we can do it!

Think of recursion as winding up a clock. When you're making all the recursive calls at the beginning, that's the actual winding bit where you build up towards the result. When the clock actually starts ticking (assuming you haven't busted the hairspring from excessive winding), that's when all the function calls start dropping out and when the main evaluation occurs.

OK, imagine we've got the number 5780. We start by making our first call to BaseConvert (which we'll call "BaseConvert A"). We check to see if 5780 is greater than or equal to 10 and because the answer is "yes", we recurse and go to BaseConvert B, but we pass on 578 instead (5780 / 10). Like I said, it's just like calling any other function so you're not actually modifying the instance of 5780 that you've been given -that bit is important otherwise it wouldn't work. So in BaseConvert B, we check to see if 578 is greater than 10. It is, and so we go to BaseConvert C, this time calling it with 57. The check succeeds and we give 5 to BaseConvert D. Finally, BaseConvert D finds that 5 is not greater than or equal to the base (10) and so it prints out the remainder of the division 5 / 10 (ie. it knocks a digit off the right) and drops out. When it drops out we go back to BaseConvert C, right after the check to see if 57 is greater than or equal to 10. You should be able to see why we don't have an "else" in the source code at this point. If we did, the digit printing part would be skipped after the function returns. So now we print out the remainder of 57 / 10, which is 7 and we drop out again. We're now in BaseConvert B and dealing with the number 578. The next thing we do is to print out 8 (remainder of 578 / 10) and drop back to BaseConvert A. Finally, we print 0 (remainder of 5780 / 10) and finish. So, we've printed out 5, 7, 8, and 0 which just so happens to be what we're looking for. Now think about it: we go right to the end of the number (from right to left) and then we work our way back again, but chucking out a digit onto the screen as we go. Easy! To make things a little clearer, have a quick peek at this diagram. Recursion example

There's more

Actually, the algorithm below has a far wider application that just printing base 10. In fact, you can use it to print pretty much any base you can think of (ASCII table permitting). So, with very little effort (changing one number, in fact) you can make it do hexadecimal (base 16), octal (base 8) or even binary extremely easily. That's why I've made the code deliberately general, otherwise if you were only ever going to work with base ten, you could reduce the calling parameters to pass just the number and hard-code the base. Try changing the number and see what happens. If the base is between 2 and 10, you should see the results instantly. However, if you want to do larger bases, you'll need to check what range the digit you're printing out is in. This is due to the nature of the ASCII table, nothing more. If the letter "A" was right after "9", then you wouldn't have a problem, but it isn't and so if you wanted to do hexadecimal, you'd have to check that the number is between 0 and 9. If it is, print out 48 (ASCII code for "0") plus that number. Otherwise, if it's between 10 and 15, print out 55 (65 (the ASCII code for "A") - 10) plus that number to give the range "A" to "F".

Go functional!

My real advice if you're still not getting the hang of this recursion lark is to get your mits on a functional programming language. Although logicians will harp on about how good they are for writing all sorts of proofs using induction and so on, personally I go for the rather more practical application. The thing about functional languages is that you don't get your usual collection of iterative instructions, and so in order to perform loop-based operations, you have to opt for the recursive method. Effectively, it's like being chucked off a cliff and in an attempt to make you learn how to fly. It works, too! Academics will tell you that programming with a functional language is unlike programming with anything else. Personally, I think that's crap.

There are several languages you can try, but my recommendation would be to go for Miranda. Now, I'm not 100% sure if a complete Amiga port of Miranda exists, but there are derivatives (namely one called "Gopher", which differs only in terms of some of the symbols which is no major hardship) which definitely does exist for the Amiga. You can also get quite a few good books to help you along with this language. Alternatives include LISP (which I've never tried) but is fairly popular, and ML (Meta Langauge, although it has been known as "Malicious Language" and "Mandatory Labotomy" by students that study it for reasons that would soon become apparent if you were ever to attempt to program with it). I'd probably recommend against ML because it can turn even the most seasoned programmers into quivering wrecks. I believe that an Amiga port does exist, but unless you're into writing compilers then I'd suggest you steer clear of it because it is a very messy language that can quite easily turn simple typos into a huge list of errors due to it's "intelligent" attempts to correct your mistakes, and it's huge memory requirements during the type checking phase.

That's all folks

With luck, this simple introduction has at least given you an appreciation for why recursion is an important thing for a good programmer to understand, and hopefully taken away some of the fear associated with it. Happy programming!

Source code

ARexx version

/* ARexx base conversion */ Address Command Number = 5780 Base = 10 Call BaseConvert Number, Base 'Echo' Exit BaseConvert: PROCEDURE Arg Number, Base If Number >= Base Then Do Call BaseConvert (Number % Base), Base End /* Note that there is NO "else" here because we want */ /* the function to do BOTH operations (ie. recurse, */ /* then print) if the condition succeeds, and only */ /* then printing bit if it doesn't. */ 'Echo '||D2C((Number // Base) + 48)||' NOLINE' Return

C version (SAS/C)

#include <stdio.h> void BaseConvert(int, int); void main() { BaseConvert (5780, 10); printf("\n"); } void BaseConvert(int Number, int Base) { if (Number >= Base) { BaseConvert(Number / Base, Base); } /* Note that there is NO "else" here because we want */ /* the function to do BOTH operations (ie. recurse, */ /* then print) if the condition succeeds, and only */ /* then printing bit if it doesn't. */ printf("%c", 48 + (Number % Base)); }

68000 assembly version (Devpac)

Push    MACRO
        Move.L \1,-(A7)
    ENDM

Pop MACRO
    Move.L +(A7),\1
    ENDM

Discard MACRO
    Add.L #\1*4,A7
    ENDM

Start:
    Move.L #Number,A0
    Move.L #5780,D0 ; An example of a number
                    ; ie. 5780 in decimal
    Move.L #10,D1   ; Convert it to base 10 (easy!)
    Push D1         ; Stick 'em both
    Push D0         ; onto the stack
    Bsr Convert     ; Now call the subroutine
                    ; - Bsr NOT Bra!
    Discard 2       ; Clean up the stack
                    ; - exit gracefully!
    Rts             ; Ta da!

Convert:
    Move.L 4(A7),D0 ; Pull Number and Base
                    ; from inside the stack
    Move.L 8(A7),D1 ; Don't forget that when
                    ; a Bsr instruction is
    Cmp.W D0,D1     ; issued, an extra 4 bytes
                    ; are pushed onto the stack!
    Bhi Digit       ; Jump if Base>Number
                    ; (same as) Number>=Base
    Push D1
    Divu.W D1,D0
    Andi.L #$FFFF,D0; Strip the division
                    ; remainder from the result
    Push D0
    Bsr Convert     ; Recursion time!
    Discard 2       ; Clean up the stack
                    ; - VERY important!

Digit:
    Move.L 4(A7),D0 ; It may seem silly, but
                    ; we must copy the data
    Move.L 8(A7),D1 ; for Number and Base back
                    ; into registers again
    Divu.W D1,D0    ; (this is for when other
                    ; recursions return)
    Swap.W D0       ; Swap hi & lo words to get
                    ; the remainder
    Bsr PrintDigit  ; Print that digit
    Rts

PrintDigit:
    Add.B #48,D0    ; ASCII code for '0' is 48,
                    ; so add 48 to the number
    Move.B D0,(A0)+ ; Stick that digit at the
                    ; next position in memory
    Rts

Number: Ds.B 50     ; Reserve a few bytes for
                    ; the result