Recursion is undoubtedly one of the great mysteries of programming.
Everyone who learns to program is constantly told by others of the many
wonders there are to having recursive code inside your programs, yet
getting to grips with the theory behind such a thing always seems
incredibly difficult. The reason for this is that there is a great deal
of confusion attached to the whole idea of a function calling itself. As
I hope to prove, it's not really all that bad once you get the hang of
it.
The most obvious question is why bother with recursion, when you can just use normal iteration (like For loops and so on). Iteration is much easier to implement, uses less stack space, and certainly a lot easier to debug. Well, although the second two points might well be true, there are certain things that recursion can do a whole lot better and more importantly, in less code. This factor alone could make debugging a little easier.
There is another very good reason for choosing to program recursively:
trees. Many advanced computing algorithms rely on a tree of some
description. Things like compression (such as the Huffman algorithm),
searches (like those used in route-finding programs such as the ancient
GBRoute as well as games such as chess and draughts (or checkers for
those of you viewing in American!) using an algorithm known as "MiniMax"),
and compilers all require a tree of some description. Simple trees that
have only two leaves (binary trees) are probably not quite so difficult
to traverse iteratively, but as the size of the tree grows, implementing
things this way will get progressively harder. Therefore, recursion is
once more ready to help. In the example of a compiler, what might look
like a linear sequence of instructions going down the screen does in fact
become a huge tree that the compiler sits and crunches through in order
to produce your code. It uses a method known as "Recursive Descent
Parsing" to pick out all of the meaningful bits in your code, and as the
name implies, this relies heavily on recursion. Although the problem
doesn't really occur when using functions that Exec provides on the Amiga,
if you were using singly-linked lists and wanted to examine the list
backwards (an obvious reason for this is to release the memory used by
the entire list, which you'd obviously want to do without accessing the
memory that has already been freed), you'd have a bit of a job on your
hands. Obviously, you only have links to take you from beginning to end
in the list but not the other way around, so what do you do ? Well, with
some hacking and several loops, you can definitely come up with an
iterative solution to the problem, but a recursive implementation is far
more elegant. Basically, using recursion in this case allows you to walk
the list backwards.
Everyone who tries to teach recursion always begins with the clich�d
factorial program. You've probably seen it a thousand times before and
to be perfectly honest, it's not the most exciting reason to program
recursively. Instead, I've included a short example of a recursive
algorithm to convert a number as held in a memory location into something
that you can print on the screen. Obviously, in order to do this we have
to take a digit from the left of the number, convert it to it's ASCII
equivalent (which is as simple as adding 48 (the ASCII code for zero) to
the number) and printing it out. That's fine, but how do we implement it ?
If we were talking iteration, we'd need a way of finding out how big a
number is (ie. how many digits are in it -that can be done with a simple loop
dividing the number by ten until you get zero and counting the number of
divisions along the way), and then divide the original number by a
progressively smaller power of ten in each iteration. Each time, we'd
have to subtract the previous numbers multiplied by an increasing power
of ten. So, for the number 5780, we'd divide (using integer division so
that we don't get nasty floating point results) it by 1000 to get 5. Then
we'd divide 5780 by 100 to get 57. Now subtract 50 (5 x 10) to get 7 and
so on. Clearly, this is quite a cumbersome operation. It would be far
easier if we could just knock off a digit from the right (and examine it)
each time until we get zero, right ? The problem being that the number
would come out back to front which generally isn't all that helpful. What
we really need is a way of performing this operation and hanging on to the
digits as we go so that they can be printed out in the correct order.
What we want is for it to work just like a stack: Last In First Out. This
can be done by holding the results in an array and then walking through it
backwards, but if you were dealing with numbers that you didn't know the
magnitude of, you'd have to perform a small loop to work out how many
digits the number had, allocate memory for the array, do the operation
and then clean up the memory for the array afterwards.
Naturally, there is a better way to perform this function. Funnily enough, it involves recursion. In essence, what we do is to actually knock the digit off the right each time, which of course is a doddle as it's merely an integer division by ten, but we don't actually print that digit we knocked off until we know that all the other digits have been knocked off. So, for each recursive call we say "knock this digit off, ask the rest of the functions to knock the rest of the digits off, and when they're finished with that, we print that digit that we knocked off ourselves". It couldn't be simpler! It should also be clear that this very same method is applied in each recursive call, so no code need be replicated.
I've included some source code below for you to study. To give the examples a slightly wider appeal, I have supplied code in ARexx, C, and 68000 assembly language. I appreciate that both C and ARexx already have functions to print numbers in various formats including hexadecimal from memory locations, but if you used them you wouldn't be learning anything now, would you ?
The key to breaking down the barrier of not comprehending recursion is to think of each of the recursive calls as a call to a completely separate function. Let's say that BaseConvert calls BaseConvert A, and that in turn calls BaseConvert B and so on. The reason we can get away with this method of thinking is because the variables in each call are preserved and so calling recursively actually leaves the existing variables completely intact. This proves useful for when those recursive calls start to drop out because we can use the knowledge built up from previous function calls to help us work out a result. Because of this, recursion depends upon stack-based variables. Without it, all the variables get overwritten at each function call, and you'd probably end up recursing forever which would ultimately mean a stack overflow and if you're really unlucky, a crash. That's probably why most people steer clear of recursion because it isn't worth the trouble (or so they think). If you can follow it, take a peek at the assembly language version of the example below. You'll notice that despite the fact that operating with registers is far faster, we still need to rely on the stack to keep track of the variables as we go. The result of this is that we have to push the values from registers onto the stack for when the function is called. Inside the function, we'd have to read (but not pop) the values from the stack back into registers. It sounds a little wasteful, but that's the only way we can do it!
Think of recursion as winding up a clock. When you're making all the recursive calls at the beginning, that's the actual winding bit where you build up towards the result. When the clock actually starts ticking (assuming you haven't busted the hairspring from excessive winding), that's when all the function calls start dropping out and when the main evaluation occurs.
OK, imagine we've got the number 5780. We start by making our first call to BaseConvert (which we'll call "BaseConvert A"). We check to see if 5780 is greater than or equal to 10 and because the answer is "yes", we recurse and go to BaseConvert B, but we pass on 578 instead (5780 / 10). Like I said, it's just like calling any other function so you're not actually modifying the instance of 5780 that you've been given -that bit is important otherwise it wouldn't work. So in BaseConvert B, we check to see if 578 is greater than 10. It is, and so we go to BaseConvert C, this time calling it with 57. The check succeeds and we give 5 to BaseConvert D. Finally, BaseConvert D finds that 5 is not greater than or equal to the base (10) and so it prints out the remainder of the division 5 / 10 (ie. it knocks a digit off the right) and drops out. When it drops out we go back to BaseConvert C, right after the check to see if 57 is greater than or equal to 10. You should be able to see why we don't have an "else" in the source code at this point. If we did, the digit printing part would be skipped after the function returns. So now we print out the remainder of 57 / 10, which is 7 and we drop out again. We're now in BaseConvert B and dealing with the number 578. The next thing we do is to print out 8 (remainder of 578 / 10) and drop back to BaseConvert A. Finally, we print 0 (remainder of 5780 / 10) and finish. So, we've printed out 5, 7, 8, and 0 which just so happens to be what we're looking for. Now think about it: we go right to the end of the number (from right to left) and then we work our way back again, but chucking out a digit onto the screen as we go. Easy! To make things a little clearer, have a quick peek at this diagram. |
Actually, the algorithm below has a far wider application that just
printing base 10. In fact, you can use it to print pretty much any base
you can think of (ASCII table permitting). So, with very little effort
(changing one number, in fact) you can make it do hexadecimal (base 16),
octal (base 8) or even binary extremely easily. That's why I've made the
code deliberately general, otherwise if you were only ever going to work
with base ten, you could reduce the calling parameters to pass just the
number and hard-code the base. Try changing the number and see what
happens. If the base is between 2 and 10, you should see the results
instantly. However, if you want to do larger bases, you'll need to check
what range the digit you're printing out is in. This is due to the
nature of the ASCII table, nothing more. If the letter "A" was right
after "9", then you wouldn't have a problem, but it isn't and so if you
wanted to do hexadecimal, you'd have to check that the number is between
0 and 9. If it is, print out 48 (ASCII code for "0") plus that number.
Otherwise, if it's between 10 and 15, print out 55 (65 (the ASCII code
for "A") - 10) plus that number to give the range "A" to "F".
My real advice if you're still not getting the hang of this recursion lark is to get your mits on a functional programming language. Although logicians will harp on about how good they are for writing all sorts of proofs using induction and so on, personally I go for the rather more practical application. The thing about functional languages is that you don't get your usual collection of iterative instructions, and so in order to perform loop-based operations, you have to opt for the recursive method. Effectively, it's like being chucked off a cliff and in an attempt to make you learn how to fly. It works, too! Academics will tell you that programming with a functional language is unlike programming with anything else. Personally, I think that's crap.
There are several languages you can try, but my recommendation would be
to go for Miranda. Now, I'm not 100% sure if a complete Amiga port of
Miranda exists, but there are derivatives (namely one called "Gopher",
which differs only in terms of some of the symbols which is no major
hardship) which definitely does exist for the Amiga. You can also get quite
a few good books to help you along with this language. Alternatives
include LISP (which I've never tried) but is fairly popular, and ML
(Meta Langauge, although it has been known as "Malicious Language" and
"Mandatory Labotomy" by students that study it for reasons that would
soon become apparent if you were ever to attempt to program with it). I'd
probably recommend against ML because it can turn even the most seasoned
programmers into quivering wrecks. I believe that an Amiga port does
exist, but unless you're into writing compilers then I'd suggest you
steer clear of it because it is a very messy language that can quite
easily turn simple typos into a huge list of errors due to it's
"intelligent" attempts to correct your mistakes, and it's huge memory
requirements during the type checking phase.
With luck, this simple introduction has at least given you an appreciation
for why recursion is an important thing for a good programmer to understand,
and hopefully taken away some of the fear associated with it. Happy
programming!
/* ARexx base conversion */ Address Command Number = 5780 Base = 10 Call BaseConvert Number, Base 'Echo' Exit BaseConvert: PROCEDURE Arg Number, Base If Number >= Base Then Do Call BaseConvert (Number % Base), Base End /* Note that there is NO "else" here because we want */ /* the function to do BOTH operations (ie. recurse, */ /* then print) if the condition succeeds, and only */ /* then printing bit if it doesn't. */ 'Echo '||D2C((Number // Base) + 48)||' NOLINE' Return
#include <stdio.h> void BaseConvert(int, int); void main() { BaseConvert (5780, 10); printf("\n"); } void BaseConvert(int Number, int Base) { if (Number >= Base) { BaseConvert(Number / Base, Base); } /* Note that there is NO "else" here because we want */ /* the function to do BOTH operations (ie. recurse, */ /* then print) if the condition succeeds, and only */ /* then printing bit if it doesn't. */ printf("%c", 48 + (Number % Base)); }
Push MACRO Move.L \1,-(A7) ENDM Pop MACRO Move.L +(A7),\1 ENDM Discard MACRO Add.L #\1*4,A7 ENDM Start: Move.L #Number,A0 Move.L #5780,D0 ; An example of a number ; ie. 5780 in decimal Move.L #10,D1 ; Convert it to base 10 (easy!) Push D1 ; Stick 'em both Push D0 ; onto the stack Bsr Convert ; Now call the subroutine ; - Bsr NOT Bra! Discard 2 ; Clean up the stack ; - exit gracefully! Rts ; Ta da! Convert: Move.L 4(A7),D0 ; Pull Number and Base ; from inside the stack Move.L 8(A7),D1 ; Don't forget that when ; a Bsr instruction is Cmp.W D0,D1 ; issued, an extra 4 bytes ; are pushed onto the stack! Bhi Digit ; Jump if Base>Number ; (same as) Number>=Base Push D1 Divu.W D1,D0 Andi.L #$FFFF,D0; Strip the division ; remainder from the result Push D0 Bsr Convert ; Recursion time! Discard 2 ; Clean up the stack ; - VERY important! Digit: Move.L 4(A7),D0 ; It may seem silly, but ; we must copy the data Move.L 8(A7),D1 ; for Number and Base back ; into registers again Divu.W D1,D0 ; (this is for when other ; recursions return) Swap.W D0 ; Swap hi & lo words to get ; the remainder Bsr PrintDigit ; Print that digit Rts PrintDigit: Add.B #48,D0 ; ASCII code for '0' is 48, ; so add 48 to the number Move.B D0,(A0)+ ; Stick that digit at the ; next position in memory Rts Number: Ds.B 50 ; Reserve a few bytes for ; the result