AMIGAWORLD - The Aardappel Programming Language

The Aardappel Programming Language
A gentle introduction

Wouter van Oortmerssen

abstract: Aardappel is a novel graphical programming language which computes by concurrent tree space transformation, a generalisation of tree rewriting. Major contributions are Aardappel's model for sharing and its actual graphical syntax. This paper aims to introduce the reader to the language and discuss the merits of its design.

Introduction

Aardappel could be categorized as a concurrent linear graphical tree space transformation language. Before explaining what that actually involves, this introduction will give a brief overview of each of the sub-components of that phrase. Section "Aardappel" will then discuss how the Aardappel language fits together based on the issues in the previous chapter. Section "Boerenkool" will show how this is all represented and edited graphically.

Tree Rewriting

Tree rewriting is a very simple and orthogonal way of programming: everything is tree-structured. To write a program you define a set of rules, each of which say: "if anywhere you find a (sub)tree that has this shape, then replace it with a tree this shape". If you then give the program a tree, the program will try and apply as many of the rules as possible until it has rewritten the tree to a shape to which no rules apply (this is called normal form). To clarify this lets look at some examples. To represent trees we will use the '(' and ')' syntax to attach a list of children to a root, e.g. a tree a with 1 as its first child, and as a second child a tree b with 2 as its only child would look like a(1,b(2)).

If we have the rule

b(x) = x*x

then our example tree a(1,b(2)) would evaluate to a(1,4). The rule says: if you find a b tree somewhere, replace it with a * tree with two copies of the child (which we have given the temporary name x).

So in between the tree looked like a(1,2*2) where 2*2 is handled by the built-in rules for maths. So how does one compute useful functions with such a system? As you might have already noticed from the previous example, apart from data structures bits of tree can also represent functions, given that there is a rule that shows how to evaluate them. For example we could have a program to compute factorials if we write rules how to rewrite fac trees:

fac(0) = 1
fac(n) = n*fac(n-1)

So given a tree fac(5), this will be rewritten into 5*fac(5-1), right until the final tree 120 appears. There's quite a variety of possible tree rewriting languages, especially if you consider that there are many answers to the question: "given a big tree and more than one applicable rule, which part should be rewritten first?" Different answers give different results, and correspond to eager and lazy (and even non-deterministic) evaluation strategies. Tree rewriting, then, resembles functional programming a lot. But since on the surface tree rewriting doesn't seem to offer anything apart from its nice and simple computational model that functional programming doesn't, it hasn't been used a lot as a basis for a programming language.

Tuple Spaces

A tuple space is a mechanism for implementing concurrency in a language. One can imagine a tuple space as a big bag (multiset) of "things to be done" (the tuples). Then a set of concurrent entities (threads or processes) can take each take a tuple out of the bag whenever they're done with their current job. Depending on the program, processes can look in the bag for any tuple, or look for specific ones according to their abilities. The bag can also be used to drop results back into, or maybe generate more "work" as a result of executing the current job. It may even be used to share information among the processes: I can pick up a tuple which holds some information, do something with it, and then throw it back for other processes to use it. If processes are waiting for a specific tuple to appear which isn't there yet, they either have to block (synchronisation), or they can process a couple of other tuples while waiting.

For example, in a simple master-slave setup, the master might throw a large set of tuples into the bag, each of which represents an item to be looked up in a database. A set of worker processes could then continue to grab tuples out of the bag until all jobs are done. Each time they finish processing a tuple they throw a result tuple back, which are then collected by the master process. The workers don't need to wait for each other, and faster workers can simply process more of the tuples.

The idea of a tuple space was invented by Carriero and Gelernter in their language Linda <> <>. They call Linda a coordination language because it's not really a programming language by itself, but more like a concurrency model that fits on top of a sequential language (in their case initially C). "Coordination" also, because they envisioned it being used for more than just concurrency, i.e. the coordination of software ensembles (read: components).

Linda is a general model for anonymous, asynchronous communication, coordination and synchronisation of program parts with many nice properties. As may be obvious from the above, Linda not only allows you to model 1 to 1 communication like normal message passing allows you to, but also 1 to n, n to 1 and n to n setups. This allows you to build asynchronous tuple streams or the software equivalent of a bus. Many concurrency schemes such as master-slave, replicated worker or specialist parallelism fit very naturally onto Linda. One can build distributed datastructures from tuples, i.e., data structures that can be accessed by many processes at the same time without blocking, furthermore these data structures can be live, i.e. computing towards a result whilst sitting in the bag. Many of these properties stem from the fact that, in Linda, communication is an actual dynamic datastructure, not a mere static protocol belonging to one or the other process, one way only. This loose coupling makes Linda very expressive, if not robust and easily extendible.

Sharing

By sharing we mean the shared use of an entity by two or more other entities, and in this paper we particularly focus on sharing of runtime values, and even more particularly Sharing of mutable runtime values.

The prototypical example of such an entity is a global variable, though in general any modifiable memory location (such as a variable in an object) is such an entity. If I mutate (modify) a global variable and then execute a different bit of code that somehow relies (has a dependancy on) this same global variable, the change I made will automatically be picked up. Very convenient, or?

For those who have ever worked on a large program with many global variables (or other shared datastructures) which were being modified from many unrelated places, you might recognise the situation where with every change to code which involves this global data it becomes harder and harder to keep track of the correct change of state, the code becomes very "fragile" (touch it and it will fall over), and in general development grinds to a halt with opportunities for safely modifying the software being minimised. The reason for this behaviour is that adding bits of state modifying code has an exponential influence on the number of dependancies between program entities. A programmer is eventually going to lose the struggle to keep track of all dependancies, which results in... bugs.

If you reflect on all bugs you ever laid your eyes on, many of them are explainable as caused by some form of sharing mutable data. So is mutation public enemy number one?

A definite "Yes!" is the answer you'll get from people who have adopted the paradigm of (pure) functional programming languages. The mess caused by shared mutable data is cleaned up in functional languages by doing away with any form of mutation altogether. Sharing, then, is of little importance since without mutation you can't tell the difference between a shared object and a copy[This is exactly why functional languages don't have object identity, which is sorely needed for many kinds of programs. Object identity could be defined as: the property of a entity (i.e. an object / value) that if a second entity transforms it, it is possible for a third entity to observe this change transparently.]. This is not the full story however, as no language can live without shared mutable data: if you could never obtain a modified version of an object, you wouldn't be able to implement any algorithm. In a functional language therefore, all situations where an algorithm (or a software architecture in general) requires the new version of an object to be seen by other bits of code, that object has to be passed as an argument to all functions that call those bits of code. As you can imagine this is not only very cumbersome, it also changes the whole structure of your code, as a call graph path from every mutating bit of code to every other probably wasn't even available in the imperative version. Obviously, if you write functional code from scratch, then the implications of the above won't be so grave, and because of the very explicit way it tells the programmer about dependancies (through arguments), functional languages are still to be preferred above imperative (OO inclusive) ones. Many problems though find a conceptually very intuitive mapping onto an object oriented (or even concurrent) software architecture, and mapping them onto a functional language makes programming if not error prone then at least inconvenient.

Sharing of mutable data, then, is at the same time absolutely necessary and also the source of all evil. Of models of sharing that try to find a way to put together these ununifyable properties in a sensible way there are currently only two extremes: imperative and functional, and neither seems to do a particularly optimal job.

Graphical Languages

A graphical programming language is a language whose programs can be expressed fully as a two (or more) dimensional graphical diagram where graphical cues are used to denote relationships between program entities, as opposed to a textual programming language which is represented using a one dimensional ascii-string (a text file) and more heavily relies on identifiers and sequential nesting to denote relationships between entities.

Graphical[I purposefully avoid the use of the term "visual" as certain compiler vendors have blurred that term to mean "language that comes with an integrated environment and/or interface builder".] languages are becoming more popular these days, be it mostly as simplistic special purpose languages. So why is it that general purpose graphical languages have failed to make an impression, especially since there is so much more freedom for developing a clever syntax?

Lets first look at what languages are actually available. One can recognise two classes of languages that have been moderately succesful:

Graphical dataflow languages, e.g. Prograph, Labview, Max. The idea for these languages is simple and intuitive: we have a number of boxes on screen which represent functions and control structures (e.g. +, if, print). If we then need a value computed by one box as the input for another box we simply draw a line between them. Data "flows" over lines, and lines can be interpreted as local variables in, e.g., C. It's possible to write any algorithm using this system, yet it has one flaw which no layout mechanism can overcome: as code gets more complex, lines on screen will have to cross, and the whole turns into an incomprehensable spaghetti. Whereas simple dataflow examples will have programmers surprised over the beauty and the simplicity of the language, real life examples will have them running back to their textual languages.
Icon/graph rewriting languages, e.g. KidSim, Agentsheets, Visulan3d, Chemtrains <>. These languages are much like the system of tree rewriting introduced earlier: they have a world full of entities and relations between them (for Icon Rewriting Languages this relation is a two dimensional grid, for Graph Rewriting this is usually any graphical relation). Then a number of rules describes how each of the entities behaves in what situation. For example in KidSim one can create a simple platform game by having rules that visualize "if the hero has an empty block below him then he falls down one position".

The rewrite paradigm has a good potential for delivering powerful graphical programming languages, yet the current languages have several flaws:

either the language has very artificial placeholders (for patterns in a rule) or it has trouble with distinguishing concrete from abstract examples.
many work with one global world and as such are not suitable for tackling larger problems involving, say, arbitrary recursive data structures
graph representations, though often simpler than the ones used in dataflow, still have the potential to be confusing. Many languages in this category have a grid instead, which is obviously extremely easy to represent, but has problems modelling more complex structures.
all of these languages have semantics that impede efficient evaluation strategies since matching up situations in the world with the correct rule has a worst case of an exhaustive search.

Aardappel

This section looks at what makes Aardappel tick: tree spaces, polymorphism and a novel sharing model.

From Trees to Tree Spaces

Tree rewriting rewrites a single tree to normal form.

Aardappel generalizes this to multiple trees, to be precise: a bag of trees. Just reducing multiple trees wouldn't be interesting at all, if it weren't for the fact that this bag of trees (which is called a Tree Space) has the same functionality as a Linda tuple space. Recall that tuples could be live datastructures, which is exactly what these trees do: they are all live datastructures. Already in tree rewriting, trees could represent both function(calls) and data, in a tree space they can also represent processes and tuples!

Rules are generalized to not only transfrom a tree one step, but take an arbitrary number of "neighbour" trees from the tree space along in the transformation: this corresponds to the taking and putting back tuples from the tuple space. As a simple example, consider the rule:

a(x) { b(y) } = a(x+y)

assume we start out with a tree space { a(1) b(2) b(3) } (the curly brackets denote a tree space). The above rule says: you may rewrite a tree a if you can pick up a tree b from the tree space. Since this is the only rule, there is no rule to rewrite the two b trees, so they will be in normal form directly. The rule for a is then applicable: it rewrites the above example tree space into { a(1+2) b(3) } and after consuming the other b tree[The order of selecting equally suitable trees from a tree space is deterministic (it's FIFO), however if there are multiple processes grabbing the same sort of trees out of the tree space, which tree gets picked up by which process is of course nondeterministic (as in any concurrent system). It doesn't have any backtracking semantics.] to its final value { a(6) }. You can think of a as a process retrieving the two b tuples, but the language doesn't make any such distinction.

Trees which can be picked up from the tree space by other trees have to be in normal form. If a rule needs another tree which is still rewriting itself or which is not available (yet) at all, the rule is not applicable at that point. If no rule can be applied, rewriting the whole tree is blocked until there is one that can be applied.

Lets now look at a slightly more realistic example: an implementation of the master-slave situation in section "Tuple Spaces". To clarify what sort of concurrency is actually involved, lets first look at what that example would look like in C with message passing[We use a fictitious set of concurrency primitives that will never the less look familiar to many people; the code is only meant as an example and will not even compile.]:

char **tofind = { "pomodoro",

"aglio",

"formaggio",

NULL };

MsgPort *rmp;

void search(char *s) {

MsgPort *mp;

if(mp = CreateMsgPort()) {

Message m;

m.replyport = mp;

m.info = finddb(s);

PutMsg(rmp,m);

WaitPort(mp);

DeleteMsgPort(mp);

};

int main() {

if(rmp = CreateMsgPort()) {

int num = 0;

while(*tofind) {

LaunchTask(search,*tofind);

tofind++; num++;

};

int *results = new int[num];

while(--num) {

Message *m = GetMsg(rmp);

results[num] = m->info;

ReplyMsg(m);

;

DeleteMsgPort(rmp);

};

main() launches a number search threads and communicates with them to receive their results. Notice that while for this example the C code is fairly simple, it's also very inflexible, for example it would be a real hassle to extend it with more complex communications. This is mainly due to the communication being point to point: code has to be written to communicate with specific entities.

The same code in Aardappel would look like this:

search(s) = result(finddb(s))
results(l) { result(x) } = results([x|l])

with this as the initial bag:

{ results([])
search('pomodoro')
search('aglio')
search('formaggio') }

This initial bag rewrites to { results([]) result(3) result(6) result(2) }, and eventually to { results([2,6,3]) } (in the C version the array results would contain exactly those values as well, assuming those are the results finddb() gives us). Not only is this code (arguably) simpler, it's also much easier to extend. Implementation-wise the Aardappel version would automatically provide a near optimal dynamic schedule regardless of the parallel hardware setup or the number of searches, whereas the C version could reach a worst case schedule on certain setups, since it directly encodes many of the process and communciation details. The behaviour of the two programs is exactly the same.

Hierarchical Bags as Data Structure

Tree spaces are not just a top-level concurrency mechanism, they may be used hierarchically, better yet, in Aardappel they're first class values. This means that an Aardappel program can create any number of bags at runtime, have trees that have bags as children, and in general can have bags within bags. This gives us a flexible mechanism for structuring concurrent programs: explicitly concurrent functions can be programmed to use a local bag and be completely transparent for the user, or local client-server setups can be created completely shielded from any related communication in other bags. One can think of this as datahiding for concurrency. Since bags are first class values, one can also look at them as data structures in their own right, regardless of their concurrent semantics. If you compare bags with objects (elements of bags being the members), bags come out favourably in many respects. Members often are initialised at certain points in their lifetime, and members often have dependancies among them or invariants. In an OO language these (dynamic) issues are often problematic as they can be structured only statically: in Aardappel certain members can simply be absent if no meaningful value for them is currently available, lookups of members can temporarily block if a dependancy/invariant is pending to be fulfilled (works transparently even if used by multiple processes at once), and in general the set of members can completely change (evolve) during the lifetime of an object. In cases where this is meaningfull, duplicate entries of a member (with different values) can be available (as a stream), since members are just values, not field:value pairs[The structure of members and streams within a bag can be as complex or simple as anything a pattern match can handle].

Sharing in Aardappel

As noted in section "Tree Rewriting", rewriting a tree is semantically close to functional programming, and as we can observe from section "Sharing", that would give Aardappel a functional programming style, in particular when it comes to sharing mutable data. A tiny but significant detail in all of this is that we're not rewriting one tree, but multiple of them, and those trees can actually make use of each other while rewriting. The fact that each tree can potentially have access to any of the others gives them object identity: after all, trees transparently have access to other trees, regardless of how many transformations they've been through. Yet there is no real mutation going on: the trees themselves are non-mutable data structures, and for a tree to be modified, it is consumed completely and removed from the tree space, and then comes back as a transformed value. The key to all of this is that there is one sole value that is being mutated in-place, and that is the tree space itself, but that value is not visible nor available to the values that are in it. Of course this tree space can be part of another tree, but because of the linear (one reference only) semantics the outside view is that of a pure value[Infact, if it weren't for linearity, Aardappel would just be plainly imperative. In a linear system mutation makes no difference to the semantics anymore. For a good read on these issues look at <>]. What this amounts to is that Aardappel has shared state and object identity, but without shared mutable data structures. Aardappel therefore has almost all safety of functional languages except for referential transparency, which can be regarded as a stronger version of the "absense of shared mutable data structures" property which Aardappel has. Aardappel then also has almost all the structuring possibilities and the comfort that comes with imperative, object oriented and concurrent styles; what's missing are exactly those possibilities which make imperative programming too messy to maintain: whereas in an imperative language the set of observers of a particular item of mutable state (i.e. the set of entities that have dependancies on it) can be scattered anywhere throughout a large program, in Aardappel that set is extremely well defined: it is precisely the tree space that state is sitting in, and only that tree space (the tree space doubles as the set of "transparent observers of identity"[If all this has a very quantum mechanical ring to it, then that's exactly what it is: observation influences semantics]). This tremendously reduces the complexity of dependancies[IMHO one of the most important goals in language design]: sharing is now local only, contrary to the "anywhere" approach of imperative languages, and it's very loosely coupled and robust, thanks to the Linda mechanisms for access.

Polymorphism

The application mechanism for rules in Aardappel deviates slightly from what you find in most languages: selection of the most suitable rule is based on specificity only instead of being source-code based[There is no concept of "source code order" in Aardappel anyway.]. In OO terminology this amounts to a form of inclusion polymorphism which is more flexible and extendable than multiple interface inheritance. As an example, imagine creating a hashtable. In most OO languages one would also create an extra class (or interface) Hashable' to be multiple-inherited by every datatype to be used as a key for hashing. But what if you wanted to use this hashtable with an object that you haven't written yourself, say A? You can't make A inherit Hashable, because, apart from retro-editing classes being bad style, you might not even have the source. Subclassing A is not going to help either, because that means other users of your hashtable who already use lots of A objects would be stuck. In Aardappel however, you can simply create any number of rules rewriting hash trees with particular kinds of objects as sub-trees, regardless whether these are language builtins, class-library objects, or completely new ones, and regardless of whether you are the implementor of the class or not. Aardappel generalises the concept of a class to "the set of datastructures that have certain transformations defined on them". The other area where it goes beyond traditional OO is that this system mimics multi-methods dispatch (dynamic binding on all arguments) instead of the single dispatch found in almost all OO languages. Though Aardappels polymorphism is more powerful than most OO languages it shouldn't be classified as an OO language, especially since it doesn't inherit state and multimethod dispatch hasn't got subtyping semantics[Both these characteristics would be overly easy to incorporate, the current system is deemed to be simpler, cleaner, and more in line with the rest of the Aardappel design]. Of course Aardappel has unlimited parametric polymorphism through its dynamic typing, which is complemented in implementation with type inference.

Boerenkool

Boerenkool is the graphical syntax system that Aardappel uses. It is loosely based on the graph/icon rewriting approach presented in section "Graphical Languages", but differs in a few respects:

Instead of being graph or grid structured, Boerenkool uses trees. This makes the graphical layout on screen much easier and less confusing, as, contrary to graphs, trees can easily be easily be layed out in 2d space automatically to a near optimal degree. It is therefore never needed for the user to manually change the layout of his code after changes. Trees also fit very nicely onto the Aardappel semantics.
Rules only use example values instead of variables or other placeholders. The pattern part of a rule is a concrete tree using specific values, the expression part then uses these values. The property of a value of being an example rather than concrete is marked graphically.
Sharing of subexpressions is marked graphically. Though trees are almost ideal to represent information, they cannot represent a subtree that is being used in two different places, since that would make it a graph, and make it subject to all the spaghetti problems. Boerenkool allows you to use a subtree in multiple places, and shows you that it is shared with other trees by giving the whole tree a different colour background. Any actions on such a tree (editing) show up transparently on all its "copies". Dragging trees around will automatically create sharing of code as necessary, and shared subtrees can easily be "decoupled" if the programmer wants to modify only one tree.
Folding is done by example as well. Since trees can get huge, most tree representations have methods of selectively folding a subtree, usually showing just the root of the tree, with a little symbol indicating its status (e.g. a little + / - box). For code this would give too little information. Boerenkool instead replaces the tree on folding with the value it would evaluate to. Remember that every expression has at least an example value, so for every tree it is possible to (pseudo-) statically compute a result.

As an example lets look at the included screenshot, Figure 1[This screenshot is a fake constructed using a paint package, as an actual implementation is not (yet) available. If you're reading this paper with a greyscale printout, it is useful to know that "orange" refers to the relatively dark grey background colour, used for most boxes, e.g.

\tt [5,4,7,2,8]\rm
. "Yellow" refers to the lighter grey boxes, e.g.

\tt [4,2]\rm
. "Light grey" refers to
the lightest grey boxes, in this example only used for
the big
\tt filter \rm
box.].
The factorial at the bottom should be easiest to understand: the thick black line seperates two rules, the dotted line separates the pattern from the expression on each rule. The first rule just says: if you find a tree of the shape fac(0) then you may replace it with 0. The second rule is the recursive case and uses as an example the value 5 (in orange). Note the yellow 4. The yellow indicates that this subtree is a fold, and 4 is the result of the tree it was folded from. If you'd double click on the 4 it would expand to 5-1 (with the 5 being orange). Someone creating this code for the first time would create it as 5-1, then maybe fold it to make the code simpler (or maybe even easier to understand). This simple example already shows quite a nice feature of the system: by folding an unfolding subexpressions one can get immediate feedback as to wether the expression is computing the correct result (debugging code without having to run it explicitly, almost static debugging). Another feature of the tree layout is apparent here: if the structure of an expression tree is very small and simple (as in 5*fac(4)), the editor will switch automatically from a full tree like representation to a linear flat one, to save screen real-estate. Append is only slightly more complex, and should be understandable for anyone that has seen append in a functional language. The funny symbol with the two arrows means cons[I've designed a temporary graphical syntax for many of the icons, the values, the structures and background markers. This is by no means final, and suggestions on how to improve its clarity are welcome.]. This example shows that choosing your example values (in this case the numbers) in a tactical way can improve readability. One can imagine someone editing this code doing a double click on the bottom append([2,3],[4,5,6]) expression to check wether the implementation makes sense. If it shows up a yellow [2,3,4,5,6] box in its place, that would give a good indication that the algorithm might be working ok. Qsort is the most complex of the bunch, and to aid the discussion, this is what it could look like in a hypothetical ASCII version of Aardappel:

qsort([]) = []
qsort([P|T]) = append(qsort(first(L)),
[P|qsort(second(L))])
where L = filter((lambda(X) = X<P),T)

The first(L) and second(L) have been replaced by their example results, [4,2] and [7,8]. This window shows clearly the structure of the qsort algorithm, and examples suggest what the hidden subexpressions do, but it doesn't show all the details. If the programmer now wanted to investigate the code further, he could double click on the [7,8]: this would result in the tree representation in the first qsort window being expanded as shown in the second qsort window. This shows the expression which computes [7,8], which is fairly equivalent to second(filter((lambda(X) = X in the ASCII version (the funny dotted box with the lambda symbol in it denotes a local rule, an argument to filter, the two circles denote second). The light grey background of the filter expression denotes that this expression is shared, and if the programmer would make any changes to this expression, then double clicking on the [4,2] tree would show up exactly the same tree (with any changes incorporated). One of the nicest properties of this environment is that it blurs the distinction between static code editing and runtime debugging of code. As we've already seen, evaluated folds provide for nice debugging. beyond that, you can easily evaluate part of a program, and incorporate a result tree in your code. Or you can evaluate/debug a program partially, and then continue to work on the halfway-run program as your new code(!). Functions for proper partial evaluation an optimisation should ideally also be available (they'd fit perfectly within the system). Reuse is generally very easy as well. ready made abstractions come with example values and can therefore be run directly, and modified incrementally.

Future Work

I hope to demonstrate the usefulness of this language with a good implementation, which I only just started.