If you Can't Reuse, Recycle: A Case Study of a Platform to Platform Port

Jeffrey Allen
Larry Latour

University of Maine
Department of Computer Science
222 Neville Hall
Orono, Maine, 04473

Tel: 207-581-3523
Fax: 207-581-4977
Email: Allen: jca31@maine.maine.edu,
Email: Latour: larry@gandalf.umcs.maine.edu

Abstract:

Our position is that when porting a system from one platform to another, there are many non-code aspects of the system that collectively define it, that may or may not be available in some concrete form, but that nevertheless need to be considered and possibly ``recycled'' in the port. Furthermore, these non-code aspects seem to parallel those aspects that affect the systematic reuse of software subsystems. We have been porting the modeling language/environment Starlogo from the MacOS platform to the Windows platform, and have grappled with a variety of such non-code aspects. We give an overview of some of these in this position paper, along with a few of our more basic suggestions on how to avoid some of our frustrations in the future.

Workshop Goals: To share our experiences with porting a medium scale system, and to better understand how to apply reuse principles to more systematically plan for change.

Working Groups: Software Architecture, Re-engineering and Reverse Engineering, Reuse of Various Software Engineering Artifacts, Mental Models of Software Subsystems.

Background

Our EMERGE System group has been involved in various aspects of software tools for the exploration of emergent phenomena. One such system, Starlogo [1], has been developed by colleagues of ours at the MIT Media Lab [2] for supporting K-12 exploration of such phenomena. Examples of Starlogo program/model behavior include the emergence of cities, the operation of a stock market, the cooperative behavior in insect colonies, traffic jams, etc. Such systems require a highly parallel environment that presents highly visual results in a timely manner. Because of this need for efficient execution, a naively simple implementation design model becomes fairly complex and platform dependent when actually implemented.

We are currently grappling with the above issues in a MacOS to Windows port of Starlogo. The original system has been implemented in Lisp and assembler, with each of these languages providing what it provides best. Lisp is used for compilation and code generation, and assembler is used to optimize execution of the creature engine and visual display. We originally approached the port with the naive view of ``tranferring code'' from one platform to the other. But we very quickly realized that the system contained a number of subsystems, each with a unique set of concerns that involved a unique choice of programming language and environment. For example, we wanted to harness the compiler in a black box fashion, and so decided to retain the Lisp front end code. But we decided that it would be easier to implement the execution-independent part of the user interface in Visual C++, since a great deal of the Lisp code for this was both platform dependent and not "Lisp-critical". In addition, we decided to initially implement the execution engine in C rather than in assembler, banking on the efficiency of compiled C code to approximate the efficiency of an assembler engine.

In addition to issues related to using and implementing emergent system models, we have also been looking at the problems software engineers have in understanding and applying those generic architecture models often discussed in the reuse literature. Larry Latour has positions papers on this topic in both the Wisr 7 and 8 proceedings [3, 4]. In a further exploration of the topic, he has co-chaired a Wisr 7 working group focusing on the Need for Good Mental Models of Software Subsystems [5]. Jeff Allen has worked with high school math/science students in the computer modeling area, and has had a long time interest in complex emergent system simulations. This joint background gives us the ability to look in-depth at both the user domain of emergent systems software and at the architectural domains of the sub-systems we are manipulating in order to build/port this software.

Position

When we talk about reuse of software artifacts, we talk about applying these artifacts whole, without change, to a particular instance of a domain. The flexibility of the artifacts is achieved through careful parameterization, taking into account a range of possible domain instances. But what of the artifacts that themselves can't be used, but which contain many design decisions that can be reused when taken together? For example, what if a byte-code interpretor exists that is written in assembler? It is a fairly easy process to hand translate such an interpretor into another assembler or into C. Certainly you might argue that this points to the development of a program generator, but in reality much prior development effort is leveraged even when no code is transferred and translation is by hand. Let's call this, for want of a better term, recycling rather than reuse.

Our position is that there are many non-code aspects of a system which define the system, which may be available and which should be considered for recycling in the above sense of the word. When considering these aspects it is important to keep in mind what the goals of the port are. That is, how much of the original functionality are we interested in protecting? How much of the behavior are we interested in protecting? How much of the architecture are we interested in protecting? We use our Starlogo port project to illustrate a number of such aspects.

Availability of Artifacts

Specifically, any of the following ``artifacts'' may be available to a varying degree:

Human/Computer Interaction (HCI) design. I use HCI here to indicate that I mean the ``look and feel'' of the system. This might be available in the form of a system specification, user reference manual, user tutorial, or the system itself (which can't always be guaranteed!). All of these can be described with varying degree of formality.
User Cognitive Models. Depending on familiarity with a system, a user may have a rather complex model of the system, independant of the HCI design. Only those models that are well known and shared by many users need be considered in the context of a port. We can't be all things to all people!
Interface Behavior. What the system lets the user do, as opposed to how it lets him/her do it. In this area, the port might or might not have to absolutely duplicate the behaviour of the original.
System Architecure, both at a Macro and Micro level. For example, macro level issues include the choice of structural paradigm (client-server, event driven, pipe-line, etc.), and micro-level issues include the choice of communication mechanism (shared data, message passing, remote procedure calls, etc.)
Actual Source Code. We may be tempted to reuse code. In this context, perhaps recycle is a better word. That is, there might be many cases in which the temptation to reuse code level design decisions is strong, even though the code might or might not be copied en masse to the new application.
The Platform itself. Often when porting a system, we tend not to consider the source platform when porting a system to a target. When we move to a target with a similar architecture, we are in effect reusing most of the design decisions made on the source architecture.

It is interesting to note that the development of evolutionary prototypes, a relatively straightforward process with a fresh application, may become somewhat confounded by the presence of a working system on another platform. Specifically, it may be difficult to compare some aspect of a prototype's functioning to that of the original system. In fact, for selected aspects, the comparison should not even be made, and yet because of the perceived similarity, the human mind immediately jumps to the side-by-side comparison as the most meaningful metric.

Our Example: Starlogo

The are actually two versions of Macintosh Starlogo, one for 68000 based machines, and the other for PowerMacs. It's interesting that for some of the categories of artifacts we can refer to the ``Macintosh version'' of Starlogo, and for other categories we need to distinguish between the two.

Human/Computer Interaction Design. The Macintosh version of Starlogo presents the user with eight windows. We won't describe these windows completely here, but we will point out a few of the more salient features. For example, the interface window contains a number of widgets with which the user manipulates his/her model. One of these widgets is called a button, and there are two types of buttons available, the forever button, and the once button. These widgets are used to control the execution of the users procedures, and operate intuitively. Note that we could have many possible representations of an interface, which would allow the same functionality, but which might ``look and feel'' radically different. In the final version, we've decided that we'd like to preserve the form of the interface as much as possible. The reasoning behind this goal is that, ideally, a user of StarLogo should not have to change mental gears when changing hardware platforms.
User Cognitive Models. Users of Starlogo have developed a variety of views of the environment, most very application domain specific. For example, some users might be biological system modelers, and have well-developed Starlogo implementation models of their systems. Other users might be very familiar with abstract mathematical models such as fractals, and be very familiar with Starlogo implementation models of these objects. We don't normally think of recycling such human models, but decisions that we make can certainly effect how much of a carry-over users bring.
Interface behaviour. In this area, the final port to Windows absolutely has to duplicate the behaviour found on the MacIntosh. In a way, without talking about specific language commands, the specification of this interface describes something which could be called StarLogo, since it describes what you can do with it.
System Architecture. A number of sub-system architectures are involved in our consideration of porting the overall architecture.
- StarLogo itself is built on top of MacOS and Macintosh Common Lisp (MCL). Part may be close to an ideal, platform-independent form, but part is no doubt a relic of the platform. Actually it is a relic of two platforms, the 68000 and PowerMac architectures.
- The x86 based machines can't currently run MacOS, so we chose Franz Lisp (ACL) on Windows 3.x and Windows 95. There actually are differences between these also, but not as great as between 68000 and PowerMac platforms. Windows is notoriously ``idiosyncratic'' in what it imposes on software designers.
It would appear that there are roughly two levels of reuse/recycling here. While PowerMac and 68000 assembly are just so many syntax errors to an Intel assembler, MCL translates into ACL somewhat more gracefully with a bit of adjustment. But in either case the temptation to blindly reuse or recycle (hand translate) the source code, adjust the syntax from the Mac to the PC's way of doing things, and recompile is both insidiously attractive and bone-headedly wrong. The actual source code is merely the result of applying the higher level design goals to a particular environment. This isn't to say that a good deal of the code cannot be reused/recycled, just that it shouldn't be done without considering the higher level, possibly platform specific design decisions that led to the implementation.
System specification. Apart from the user interface, there are other aspects of user-visible Starlogo which are specified. For example: language primitives, allowable file names, maximum numbers of turtles, numbers of variables, allowable ranges for values stored by variables, colour palettes, etc. These can be broken into two roughly distinct categories, which we can call the "hard" and the soft specificiations. The hard specs are those central to behaviour of StarLogo models, and include such things as the language primitives, variable types and ranges, etc. Soft specs would include such items as particular colour choices, world size, patch size and so on.

Porting Concerns and Suggestions

Here are a few quick thoughts on what made the source Starlogo architecture difficult to port, and on what we would like the implementor to have done to remedy the problems. Using 20-20 hindsight they seem to all be based on the application of well known software engineering principles, but the fact that they keep cropping up from system to system (a storm in every port?) is reason to mention them.

Think very carefully about what black box/information hiding design really means. Use it properly, not only all the time, but also at the right time. That is, encapsulate (as carefully as you can) those design decisions that you even remotely suspect will change in a port. Try not to take shortcuts behind the abstraction for the sake of the implementation's efficiency on a particular platform, and if you do, document these shortcuts precisely.
Put off (i.e., localize, encapsulate) references to specific platforms as long as possible. There's no need to get married to a particular platform before you even know what the application is. Ideally, the application should be shaped entirely by the human designers, and not by the peculiarities of specific pieces of hardware or operating systems. There might very well be requirements restrictions on the target platform, but considering these restrictions too early in the design process might unnecessarily restrict the range of design possibilities. The goal of course is to push the nasty non-portable things further down (i.e., localize them, encapsulate them) into the system architecture. Face the fact that the object code isn't going to port nicely (unless you write a 68K emulator on the PC, for example) and we might as well strive to keep the upper levels of organization as portable and free from platform specifics as possible.
At the proper time, define the platform context of your design. When a particular decision is made for platform specific reasons, document that decision. You can provide source code with no comments. You can provide architectural diagrams with no paragraphs on the back of each one. If you can provide only one thing, provide a full list of the decisions that were made in order to accomodate the platform, along with the rationale behind these decisions.
Don't ``Code between the lines'' and don't write ``Hematomae'' to optimize performance. Gregor Kiczales [6] defines code hematomae as code that reimplements functionality already existing in a subsystem, many times for the purposes of either efficiency or architectural ``glue''. Again, this is taking the platform for granted, because a lot of what ends up in the code and in the architectural framework is there to make up for idiosyncrasies and deficiencies on that platform. This sort of design is aimed at a specific target, which, by definition, will be dealt with differently on the target machine during a port. While similar issues will no doubt manifest themselves on the target platform, the silver bullet that was used on the source is not necessarily going to work on that target.

There is a fundamental conflict between the goals of re-use and of optimal (i.e., efficient) code. The totality of code in a system must be efficient in the environment in which it is run. So it is naive to think that this totality of code can be ported from one idiosyncratic platform to another. The question then becomes how to properly isolate that part of the code that is ``platform independent''. But in actuality this property of "platform-independence" is elusive. Efficiency concerns are not as easily encapsulated as simple I/O functionality concerns. It may be that the solution to the entire problem is to alter the set of environments in which the typical piece of re-usable code is to be run, with Java and the goal of platform independent subsystems coming immediately to mind. But people who hail this as the second coming tend to forget the legacy of P-code. We haven't solved the problem of efficiency, we've just attempted to package it inside of a platform specific run-time environment. And it is naive to think that we've properly hidden all platform-specific design decisions.

Comparison

Although at first glance a good deal of this work seems more closely connected to the re-engineering/ reverse engineering area, there are a number of important connections to the systematic reuse area. To begin with, there has been a great deal of concern in the literature with the reuse of non-code level artifacts. In the case of our port, most of what we are reusing is not at the code level, and even when we are looking to leverage code we do so more from a design perspective than from a strict code transfer.

We mentioned earlier that we are interested in this port from the perspective of understanding what makes or doesn't make a good mental model of a software subsystem [5]. This port project has allowed us to consider a number of interacting architectural models, and has given us a great deal of material to think about the architectural mismatch problem. In this sense it is similar to the ``Reuse is Hard'' work of Dave Garlen [7]. Garlen's Aesop system is a development environment constructed from COTS (Common Off The Shelf) subsystems. He describes how the mismatch of these architectures caused a great deal of code to be rewritten or written from scratch.

The port project has also allowed us to look at a user model, a large number of creatures moving on top of a cellular automata grid, that has a naively simple implementation model but a realistically complex model. This complexity arises because of the efficiency concerns with running such large numbers of creatures in parallel. In this sense our issues are similar to those discussed by Gregor Kiczales in his Open Implementation project [6]. He points out that many systems with similar user interface models have wildly dissimilar architectural concerns, primarily due to behavioral differences.

References

1: L. Latour, ``Maine Starlogo Communities,'' in url: http://www.asap.um.maine.edu/starlogo , (Orono, ME.), 1997.
2: M. Resnick, ``New paradigms for computing, new paradigms for thinking,'' in Computers and Exploratory Learning (A. diSessa, C. Hoyles, and R. Noss, eds.), pp. 31-43, Springer-Verlag, 1995.
3: L. Latour and E. Dusink, ``Controlling Functional Fixedness: The Essence of Successful Reuse,'' in WISR'95: 7th Annual Workshop on Software Reuse, (St. Charles, IL.), 1995.
4: L. Latour, `` The Need For A Cognitive Viewpoint on Software Component Understanding,'' in WISR'97: 8th Annual Workshop on Software Reuse, (Ohio State University, Columbus, OH.), 1997.
5: S. Edwards and L. Latour, `` The Need For Good Mental Models of Software Subsystems,'' in Working group report, WISR '95, (St. Charles, IL.), 1995.
6: G. Kiczales, ``Why Black Boxes are so Hard to Reuse: A New Approach to Abstraction for the Engineering of Software,'' in OOPSLA, 1994, 1994.
7: D. Garlen, R. Allen, and J. Ockerbloom, ``Architectural mismatch: Why reuse is so hard,'' IEEE Software, pp. 17-26, November 1995.

Biographies

Jeffrey Allen is a graduate research assistant in the University of Maine Department of Computer Science. He is a software engineer whose main area of interest is complex adaptive systems and non-linear dynamics. Currently, he is working with the Computer Science Department's EMERGE group, on both the Starlogo port project and the Maine Starlogo Communities project.

Larry Latour is an Associate Professor of Computer Science at the University of Maine. He was introduced to reuse in 1986, when he and a small group of Tools and Environments working group members began what is currently the National WISR workshop series on software reuse. Along with his reuse interests, Larry is exploring interdisciplinary teaching techniques and K-12 learning environments with the EMERGE group, focusing on the understanding problem.