Some Holes in the Emperor's Reused Clothes
Dewayne E. Perry

Software Production Research Department
Bell Laboratories
600 Mountain Ave
Murray Hill Nj 07974

Tel: +1.908.582.2529
Fax: +1.908.582.5809
Email: dep@research.bell-labs.com

Abstract:

Reuse continues to be a problem whose potential remains elusive. Each new solution remains full of promise but riddled with what look like insurmountable problems. I examine some of the current trends and suggest some of the realities that stand in the way of success.

Keywords: Problems in Reuse, Retrieval, Domain-Specific Approaches, Product-Line Architectures

Workshop Goals: Explore reuse in the context of software architecture and specifically in the context of product line architectures.

Working Groups: Software Architecture or Product Line Architecture and Reuse; Domain-specific Languages and Reuse.

Background

One of the most promising areas of research in software architecture is that of product line architectures. Existing product architectures are abstracted and generalized to form a product line architecture to cover those existing products as well as new ones to be added to the line. While there are significant problems about how to do that abstracting and generalization [1] and what the various relationships are between a product line architecture and its various instance product architectures, there is a very interesting question about how to support the building of the various products from the instantiated product architecture.

It is this area of assets and components from which to construct an individual product that comprise a set of interesting problems in their own right (as do the various product line processes and organizational issues surrounding all of this as well).

Position

Language Problems

Let us first take a look at some of the significant examples of reuse. There are several that have been around for quite a while: the fortran (and other languages') math library and the unix filters. Aside from being rather low level in size and functionality in general, they have two other characteristics in common: there is one data type and there is really only one domain (numbers and strings).

To be able to use either one of them well, you have to know the domain and what operations are available in those domains. In the case of the math library, you need to know mathematics. If you don't know what a square root is, no amount of help about the library is going to enable you to understand and use the square root function.

In addition, if you are going to do really sophisticated computations, you need to understand the computational limitations of the underlying representations and algorithms with respect to precision and accuracy. Hopefully the documentation will provide insight into those limitations.

For the standard unix library there are a number of operations that can be used in many different contexts - string manipulation, I/O, etc. Here again, you need to know the domains of these operations and their limitations.

In both cases, if you do not know the concepts, current retrieval mechanisms will be unlikely to help you very much. If you do know the concepts, especially in the case of the math library, you don't need a retrieval mechanism, only a mechanism for finding out the names which are semantically loaded in their respective domains.

In the cases where we would like to capture the benefits of reuse in real systems and products, we do not have, in general, these nice neat simple data types and well understood domains. We often have domains that we create with complex data types and operations. Where we try to find common ground we find a babal of individual dialects and private languages that completely confound our limited automated mechanisms.

We must somehow bridge these differences and establish equivalences amongst the different utterances that make up our complex systems put together by legions of people the casts of which change over time.

Domain-Specific Language Problems

OK, then all we need to do is to focus on a domain-specific description of our system and all is solved. That way the tower of babal goes away and all is uniform. Wrong!

First of all, any reasonably complex system built by groups of people will have more than one domain needed to implement it. Thus we will have multiple languages, separated into orthogonal components at best, identified but not separated at worst.

Second, There are multiple ways to abstract a domain. As Lehman and Belady point out in Program Evolution [2], reality is infinite and our abstractions of that reality both finite and selective. Given that our domains then are selected from an infinite number of observations, it is not surprising that two different selection processes could result in significantly different views of the same domain in reality.

Third, even if we agree on the concepts in the domain, different meanings and interpretations are possible. A specific concept may require different representations for different uses and this results in different denotations as well as different connotations. It is here where evolution often catches us unaware and here where unforeseen use and reuse catch us as well.

And finally, there is the problem of whether you are working in the problem or the solution domain. A good case can be made, for instance, that the architecture of a system ought to be defined in the problem domain [1] -- that is, in the business domain. This is then at odds with our set of resuable assets which are more likely than not in the solution domain, completely divorced from the problem domain until we make those correspondences between the two.

All of this is compounded by the fact that it has been shown from empirical fault studies [3, 4] that there is a thin spread of domain and system knowledge among the developers constructing and evolving software systems. So even if we had solved all the problems of domain specific descriptions we still have a huge education and training problem.

Product Line Architecture Problems

Now we come to the latest in a long line of reuse solutions: the product line architecture [5]. Just collect the assets from existing products, generalize them and create an asset base for the components in the same product or business domain and construct the existing and new products from these components with a little glue and a few new components.

The additional advantages here are that we get larger sized components to reuse because we are working this problem at the architectural level not the subroutine level.

Assuming that we have solved the problems mentioned above of domain specific languages, it is still, however, not as easy as it seems. How well this can be done depends on the kind and magnitude of the variance among the different products. If it is a variance in functionality, then that seems pretty easy to solve. If it is a variance in underlying platform (that is, say, the computer platform or the operating system platform) then that also can be handled with some good modularization, encapsulation and abstraction techniques.

However, if it is a variance in performance, then that starts getting harder. If it is a variance in reliability or fault tolerance, then it is not at all clear how you go about solving this problem with a common set of assets. These kinds of problems tend to be integral to the components whereas the preceding cases tend to be compositional.

If you need a different reliability characteristic for each system that requires a different component for each product, then your reuse factor has been driven down to zero. Now, you may still have a leg up on productivity since building one system will be just like building another except the characteristics of the components differ. But your potential for reuse has disappeared.

The Punch Line

The challenge then is to see if we can make what we normally think of as non-functional qualities of our systems to be compositional in nature rather than integral [5]. It is here that I think the useful distinction between connecting and data/process components [6] comes into play in a much more interesting way than hitherto considered.

Just as we have now separated coordination from computation via connectors to our advantage, we might well be able to separate these various attributes from computation as well. It is time to explore this avenue to see what can be done.

References

[1] Dewayne E. Perry. ``Generic Descriptions for Product Line Architectures''. ARES II Product Line Architecture Workshop, Los Palmos, Gran Canaria, Spain, February \ 1998.

[2] M. M. Lehman and L. A. Belady, Program Evolution. Process of Software Change, Academic Press, 1985.

[2] Dewayne E. Perry and Carol S. Steig, ``Software Faults in Evolving a Large, Real-Time System: a Case\ Study'', Proceedings of the 1993 European Software Engineering Conference, Garmisch Germany, September 1993.

[4] Bill Curtis, Herb Krasner, Neil Iscoe A Field Study of the Software Design Process for Large Systems. Communications of the ACM 31:11 (November 1988) 1268-1287

[5] Dewayne E. Perry, Software Architecture and Software Engineering, Coordination 1997, Berlin Germany, September 1997.

[6] Dewayne E. Perry and Alexander L. Wolf. ``Foundations for the Study of Software Architecture''. ACM SIGSOFT Software Engineering Notes, 17:4 (October 1992).

Biography

Dewayne E. Perry is a Member of Technical Staff in the Software Production Research Department at Bell Laboratories. He draws on a rich background of commercial, industrial and military systems as the basis for his software engineering research. His research encompasses software fault studies, time studies focusing on people and organizational issues, experimental studies of process descriptions, visualization and analysis, process formalisms and process support, software architecture, software evolution, and the use of formal interface specifications in software construction and evolution. He is President of the International Software Process Association, Co-Editor in Chief with Prof. Wilhelm Schaefer of Software Process: Improvement and Practice, an Associate Editor for IEEE Transactions on Software Engineering, and a member of ACM and IEEE Computer Society.