home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
ftp.umcs.maine.edu
/
2015-02-07.ftp.umcs.maine.edu.tar
/
ftp.umcs.maine.edu
/
pub
/
WISR
/
wisr6
/
proceedings
/
ascii
/
biggerstaff.ascii
< prev
next >
Wrap
Text File
|
1993-10-19
|
14KB
|
256 lines
Biggerstaff - 1
Biggerstaff - 1
The Limits of Concrete Component Reuse
Ted J. Biggerstaff
Microsoft Research
One Microsoft Way
Redmond, WA 98052-6399
Tel: (206) 936-5867
Email: tedb@microsoft.com
Abstract
This paper takes the position that the formal representation options
available today for expressing reusable components -- notably,
programming languages -- are excessively concrete (even with all of their
advertised abstraction facilities such as classes and generics). And this
concreteness imposes a built-in barrier to widespread and high payoff
reuse of those components. The proposed solution to this problem is
representations that allow improved deferral of implementation details
(i.e., additional levels of abstraction) and allow for the automatic
generation of those details at component reuse time.
Keywords: Concrete components, repositories, limitations, payoff.
Workshop Goals: To explore the notions of new representational abstractions for reuse components
and the concomitant requirement for generative architectures to support those abstractions.
Working Groups: Domain analysis/engineering representations.
1 Background
I have done research on reuse and related topics (e.g., design recovery) since
1983 and this has resulted in a number of books, papers and systems. Most
recently, I have been working in the area of representations that can enhance
the reusability of components and yet allow most of the reuse to happen
automatically. This work has led me to consider the limitations of the current
concrete (i.e., programming language) representations for reusable
components.
2 Position
2.1 Hypothesis
Reuse in its various forms has had a number of notable successes1 and a
number of apparent failures. In a previous article [2], I analyzed the key
factors that lead to reuse success or failure. These included factors such as the
scale of the component, the narrowness of the application domain, the
minimization of the component interconnection architecture (e.g., the data
flow structures) and the standardization of the data items (i.e., the data
structures that flow between components) shared among the components in a
library. But there is another factor that was not discussed in that article and
one that introduces serious limits on reuse. That factor is the level of
concreteness (or lack of abstraction2) of the reusable component.
2.2 What are Concrete Components?
Concrete components are those that express the operational meaning of the
component in enough detail that it can be translated into an implementation
oriented form such as a compiled function or object with little or no
additional information. To the degree that a component needs additional
information and to the degree that that additional information will cause
extension and/or restructuring of the resulting implementation form, we will
consider that component to be abstracted to some degree. For example,
macros, templates and generics represent kinds of components that require
additional information and exhibit extension and/or restructuring in the
process of achieving their final implementation form.
Another test for concreteness is the amount of generation used by the reuse
system in the course of developing the resulting implementations. If their is
very little generation required and components are mostly just plugged
together, then the components are truly concrete.
Generally, concrete components are expresses in conventional programming
languages which by their very definition foster concreteness. In fact, I believe
that the representations available today for representing reusable components
are the central reason for the limitations inherent in today's reusable libraries.
2.3 Limitations of Representation
Why should concreteness have a deleterious effect on reuse? Because concrete
representational forms require implementation oriented details that are
needed by the compiler but could easily be deferred until later in the design
process. Worse, premature introduction of implementation details (which often
represent arbitrary design choices made in the absence of definitive
requirements) precludes many opportunities for reuse. How often have you
heard "I should be able to reuse this component but it is just too slow (or
large, or a variety of other factors) for my application. " More often than not,
such reasons are perfectly valid and a potential reuse is lost because concrete,
implementation details have been introduced too early -- before the
opportunity for reuse, not after.
This premature introduction of implementation details is a direct result of the
representations available for expressing reusable components -- programming
languages. Today's programming languages are largely concrete and therefore,
make abstraction difficult and limit its form and degree. Let us consider the
modes of abstraction that are available to the builder of a reuse library --
classes (e.g., in Smalltalk or C++), macros (e.g., in C) , generics (e.g., in Ada)
and general templates (as described in [5]).
Classes are conventionally thought of as abstractions and that term is even
applied to describe them. But classes are already implementation-oriented
components. Their detailed algorithms are chosen and although hidden, these
show through to the application in terms of their performance, size, error
handling design, memory management schemes, etc. [3] That is, the
information may be hidden but the implementation properties show through
and it is these concrete properties that can have great (generally, negative)
effect on the reusability of the components. So, while object orientedness is
highly beneficial to reuse, it imposes built-in limits on the degree to which
objects may be reused and thereby, on the expected payoff through reuse. OK,
let's look further. What about macros?
Macros certainly have the potential to be powerful tools but they are limited
by the design considerations of the languages in which they are embedded.
That is, programming languages are designed according to principles that are
somewhat antithetical to reuse. Take C for example. The language was
designed to allow the maximum flexibility to the programmer not the
maximum abstract-ability of the code. So macros in languages like C are quite
limited and have little to offer as a representation for reuse. As an example of
the key limitations of macros in languages like C, consider the requirements of
generative reuse architectures. Generative reuse architectures often
conditionally generate alternative code streams based on the type of, or on a
declared property of a data item, and typically, they apply such conditional
generation recursively. C macros allow neither capability let alone allow it to
be applied recursively. Consequently, while quite useful, C-like macros are far
too limited as a reuse representation candidate.
Generics are an improvement on C-like macros. But generics can only vary
based on data types and, in Ada for example, only on the built-in types. This
limitation to built-in types is probably in Ada because the emphasis in the
design of the Ada language was validation based on type consistency which
would mean that the compiler would have to have the type inference rules
built-in so that it could do type inference efficiently. But the downside of this
limitation is that one cannot abstract generics to the degree desired. Even
abstraction based on user defined types would help but it too would fall short
of the needs of abstraction for reuse. Consequently, while generics are a step in
the right direction, they do not go far enough. OK, what about general
templates that vary based on types including user defined types.
Templates add a powerful level of abstraction but they too fall short of what is
needed simply because highly abstract components must vary based on
properties that fall outside of the type system. For example, consider two
coupled design decisions -- the choice of the implementation data structure for
a very long strings (i.e., choosing between an array versus linked list
implementation) and the choice of the substring search algorithm (e.g.,
choosing between a linear search versus Knuth-Morris-Pratt algorithm). The
performance consequences can be onerous if the choice of implementation
data structure and the choice of search algorithm are made independently,
without considering the performance interactions3. Therefore, when I am
choosing the implementation data structure for the string, I would like to be
able to have my generation algorithm test the implementation property of the
string search algorithm (i.e., is it linear search or KMP) and under some
conditions to revise the choice of search algorithm. Types are not a good way
to encode that information but general properties associated with the program
are and this is the way DRACO [4] dealt with this kind of abstraction problem.
2.4 Representational Abstraction
In conclusion, if reuse is to escape the bounds of concrete representations it
must move beyond static, concrete representations. It must include the ability
to develop a rich representation of the target program that goes beyond
today's programming languages (e.g., allows arbitrary extra-linguistic
properties on any structure) and allows general manipulation and
reorganization of the program at the level of DRACO.
If implementation details are truly deferred in reusable components, then the
control skeletons of many low level algorithms within those components (e.g.,
the string search algorithm from the earlier example) may not exist in any
concrete form at the time that the component is entered into a reuse library.
The control structures and their details may only be generated in concrete
form when the component is reused within a specific application context. In
short, abstract representations that can truly defer implementation details
must by necessity be coupled to powerful generation systems that can derive
much of the detail, concrete structure with only a minimal involvement from
the user. And it is only by such abstraction and generation that reuse can
surpass the limitations that currently restrict its payoff.
3 Comparison
This view of reuse falls somewhere between the fully generative view where all
of the componentry is completely built into the generation system and the
fully compositional view where all of the componentry is explicitly and
completely defined by the end users4 of the system. Visual BasicTM is a good
example of a fully generative reuse system and the Foundation Classes for
Microsoft's C compiler, which defines a set of reusable components that
simplify WindowsTM programming, is a good example of the fully
compositional view. The view of representation that is most similar to the view
expressed in this paper is that of Neighbors [4]. Within Neighbor's DRACO
system, there is a set of clearly identifiable componentry (expressed in domain
oriented languages) that the programmer can change and manipulate, but
there is also a generative mechanism (i.e., a transformation engine) that is
driven by some of the components (i.e., those that are expressed as
transformations). This architecture allows the DRACO user to develop
components that are far more abstract than possible with current
programming languages.
While the view of this paper targets roughly the level of component abstraction
in DRACO, it differs from the DRACO view in a couple of ways. It accepts a
larger user involvement in the decision making that controls the generation of
the concrete implementations and therefore, it anticipates a generative system
that is somewhat less general and less automated than the DRACO model. The
details of such a model and the generative architecture that it requires await
the answers to the representation research, which will determine whether or
not this ideal is achievable.
References
[1] Batory, D.S., "Concepts for a Database Compiler", ACM PODS, 1988.
[2] Biggerstaff, T.J., Microelectronics and Computer Technology Corporation,
"An Assessment and Analysis of Software Reuse", in Advances in Computers,
Vol. 34, Academic Press, 1992.
[3] Berlin, L., "When Objects Collide", OOPSLA, 1990.
[4] Neighbors, J., "DRACO: A Method for Engineering Reusable Software
Systems", in Software Reusability, ACM Press, 1989.
[5] Volpano, D. and Kieburtz, R.B., "The Templates Approach to Software
Reuse", in Software Reusability, ACM Press, 1989.
4 Biography
Ted J. Biggerstaff is Research Program Manager of the Intentional Programming
Project at Microsoft Research Laboratories. He is responsible for exploring ways
of enhancing reuse capabilities of application development. Before coming to
Microsoft, he was Director of Design Information at MCC where he led the
research and development of the DESIRE design recovery system. Before that, he
worked for ITT's programming laboratory (on reuse) and Boeing Computer
Service's Advanced Technology and Application Division (on CASE tools). He
received a Ph.D. in Computer Science from the University of Washington in
1976.
1Problem specific application generators such as GENESIS [1], more general
application tool kits such as Visual BasicTM, fourth generation languages such as
dBase, and many others.
2I am using abstraction in its most general form and thus, while our use includes
object-oriented programming as a special case, we intend abstraction to have its
general, non-jargon meaning.
3The advantage of the KMP search algorithm arises out of the ability to avoid
comparisons for many substrings (i.e., the ability to jump over some substrings)
within the long search string. A linked list implementation eliminates most of this
advantage and makes sequencing through the strings an expensive operation.
4 Or at least personnel that are closer to the end user than to the developer of the
reuse system. The author recognizes that the population of the reuse library is
frequently not done by end users (i.e., application developers) but rather by domain
analysts.