Title of Your Position Paper

Why Neither Java Components Nor Formal Methods Can Do It Alone Murali Sitaraman Computer Science and Electrical Engineering West Virginia University Morgantown, WV 26506-6109 Email: murali@csee.wvu.edu Project URL: http://www.csee.wvu.edu/~resolve

Abstract Component-based technologies and formal methods both have significant potential to improve the productivity and quality of all systems, including safety-critical ones. To realize this potential, they need to be combined to enable specification and reasoning of practical components. Technical problems in specification and modular reasoning that are routinely deemed to be implementation issues by the formal methods community, but considered limitations of formal methods by practitioners, need to be tackled. Examples of such problems include reference handling, storage-related issues, and inversion of control. Satisfactory solutions to these and other problems involve simultaneous consideration of formal specification and practical component implementation issues.

Keywords: mathematical modeling, practical components, pointers, specification, reasoning, and technology transition.

Workshop Goals: Identify and address technical sub-problems that must be addressed to bridge the gap between formal component specification methods and practical components.

Working Groups: Software components, technical aspects of component technology transition.

1 Background

This paper is based upon a conversation among Professor Obvious, Mr. Impossible, and Joan Student. Professor Obvious has a background in formal methods and softeware engineering. Mr. Impossible is a technical project leader of a major software development effort at Impresso, Inc. Joan Student is doing a major in Computer Science.

2 Position

Neither Java components nor formal methods can do it alone.

3 Approach

Joan Student, Professor Obvious, and Mr. Impossible ran into each other in a remote part of Texas.

Professor Obvious: Hi Joan! I am glad to hear you are majoring in Computer Science. Wise choice! Maybe I should quiz you on a thing or two to see what kind of education you are getting. Tell me what the following piece of code does.

Read(x); Read(y);
x := x + y;
y := x - y;
x := x - y;
Write(x); Write(y);

Joan Student: I sure can. This code exchanges the values of the two Integers x and y, under certain conditions.

Professor Obvious: It works under all conditions. Do you not see that it works for negative numbers as well?

Joan Student: It works for negative numbers, but it has an overflow/underflow problem. For example, if x = Max_Int and y = 1, there will be an overflow in the very first statement.

Professor Obvious: I can't believe this. Aren't they teaching you that you should ignore practical and implementation details such as Max_Int?

Mr. Impossible: On the contrary, you should not ignore practical implementation issues. She is right, professor. Your swap code works only under certain conditions. "Toy" code such as this is too easy. Joan, see if you can figure out what the following code does. This is the type of stuff you see in the industry. This code is from a logic database system under development at my company.

FactHolder

HashTable

return

Professor Obvious: Obviously, it is impossible to say what this code does.

Joan Student: What made it possible for me to reason and understand the Integer code was that I could associate Integer variables and operations such as + and - with their mathematical counterparts. So it would seem that what I need here are mathematical models for Fact_Holder and HashTable objects, and abstract explanations of the operations in terms of the mathematical models. With abstract interface descriptions, we can reason about code that reuses objects, without understanding details of the objects. Without them, to understand code such as the one above, we have to understand the code for each reused operation.

Mr. Impossible: Most often the only way to really understand operations like the ones we use in the industry is to understand their code. This idea of modular reasoning that is based on specifications of objects is actually good, but the thing is, it is really impossible to pin down precisely what the objects used in the industry do in mathematical terms. They involve pointers and such, and it is just too complicated. For example, look at the following description of the Dictionary object in Java Class library. HashTable, used in the code above, is an implementation of this object. Remember in Java, every object is represented using a pointer!

public

Methods

public abstract

Returns

Throws

Professor Obvious: The mathematical modeling and specification of a Dictionary component is straightforward. A dictionary is modeled mathematically as a (partial) function from the domain of key values to range values. The Put operation adds a new key to range-value mapping to the Dictionary, without changing any of the existing mappings.. For example, see the specification of a symbol table component, that is similar in spirit to the Dictionary component, in Larch, VDM, and Z notations in [Wing 90] .

Joan Student: But there is a problem. This abstract and simple description, unfortunately, creates a mismatch with the pointer-based interface description in Java. Take a look at the following code, for example. If the HashTable Java implementation and the professor's Dictionary specification are combined, unsound reasoning results.

d.put(│dog▓, dog);
dog.Decorate_Tail();

Here is what actually what happens in the implementation:
d = {│dog▓ -> │Picture of a dog with decorated tail▓, all other strings -> │empty picture▓}

Here is what the specification says happens:
d = {│dog▓ -> │Picture of a dog▓, all other strings -> │empty picture▓}

Mr. Impossible: Isn't it yet obvious to Professor Obvious that specification of practical components used in the industry is impossible?

Professor Obvious: The problem is with the implementation. Put method should not copy pointers to objects. It should copy values.

Mr. Impossible: In industrial practice, efficiency is an important consideration. If objects are complex gif pictures, copying them is expensive. The problem, of course, is with specification techniques.

Professor Obvious: Not entirely. It is true that my modeling of Dictionary does not match the implementation. Instead, consider this model: a Dictionary is a mathematical mapping of key addresses (integers) to range value addresses.

Dictionary = Key_Object_Address -> Range_Object_Address

Key_Mapping and Range_Value_Mapping are global mappings that map mathematical addresses to contents. Now using this new model for a dictionary, we can capture the essence of Java Dictionary that is based on pointers and pointer parameter copying (though I think it is a bad idea).

Joan Student: While professor's new specification captures intended behavior of the Java component, it complicates specification and reasoning considerably by introducing global functions for each object. In reasoning about code such as the one from Mr. Impossible that uses a Dictionary, it becomes essential to "reason" about pointers all over the place. In addition, following this idea, specification and reasoning about every component will involve pointers and related complications!

Mr. Impossible: Thanks Joan. I am ever more convinced of my position -- formal specifications are not suitable for practical components!

Professor Obvious: On the contrary, I am ever more convinced that the problem is with industrial implementations.

Joan Student: May be there is a middle ground here that requires compromises from both specifiers and implementers. Specifiers need to be aware of implementation considerations and implementers need to use efficient techniques, that do not necessarily complicate behavioral descriptions or reasoning. For the Dictionary Put operation, there is no need to copy pointers or values. The truly intended behavior here is that the new "key - value" mapping should be transferred to the dictionary. According to the RESOLVE folks, this idea of transfering values is conceptually easy to explain in specification/reasoning and it is possible to implement the idea efficiently using pointers [Harms 91]. These and other ideas are a part of the RESOLVE discipline, a key intent of which is to help bridge the gap between specification theory and implementation practice [Hollingsworth 94, Sitaraman 94]. The RESOLVE Software Composition Workbench is an environment to facilitate specification, reasoning, and testing of components and component-based systems [Atkinson 98].

Professor Obvious: That is interesting. I always thought that RESOLVE research was too implementation-oriented, and they didn't care about writing elegant specifications or ease of reasoning.

Mr. Impossible: I have heard of RESOLVE too. But I had no idea that they thought about practical impementation issues such as use of pointers or efficiency.

References

[Atkinson 98] Atkinson, S., │(Unreasonable Software) Reuse is Unreasonable (Software Reuse),▓ WISR proceedings, Austin, TX, january 1998.

[Harms 91] Harms, D.E., and Weide, B.W., │Copying and Swapping: Influences on the Design of Reusable Software Components,▓ IEEE Transactions on Software Engineering 17, No. 5, May 1991, pp. 424-435.

[Hollingsworth 94] J. Hollingsworth, S. Sreerama, B.W. Weide, and S. Zhupanov, │RESOLVE Components in Ada and C++▓, ACM SIGSOFT Software Engineering Notes 19, 4, October 1994, pp. 52-63.

[Java 98] http://www/javasoft.com.

[Leavens 91] Leavens, G., │Modular Specification and Verification of Object-Oriented Programs▓, IEEE Software 8, No. 4, July 1991, pp. 72-80.

[Sitaraman 94] Sitaraman, M., and Weide, B.W., eds., │Special Feature: Component-based software using RESOLVE,▓ ACM Software Eng. Notes 19, 4 (1994), 21-67.

[Sitaraman 96] Sitaraman, M., │Impact of Performance Considerations on Formal Specification Design,▓ Formal Aspects of Computing, Springer-Verlag International, Vol. 8, No. 6, January 1997, pp. 716-736.

[Szyperski 98] Szyperski, C. Component Software, Beyond Object-Oriented Programming, Addison-Wesley, 1998.

[Wing 90] Wing, J. M., │A Specifier's Introduction to Formal Methods▓, IEEE Computer 29(9), September 1990, 8-24.

Biography

Murali Sitaraman is an Associate Professor of Computer Science and Electrical Engineering at West Virginia University, Morgantown, WV 26506. He leads the Reusable Software Research Group at WVU. His areas of interest span theoretical, practical, and educational aspects of software engineering. Sitaraman is one of the principal investigators of the RESOLVE research effort. His research is funded by grants from NSF, DARPA, NASA, and the Department of Education.

Sitaraman served as the program chairman of the IEEE Computer Society Fourth International Conference on Software Reuse, held at Orlando, Florida in 1996. He also served as the co-chair of the Workshop on Foundations of Component-Based Systems at Zurich, Switzerland in 1997. He is the panel chair of the ACM SIGSOFT 1999 Symposium on Software Reusability. He has published over 50 papers on a variety of topics in software engineering.