Hollingsworth - WISR9 Position Paper

An Architecture for Doing Pointers Right

Dr. Joe Hollingsworth

Indiana University Southeast

PS105-4201 Grant Line Road

New Albany, Indiana 47150 U.S.A.

(812) 941-2425

Email: jholly@ius.edu

Abstract

Component engineering textbooks [Booch87, Horowitz96, Meyers92] do not normally suggest specific methods of using pointers to build unbounded components, but they imply a method: build each unbounded component by using the raw pointer types supplied by the programming language. This approach produces a library of unbounded components each directly implemented in the programming language’s raw pointer types. There is a better way! Encapsulate the power of raw pointer types in a few well-engineered (foundational) components and then layer all your unbounded components on these foundational components. This paper discusses how this is done for dynamically allocated linearly linked structures.

Keywords: engineering handbook, harnessing raw pointers, layered components, mental models.

Workshop Goals: Identify and address technical sub-problems that address software component engineering and the use of software components when building software systems.

Working Groups: Software components, technical aspects of component software component engineering.

1 Background

The Hardware Insulation Tenet

A RESOLVE [Bucci94, Sitaraman94] tenet is to insulate one’s self from the hardware where possible and feasible. Because of this tenet, RESOLVE requires us to examine programming language pointer types since they permit software developers to gain access to the machine at the hardware level through direct manipulation of memory addresses.

Direct manipulation of memory addresses and the dynamic allocation of storage is a very powerful tool, often times too powerful. We all can think of applications on which we have worked, or commercial applications with which we have used, where the misuse of storage occurred. The hardware insulation tenet is not meant to eliminate the use of these powerful tools, it is meant to help us harness them in the safest way possible.

Harnessing Raw Pointers

In programming languages that permit the creation of parameterized abstract data types (e.g., Ada and C++) we are able to create new components which harness the power of raw the programming language’s pointer types, including dynamic storage allocation.

These foundational components are engineered as follows:

They permit no raw pointers to "leak" through the abstract interface, i.e., pointers are fully encapsulated.

They export operations which permit the construction of unbounded generic components, e.g., queues, stacks, trees, one and two way linked lists, etc.

They provide a mental model used by client programmers for correct reasoning about their behavior.

They clean up after themselves by reclaiming allocated storage when it becomes inaccessible by the client program (traditional garbage collection is not used).

Client programmers are able to leverage off these components because the direct manipulation of raw pointers is done once and for all by these foundational components and does not have to be done over and over again.

2 Position

Your engineering handbook should include the following principles with respect to using pointer types:

A software system’s architecture should be engineered such that all raw pointers and dynamic allocation of storage is harnessed in a handful of foundational components and no where else. For example, a system should have a component exporting functionality for building linearly linked structures, if linearly linked structures are to be used by the system. (RESOLVE has a component in its library for this purpose called Chain_Position.)

These foundational (pointer harnessing) components must be used by the system’s software engineers to build the unbounded components required by the system. For example, an unbounded Queue component must be layered on (implemented using) Chain_Position, and no raw pointer types are allowed to appear in Queue’s implementation.

If you must violate either rules #1 or #2, then there must be a defendable reason. For example, a real-time system that requires squeezing every last unneeded cycle out of the code.

If you faithfully follow these engineering principles then your system will have a very well defined architecture with respect to the use of dynamic storage allocation and pointer types.

3 Approach

Since it is unknown ahead of time how much data is to be processed, many of the container components use dynamic storage allocation to build dynamically linked structures. The traditional approach to implementing this type of component is to directly implement them using raw C++ pointer types. An alternative approach taken by RESOLVE is to use component layering so that all components requiring linked structures are layered on another component called Chain_Position, [Hollingsworth92a, Hollingsworth92b]. Component layering is not novel, but the Chain_Position component is novel, and layering all of these unbounded container components on it is novel. The benefits are: Chain_Position encapsulates raw C++ pointers for building linearly linked structures in one component; and it provides an abstract mental model for reasoning purposes [WISR95]. The implementer of a component layered on Chain_Position uses this abstract mental model for reasoning rather than reasoning in terms of raw C++ pointers. Furthermore, the direct manipulation of raw C++ pointers is localized in the Chain_Position component, where there are no more than 30 to 40 lines of executable code.

We benefit three ways from the Chain_Position component: (1) the direct manipulation of pointers for building linearly-linked structures is captured in one place once and for all, and because it is captured, (2) we can leverage off of that work each time we layer a component on Chain_Position. Finally, (3) this approach supports modular reasoning [Weide94, Weide94] about system behavior since all pointers are hidden "under the hood" [Hollingsworth92b, Hollingsworth93]. In a controlled setting, we have evidence that indicates that the layering approach can lead to improved quality and lower development costs, [Zweben95]. In the development of commercial applications we have anecdotal evidence that indicates that layering on components like Chain_Position leads to systems where we can effectively reason about the system’s correct use of storage.

Some components that we have directly layered on Chain_Position include: One_Way_List, Sequence, Stack, and Queue.

To the right is a code sample illustrating an implementation of Queue’s Enqueue operation. Note:

This is a C++ implementation.

Queue is layered on Chain_Position.

new_pos is a Chain_Position object

Label_New, Swap_With_Target and Apply_Target are exported operations from Chain_Position.

There are no raw pointers in this code.

template <template_parameters>

void Queue_1<class_parameters>::Enqueue (

consumes Item& item)

{

object Item_Chain_Position new_pos;

new_pos.Label_New (item);

rep->rear.Swap_With_Target (new_pos);

rep->rear.Apply_Target ();

rep->size++;

} // Enqueue

PowerScout [Hollingsworth99] a successful commercial application developed by the author was developed for the Microsoft Windows environment following the RESOLVE/C++ discipline including the engineering principles listed in Section 2. All unbounded components in PowerScout are layered on Chain_Position.

3 Comparison

A De Facto Standard

It appears that directly implementing each unbounded component in the programming language’s raw pointer types is a de facto standard dating to the days when programming languages did not support generic components. Without genericity, there is no satisfactory way to effectively create and use components like Chain_Position. Therefore, in the "old days", the only way to go was to implement each unbounded component in the library using raw pointer types.

With the advent of genericity in Ada and C++, authors of data structure books and practitioners did the obvious thing: they transformed their library of unbounded non-generic components into unbounded generic components. It appears that they did not take the time to think of innovative ways to make use of generic components in order to capture the common pointer manipulation routines found throughout their library of unbounded components.

Here are some examples found from various authors:

Author	Internal Representation	Sample Operation
[Booch87]	private type Node; type Stack is access Node; type Node is record The_Item: Item; Next: Stack; end record;	procedure Push ( The_Item: in Item, On_The_Stack: in out Stack) is begin On_The_Stack : new Node’(The_Item The_Item, Next On_the_Stack); … end Push;
[Horowitz96]	private: struct node { Type data; struct node *link; };	bool Stack::Add (const Type item) { struct node temp new* node; if (temp) { temp->data item; temp->link top; top temp; return true; } else { … } }
[Meyers92]	private: struct StackNode { T data; StackNode next; StackNode ( const* T& newData, StackNode *nextNode) : data (newData), next (nextNode) {}; };	void Stack::push (const T& object) { top new StackNode (object, top); … }

References

[Booch87] Booch, G., Software Components with Ada, Benjamin/Cummings, Menlo Park, California, 1987.

[Bucci94] Bucci, P., Hollingsworth, J.E., Krone, J., and Weide, B.W., "Implementing Components in RESOLVE", Software Engineering Notes, Vol. 19, No. 4, pp 40 - 51, October, 1994.

[Hollingsworth92a] Hollingsworth, J.E. and Weide, B.W., "Engineering ‘Unbounded’ Reusable Ada Generics", Proceedings of 10th Annual National Conference on Ada Technology Arlington, VA, February 1992.

[Hollingsworth92b] Hollingsworth, J.E., "Software Component Design-for-Reuse: A Language Independent Discipline Applied to Ada", Ph.D. thesis, Dept. of Computer & Information Science, The Ohio State University, Columbus, OH, 1992. Available electronically at http://www.cis.ohio-state.edu/rsrg

[Hollingsworth93] Hollingsworth, J.E. "Uncontrolled Reference Semantics Thwart Local Certifiability", Proceedings of the Sixth Annual Workshop on Software Reuse, November 1993. Available electronically at: http://www.umcs.maine.edu/~ftp/wisr/wisr.html

[Hollingsworth99] Hollingsworth, J.E., "Experience Report: Applying the RESOLVE/C++ Discipline to the Design and Development a Commercial Software Application", submitted to 21st International Conference on Software Engineering, Los Angles, California, May 1999.

[Horowitz96] Horowitz, E., Sahni, S., Rajasekaran, S., Computer Algorithms/C++, Computer Science Press, New York, New York, 1996.

[Meyers92] Meyers, S., Effective C++/Effective Ways to Improve Your Programs and Designs, Addison-Wesley, Reading, Massachusetts, 1992.

[Sitaraman94] Sitaraman, M., and Weide, B.W., eds., "Component-based software using RESOLVE", ACM Software Eng. Notes 19, 4 (1994), 21-67.

[Weide94] Weide, B.W., and Hollingsworth, J.E., "On Local Certifiability of Software Components," tech report #OSU-CISRC-1/94-TR04, the Dept. of Computer and Information Science, Ohio State Univ., Columbus, OH, January, 1994.

[Weide95] Weide, B.W., Hollingsworth, J.E. and Heym, W.D., "Reverse Engineering of Legacy Code Exposed", Proceedings 17th International Conference on Software Engineering ACM, Apr 1995, pp. 327-331.

[WISR95] Edwards, S.E., Hollingsworth, J.E., Latour, L., Weide, B.W., "Working Group Report: Micro-Architecture of Software Components And The Need For Good Mental Models of Software", Proceedings of the Seventh Annual Workshop on Software Reuse, August 1995. Available electronically at: http://www.umcs.maine.edu/~ftp/wisr/wisr.html

[Zweben95] Zweben, S.H., Edwards, S.E., Hollingsworth, J.E., and Weide, B.W., "The Effects of Layering and Encapsulation on Software Development Cost and Quality," IEEE Transactions on Software Engineering, Vol. 21, No. 3, March 1995.

Biography

Joe Hollingsworth is an associate professor in computer science at Indiana University Southeast. He has a Ph.D. in software engineering from the Reusable Software Research Group (RSRG) at The Ohio State University (1992). His research focuses on various aspects of software engineering and reuse including design-for-reuse, formal methods, testing and education, to name a few. He has authored several technical papers on related topics in software engineering. He also serves as a consultant to Holly Software, Inc. where he has used the RESOLVE and RESOLVE/C++ to develop a number of commercial software packages.