An Architecture for Doing Pointers Right

Dr. Joe Hollingsworth
Indiana University Southeast
PS105-4201 Grant Line Road
New Albany, Indiana 47150 U.S.A.
(812)941-2425
jholly@ius.edu

Abstract

Component engineering textbooks [Booch87, Horowitz96, Meyers92] do = not normally suggest specific methods of using pointers to build unbounded components, but = they imply a method: build each unbounded component by using the raw pointer types supplied = by the programming language. This approach produces a library of unbounded components each = directly implemented in the programming language’s raw pointer types. There = is a better way! Encapsulate the power of raw pointer types in a few well-engineered = (foundational) components and then layer all your unbounded components on these = foundational components. This paper discusses how this is done for dynamically allocated = linearly linked structures.

Keywords: engineering handbook, harnessing = raw pointers, layered components, mental models.

Workshop Goals: Identify and address technical sub-problems that = address software component engineering and the use of software components when building = software systems.

Working Groups: Software components, technical aspects of = component software component engineering.

1. Background

The Hardware Insulation Tenet

A RESOLVE [Bucci94, Sitaraman94] tenet is to insulate one’s = self from the hardware where possible and feasible. Because of this tenet, RESOLVE requires us = to examine programming language pointer types since they permit software = developers to gain access to the machine at the hardware level through direct manipulation of memory = addresses.

Direct manipulation of memory addresses and the dynamic allocation = of storage is a very powerful tool, often times too powerful. We all can think of = applications on which we have worked, or commercial applications with which we have used, where the = misuse of storage occurred. The hardware insulation tenet is not meant to eliminate the = use of these powerful tools, it is meant to help us harness them in the safest way = possible.

Harnessing Raw Pointers

In programming languages that permit the creation of parameterized = abstract data types (e.g., Ada and C++) we are able to create new components which harness = the power of raw the programming language’s pointer types, including dynamic = storage allocation.

These foundational components are engineered as follows:=20

2. Position

Your engineering handbook should include the following principles = with respect to using pointer types:=20

    1. A software system’s architecture should be engineered such = that all raw pointers and dynamic allocation of storage is harnessed in a handful of = foundational components and no where else.

      For example, a system should have a component exporting = functionality for building linearly linked structures, if linearly linked structures are to = be used by the system. (RESOLVE has a component in its library for this purpose called = Chain_Position.)
    2. These foundational (pointer harnessing) components must be used = by the system’s software engineers to build the unbounded components required by = the system.

      For example, an unbounded Queue component must be layered on = (implemented using) Chain_Position, and no raw pointer types are allowed to appear in = Queue’s implementation.
    3. If you must violate either rules #1 or #2, then there must be a = defendable reason.

      For example, a real-time system that requires squeezing every = last unneeded cycle out of the code.

If you faithfully follow these engineering principles then your = system will have a very well defined architecture with respect to the use of dynamic storage = allocation and pointer types.

3. Approach

Since it is unknown ahead of time how much data is to be processed, = many of the container components use dynamic storage allocation to build = dynamically linked structures. The traditional approach to implementing this type of = component is to directly implement them using raw C++ pointer types. An alternative approach = taken by RESOLVE is to use component layering so that all components requiring linked = structures are layered on another component called Chain_Position, [Hollingsworth92a, = Hollingsworth92b]. Component layering is not novel, but the Chain_Position component is novel, and = layering all of these unbounded container components on it is novel. The benefits are: = Chain_Position encapsulates raw C++ pointers for building linearly linked structures = in one component; and it provides an abstract mental model for reasoning purposes = [WISR95]. The implementer of a component layered on Chain_Position uses this abstract mental = model for reasoning rather than reasoning in terms of raw C++ pointers. Furthermore, the = direct manipulation of raw C++ pointers is localized in the Chain_Position component, where = there are no more than 30 to 40 lines of executable code.

We benefit three ways from the Chain_Position component: (1) the = direct manipulation of pointers for building linearly-linked structures is captured in one = place once and for all, and because it is captured, (2) we can leverage off of that work = each time we layer a component on Chain_Position. Finally, (3) this approach supports = modular reasoning [Weide94, Weide94] about system behavior since all pointers are hidden = "under the hood" [Hollingsworth92b, Hollingsworth93]. In a controlled = setting, we have evidence that indicates that the layering approach can lead to improved quality = and lower development costs, [Zweben95]. In the development of commercial = applications we have anecdotal evidence that indicates that layering on components like = Chain_Position leads to systems where we can effectively reason about the system’s correct = use of storage.

Some components that we have directly layered on Chain_Position = include: One_Way_List, Sequence, Stack, and Queue.

To the right is a = code sample illustrating an implementation of Queue’s Enqueue operation. Note:
  • This is a C++ implementation.
  • Queue is layered on Chain_Position.
  • new_pos is a Chain_Position object
  • Label_New, Swap_With_Target and Apply_Target are exported = operations from Chain_Position.
  • There are no raw pointers in this code.
template <template_parameters>
void Queue_1<class_parameters>::Enqueue (
      consumes Item& item
   )
{=20
   object Item_Chain_Position new_pos;

   new_pos.Label_New (item);
   rep->rear.Swap_With_Target (new_pos);=20
   rep->rear.Apply_Target ();
   rep->size++;
} // Enqueue

PowerScout=E4 [Hollingsworth99] a = successful commercial application developed by the author was developed for the = Microsoft=E2 Windows=E4 environment following the = RESOLVE/C++ discipline including the engineering principles listed in Section 2. All unbounded = components in PowerScout are layered on Chain_Position.

4. = Comparison

A De Facto Standard

It appears that directly implementing each unbounded component in = the programming language’s raw pointer types is a de facto standard dating to the = days when programming languages did not support generic components. Without = genericity, there is no satisfactory way to effectively create and use components like = Chain_Position. Therefore, in the "old days", the only way to go was to implement each = unbounded component in the library using raw pointer types.

With the advent of genericity in Ada and C++, authors of data = structure books and practitioners did the obvious thing: they transformed their library of = unbounded non-generic components into unbounded generic components. It appears = that they did not take the time to think of innovative ways to make use of generic = components in order to capture the common pointer manipulation routines found throughout their = library of unbounded components.

Here are some examples found from various authors:

Author

Internal Representation

Sample Operation

[Booch87]
private
   type Node;
   type Stack is access Node;
   type Node is record=20
      The_Item: Item;
      Next: Stack;
   end record;
procedure Push (
      The_Item: in Item, On_The_Stack: in out Stack) =
is
begin
   On_The_Stack :=3D
      new Node’(The_Item =3D> The_Item, Next =3D> =
On_the_Stack);
   …
end Push;
[Horowitz96]
private:
   struct node {
      Type data;
      struct node *link;
   };
bool =
Stack::Add (const Type item)
{
   struct node *temp =3D new node;
   if (temp) {
      temp->data =3D item;=20
      temp->link =3D top;
      top =3D temp;=20
      return true;
   } else { … }
} 
[Meyers92]
private:
struct StackNode {
   T data;
   StackNode *next;
   StackNode (
         const T& newData,
         StackNode *nextNode) :
      data (newData), next (nextNode)
   {};
};
void Stack::push (const T& =
object)
{
   top =3D new StackNode (object, top);
   …
}

References

[Booch87] Booch, G., Software Components with = Ada, Benjamin/Cummings, Menlo Park, California, 1987.

[Bucci94] Bucci, P., Hollingsworth, J.E., Krone, J., = and Weide, B.W., "Implementing Components in RESOLVE", Software Engineering = Notes, Vol. 19, No. 4, pp 40 - 51, October, 1994.

[Hollingsworth92a] Hollingsworth, J.E. and Weide, = B.W., "Engineering ‘Unbounded’ Reusable Ada Generics", Proceedings of = 10th Annual National Conference on Ada Technology Arlington, VA, February 1992.

[Hollingsworth92b] Hollingsworth, J.E., = "Software Component Design-for-Reuse: A Language Independent Discipline Applied to = Ada", Ph.D. thesis, Dept. of Computer & Information Science, The Ohio State University, = Columbus, OH, 1992. Available electronically at http://www.cis.ohio-= state.edu/~weide/rsrg/

[Hollingsworth93] Hollingsworth, J.E. = "Uncontrolled Reference Semantics Thwart Local Certifiability", Proceedings of the = Sixth Annual Workshop on Software Reuse, November 1993. Available electronically at: http://www.umcs.m= aine.edu/~ftp/wisr/wisr.html

[Hollingsworth99] Hollingsworth, J.E., = "Experience Report: Applying the RESOLVE/C++ Discipline to the Design and Development a Commercial = Software Application", submitted to 21st International Conference on = Software Engineering, Los Angles, California, May 1999.

[Horowitz96] Horowitz, E., Sahni, S., Rajasekaran, = S., Computer Algorithms/C++, Computer Science Press, New York, New York, = 1996.

[Meyers92] Meyers, S., Effective C++/Effective = Ways to Improve Your Programs and Designs, Addison-Wesley, Reading, Massachusetts, = 1992.

[Sitaraman94] Sitaraman, M., and Weide, B.W., eds., = "Component-based software using RESOLVE", ACM Software Eng. Notes 19, 4 = (1994), 21-67.

[Weide94] Weide, B.W., and Hollingsworth, J.E., = "On Local Certifiability of Software Components," tech report = #OSU-CISRC-1/94-TR04, the Dept. of Computer and Information Science, Ohio State Univ., Columbus, OH, = January, 1994.

[Weide95] Weide, B.W., Hollingsworth, J.E. and Heym, = W.D., "Reverse Engineering of Legacy Code Exposed", Proceedings 17th = International Conference on Software Engineering ACM, Apr 1995, pp. 327-331.

[WISR95] Edwards, S.E., Hollingsworth, J.E., Latour, = L., Weide, B.W., "Working Group Report: Micro-Architecture of Software Components = And The Need For Good Mental Models of Software", Proceedings of the Seventh = Annual Workshop on Software Reuse, August 1995. Available electronically at: http://www.umcs.m= aine.edu/~ftp/wisr/wisr.html

[Zweben95] Zweben, S.H., Edwards, S.E., Hollingsworth,= J.E., and Weide, B.W., "The Effects of Layering and Encapsulation on Software = Development Cost and Quality," IEEE Transactions on Software Engineering, Vol. = 21, No. 3, March 1995.

Biography

Joe Hollingsworth (jholly@ius.edu)
PS105-4201 Grant Line Rd, New Albany, IN 47150 USA
http://www.ius.indiana.edu/jh= olly

Hollingsworth is an associate professor in computer science at = Indiana University Southeast. He has a Ph.D. in software engineering from the Reusable Software = Research Group (RSRG) at The Ohio State University (1992). His research focuses on various = aspects of software engineering and reuse including design-for-reuse, formal methods, = testing and education, to name a few. He has authored several technical papers on related = topics in software engineering. He also serves as a consultant to Holly Software, Inc. = where he has used the RESOLVE and RESOLVE/C++ to develop a number of commercial software = packages.