The Role of Domain Knowledge

The kind of support for understanding and adaptation that we envision depends on domain knowledge ¹ - concepts like message acknowledgement, connectionless sockets, and message encryption for a message processing domain. There is empirical evidence that developers learn, remember, and employ chunks of domain knowledge both in constructing new programs and in understanding existing ones [Solo 84]. There is further evidence that expert developers work with chunks that embody domain-specific knowledge, and that it is this knowledge of mappings between domain concepts and programming concepts that is essential to their expert performance [Broo 87,Curt 87].

The key idea is the explicit representation of this domain knowledge as the deeper structure or pattern map behind a program. This pattern map shows in detail how individual parts, whether disparate statements, contiguous statements, procedures and the like, are interrelated. One way to view a pattern map is as a set of instances of pre-defined patterns of programming/domain knowledge that have been composed according to certain rules about data flow, control flow, and other constraints. A number of recent research projects have attempted to account for the structure of software in just this way, and we intend to build upon this work. These projects include the Programmer's Apprentice [Rich 88], PROUST [John 85], and PAT [Hara 90].

To give an example, there is a pattern that describes one way to reassemble messages from individual packets when packets are expected to arrive out of order. This pattern shows how two abstract data types, one for a translation table and one for random access memory (RAM), are used with specific information from the message itself. At the center of this pattern is the strategy for handling temporary storage of incomplete messages. The message id is used in a call to the translation table ``lookup'' operation, giving a starting address for temporary storage of the message until all its packets are received. The message sequence number is multiplied by the packet size to give an offset , which is added to the starting address. This new address, along with the content of the current packet, is used in a call to the ``store'' operation of the RAM data type This pattern, which shows how certain data flows from the message to calls on operations of the two abstract data types, makes a mapping between two computer science concepts (translation table and RAM) and concepts from the message processing domain.