Cornwell - WISR9 Position Paper

Scoping the Task and Application Domain for Knowledge Acquisition

Patricia Cornwell

Hewlett-Packard Company

1501 Page Mill Road, MS 1U-4

Palo Alto, CA 94304-1100

patricia_cornwell@hp.com

650-857-6821 (office)

650-852-3732 (fax)

Abstract

Borrowing from past research in AI-based knowledge modeling, scoping the domain for knowledge acquisition can be aided by making a clear distinction between the scope of the task and the scope of the application domain in which the task is carried out. Our previous work in defining a domain analysis method integrated the scoping of the task with the scoping of the domain. Enhancing this method with separate consideration of the task scope and the domain scope is especially helpful for knowledge-based software engineering approaches to domain engineering. Nevertheless, in most product line cases there is no business benefit to having a truly generic task model. Therefore, we recommend that the task and application domain be modeled separately but inter-dependently in the domain analysis.

Keyword List: Domain analysis, domain engineering, task-structured analysis, knowledge modeling, domain modeling, domain ontology, task ontology

Personal Goals for WISR Participation

Re-establish participation in the Reuse community

Update understanding of state-of-the-practice in software reuse technologies, especially component-based reuse

Contribute in-the-trenches experiences of the past few years as a company-internal reuse consultant

Working Groups of Interest

Product Line Architectures

Component-based Reuse

Domain Analysis

1 Background

During most of the past decade, I served as a company-internal consultant on software reuse. In that capacity, I developed a domain analysis method [Cornwell96], based on earlier work with Mark Simos [Simos95]. Product design teams in Hewlett-Packard applied the method to produce product-line architectures, to capture domain knowledge, and to evaluate the reusability of existing software work products. As a consultant, I worked closely with the product design teams with the goal that they would be able to apply the HP DA method in the future without need for a consultant. This involved intense consulting to transfer deeper knowledge about domain analysis97that knowledge that comes from years of applying the method in diverse environments and developing tacit knowledge about what to apply when. During that time, the method was applied in diverse HP businesses, none of which thought of themselves as a "software business," even when the majority of their engineering staff wrote software or firmware. In parallel, I served as an active participant in the development of the STARS Conceptual Framework for Reuse Processes. I co-authored Volume 2 of the CFRP documentation, as well as making contributions to Volume 1. More recently, I rejoined HP Laboratories to lead research in knowledge representation and reasoning technologies for codified knowledge that is both task- and domain-specific. The targeted research transfer group thinks of itself as a "knowledge management" organization, even though the knowledge is often captured in executable code. This organization has even less traditional software engineering experience and consists of domain experts. Our research focus is in knowledge/software creation and evolution technologies that are easy for the domain experts to use. That is, the domain engineering tasks employ technologies that do not require a degree in computer science or software engineering, but can assume some practical software development experience.

2 Position

The domain of focus in a domain analysis can be thought of as an intersection of the task being accomplished by the software and the range over which the task operates. Our previous domain analysis work called out specific models for user modeling, context and environment modeling and capability modeling.[Cornwell96] However, the conceptual and capability models were a blend of the task and the domain of application. Some DA examples in the literature capture feature models [Kang90] or other representations that blur the boundaries between the task and the range of application. Our capability models were guilty of exactly this lack of clarity. When building conceptual models, in some domains we would model the subtasks as the "bubbles" and the objects of the subtasks as "arcs." For other models, we modeled the objects as the "bubbles" and the subtasks as the "arcs." However, when it came time to do a more detailed capability model, the typically hierarchical decomposition of the domain led to some mixed-focus representations that confused the domain experts, analysts, and targeted users of the models. More commonly, the domain analysts would focus in the domain expert92s comfort zone: talking about either objects or procedures, with the other dimension under-modeled.

2.1 A Structural Linguistics Interpretation

Simplistically, we can think of a brief description of the domain as a combination of the verb-task, the direct object-application domain, and the indirect object-user/agent in some environmental context. Using the HP DA method, we have often elicited the capabilities of the domain by asking the domain experts, "What do systems in the domain do?" and probed for transitive verbs and objects (e.g., "The domain structures outlines85" or "The domain interprets measurements85"). However, the resulting descriptions made it easy to spot the people who were procedural programmers. They defined the domain with transitive verbs, but neglected to state the direct or indirect objects. OO advocates were just as likely to draw pages of object-bubbles, leaving unclear what task and subtasks the overall domain could handle. The use of passive voice further undermined scoping of subdomains. "Measurements are interpreted." leaves the question of who (which subdomain) unspecified. In our experience this led to many problems. In the extreme, software engineers made inconsistent decisions about which subtasks were handled by which part of the system and the resulting implementation had interface inconsistencies or implementation inconsistencies. We made significant improvements in the quality of the domain architectures and system architectures by repeatedly asking "Who does what to whom?" Unfortunately, we did not follow through by ensuring that the conceptual and capability models had a clear depiction of the task-subtask model and the application-domain model. In effect, our models committed the graphical equivalent of using passive voice or intransitive verbs.

2.2 Task-Based Modeling

{Chandra92] describes a knowledge modeling method in which a generic task (e.g., diagnosis, design, control) is decomposed into subtasks. The subtasks identify what needs to be done, but not how. Rather, the subtask may be implemented by one of a set of alternative methods. These methods may in turn be decomposed into a set of subtasks, which in turn can be implemented by one of a set of alternative methods. The task-structured analysis method is intended to be employed at the time of domain analysis (although they don92t use that term), leaving the system architect to choose an optimal method for each subtask at the time of design of a specific system. The authors envision the benefit of the task-structured analysis to include the delineation of a generic task hierarchy that could become widely reused across domains. [p. 135] We have found that such general descriptions of a task like diagnosis or design are trivial to construct, compared with understanding and designing the application-specific aspects of the task ontology or its design equivalent. Rather we92ve found that domain experts are most productive with domain analysts if the task analysis is scoped to match their intended range of uses from the beginning of the analysis. For that reason, a practical task-structured analysis will not have as its "root" a general class of task, like diagnosis or design, but rather will have some narrowed class of task, such as on-line distributed system diagnosis or mixed digital-analog circuit design. There is still plenty range of variation to be identified and analyzed, even within this narrower scope. And the domain experts are far better equipped to hold forth on just what the commonality and range of variation is within the narrower task scope.

2.3 Application Domain Modeling

For ease of analysis, we focus the application-domain modeling on those concepts and objects over which the task operates. For example, in on-line distributed system diagnosis, there is a complete domain ontology to be constructed about distributed systems. Just what is the scope over which such on-line diagnostic applications might operate? In the test and measurement field, this is usually referred to as the "System Under Test" and is worthy of a very careful modeling effort that covers the range of devices and connections, plus the behavior of the devices, subsystems, and the system as a whole. In theory, the separation of task model from application domain model would enable us to reuse the domain ontology for diagnosis tasks, design tasks, or perhaps some other task. However, we believe that the domain model needs to describe a task-specific view of the domain. That is, how does the distributed system appear to the on-line diagnosis task? For that targeted task, what can the application "observe" about the distributed system and what is hidden from view or irrelevant? Even on the surface, it seems clear that this view would potentially be quite different from the view required in order to design such a distributed system.

2.4 Advantages

This position paper contends that the activity of giving specific and separate attention to the task and application-domain models supports better clarification of the multi-dimensional scope of the domain. Nevertheless, those models should not be constructed independent of each other. The construction of distinct models for the task and the application domain ensures that the analyst and domain experts give due consideration to what systems in the domain do, as well as clearly identifying the objects on which those systems operate. However, generic task modeling and generic application domain modeling seem to go too far, i.e., often to the point where the cost for model construction includes a lot of wasted analysis, irrelevant to the range of targeted uses for the models. In addition, attempting a generic task model or application domain model often stretches domain experts and domain analysts beyond their areas of competence. Rather, by carefully narrowing the scope of the task model and application domain model to the range of targeted uses, we can ensure that the domain models will contain just the needed knowledge about the task and application-domain and that the quality of that modeled knowledge will be reflective of the expertise of the domain experts and analysts. The HP Domain Analysis Method recommends that the analysis produce a Domain-of-Focus Statement, Megadomain Context Model, Physical Environments Context Model, Users Model, Utilizers Model, and Domain Lexicon. These remain important components of the DA. By modeling the task and the application-domain separately, we ensure that both are clearly scoped. This supports higher quality in the domain models and derived domain engineering work products.

Comparison

The position statement is written in a comparative style. It mentions comparisons with

Task-Structured Analysis for Knowledge Modeling [Chandra92]

Feature-Oriented Domain Analysis [Kang90]

HP DA Method [Cornwell96]

The recommendation presented in this position statement encourages separate clarification of task model and application domain model, which was not adequately handled by FODA (or JODA [Holibaugh92], for that matter) and the published version of the HP DA Method. The position further argues for a more practical approach at this separation than has been promoted by generic task architecture or task-structured analysis methods.

Furthermore, we continue to support the need for domain analysis to include other models, like the user models, context and environment models. A task-oriented user model [Hoppe92] contrasted with a domain-specific task model is a helpful way of clarifying the boundary or scope of what the user does and what the domain does.

References

[Chandra92] Chandrasekaran, B. et al. "Task Structured Analysis for Knowledge Modeling" in Communications of the ACM, September 1992, pp. 124 96 137.

[Cornwell96] Cornwell, Patricia Collins (1996). "HP Domain Analysis: Producing useful models for reusable software" in Hewlett-Packard Journal, 47(4), pp. 46 96 55.

[Holibaugh93] Holibaugh, Robert, "Joint IntegratedAvionics Working Group (JIAWG) Object-Oriented Domain Analysis Method (JODA)", Special Report, CMU/SEI-92-SR-3, Version 3.1, November 1993.

[Hoppe92] Hoppe, Ulrich & Franz Schiele, "Towards Task Models for Embedded Information Retrieval" in Proceedings CHI9292, Association for Computing Machinery, 1992, pp. 173 96 180.

[Kang90] Kang, K.C. et al. "Feature-Oriented Domain Analysis (FODA). Technical report, CMU/SEI-90-TR-21, November 1990, Software Engineering Institute, Pittsburgh, PA.

[Simos95] Simos, Mark A. "Organization Domain Modelling (ODM): Formalizing the core domain modelling life cycle" in Proceedings of SSR9295, Seattle, WA.

[Stars93] STARS Conceptual Framework for Reuse Processes (CFRP), Technical Report STARS-VC-A018/001/00, Version 3.0, Vol. 1 & 2, Paramax, 1993.

[Studer98] Studer, Rudi et al. "Knowledge Engineering: Principles and methods" in Data & Knowledge Engineering, 25(1998), pp. 161-197.

Biography

Patricia Collins Cornwell has led the research, development, and testing of the HP domain analysis method. She was awarded a BA degree in mathematics and philosophy from Dickinson College and an MSEE degree from Stanford University. Prior to focusing on software reuse at Hewlett Packard, Patricia conducted research on speech recognition systems, speech synthesis systems, and object-oriented software engineering environments. She also served as research project manager for work in multithreaded operating systems at HP. Since January 1998, Patricia has served as research project manager for knowledge representation and reasoning in the Data Mining Solutions Department in HP Laboratories.