Porter - WISR9 Position Paper

Building Concept Representations from Reusable Components

Bruce Porter

Computer Science Department

University of Texas

Austin, Texas 78712

Porter@cs.texas.edu

Peter Clark

Research Division

Boeing Company

Seattle, WA 98124

Clarkp@redwood.rt.cs.boeing.com

Abstract

Our goal is the construction of knowledge-based systems capable of answering a wide range of questions, including questions unanticipated when the knowledge base was constructed. Our approach to achieving this goal is to develop ways of building knowledge bases from reusable components, and to develop a computational mechanism for ``plugging together'' representational components. One of our target domains, which we use for illustration here, is bioremediation: the removal of toxic waste using micro-organisms that convert pollutant into harmless bi-products.

1 Background

Our goal is the construction of knowledge-based systems capable of answering a wide range of questions, including questions unanticipated when the knowledge base was constructed. Our earlier research on one large-scale project -- the Botany knowledge-base project -- shows that if detailed, declarative representations of concepts are available, then sophisticated question-answering performance can be achieved. However, manually constructing such representations is laborious, and this proved to be a major bottleneck in the project; moreover, it is simply not possible to anticipate all the concept representations that may be needed for answering questions.

1 Approach

Our approach to achieving this goal is two-fold. First, we are developing ways of building knowledge bases from reusable components. Each component encodes the objects and relations that describe a generic concept, such as move, produce or contain. Second, we are developing a computational mechanism for ``plugging together'' representational components as they are needed to answer questions and perform tasks.

One of our target domains, which we use for illustration here, is bioremediation: the removal of toxic waste using micro-organisms that convert pollutant into harmless bi-products. For a system to answer a variety of questions about bioremediation (requiring tasks such as description, prediction, and explanation), it needs comprehensive knowledge about the process. Although bioremediation is rather specialized, it's representation can be built from numerous generic concepts, including conversion (in which pollutant is converted into a fertilizer-like compound), treatment (in which microbes are applied to the pollutant), and digestion (in which microbes digest the pollutant). See Figure 1.

Building representations by combining generic components simplifies both knowledge engineering and knowledge-base maintenance. Rather than encode detailed information for each domain concept, knowledge engineers need only specify the components, and their inter-relationships, that comprise each concept. The resulting knowledge base is relatively easy to maintain because encoded knowledge is localized in the components. By changing the conversion component, for example, the knowledge engineer changes the representation of every domain concept that includes that component.

Motivated by earlier work by Batory, Goguen, and others in software engineering, we define a component as a triple <P,A,R> where:

P is a set of participants, denoting the objects involved in the pattern being modeled.

A is a set of axioms, describing relationships among the participants.

R is a set of roles with which the participants can be labeled, each role referring to a different participant. Roles are parameter names, analogous to Lisp's keywords for naming parameters.

Finally, we add a well-defined interface to each component, which enables reference to the objects that participate in the component's axioms. During the workshop, we hope to further explore connections between the application of component technologies to software engineering and knowledge-base construction.

Figure 1:

The concept bioremediation is a composition of multiple concepts, including specializations of conversion, treatment, and digestion.