ftp.umcs.maine.edu

home *** CD-ROM | disk | FTP | other *** search

/ ftp.umcs.maine.edu / 2015-02-07.ftp.umcs.maine.edu.tar / ftp.umcs.maine.edu / pub / WISR / wisr4 / proceedings / detex / kilov.detex < prev next >

Wrap

Text File | 1992-04-05 | 10KB | 195 lines

[12pt] article Reuse of generic concepts in information modeling Haim Kilov Bellcore, MRE-1F 216 435 South Street Morristown, NJ 07960 haim@bcr.cc.bellcore.com Position Statement Two major issues have existed for some time in system development: (1) the information to be processed should be understood, and (2) the complexity of existing systems precludes understanding. As early as in 1972, E.W.Dijkstra explicitly acknowledged this and urged programmers to deal only with intellectually manageable problems. Although important lessons have been learned in programming, all too often other levels of system development, in particular, planning and analysis, still remain close to black magic. This need not be the case: ideas from programming methodology -- most importantly, abstraction -- can and should be reused at all levels, and not just in coding. The sequence understand -- specify -- reuse may provide a reasonable framework for the job to be done. Implementation should start only after a clear, precise, and formal contract specification exists. The specification should be abstract enough: irrelevant details like implementation considerations should be suppressed, and the declarative approach (i.e., formulating pre- and postconditions for an operation), well-known in programming methodology , is appropriate (i.e., should be reused) at the analysis level as well. Intellectual economy (reuse rather than reinvention) is possible only if the construct to be reused is understood, i.e., if its specification exists and is precise and explicit. However, this need not mean top-down waterfall development: a higher-level primitive may be built from lower-level ones that already exist. The most important difference between a typical contract considered from a programming language viewpoint and a contract considered from an information management viewpoint is the existence of explicit inter-class relationships. Namely, the pre- and postconditions refer not just to properties of their class, but also to properties of other related classes visible to the client of the class. An operation for which the contract is specified may be spanned across several classes . An invariant for a class may include visible properties of objects belonging to other classes, and in this case it may be more proper to consider an inter-class invariant. This approach to contracts may be used both at the generic level (example: ``create a dependent entity'' and at any application domain-specific level (example: ``hire an employee''). Certain generic modeling concepts (entities, relationships, dependencies, etc.) have been used by analysts, often inconsistently. These inconsistencies were due to the absence of formal and implementation-independent definitions of the concepts. As a result, concepts have not been properly understood (a typical remark by a subject matter expert: ``we don't know whether this is a dependent entity or a subtype'') and therefore not properly (re)used. Moreover, structural properties (``data'') and behavioral properties (``processes'') have been artificially separated, leading to unnecessary complexities. We have provided precise and formal definitions of generic information modeling concepts, i.e., defined a reusable class library of entity meta-types (aka object classes), based on pre- and postconditions and invariants . Pre- and postconditions define operations that can be applied to instances of these classes, and invariants are operation-independent conditions associated with collections of classes that must be true at all times outside of any operation on those classes . The invariant, pre- and postconditions usually refer to the properties of more than one of the objects. Whereas in considering isolated objects, it has been possible -- to a certain extent -- to underestimate the importance of precise and formal specifications of behavior, this is not possible anymore for inter-object relationships. The reason is simple: the relationships must be intellectually manageable. As they are substantially more complex than isolated objects, their understanding is possible only by means of encapsulating their implementations and providing explicit and precise, i.e., formal, specifications of their behavior. This approach leads to a clear understanding of concepts, to conceptual simplicity, and also, as an important side-effect, to non-proliferation of different and often shallow definitions for commonly encountered terms. Therefore these concepts can easily be understood and therefore reused both by the customers of the information model (including subject matter experts) and by its implementors. For instance, the definition of a dependent entity includes an invariant: ``the existence of a dependent entity instance implies the existence of an appropriate instance of its parent entity''. Evidently, without understanding of the information model it cannot be correctly implemented and used; programmers will have to introduce their own understanding because a program has to be precise (and in this manner a programmer will have to become a modeler, usually without the benefit of reusing the class libraries of information model components [only exceptional programmers can do that; however, they work within a certain application domain, and the problem of redundant and inconsistent data across different application areas can not be solved in this manner]). Given such concept definitions, information modelers and their customers reuse common concepts independently of methodologies, CASE tools, implementations, etc., both at domain-independent and domain-specific levels. The generic class library described in is extensible: a sufficient number of application domains sharing a common concept leads to the inclusion of an appropriate generic concept into this library. Examples of currently existing -- and reusable -- generic concepts are: ``regular entities'', ``dependents'',/ ``composites'', ``reference entities'', etc. Concepts currently considered for inclusion into the library are exemplified by ``derived entity'', ``version'', etc. Generic concept definitions are based only on primitive Create-Read-Update-Delete (CRUD) operations. Naturally, the signatures, pre- and postconditions of these operations may refer not only to the entity itself, but also to its associated entities (e.g., to create an instance of a dependent entity, references to its parent entity type and instance are needed). Note that an application domain-specific model consists of interrelated objects that may be considered as subclasses of the generic object classes. In this manner, generic properties of an object belonging to a particular domain-specific class (e.g., ``account transaction'') should not be reinvented: they are reused from the definition of its generic object (super)class (e.g., ``dependent'' with respect to ``account''). Our experience with information modeling in Bellcore suggests that the reusable component library of generic meta-types leads to drastically improved understanding of information models. The components of these models become clearly defined and therefore reusable. On the other hand, the granularity of these components is appropriate: the size of the models does not preclude their understanding, especially taking into account that one ``high-level'' entity meta-type can be decomposed into a cluster of interrelated ``lower-level'' entity meta-types. (For instance, a ``document'', being a subtype of a ``composite entity'', may belong, together with its associations, to the high-level model. Components of a document, e.g., pieces of text, tables, pictures, etc., being subtypes of a ``component entity'', may be of no interest -- and therefore invisible -- to the high level model, but will belong, together with their associations, to the lower-level model. In this manner, the high-level model is an abstraction (``suppression of irrelevant detail to establish a simplified model'' ) of the lower-level one.) Naturally, model clustering provides a way of browsing through the model, again, both by its users and implementors. dijk 72 E.W.Dijkstra. The humble programmer. Communications of the ACM, Vol. 15 (1972), No. 10, pp. 859-886. B.Meyer. Object-oriented software construction. Prentice-Hall, 1988. H.Kilov. Generic information modeling concepts: a reusable component library. In: TOOLS '91 (Proceedings of the Fourth International Conference on Technology of Object-Oriented Languages and Systems, Paris, 1991, pp. 187-201). Prentice-Hall, 1991. A Reference Model for Object Data Management. Final Revision. (ANSI Accredited Standards Committee. X3, Information Processing Systems.) Document Number OODB 89-01R8. August 10, 1991. E.W.Dijkstra. A discipline of programming. Prentice-Hall, 1976. Basic Reference Model for Open Distributed Processing Q Part 2: Descriptive Model. Committee Draft ISO/IEC CD 10746-2. ISO/IEC JTC1/SC 21 N 6079. 1991-07-24. About the Author Haim Kilov has been through all stages of various software (compilers, preprocessors, post-relational DBMS, etc.) design and development -- from initial conception to actual implementation and release, and also has been engaged in research, development, and consulting in advanced information modeling. He is currently involved in information modeling as a Member of Technical Staff at Bellcore (Morristown, NJ). His approach to creating a reusable library of generic object classes for this purpose has been widely used in actual modeling activities within Bellcore. He is also a member of the ANSI X3 Database System Study Group and its subgroups (Object Database Task Group and OSI/Database Task Group); he is one of the Editors of the reports of the Object Database Task Group and one of the active contributors to the Object Data Management Reference Model. He is a member of the Editorial Board of ``Computer Standards and Interfaces'' and has been a Program Committee member of several domestic and international conferences and workshops on information management. He has a significant number of published papers and reviews in the database and information modeling areas. His current interests and experience are in the areas of information modeling and programming methodology.