Reusability of formal specifications in programming language description

Marjan Mernik, Viljem Zumer

University of Maribor
Faculty of Electrical Engineering and Computer Science
Smetanova 17, 2000 Maribor, Slovenia
Email: marjan.mernik@uni-mb.si
zumer@uni-mb.si

Abstract:

Compiler construction is often mentioned as one of the few really systematically managed disciplines. There is a long tradition of producing compilers, underlying theories are well understood and there exist many application generators which automatically produce compilers or interpreters from programming language specifications. In spite of this, currently used formal methods for programming language description are not modular, extensible and reusable. Since programming language design is an iterative process, designers need partial descriptions that can easily be extended and modified. In the paper our approach to reusable semantic specifications is described. The concept of semantic specification reuse for programming language design is a promising field of study which should contribute to better modularity, clarity, reusability and a general use of formal methods. In addition, experience gained in reusability of programming language specifications can also be applied to other software systems.

Keywords: Reuse, formal semantic specification, programming language design

Workshop Goals: Discuss reuse of specification, learn more about current reuse technology from academia and industry, meet other researchers and practitioners of software reuse

Working Groups: Rigorous Behavioral Specification as an Aid to Reuse, Domain Engineering Tools

Background

We have been using formal methods for programming language description in designing new industrial languages, in teaching computer science courses and in compiler and interpreter generation since 1989. More recently we became interested in reusability of formal methods for programming language description because this is, in our opinion, a great opportunity for formal methods to be finally of general use.

Position

The advantages of formal methods for programming language description are well known such as: they describe syntax and semantics in a precise and unambiguous manner, they offer the possibility of automatically generated compilers or interpreters and serve as a tool for programming language development and design. Programming languages which have been designed with one of the various formal methods have better syntax and semantics, few exceptions and are easier to learn. However, despite of obvious advantages none of the currently most widely used formal methods for programming language description such as attribute grammars, axiomatic semantics, operational semantics and denotational semantics have gained more popularity and general use. Some reasons are: semantics is much more difficult to describe than syntax and semantics description is not easy to read. However, we are convinced that the real problem with these formal methods are the lack of modularity, extensibility and reusability. There is a need for modularizing the semantics. Moreover, when we observe the semantics of different languages, even those based on different paradigms, there are various commonalities between them which are apparent when we get beyond the syntax level. A framework which addresses this issue and facilitates the reuse of such features is needed. Because neither of above mentioned formal method for programming language description supports modularity, extensibility and reusability they need some adaptation towards better reusability. In the workshop our approach to reusability in attribute grammars will be represented.

An attribute grammar is a generalization of contex-free grammars in which each symbol has an associated set of attributes which carry semantic information, and with each production a set of semantic rules is associated. We have developed a tool called LISA [1] which automatically produces compiler or interpreter in C++ from attribute grammar. The tool LISA can be classified as a kind of application generator [2] where the complete software system design is reused, algorithms and data structures are automatically selected, so the software development can concentrate on what the system should do rather than how it is done. The knowledge of lexical, syntax and semantic analysis is embedded in the compiler/interpreter generator, which is reused each time a compiler is generated. However, we went a step further and reused also semantic specification. We have applied the object-oriented approach to attribute grammars in a unique manner. Attributes representing semantic information can also be objects which can be inherited and specialized depending upon semantic necessities of the language. Some of the typical object attributes that can be reused are the environment, store, configuration, type and parameter passing mechanism of the language. Yet another nonexploited possibility of reuse in LISA are reuse of semantic rules, which are associated to production rules of contex free grammars.

Our position is therefore that formal methods for programming language description are not generally used because of their lack of reusability. Fortunately, techniques and methods exist, which enable that current formal methods become reusable. This can lead to a general use of formal methods, and more formal definitions of languages and compiler designs than developing compilers from informal descriptions. The project started by developers of denotational semantics in early 1970's to provide formal semantic description of all major programming languages is still not finished. We believe that reusable semantic specification is going in the right direction and that the project will be finished at last. On the other hand, experience gained in the reusability of programming language specification can be also applied to other software systems since current reuse approaches provide very limited support to specification reuse.

Comparison

Recently, many researchers are working on reusable semantic specification from which compilers or interpreters can be automatically produced. Some work has also been done on reusing the components of a compiler for one language to develop a compiler for another language [3]. But they do reuse only at the source-code level. However, we agree with [4] that the reuse of specifications is an area with large potential for benefits which is also realizable. Since most of the widely used formal methods for programming language description are hard to modularize, there are two possibilities: either to adapt the existing formal method to attain better modularity, extensibility and reusability or to design and invent a new formal method which is modular, extensible and reusable. It seems that many researchers, including us, choose the first solution. An interesting approach to modularity and reusability of denotational semantics is described in [5] where various semantic domains are encapsulated properly in the semantic algebra. Therefore the semantic functions and equations can be described without worrying about the internal structure of the different semantic domain. With this approach the specialization of one language can be inherited and specialized in order to define the semantics of another language, which is useful in designing new languages [6]. In [7] another approach to reusability of semantic specifications is presented. The authors use formal specification languages such as VDM and Z and apply the object-oriented approach to Z language. With Object-Z language the abstract syntax, static and dynamic semantics of an individual language construct are typically defined in one class such that the semantics representation is structural. Not only does this help the readability of the semantics, but if the language is enhanced the corresponding semantic modifications can be captured by minimal disruption of the existing semantics. Furthermore, it is also possible with this approach to reuse parts of semantics specification of one programming language to define another. A modular framework of attribute grammars is also presented in [8], where with remote attribute access and inheritance, an attribution module is defined and can be reused in a variety of applications.

On the other hand, action semantics [9] is invented as a direct response to the disadvantages of pragmatic aspects of denotational semantics such as lack of modularity, extensibility and reusability. An action semantic description is extremely modular, providing a high degree of extensibility and reusability. But action semantics is still evolving. The theory for reasoning about actions is still rather weak, and needs further development.

References

1
M. Mernik, N. Korbar, and V. Zumer, ``LISA: A Tool for Automatic Language Implementation,'' ACM Sigplan Notices, vol. 30, no. 4, pp. 71-79, 1995.

2
C. W. Krueger, ``Software Reuse,'' ACM Computing surveys, vol. 24, no. 2, pp. 131-183, 1992.

3
M. Ancona, G. Dodero, and G. Clematis, ``Reusing a Compiler,'' in Proceeding of ACM Symposium on Applied Computing, pp. 82-87, 1994.

4
J. C. Knight and D. M. Kienzle, ``Reuse of specifications,'' in WISR-5, pp. 767-775, 1992.

5
V. Vaidyanathan, ``Modular Semantic Specifications to Interpreter Implementation in C++,'' in Proceedings of 34th Annual ACM SouthEast Conference, pp. 236-241, 1996.

6
V. Vaidyanathan and B. R. Bryant, ``Formal Semantics Reuse for Different Programming Languages,'' Tech. Rep. CIS-TR-96-010, Department of Computer and Information Sciences, University of Alabama at Birmingham, 1996.

7
J. S. Dong, R. Duke, and G. Rose, ``An Object-Oriented Approach to the Semantics of Programming Languages,'' in Proceedings of 17th Australian Computer Science Conference, pp. 767-775, 1994.

8
U. Kastens and M. W. Waite, ``Modularity and Reusability in Attribute Grammars,'' Acta Informatica 31, pp. 601-627, 1994.

9
P. D. Moses, Action semantics. Cambridge Tracks in Theoretical Computer Science, Cambridge University Press, 1992.

Biography

Marjan Mernik is a teaching assistant at the Faculty of Electrical Engineering and Computer Science, University of Maribor. He received M.Sc. in Computer Science from the University of Maribor and is currently Ph.D. candidate at the same university. His research interests are in formal methods for programming language description and their reusability, principles and implementation of programming languages.

Viljem Zumer is a professor of Computer Science at the Faculty of Electrical Engineering and Computer Science, University of Maribor. He is a founder of Computer Science Department at University of Maribor and a leader of many projects. His research interests are in computer architecture and programming languages.