home *** CD-ROM | disk | FTP | other *** search
-
- THE CALC PLUS CLASS LIBRARY
-
- by Vladimir Schipunov,
-
- Copyright (c) 1996.
-
- Version 1.0, April 2, 1996
-
-
-
- CONTENTS
-
- 1. What is CalcPlus library.
- 2. Distribution and warranty notice.
- 3. Installation and running examples.
- 4. How it works.
- 4.1 Lexical analyzer.
- 4.2 Class CType hierarchy.
- 4.3 Class Expression hierarchy.
- 4.4 Language and YACC rules.
- 4.5 Interface with C++.
- 4.6 Modifying source code.
- 5. Known bugs and problems.
- 6. Appendix. Language description.
-
-
- 1. What is CalcPlus library.
- ----------------------------
-
- The CalcPlus library is the C++ class library which provides an ability
- to use your own programming language built into C++ project. Almost any
- complex C++ application needs to be tuned by some external description,
- e.g. INI file. CalcPlus generalizes this approach, any algorithm or any
- constant needed by application can be carried out into the special file,
- when process comes to the key point, it calls function or procedure stored
- in the text file. Interpreter runs the function and process returns back
- to C++ code. Library contains the interpreter which understands simple
- nameless procedural language. Bi-directional communication between C++ and
- the code for interpreter available.
-
- If you develop C++ project and want to provide user access to the algorithm
- of application, its constants or schemes, then probably you can make use of
- the CalcPlus class library.
-
- Version of the language that comes with the library allows to use functions,
- procedures, blocks, preprocessor, global and local variables and constants,
- if/for/while statements. Each variable can have value of type: nil, bool,
- long, float, string, date. Type definitions and arrays are allowed. Functions
- and procedures may be recursive. New functions written in C++ may be easily
- added to the language. Syntax of the language can be modified by changing
- YACC rules. Interpreter is fast enough and may be helpful for many tasks.
-
- Interpreter was successfully used in server application of Btrieve-based
- financial software. Often changing on requests of customers parts of C++
- code were moved into special file, which was interpreted by CalcPlus.
-
- The other application of the library was emulation Clipper machine (except
- "code blocks"), this allowed to debug C-extensions written for Clipper
- in normal C++ environment. (Actually, interpreter has a lot of common
- with Clipper and runs with the same speed on 16-bit platform, on 32-bit
- platform it is faster).
-
- Interpreter was written in 1995 during approximately 3 months. Parts of
- older C++ and YACC code were used. This is the first freeware release.
- Library pretends to be compiler/OS independent. This means you can
- compile it on any OS with any C++ compiler, YACC required. Templates,
- exception handling and RTTI were not used for compatibility with older
- compilers.
-
-
- 2. Distribution and warranty notice.
- ------------------------------------
-
- The CalcPlus Class Library is freeware, you may use, modify and redistribute
- it under the condition that copyright notice is not removed from the source
- code.
-
- NO WARRANTY OF ANY KIND, YOU USE THIS SOFTWARE ON YOUR OWN RISK.
-
- Author:
-
- Vladimir Schipunov, 25 years,
-
- email: vschipun@cammail1.attmail.com
- phone: 1-908-2716881
-
- Any comments, suggestions, extensions, bugs information are very welcome.
-
-
- 3. Installation and running examples.
- -------------------------------------
-
- To install the class library, unzip file calcplus.zip.
- In UNIX platforms you probably will need to use options -a -L for unzip,
- these options are needed to convert file names to lower case and to convert
- DOS text with Carriage-Return-Line-Feed to UNIX text with Line-Feed.
-
- DOS: pkunzip calcplus.zip
- UNIX: unzip -a -L calcplus.zip
-
- This version of archive contains files:
-
- calcexpr.h |
- calclex.h |
- calctype.h |
- yycalc.h |
- calcexpr.cpp | C++ source code
- calclex.cpp |
- calctype.cpp |
- calcplus.cpp |
- calclib.cpp |
- yycalc.yac YACC source code
- calc.mak makefile for DOS
- gcalc.mak makefile for UNIX
- readme short description of the archive
- calcplus.txt this file
- hello Hello, world!
- example selfcheck and example of program
- prime example of program
- pi example of program
-
-
- To build the interpreter you will need YACC and C++ compiler.
-
- DOS: Command line options for some widely used C++ compilers for DOS are
- written in the file calc.mak. Uncomment or change CC and LINK to your
- C++ compiler and specify your version of YACC if necessary.
- Run make utility:
-
- make -f calc.mak
-
- UNIX: gcalc.mak is the makefile for GNU C++, simply run:
-
- make -f gcalc.mak
-
- Library was carefully tested for use with many versions of popular compilers:
- Borland, Watcom, Microsoft, Zortech, GNU. Generally, you should not have
- problems with building the interpreter. However, if you have problems,
- please contact me.
-
- The first thing you should do after building the interpreter is to check it:
-
- a) calc example
-
- This is the primary check of correct work of the interpreter.
- File 'example' contains most of interpretable syntax constructions.
- If version works correctly output should be:
-
- {0,1,4,9}
- {0,1,4,9,{1,2,3}}
- {0,1,9,{1,2,3}}
- 1 7 255
- {a,b,c}
- {a,{d,e,f},c}
- {a,{d,TRUE,f},c}
- 3 ab 1
- exiting...
-
- b) calc pi [Number of iterations]
-
- If interpreter works right you should see
- something around PI number.
-
- c) calc prime [Upper limit]
-
- This test calculates prime numbers below
- the upper limit, 1000 by default.
-
- Command line switch /d can be added to trace the program, e.g.:
-
- calc prime 100 /d
-
- Note, that interpreter uses recursive algorithms and requires enough
- space on the stack. So, if you get runtime error 'stack overflow',
- increase the stack size.
-
- If interpreter works right, you can begin its adaptation for your own tasks.
- Here is the general description of the library.
-
-
- 4. How it works.
- ----------------
-
- First of all, interpreter designed using YACC (yet another compiler of
- compilers). Hand written lexical analyzer takes input from the file(s),
- yyparse() function processes input and builds program as the tree of
- instructions. Each node of the tree has its own value, execution goes
- from the child nodes to the root.
-
- So, CalcPlus consists of:
-
- lexical analyzer,
- yacc parser,
- basic types hierarchy,
- hierarchy of language instructions.
-
-
- 4.1. Lexical analyzer.
- ----------------------
-
- In general function YLex::yylex() is the traditional translator of input
- stream into tokens for yacc parser. Class YLex takes care of token
- analysis, simple preprocessing is performed. However, there are some
- features listed below.
-
- Tokens are divided into two parts - the first contains key words, signs of
- arithmetic operations, etc., the second part contains tokens returned
- by descendors of YLex class. Overloadable method YLex::__name() may
- find the word in the lists of already defined symbols: functions,
- procedures, structures, variables or constants. At the first case
- token lx*** used, at the second - yy***.
-
- Several simple container classes provide storage and linear search
- of the objects and references to them. There is also stack container,
- objects stored in stack are destroyed after pushing them, so often
- only references to the objects are pushed into stack.
-
- Preprocessing is performed by pushing into the stack of input streams.
- Only 'include', 'define', 'ifdef', 'endif' directives are supported.
- Input stream is an 'ifstream' for file input and 'istrstream' for
- preprocessor.
-
- Symbol '->' is translated as 'implementation'. When interpreter
- finds statement 'A -> B' it assumes that 'if A then B end' occurred.
-
- Strings separated by both '...' and "..." are allowed.
-
- Comments are of C++ style:
-
- // This is comment
- /* Another comment */
-
- Case of letters is insignificant.
-
- In all other senses lexical analyzer acts like any other yylex function.
-
-
- 4.2. Class CType hierarchy.
- ---------------------------
-
- Class CType is the base class for all classes corresponding to the CalcPlus
- types. They are: CNil, CBool, CLong, CDouble, CString, CDate, CArray.
- These types may be used for writing program for the interpreter. When
- yylex() finds immediate value in the program, it allocates the object of
- appropriate type. For instance, following constants correspond to types:
-
- 2 CLong
- 1.2 CDouble
- true CBool
- 'aaa' CString
- {1,2,3} CArray
-
- New types may be added by simply adding new class inherited from the base
- abstract class CType. Values of new types may be returned by C++ functions
- or yyparse() may be changed to translate input tokens into the instances
- of new class. Class CType has a good number of pure virtual methods,
- some of them actually may be set dummy.
-
- First of all, class should provide its identification. Unique numeric
- identifier is received from enumeration in file calctype.h. This identifier
- should be used in the constructor of CType and returned by method type().
- Method name() is the symbolic identification of the class.
-
- Method copy() returns pointer to the copy of the object allocated
- by operator new. Usually this is call of copying constructor:
-
- CType* copy() const { return CNewClass( *this ); }
-
- Input and output methods should be overwritten as well. They are:
-
- void print( ostream& ) const;
- void get( istream& );
-
- Other important methods are comparison and assignment:
-
- CType& operator =( const CType& t );
- compare( const CType& ) const;
-
- In order to provide standard implementation of these operations for
- simple objects represented by sequence of bytes, methods data(), size()
- and ptr() are used. These methods are intended to give an access to binary
- data of objects. The only difference between data() and ptr() is that data()
- is const method and it should be used in most cases rather than method ptr().
-
- const void *data() const;
- void *ptr();
- size() const;
-
- Class CArray is only non-primitive example of data type. It implements
- data() and ptr() methods to return null pointer and overloads methods of
- comparison and assignment. Arrays may be indexed by strings. Field
- CArray::structure points to the array of CString's, operator[](const char*)
- looks for the pattern string in the index array and returns reference
- to the object of the array, which has the same index as the element of
- CArray::structure. String indexes are used with structure types:
-
- struct abc {a,b,c};
- abc a;
- echo a.b;
-
- This will be translated as the CArray indexed by CArray {'a','b','c'}
- and echo a['b']. Such definitions may be useful though they are not
- strict. No optimization made for speed of the index search.
-
- Classes of the CType hierarchy are relatively simple. For details of
- realization refer to the source code in file calctype.cpp.
-
-
- 4.2. Class Expression hierarchy.
- --------------------------------
-
- Classes of Expression hierarchy are the main part of the interpreter.
- All lexical input after parsing is converted into the tree of the
- instances of Expression inheritants as the nodes.
-
- Each node, as it is the instance of Expression, has:
-
- 1) field 'flags' which shows the current state of the node
- 2) field 'v' which is the pointer to the value of the node
- 3) method 'Calc' which is called every time when node is being
- calculated
-
- Let us see what the process actually does during interpretation. Recursive
- method Expression::Calculate runs first at child nodes, if the execution
- of child nodes was not interrupted by setting non-zero flags, then method
- Calc() of the node will be called. Calc() refers to the values of its child
- nodes. For instance, if we have inherited class Addition from the abstract
- class Expression, then method Addition::Calc() could look like:
-
- void Addition::Calc()
- {
- *v = *child[0]->v + *child[1]->v;
- }
-
- More precisely, we should check the type of arguments, because this is an
- interpreter and there is no type checking in compilation time.
- So, the method rather should be:
-
- void Addition::Calc()
- {
- if( child[0]->type()!=idLong ||
- child[1]->type()!=idLong )
- {
- flags = exError;
- return;
- }
- delete v;
- v = new CLong;
- (CLong&)(*v) = (CLong&)(*child[0]->v) + (CLong&)(*child[1]->v);
- }
-
- Process is controlled by the state of bit flags of the nodes. There are
- normal flags, like indication of function return or exit from while/for
- statement, and flags showing that runtime error occurred. Flags are copied
- from the children to parents. Analysis of the state of flags can show the
- location of node where there was error.
-
- Every statement written in input language is translated to the instance
- of corresponding class of Expression hierarchy. This is the picture of
- the hierarchy.
-
-
- Expression base class
- |
- |-------XImmediate immediate value
- | |
- | +-------XEndl line-feed constant
- |
- |-------XBreak exit from for/while
- |
- |-------XAr1 unary arithmetic
- | |
- | +-------XBool1 unary boolean
- |
- |-------XAr2 binary arithmetic
- | |
- | +-------XComparison comparison
- | |
- | +-------XBool2 binary boolean
- |
- |-------XVariable variable
- |
- |-------XEcho output on cout
- |
- |-------XConditional if ... then ... else ... end
- |
- |-------XLoop like 'continue' in C
- |
- |-------XWhile while/for
- |
- |-------XBlock begin ... end
- | |
- | +-------XFunction func/proc ... end
- | |
- | +-------XUserFunction external C++ functions
- |
- |-------XCall function call: f(1,2,3)
- | |
- | +-------XDynamic function by name: &('f')(1,2,3)
- |
- |-------XReturn return expr
- |
- +-------XSet arrays: {1,2,{1,2,3}}
-
-
- Though C++ is much more easier language than English, and full description
- of used algorithms and methods can be found in file calcexpr.cpp, we will
- discuss in the next paragraphs some non-obvious points of architecture of
- the Expression hierarchy.
-
- For storage values of variables class Var is used. There may be many
- references (nodes of the tree) to the variable in different expressions,
- each reference is the instance class XVariable. Class XVariable has fields:
-
- PrintObj* obj;
- CType** ptr;
- int ref;
-
- Field obj is used for debugging output, ptr for setting new value to
- the variable. Field ref used as flag for passing argument to the function by
- reference. This field is set by XCall class temporally while function works.
-
- Method XVariable::Calc() acts in different manners depending of the number
- of child nodes. We assume, that if the number of children is zero, then
- this is the usage of variable inside of other expression, e.g. (x+y). If
- node of XVariable class has only one child node, then this is the operation
- of assignment: x := expr. When the number of child nodes is two, this is
- array element inside of the expression; (x[i]+y[j]). The only case left is
- three child nodes - this is assignment to the array element: x[i] := expr.
- child[0] is considered to be an array, child[1] index in the array,
- child[2] is expression which is assigned.
-
- Class XBlock is the composite expression consisting of a number of
- subexpressions. In the input language corresponding statement is:
-
- begin
- expr1
- expr2
- ...
- exprn
- end
-
- Every block bounds visibility of variables defined inside of it.
- That is why class XBlock has fields: vars, funcs and structs.
- Actually list of functions used only in global context.
-
- Function is the block, its arguments are local variables. Function
- has only one child node of class XBlock, which is the body of the function.
-
- Like XVariable references to Var, XCall references to XFunction.
- Arguments for function are child nodes of XCall. Method XCall::TieArgs()
- is called twice. The first call is to assign values of the arguments
- to the local variables of XFunction. The second is to assign back
- values for arguments passed by reference.
-
- It is easy to see that all algorithm with keeping temporary results
- of calculation in the nodes of the tree does not allow recursive
- calls of functions. To remove this problem method Expression::Recursion
- used. There is a stack of pointers to the values. By the signal, all
- subnodes of the function put their values onto the stack. This action
- is synchronized with the passing arguments in method XCall::TieArgs().
-
- There is no separate class XFor. Class XWhile provides both types of
- iterations: for and while.
-
- Most of other classes are obvious and intuitively clear.
-
-
- 4.4. Language and YACC rules.
- -----------------------------
-
- Class CalcPlus is derived from the class YLex. It overloads method yyparse()
- for YACC parsing of the tokens from the input stream. Language is described
- by the set of rules for YACC, generally, every rule simply translates
- its arguments to the appropriate instance of class Expression hierarchy.
- For correct context handling, stack mechanism is used. Each recursive
- syntax construction has corresponding stack container in the class
- CalcPlus, they are:
-
- LexStack Blocks; // blocks
- LexStack Calls; // function calls
- LexStack Cond1; // 'if' part of the condition
- LexStack Cond2; // 'else' part of the condition
- LexStack Sets; // sets
- LexStack Idx; // array indexes
-
- Often used definitions XBEG, XEND, XSEQ are intended for handling
- current block context. When new variable defined, we store it in the
- list of variables of the current block. if/else/for/while statements
- have implicit blocks inside of them:
-
- if a then
- a:=b;
- end;
-
- This is actually translated as:
-
- if a then
- begin
- // local variables may be defined here
- a:=b;
- end;
- end;
-
- Class CalcPlus overloads method __name and searches for symbols that are
- already defined. This makes syntax analysis easier. Method Link uses
- recursive tree search for connecting XCall nodes with the XCall.
- Simple diagnostic is done.
-
- Method UserSym() is not implemented. It was initially added for different
- extensions. For example, we change method __name to translate symbols
- beginning with letter '@' as lxUser:
-
- Token CalcPlus::__name()
- {
- Token t = YLex::__name();
- if( t == lxName && *Lex == '@' )
- {
- Expression* e = new XImmediate( new CString( Lex+1 ));
- YYLVAL( e );
- return lxUser;
- }
- ...
- }
-
- When function yyparse() gets such token it calls method UserSym().
- There two possible calls: with one or two arguments. One argument,
- if token is detected on the right side of assignment, two arguments,
- if token is on the left. Possible implementation of method UserSym may be:
-
- Expression* CalcPlus::UserSym( Expression *e1, Expression *e2 )
- {
- Expression *e = new XEcho;
- if( e1 ) e->Add( e1 );
- if( e2 ) e->Add( e2 );
- return e;
- }
-
- So, the statement "@Hello := ' World!';" will print: Hello World!
-
-
- 4.5. Interface with C++.
- ------------------------
-
- It is possible both to call C++ code from interpreter code
- as well as to call interpreter functions and procedures from C++.
-
- There are a number of definitions at the end of file calcexpr.h
- to help writing C++ function visible from the interpreter.
- Let's see the implementation of functions EMPTY and GETENV:
-
- USER_FUNC( Empty ) // Is value empty?
- DEF_ARGX( 0, x ) // We don't know the type of argument
- RETURNS( Bool ) // Function returns TRUE or FALSE
- ret = x.empty(); // Getting the result
- USER_END // Done
-
- USER_FUNC( Getenv ) // Reading environment variable
- DEF_ARGV( 0, var, String ) // Expecting string argument
- RETURNS( String ) // Result will be the string also
- const char* s = getenv( var ); // Calling C function
- ret = s ? s : ""; // Check if var is not in env.
- USER_END // Done
-
- Functions and procedures must be registered before running
- the interpreter to make them visible from the program. Function
- UserLib defined in module calclib.cpp performs the registration:
-
- void UserLib()
- {
- RegFunc( "EMPTY", Empty );
- RegFunc( "GETENV", Getenv );
-
- //
- // Other functions
- //
- }
-
- If number of arguments exceed one, it should be passed as the third
- parameter. For procedures DEF_PROC and RegProc are used.
-
- Call of the interpreter function from C++ code is illustrated in file
- calcplus.cpp. Function 'atexit' called when program finishes.
- Method CalcPlus::Call() takes as arguments pointer to function name,
- number of arguments, and pointers to CType arguments:
-
- if( calc.Global->funcs( "atexit" ))
- {
- CString s("exiting... ");
- calc.Call( "atexit", 1, &s );
- }
-
-
- 4.6. Modifying source code.
- ---------------------------
-
- If you are going to use the library in your project, then most
- likely you will have to change its source code for your own needs.
- There are different ways of source code modification, and you should
- choose the better one. Which one is better depends of how serious
- changes you need.
-
- The simplest way to extend the library is to add new functions visible
- from the interpreter. This can be done by modifying file calclib.cpp.
-
- Another way of easy modification is the change of language syntax,
- see file yycalc.yac.
-
- More difficult solution may require change of the hierarchies CType and
- Expression. In this case you should overwrite necessary methods and probably
- change YACC rules. Actually whole CType hierarchy can be replaced by your
- own hierarchy, if you already have something like that in your project.
-
- As the example of changes in source code, let us consider steps
- necessary for implementation of big numbers arithmetic:
-
- a) We need CType inheritor, which will store, print, and calculate
- very big numbers (hundreds of significant digits).
-
- b) Method Calc() of classes Ar1, Ar2, Comparison should be changed
- to be able handle big numbers.
-
- c) YLex::yylex() must detect big number from the input stream and
- return corresponding token. This can be done by adding special
- conversion function as well.
-
- d) CalcPlus::yyparse() must generate new Immediate( new BigNumber )
- when such token detected.
-
- After we have done these steps, hopefully, big numbers arithmetic
- will be available from the interpreter's programs.
-
-
- 5. Known bugs and problems.
- ---------------------------
-
- The biggest known problem is obvious: error diagnostic is too simple.
- So the user with low programming experience may have a lot of problems
- trying to write program for the interpreter.
-
- Compiler returns line number 0 when EOF inside of unclosed block detected.
- So, line number 0 is the last line in file.
-
- Complex recursive define directives may work not properly.
- There is no real reason yet to develop full built-in preprocessor.
-
- Passing arguments to function by reference is not absolutely correct.
- There were problems when operator throw was used in C++ code called
- from inside of such function. However, this problem can be easily avoided
- by adding flag to variable, which says that variable has passed its value
- to the function, so operator delete cannot be used for the pointer to value.
-
- If you have found more errors, bugs, problems - please, let me know.
-
-
- 6. Appendix. Language description.
- ----------------------------------
-
- This is informal description of CalcPlus interpreter's language.
- Most of the syntax looks and works like the same syntax in other languages.
- Language has a lot of common with C and Clipper.
-
- Like C:
-
- Module is the unit of compilation.
- Program can consist of more than one module.
- Start symbol is MAIN if not redefined.
- File can include other files by #include 'filename' directive.
- Global variables may be defined in the module context.
- Semicolon ';' is the separator between statements.
- Sign '!' is the logical NOT.
-
- Not like C:
-
- Case is insignificant.
- Preprocessor has only 'define', 'ifdef', 'endif' directives.
- No logical operations available for preprocessor.
- There are no static variables.
- Assignment is ':='.
- Strings are declared with both (') and (") separator: 'str1', "str2".
- Unary assignment sign used in comparison: if a=b then ... end;
- EXIT and LOOP keywords are the same as 'break' and 'continue' in C.
- OR, AND keywords used instead of ||, &&.
- ARRAY, ADEL, AADD should be used for array access.
- ARGC, ARGV functions provide access to the command line arguments.
- '<>' used instead of '!=' for logical not_equal.
-
- Functions and procedures must be described before calling, description
- of arguments is not required. Functions and procedures may contain
- return statement. Default return value is NIL.
-
- func a;
- proc b;
-
- ...
- var x:=a(1,2,3);
- b('abc');
- ...
-
- func a(x,y,z)
- return x+y+z;
- end;
-
- proc b(x)
- echo x,endl;
- end;
-
- Arguments preceding with sign '*' are passed to the function by reference.
- Arrays are always passed by reference. Result of the program below will be
- 1 2 3 3:
-
- func a(x)
- x:=x+1;
- return x;
- end;
- proc main()
- var x:=0;
- echo a(*x),' ',a(*x),' ';
- echo a( x),' ',a( x);
- end;
-
-
- Blocks are allowed in any place inside of function or procedure:
-
- begin
- expr1;
- expr2;
- ...
- exprn;
- end;
-
- Variables, constants, structures may be declared in any place of
- program, they are visible only inside of block where were defined.
- No type checking. Structures are actually arrays indexed by arrays
- of strings.
-
- var count := 1;
- const pi := 3.14;
- struct abc {a,b,c};
- abc test;
-
- Output is performed by ECHO statement followed by expressions list.
- ENDL is the Line-Feed constant.
-
- echo '2*2=',2*2,endl;
-
- All control structures: begin, if, for, while, func, proc must be
- closed by keyword 'end'. Variables defined inside of such structures
- are considered as local for them. STEP keyword may be omitted in
- FOR statement, step is 1 by default.
-
- if a=b then
- f1();
- else
- f2();
- end;
-
- for i:=1 to 10 step 2 do
- echo i*i,' ';
- end;
-
- while a>b do
- b := b+1;
- end;
-
- All the arithmetic operations are usual: ((2+3)-1)*6/2.
-
-
- <EOF>
- -----
-