home *** CD-ROM | disk | FTP | other *** search
- A Technique for Automatically Porting
- Dialects of Pascal to Each Other
-
- QCAD Systems, Inc.
- 1164 Hyde Avenue
- San Jose CA 95125
-
- Michael J. Sorens
-
-
- In the course of programming events, there have
- been those who have found it necessary to
- write Pascal code that could be ported between dialects of Pascal
- which differ from machine to machine. This task cannot be
- accomplished just by writing "vanilla" code to a maximal degree,
- as there are inevitably syntactical and semantic differences from
- one dialect to another. We have developed a technique which,
- in conjunction with vanilla code, allows translation from
- one dialect to another to be performed automatically.
- Since it is generally the case that when a file of program text needs
- to be translated to a new dialect it also needs to be transmitted
- to a physically separate computer, our translation utility exists in
- two forms: a stand-alone translation utility, and a combination
- translation/transmission utility.
-
- FINDING THE DENOMINATOR
-
- The primary task is to reduce functions and procedures to common
- denominators.
- That is, if a function does not exist in one dialect, it must be
- written for that dialect using the dialect's own primitives.
- This could be a simple renaming. For example, consider the two
- functions shown below:
-
- Turbo Pascal: copy(string, location, length)
-
- VAX Pascal: substr(string, location, length)
-
- As it happens, these functions are semantically identical, only the
- names have been changed to confuse the unwary.
- If we have written code in Turbo Pascal, then we simply create a
- copy function for VAX Pascal:
-
- function copy(STR: string255;
- WHERE, LEN: integer): string255;
- begin
- copy := substr(str, where, len);
- end;
-
- Thereafter, we can use the copy function in either Turbo Pascal
- or VAX Pascal.
- Of course, we only want the above definition of copy to exist
- in the VAX environment, as it will generate compilation errors in Turbo;
- we will see how to selectively compile this kind of entity shortly.
-
- A second case might be this: in evaluating a logical expression,
- HP Pascal, for example, has a compiler option to do only partial
- evaluation (a la C) where an expression is evaluated only until the
- point where its result is determinable. C programmers use this
- in establishing and avoiding side effects. (It is quite a useful thing
- to have, but, alas, having it in only one Pascal dialect makes life difficult.)
-
- For example, if the function
- foo changes some global variable y, then (false and foo(x)) will
- not change y if the dialect uses partial evaluation.
- This is due to the way the boolean and operates. Both terms
- must be true for the conjunction to be true. Scanning from left to right,
- since the first term false is false, we do not need to even look at
- foo(x) to determine the result. Without the partial evaluation feature,
- foo(x) will always be evaluated, potentially yielding differing results.
-
- A second example might be a test such as
-
- while (count <= length(str)) and (str[count] = 'X') do...
-
- which is a valid statement with partial evaluation but can cause
- a runtime error without it. Why? Well, as long as count is less than
- or equal to the length of the string str there is no problem.
- But when count becomes one larger than the length of str
- referencing str[count] may cause a runtime error, depending on
- whether range checking is enabled for a particular compiler.
-
- Therefore, we must modify the code in which results might vary depending on
- whether partial evaluation is available. This could be done by introducing
- a two-step evaluation. That is, rather than if (false and foo(x)) then...
- we substitute if false then if foo(x) then... which guarantees an
- equivalent partial evaluation. The while loop requires introducing some
- ugliness. We need a boolean variable, call it done, which we initialize
- to false. Then, the code might be
-
- while (count <= length(str)) and (not done) do begin
- done := (str[count] = 'X');
- if not done then ...
-
- The simple elegance and power of partial evaluation begins to shine through,
- no?
-
- A more cumbersome reduction involves string comparisons.
- In HP Pascal or Turbo Pascal one could compare
- strings s1 and s2 by merely
- interposing a relational operator, e.g., (s1 < s2) or (s1 = s2).
- In VAX Pascal, however, only strings of the same length can be compared
- (not strings of the same maximal length, but strings of the same length
- at the moment of comparison). Thus, we need to introduce an added
- functional level in all three Pascals so that the code can be identical.
- We create the function StrCmp:
-
- function StrCmp(S, T: string80): integer;
- { return -1 if s<t; 0 if s=t; 1 if s>t }
- var Result: integer;
- begin
- if length(s) < length(t) then begin
- result := StrCmp(s, copy(t, 1, length(s)));
- if result = 0 then StrCmp := -1 else StrCmp := result
- end
- else if length(s) > length(t) then begin
- result := StrCmp(copy(s, 1, length(t)), t);
- if result = 0 then StrCmp := 1 else StrCmp := result
- end
- else if s = t then StrCmp := 0
- else if s < t then StrCmp := -1
- else if s > t then StrCmp := 1
- end;
-
- Then, if we have existing code which needs to be retrofitted,
- we must change occurrences of (s1 = s2) to (strcmp(s1, s2) = 0),
- and likewise for the other relational operators.
-
-
- MAKE THAT CODE DISAPPEAR
-
- At some point, there will be pieces of code that must be seen
- by one compiler but must be hidden from another.
- Enter DIALATE. DIALATE -- a combination of the words dialect
- and translate -- converts one dialect of Pascal to another provided
- a program text file has been set up according to the guidelines about
- to be discussed.
-
- We introduce a meta-notation to be used in any Pascal we are interested
- in. This is just a notation that is in some sense "above" the Pascal code
- in that our translator will be looking only at the meta-notation and not
- the Pascal text itself. In our meta-notation we have
- meta-brackets which are the only constructs that DIALATE looks for:
-
- {@x} opening meta-bracket for dialect x
- {@} closing meta-bracket
-
- We use the curly braces "{" and "}" as the basis of our dialect
- notation, since they already have the capability of hiding text from
- a Pascal compiler -- the simple comment. DIALATE scans for comments
- that immediately begin with an "@" symbol, indicating that the comment
- is a special, dialectic comment. Immediately following the "@" can
- be one or more dialect designations.
- We currently use the following conventions:
-
- T - Turbo Pascal (IBM PC)
- V - VAX Pascal (VMS)
- H - HP Pascal
- A - Apple Pascal (Macintosh)
-
- Any piece of code that cannot run on all relevant machines
- must be surrounded by meta-brackets. DIALATE inserts
- and removes the right curly brace of the opening meta-bracket in a judicious
- manner so that the compiler "sees" only valid language constructs.
-
- Let's look at an example. Consider opening a file for input
- in Turbo Pascal and HP Pascal:
-
- Turbo Pascal: assign(MyFile, FileName);
- reset(MyFile);
-
- HP Pascal: reset(MyFile, FileName);
-
- Since there are significant differences, it is perhaps wise to
- create a procedure OpenInputFile which can be used in both Pascal
- dialects. Here is what the procedure will look like in Turbo Pascal:
-
- (1) procedure OpenInputFile(var F: text; NAME: string80);
- (2) begin
- (3) {@T}
- (4) assign(f, name);
- (5) reset(f);
- (6) {@}
- (7) {@H
- (8) reset(f, name);
- (9) {@}
- (10) end;
-
- Carefully examine the position of each of the curly braces.
- Notice in line 3 that there is a "}" but that in line 7 there is not.
- Thus, lines 4 and 5 are active code while line 8 is passive code.
- Lines 6 and 9 are closing meta-brackets which never change. The "{"
- in line 6 begins a comment which hides the "@" character, while the "{"
- in line 9 is ignored since it is already inside a comment.
- Now let's run the procedure through DIALATE, converting it to
- HP Pascal code:
-
- (1) procedure OpenInputFile(var F: text; NAME: string80);
- (2) begin
- (3) {@T
- (4) assign(f, name);
- (5) reset(f);
- (6) {@}
- (7) {@H}
- (8) reset(f, name);
- (9) {@}
- (10) end;
-
- The only difference is that the "}" in line 3 is gone, and there is
- a new "}" in line 7. This has reversed the active and passive sections
- of code.
- Hence, to write a piece of
- automatically translatable code, decide on which dialect you wish
- to write in, and use the appropriate meta-brackets.
-
- The technique is extensible to multiple machines with a concise
- notation. Take, for example a function to find the location of a pattern
- string within some other string. In Turbo Pascal and HP Pascal this
- function is called pos, while in VAX Pascal it is called index.
- We could then write a compatibility function called index to be used
- in Turbo or HP Pascal, or one called pos to be used in VAX Pascal.
- A third alternative, though, is to write a function with a new name,
- perhaps LocateSubString, to be used in all three languages.
- Stylistically, it might be better to use a very different name so that
- there is no chance of confusing the name with some other valid language
- construct.
- As it happens, the functions pos and index do precisely the
- same thing, though their parameter order is different, so we do not
- have to write much code.
-
- (1) function LocateSubString(Object,
- Target: string80): integer;
- (2) begin
- (3) {@HT
- (4) LocateSubString := pos(object, target);
- (5) {@}
- (6) {@V}
- (7) LocateSubString := index(target, object);
- (8) {@}
- (9) end;
-
- We can see that the above function is written in VAX Pascal since
- line 7, the VAX code, is active. Line 4 is passive code which is
- for both the HP and the Turbo dialects, since line 3 has both a "T" and
- an "H". When translated to either of these dialects, line 4 will
- become active while line 7 becomes passive.
-
-
- CASE IN POINT
-
- There is one final twist which has precipitated out of the Babel-like
- differences in Pascals, and that is the else clause of a case
- statement. Some Pascals, such as Turbo, use the keyword else to
- indicate any cases not explicitly enumerated. Other Pascals,
- such as VAX, HP, and Macintosh, use the keyword otherwise.
- We can certainly handle this discrepancy using our standard meta-notation
- described in the previous section. A typical case statement might
- look like this:
-
- case SelectionChar of
- 'R': RunIt;
- 'P': ProcessIt;
- 'E': EditIt;
- {@T} else {@}
- {@HVA otherwise {@}
- writeln('Invalid selection character);
- end { case } ;
-
- But this can quickly become an annoyance. Since there is no
- way to make some kind of generic function which handles all case
- statements (like we did with OpenInputFile),
- we must use the meta-brackets every time we have a case statement.
- Or rather, we would have to, if DIALATE didn't have a
- better solution.
-
- What we would like to do is have the translator automatically convert
- an else to an otherwise if we are going from Turbo to either
- HP, VAX, or Macintosh Pascal, and convert an otherwise to an else
- if we are going in the converse direction.
- But wait, what about the oft-found if... then... else... statement?
- How will we know if an else belongs to an if or to a case?
-
- What we have chosen to do is have a special ELSE-OTHERWISE convention.
- In Turbo Pascal, we write "ELSE" to denote an else of a case
- statement, and "else" to denote an else of an if statement.
- In our other Pascals, we write "OTHERWISE" to denote an otherwise
- of a case statement, and "else" to denote an else of an if
- statement. That is, the case clause -- whatever it is called --
- must be in uppercase letters, while the if clause must be in
- lowercase letters.
-
-
- DISCUSSION
-
- There are minor disadvantages to this dialect translation technique.
- The foremost is that you must have all meta-brackets balanced
- and correct, otherwise you might hide too little or too much code from
- the compiler. This might cause compilation errors, but it might not,
- creating possibly subtle bugs.
- For example, if we accidentally made the VAX code above into passive code,
- then the LocateSubString function would never be assigned a value.
- This could cause random or unpredictable results in the program.
- On the other hand, if we made both assignments active, a compiler
- should complain over the unknown function index or pos,
- depending on which machine the code is compiled.
- It takes a little getting used to, but typing correct meta-brackets
- is, after all, no more than typing correct syntax in a programming
- language.
-
- Second, it is not possible to put actual comments inside a region
- that is meta-bracketed. This may cause compilation errors when the
- entire region is supposed to be hidden, since the closing comment bracket
- could inadvertantly reactivate a portion of the hidden code.
- This is most insidious, however, when it does not cause compilation
- errors, as, for example:
-
- procedure DUMMY;
- begin
- {@H
- a := 10;
- b := 5;
- {@}
- {@T}
- a := 5; { some global variables }
- b := 10;
- {@}
- end;
-
- The above code, as written, will work fine with the Turbo compiler.
- However, when we translate it to run with HP Pascal, we will get the
- wrong value for b, since the closing "}" of the "real" comment will
- prematurely close the comment created by the meta-brackets.
-
- The dialect translation technique discussed in this paper is an
- effective and rapid way to work in several different Pascals with virtually
- the same program text. It may seem awkward at first, but one can readily
- get used to the style. And for those who need to work with more than
- one dialect or more than one machine, the tool may prove invaluable.
- DIALATE was developed due to a perceived need here at QCAD.
- We manufacture a large software package (a parser generator) which runs
- on the machines discussed above; DIALATE allows us to keep the same
- code on all of the machines.
-
- Should there be any readers who are so amazed by the technique revealed
- in this paper (for the first time anywhere) I will gladly supply both
- object and source code for DIALATE for a mere $10.24 (a kilo-penny)
- to cover diskette, postage, copying, etc. I will also be happy to
- tell you about some of the neat software tools that QCAD sells for
- real money.