home *** CD-ROM | disk | FTP | other *** search
- STAGE2 INTRODUCTION
-
- COPYRIGHT:
- Written: 06/15/79
- Updated: 06/17/79
-
- This introductory material is the property of:
- Dick Curtiss
- 843 NW 54th
- Seattle, Washington 98107
-
- Permission is granted to copy for personal use only.
-
- This material may NOT be used for publication without written
- permission of the author.
-
- STAGE2 PROGRAMMING TECHNIQUE:
- STAGE2 is unlike conventional languages and requires getting
- used to a different way of approaching a problem. By its nature,
- STAGE2 forces a top down approach to problem solving using stepwise
- refinement and recursive descent. Expect to read this material
- several times before it sinks in. The best way to learn STAGE2 is
- to study examples and experiment. Good luck!
-
- EXAMPLE PROBLEM:
- Suppose it is desired to recognize the following WHILE statement
- and translate it into assembler type language.
-
- WHILE ( X < Y ) PRINT X*Y : X = X + 1 ; Comment
-
- A top level macro would be used to recognize the "WHILE" as a keyword.
- As part of the recognition process the statement would be broken into
- three parts, "WHILE", "X < Y", and the rest of the line. The first
- step in the translation process is to generate a looping label. Next,
- "X < Y" would be passed on to a set of macros designed to generate a
- sequence of assembler language instructions which evaluate the
- conditional expression. Then a "jump if false" instruction would be
- generated to branch to the loop exit label.
-
- Next, the remainder of the line would be passed on to macros designed
- to recursively break apart multiple statement lines and process each
- single statement with still other specialized macros. After
- decomposition of the "WHILE" statement is complete, a jump instruction
- is generated to branch to the looping label. Finally, "WHILE"
- statement processing is completed by generating a loop exit label.
-
- EXAMPLE MACRO EXPANSION:
- LOOP:
- LOAD X
- CMP-LT Y
- J-FALSE EXIT
-
- LOAD X
- MULTIPLY Y
- CALL PRINT-RESULT
-
- LOAD X
- ADD #1
- STORE X
-
- JUMP LOOP
- EXIT:
-
- EXAMPLE MACROS:
- WHILE$($)$# To recognize WHILE statement
- LOOP:!F13% output looping label
- CONDITION !20% macro call to parse conditional expression
- J-FALSE EXIT!F13% output exit jump
- STATEMENT !30% macro call to parse remainder of line
- JUMP LOOP!F13% output loop jump
- EXIT:!F13% output loop exit label
- % ------------------------- end of macro
- CONDITION $<$# To recognize less than compare
- LOAD !10!F13% output load X instruction
- CMP-LT !20!F13% output compare Y instruction
- % ------------------------- end of macro
- STATEMENT $:$# To recognize and split multiple statement
- PROCESS !10% macro call to process single statement
- STATEMENT !20% recursive macro call to parse rest of line
- % ------------------------- end of macro
- STATEMENT $;$# To recognize and split statement with comment
- PROCESS !10% macro call to process single statement
- % ------------------------- end of macro
- STATEMENT $# To recognize single statement
- PROCESS !10% macro call to process single statement
- % ------------------------- end of macro
- PROCESS PRINT$# To recognize and process PRINT statement
- *** macro code not shown
- % ------------------------- end of macro
- PROCESS $=$# To recognize and process assignment stmt.
- *** macro code not shown
- % ------------------------- end of macro
-
- READING MACROS:
- Macros consist of a template line followed by one or more code
- body lines. The macro is terminated by an empty code body line.
-
- Macro templates, which are terminated by a special template end
- character, consist of character strings with special parameter
- flag characters interspersed.
-
- Code body lines are terminated by a special code body line end
- character. An empty code body line (macro terminator) has the
- code body line end character in column 1. A special escape
- character is used in code body lines for parameter reference
- and invocation of processor functions.
-
- Characters in a line following the special end characters are
- taken as comment only.
-
- SPECIAL CHARACTERS:
- # Template end of line
- $ Template parameter flag
- % Code body end of line
- ! Code body escape
- ( Left bracket
- ) Right bracket
-
- These special characters are user selectable on the first line
- of input to STAGE2. The particular characters shown above were
- arbitrarily chosen for the examples which follow.
-
- MACRO EXAMPLE:
- This macro may be used to store information into the STAGE2
- built in memory.
-
- MEM[$]=$# 1. Template line
- !F3% 2. Store into memory
- % 3. End of macro
-
- The template in line 1 contains two parameter flags (maximum
- of nine allowed). For a string to match the template it must
- contain the literal characters in the order shown in the
- template line. The parameter flag characters, "$", will match
- any balanced strings including a null string. A balanced string
- is one containing equal numbers of left and right bracketing
- characters, usually "(" and ")".
-
- Line 2 contains a processor function request. The escape
- character "!" folowed by "F" followed by a digit specifies one
- of ten possible functions. The "F" can actually be any non-
- numeric or special character. The function "3" shown in the
- example instructs STAGE2 to store parameter string 2 into the
- memory using parameter string 1 for access to memory. In other words
- the string in parameter 1 is given a value in memory and that value
- is the string in parameter 2.
-
- Parameter 1 is the string segment represented by the first "$"
- in the template and parameter 2 is the string segment represented
- by the second "$" in the template. The maximum number of
- parameters is nine.
-
- Line 3 is an empty code body line or macro terminator.
-
- Strings for a successful match:
- MEM[25]=TWENTY FIVE#
- Parameter 1 = "25"
- Parameter 2 = "TWENTY FIVE"
- Parameters 3-9 = ""
-
- MEM[ABC]=HELLO#
- P1 = "ABC"
- P2 = "HELLO"
-
- MEM[EQUATION]=A=2*(B+C)#
- P1 = "EQUATION"
- P2 = "A=2*(B+C)"
-
- MEM[X=Y]=Z#
- P1 = "X=Y"
- P2 = "Z"
-
- Strings for a match failure:
- MM[12]="E" MISSING#
-
- MEM [XYZ]=SPACE AFTER "MEM"#
-
- MEM[ABC)]=UNBALANCED STRING#
-
- MEM[ABC]=UNBALANCED (STRING#
-
- MACRO EXAMPLE:
- This macro may be used to print information stored in the
- memory.
-
- PRINT MEM[$]# 4. Template line
- !10=!11!F14% 5. Extract info and output
- % 6. End of macro
-
- The template in line 4 contains 1 parameter flag which
- represents the string which will be used to access the memory.
-
- The first escape character in line 5 is followed by a non-zero
- digit, "1", which is taken to be a reference to parameter string
- 1. The digit, "0", following the parameter reference is a
- conversion code (0-8 allowed). Conversion "0" copies the
- parameter string unchanged to the constructed line. The
- constructed line can be thought of as a scratch string which is
- empty at the start of a code body line scan. In summary the
- three characters "!10" instruct STAGE2 to append parameter 1 to
- the constructed line.
-
- The next character in line 5 is a literal "=" which is appended
- to the constructed line. Next is another escape character
- followed by the digit "1". This is another reference to
- parameter string 1. This time, however, the conversion digit
- is a "1" which instructs STAGE2 to append information from the
- memory to the constructed line using the specified
- parameter string for access.
-
- At this point 3 items have been appended to the constructed line:
- parameter string 1, "=", and a string from the memory. The
- next character in line 5 is another escape. This time, however,
- the following character is non-numeric indicating a processor
- function request. The next character, the digit "1", specifies
- the output function. The following digit, "4", specifies the
- output channel. "!F14" causes output of the constructed line to
- channel 4. When the channel number is ommitted output is to
- channel 3 by default. Processing of line 5 is now complete
- and line 6 terminates the macro.
-
- Strings for successful match:
- PRINT MEM[25]#
- Channel 4 output = "25=TWENTY FIVE"
-
- PRINT MEM[ABC]#
- Ch4 = "ABC=HELLO"
-
- PRINT MEM[PDQ]#
- Ch4 = "PDQ=" nothing stored previously
-
- MACRO EXAMPLE:
- This macro will also display information stored in the memory
- but formatted into fields.
-
- FORMAT MEM[$]# 7. Template line
- !11!26% 8. Extract info
- !F14% 9. Output
- 1111111= 22222222222222222% 10. Format
- % 11. End of macro
-
- As in line 5, the "!11" in line 8 appends information from the
- memory to the constructed line using parameter 1 for
- access. Then "!2" is a reference to parameter 2. The
- following conversion digit, "6", instructs STAGE2 to copy the
- constructed line into the specified parameter. This also
- results in clearing the constructed line to null. Processing
- of line 8 is complete as the end of line character comes next.
- It is possible, however, to have more operations appear in that
- same code body line (i.e. "!11!26 !F14#"). The character after
- the 6 in this alternative is ignored so a space is shown.
-
- At this point parameter 1 still has the string resulting from
- the template match and parameter 2 contains the string extracted
- from the memory using parameter 1 for access.
-
- The output request in line 9 is like that of line 5 except that
- the constructed line is empty (null). This condition instructs
- STAGE2 to use the following code body line as a formatting
- template which should not be confused with macro templates.
- Fields of numeric characters refer to corresponding parameter
- strings. Non-numerics in the formatting template appear in the
- output line as is. Parameter strings are inserted into the fields
- left justified (leading blanks are not suppressed) and blank
- filled or truncated on the right depending on parameter length
- and field width.
-
- Strings for successful match:
- FORMAT MEM[25]#
- Ch4 = "25 = TWENTY FIVE "
-
- FORMAT MEM[ABC]#
- Ch4 = "ABC = HELLO "
-
- FORMAT MEM[PDQ]#
- Ch4 = "PDQ = "
-
- EXAMPLE RUN:
- FILE "MEMORY.INP"
- #$%!0 (+-*/) 0. Special character selection
- MEM[$]=$# 1. Template line
- !F3% 2. Store into memory
- % 3. End of macro
- PRINT MEM[$]# 4. Template line
- !10=!11!F14% 5. Extract info and output
- % 6. End of macro
- FORMAT MEM[$]# 7. Template line
- !11!26% 8. Extract info
- !F14% 9. Output
- 1111111= 22222222222222222% 10. Format
- % 11. End of macro
- END# 12. Template line
- !F0% 13. Terminate processing
- %% 14. End of macros
- MEM[25]=TWENTY FIVE#
- MEM[ABC]=HELLO#
- MEM[EQUATION]=A=2*(B+C)#
- MEM[X=Y]=Z#
- MM[12]="E" MISSING#
- MEM [XYZ]=SPACE AFTER "MEM"#
- MEM[ABC)]=UNBALANCED STRING#
- MEM[ABC]=UNBALANCED (STRING#
- PRINT MEM[25]#
- PRINT MEM[ABC]#
- PRINT MEM[PDQ]#
- FORMAT MEM[25]#
- FORMAT MEM[ABC]#
- FORMAT MEM[PDQ]#
- END#
-
- Note: The "#" shown at the end of the input lines are
- optional in the CP/M implementation as carriage
- return is sufficient for an end of line condition.
- The special character when used is the same
- terminator used for macro template end of line. It
- can be used if it is desired to allow comment
- information in the source input stream.
-
- COMMAND LINES:
- STAGE2 CH3,CH4=MEMORY.INP
- TYPE CH3
- TYPE CH4
-
- FLAG LINE:
- The first line read by STAGE2 is used to specify the user's special
- symbol selections.
-
- column description
- 1 End of template and source input lines
- 2 Template parameter flag
- 3 End of code body line
- 4 Escape character for code body (parameter or function ref.)
- 5 The character for zero
- 6 Space character for formatted output
- 7 Open bracket (arithmetic expressions and balanced strings)
- 8 Addition operator
- 9 Subtraction operator
- 10 Multiplication operator
- 11 Division operator
- 12 Closing bracket (to match #7)
-
-
- PARAMETER CONVERSIONS:
-
- There are a maximum of ten parameters, numbered 0 through 9.
- Parameter 0 is a special case.
- There are nine possible parameter conversions, numbered 0 through
- 8. Most of this discussion will refer to specific parameters but
- the remarks apply generally to parameters 1 through 9.
-
- !10 Append parameter string 1 to the constructed line.
-
- !20 Append P2 to the CL.
-
- !11 Append MEM(P1) to the CL. Using P1 for access, append the
- value of the symbol to the CL.
- CASE
- ( P1 = null ) Generate error mesage and trace back
- ( P1 undefined ) Append null to the CL
- ( otherwise ) Append MEM(P1) to the CL
- fin
-
- !12 Similar to conversion 1 except when P1 is undefined.
- CASE
- ( P1 = null ) Generate error message and trace back
- ( P1 undefined )
- MEM(P1) = S1 ; define from symbol generator
- S1 = S1 + 1 ; increment symbol generator
- Append MEM(P1)
- fin
- ( otherwise ) Append MEM(P1) to the CL
- fin
-
- !13 Useful only in conjunction with context-controlled iteration.
- (Described after !17)
-
- !14 Append EVAL(P1) ; Evaluate the parameter string as an
- arithmetic expression and append a string of digits to the CL
- to represent the result. Non-numeric items in the expression
- will be taken as symbols for memory reference. An undefined
- symbol is treated as zero. If null, P1 will be treated as
- zero. A symbol with a non-numeric value will cause an error
- message and traceback.
-
- !15 Append LEN(P1) to the CL ; Append a string of digits to the CL
- to represent the length of the parameter string. A null string
- results in a single zero digit.
-
- !16 P1 = CL, CL = null ; Copies the CL into parameter 1, replacing
- whatever might have been there before. Also, the CL is
- cleared. The character immediately following "!16" will be
- ignored. If it is the end of code body line character
- processing will continue on the following line. Otherwise,
- processing will continue with the next character. When used
- inside of an iteration loop the string placed in the specified
- parameter is not retained from one iteration to the next or
- after exit from the loop.
-
- !17 This starts a context controlled iteration loop. The current
- value of the specified parameter is saved as the iteration
- process will supply new values for the parameter. The
- original value will will be restored after exit from the loop.
-
- The CL is scanned for break characters which are specified
- following the digit "7". All of the characters up to the end
- of line character will be used as break characters. If no
- break characters are specified the CL scan is broken on each
- character. When a break character or the end of the CL is
- reached scanning stops and the scanned string (excluding the
- break character) is copied into the specified parameter. The
- scanned string and break character are deleted from the CL.
- Break characters enclosed in brackets will not be recognized as
- the scanned string would not be balanced. After scanning stops
- code body lines are expanded within the loop which ends at an
- "!F8". After all lines within the loop have been processed,
- scanning of the CL continues unless the CL is null. When the
- CL is null the iteration loop is terminated.
-
- !F8 Processor function to define the scope of an iteration loop.
-
- !13 Append BREAK(P1) to the CL ; The break character is the
- single character immediately following the specified
- parameter which represents a substring of the line being
- scanned. When the end of line is reached, the break
- character is null.
-
- !18 Append a string of digits to the CL to represent the internal
- storage code for the character in P1. Unless P1 contains
- exactly one character an error message and traceback will
- result.
-
- SYMBOL GENERATOR:
-
- !0 Parameter "0" is a reference to an internal symbol generator.
- Within a given macro expansion up to ten unique symbols
- (actually integers or strings of digits) are available; "!00"
- through "!09". After the macro expansion is complete the
- symbol generator is incremented so that future macro expansions
- will get different symbols.
-
- PROCESSOR FUNCTIONS:
-
- There are eleven processor functions, numbered 0 through 9 and E.
- Some processor functions assume use of specific parameters.
-
- !F0 Terminate processing.
-
- !F1 Output request. The output request must appear at the end of
- a code body line. The CL is output if it is not null. If it
- is null the following code body line will be used as a format
- specification for output. A format line specifies exactly
- the number of characters in the line to be output. Non-numeric
- characters in the format specification are output exactly as
- they are. Parameter fields in the format specification are
- denoted by strings of identical digits. "22222" is a five
- character field into which parameter string 2 will be inserted,
- left justified with blank fill or truncation on the right as
- required. A given parameter may be referenced for more than
- one field in the formatted line.
-
- !F14 Output the CL to channel 4.
- !F1 Output to channel 3 as default channel.
- !F12R Output to channel 2 after rewind.
-
- !F2 Change I/O channels and copy text from the specified input
- channel to the specified output channel. If P1 is null no
- copying takes place. Copying continues up to an input line
- whose initial substring matches P1. The line which matches
- P1 is ignored, copying stops and the input channel is
- positioned to the line following the matched line. If no
- match line is found end of file terminates the copy.
-
- WHEN ( input channel specified )
- make it the new current input (CI) channel
- fin
- ELSE the current input channel number is unchanged
-
- WHEN ( output channel specified )
- copy to the specified channel
- fin
- ELSE copy to channel 3
-
- 5!F2 CI = 5 , out is 3
-
- 2R!F2 CI = 2 , Rewind 2 , out is 3
-
- !F24 CI unchanged , out is 4
-
- 2!F23R CI = 2 , out is 3 , Rewind 3 before copy
-
- In all cases no copy takes place if P1 is null.
-
-
- !F3 MEM(P1) = P2 ; Using parameter string 1 for access, store
- parameter string 2 into memory. (i.e. the value of P1 is
- defined to be P2). If P1 is null an error message and
- traceback result.
-
- !F4 Set the skip counter unconditionally. The skip counter applies
- to macro code body lines. The skip feature allows conditional
- expansion of portions of a macro code body. Parameter string 1
- is evaluated as an arithmetic expression (see conversion 4
- description) and the result is placed in the skip counter.
-
- SKIP = EVAL(P1)
-
-
- !F5 Set skip counter based on string compare for equality. The
- test condition is specified by a character following "!F5".
-
- !F50 IF ( P1 == P2 ) SKIP = EVAL(P3)
- !F51 IF ( P1 <> P2 ) SKIP = EVAL(P3)
-
-
- !F6 Set skip counter based on the relative values of 2 arithmetic
- expressions. The test condition is specified by a character
- following "!F6".
-
- !F6- IF ( P1 < P2 ) SKIP = EVAL(P3)
- !F60 IF ( P1 == P2 ) SKIP = EVAL(P3)
- !F61 IF ( P1 <> P2 ) SKIP = EVAL(P3)
- !F6+ IF ( P1 > P2 ) SKIP = EVAL(P3)
-
-
- !F7 Count-controlled iteration. The CL is evaluated as an
- arithmetic expression (see conversion 4 description) and the
- resulting value is placed in an iteration counter. The loop,
- which ends at an "!F8", is repeated with the iteration counter
- decremented for each iteration. The loop terminates when the
- counter reaches zero.
-
- !F8 Defines the scope of count-controlled loops and context-
- controlled loops. Loop nesting is permitted. Skipping out of
- loops is permitted. Skipping over entire loops is tricky
- business (see Waite's book, page 398).
-
- !F9 Terminates expansion of the current macro.
-
- !FE Force an error message and traceback. All macro calls to the
- current level will be output in reverse order to channel 4.
- The last traceback line is the current input line.
-
- -----------------------------------------------------------------------
-
- STAGE2 PROGRAM:
- This is a highly simplified description of the STAGE2 algorithm.
-
- PROGRAM;
-
- PROCEDURE MATCH ( STRING );
- BEGIN
-
- attempt to match STRING against macro templates
-
- IF MATCH_SUCCESSFUL THEN
-
- FOR each line of macro code body DO
- BEGIN
- scan code body line and perform operations
- IF CONSTRUCTED_LINE <> NULL THEN
- MATCH ( CONSTRUCTED_LINE ); {note recursive call}
- END
-
- ELSE output STRING to channel 3
-
- END;
-
- BEGIN { ---------------- program starts here ------------ }
-
- INPUT_FLAG_LINE {gets special character definitions from
- the first line input from channel 1}
-
- INPUT_MACROS {reads macro code bodies into memory from
- channel 1 and builds templates into tree
- structure for the template matching
- algorithm}
-
- INPUT_NEXT_LINE {gets first source line from input file
- (channel 1) - this is the first line
- following the last macro}
-
- WHILE NOT END_OF_FILE DO
-
- BEGIN
- MATCH ( LINE ); {attempt to match the line against all
- macro templates}
-
- INPUT_NEXT_LINE
-
- END;
-
- END.
-
-
- Except for switching channels or rewinding channels, the STAGE2
- user has no control over input. The processor has a built in loop
- for input as can be seen in the above program. Output, however,
- is under user control through macro body processing. If STAGE2
- fails to match an input line against a macro template the line
- will be output to channel 3 as is and the processor will go on
- to the next input line.
-