home *** CD-ROM | disk | FTP | other *** search
-
- More on the infix parser.
- =========================
-
-
- N. Wirth helped me with this one. I was leafing through his book:
- Algorithm+DataStructures=Programs one day and found some charting, I
- forget the name of it. Anyhow, they seemed neat at the time and give
- almost a direct correspondence to code. Something other people using
- other types of flowcharts have also noticed. (Notably: Wil Baden in
- Charting Forth, 1986 FORML Conference Procedings, a superb paper).
-
- I know some people are probably horrified. Using Wirth for Forth? Well,
- I believe that Wirth one day will write (if he didn't already) a
- threaded interpreted language.
-
- I also wanted to write a recursive utility that did a little more then
- calculate a bunch of numbers in a special sequence. (fibo...? a heck.)
- If you don't like it recursive, look through the back issues of Forth
- Dimensions for a operator-stack type parser.
-
- What follows are an explanation of the precedence, the charts on which
- it is based (I hope they are clear enough) and a small description of
- the routines which is in addition to the shadow screens.
-
-
- Precedence.
- -----------
-
- Every infix parser takes precedence of operators into consideration. The
- following are defined from the highest to lowest precedence:
-
- operator explanation
-
- ( ) nesting
- - ~ negate not
- << >> shift left shift right
- & ! and or
- * / multiply divide
- + - add subtract
-
-
- These operators work on numbers, which have a 32 bit limit in my system.
- Numbers can also start with an $ sign. This indicates an hexadecimal
- radix, and for the number following only. In addition to numbers the
- system can use identifiers. These identifiers are normal Forth
- constants. Forth variables would use the address of that variable,
- simply what that identifier leaves on the stack.
-
- Identifiers must start with a _ (underscore) . (period) or an alpha
- character. The rest can be alpha characters, numbers or _ and .
- The length of the identifier is limited by the underlying Forth, usually
- 31 characters.
-
-
- Charts. ( From the top down )
- -------
-
- expression
-
- ------------> simple-ex >----+--+------>
- ^ | |
- | v v
- |
- | + -
- |
- | | |
- +--------------------+--+
-
-
- simple-ex
-
- ------------> term >----+--+------>
- ^ | |
- | v v
- |
- | * /
- |
- | | |
- +--------------------+--+
-
-
- term
-
- ------------> factor >----+--+------>
- ^ | |
- | v v
- |
- | & !
- |
- | | |
- +--------------------+--+
-
-
- factor
-
- ------------> factor1 >----+--+------>
- ^ | |
- | v v
- |
- | << >>
- |
- | | |
- +--------------------+--+
-
-
- factor1
-
- ---+-------+-----> factor2 >--------->
- | ^
- | |
- +-> - >-+
- | |
- +-> ~ >-+
-
-
- factor2
-
- ---+--------> getconstant >---------+---->
- | ^
- | |
- +-> ( >--> expression >---> ) >-+
-
-
- getconstant
-
- ---+--------> getnumber >---------------->
- | ^
- | |
- +-------> identifier >-------+
- | |
- | |
- +--> ' >--> getchar >--> ' >--+
-
-
- getnumber
-
- ---+---------------> digit >-+----------->
- | ^ ^ |
- | | | |
- +--> $ >--+ +-----------+
-
-
- identifier
-
- ---+-> letter >-------+----------------+-->
- | ^ ^ | ^ |
- | | | | | |
- +--> _ >---+ | +--> letter >--+ |
- | | | | | |
- | | | | | |
- +--> . >---+ | +--> digit >--+ |
- | |
- +------------------+
-
-
-
- Routines.
- ---------
-
- TEST
- Small word to test the evaluation process. Uncomment it before loading.
- To use it, type from the Forth prompt (e.g.):
- 5 constant testcon
- test 2*8+_testcon
- which should leave the answer on the stack. Reverse the order and the
- result should be the same, precedence.
-
- EVALUATE (s addr len -- n )
- The topmost word of the parser. It returns the value of the
- expression, using the string non destructively. It initializes the
- pointer and counter to keep track of the current character to be
- examined, ie. the symbol.
-
- (EXPRESSION)
- The default for expression. This word corresponds to the first graph.
- Note that it is indirectly recursive, by calling the deferred word
- expression, which is itself.
- Also I have omitted stack pictures, in a recursive context it varies
- with the input. Doesn't matter as long as the end result is not
- variable.
- Program flow is almost a direct correspondence to the graph.
-
- SIMPLE-EX
- The next graph translates into this word. Also recursive, as the arrow
- around the graph implies. The order of operator precedence is bottom
- up; this is the second level from the bottom, the characters * and /
- are scanned for here. If not present, the routine is exited, with the
- result from TERM.
-
- TERM
- Recursive routine, looks for the logical & and ! symbols. It is
- expected that the expressions presented to this parser are not to
- deeply nested, or the stack might overflow? Normally, that will not be
- the case. Most calculators allow appr. 10 levels of nesting, and I
- think this one will outdo them.
-
- FACTOR
- Recursive also. Here I cheated a little, I don't parse the entire >>
- and <<. I simply ignore the second character. The second 'readsymbol'
- takes care of that.
-
- FACTOR1
- The negate and not level. It is not recursive, simply decides on the
- character in symbol, what to do with factor2. The graph doesn't have
- an arrow back to itself.
-
- FACTOR2
- Looks at the next symbol, and if it is a (, bracket that is, will call
- expression, and cause indirect recursion, by calling the entire
- sequence. Note how the missing ), closing bracket, is handled.
- If it is not a nesting, getconstant gets called.
-
- GETCONSTANT
- A constant can be a character, a number or an identifier. Each one has
- a special starting character, allowing for this parser not to have to
- backtrack. Letter? is in this version [ . | _ | a..z | A..Z ]
-
- IDENTIFIER
- Places the substring from the expression at (Forth's) 'here', appends a
- null character after it (required in my system) and finds it in the
- Forth context dictionary. If not found aborts with an error message,
- displaying the unknown identifier. If found, it will execute the word.
- Therefore if the word is a constant it will return a value on the
- stack. Any word returning a value could be used, even one with side
- effects.
-
- GETNUMBER
- Also places a substring at 'here', but calls 'number' to convert the
- string into a value. The number is in hex base if preceded by $.
- Again: ' 0 here count + c! ' is particular to my system, most other
- F83's will require a blank instead of a zero.
-
- GETCHAR
- Read the next symbol and uses it as a character constant. Since Forth
- already translates it for me, it is easy. It does expect a closing
- single quote and aborts complaining if it is missing.
-
- The first screen of the parser is a set of normal Forth words the
- explanation on the shadow screen is explicit enough. The only strange
- trick used here is not to point to the beginning of the string, but to
- one character past the end. This allows the length of the string to be
- decremented to zero, at which time the string is exhausted. This makes
- it unnecessary to keep the length in a variable and comparing a counter,
- which is incremented, to it. The current symbol is pointed to by
- symbol> - insymbol or the end of the string less the counter.
- On this screen I have also defined what a digit and what a letter should
- be.
-
- This parser can be extended, modified, to allow more levels and
- operators, or to change the meaning of identifiers. An example of how
- techniques from other languages can be used in Forth. And this parser
- can be used for people who cannot get the hang of RPN. I wrote it to
- input a bunch of text files written in another language.
-
- Peter Appelman.
-
-
-