home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Club Amiga de Montreal - CAM
/
CAM_CD_1.iso
/
files
/
224b.lha
/
infix.txt
< prev
next >
Wrap
Text File
|
1989-04-08
|
9KB
|
259 lines
More on the infix parser.
=========================
N. Wirth helped me with this one. I was leafing through his book:
Algorithm+DataStructures=Programs one day and found some charting, I
forget the name of it. Anyhow, they seemed neat at the time and give
almost a direct correspondence to code. Something other people using
other types of flowcharts have also noticed. (Notably: Wil Baden in
Charting Forth, 1986 FORML Conference Procedings, a superb paper).
I know some people are probably horrified. Using Wirth for Forth? Well,
I believe that Wirth one day will write (if he didn't already) a
threaded interpreted language.
I also wanted to write a recursive utility that did a little more then
calculate a bunch of numbers in a special sequence. (fibo...? a heck.)
If you don't like it recursive, look through the back issues of Forth
Dimensions for a operator-stack type parser.
What follows are an explanation of the precedence, the charts on which
it is based (I hope they are clear enough) and a small description of
the routines which is in addition to the shadow screens.
Precedence.
-----------
Every infix parser takes precedence of operators into consideration. The
following are defined from the highest to lowest precedence:
operator explanation
( ) nesting
- ~ negate not
<< >> shift left shift right
& ! and or
* / multiply divide
+ - add subtract
These operators work on numbers, which have a 32 bit limit in my system.
Numbers can also start with an $ sign. This indicates an hexadecimal
radix, and for the number following only. In addition to numbers the
system can use identifiers. These identifiers are normal Forth
constants. Forth variables would use the address of that variable,
simply what that identifier leaves on the stack.
Identifiers must start with a _ (underscore) . (period) or an alpha
character. The rest can be alpha characters, numbers or _ and .
The length of the identifier is limited by the underlying Forth, usually
31 characters.
Charts. ( From the top down )
-------
expression
------------> simple-ex >----+--+------>
^ | |
| v v
|
| + -
|
| | |
+--------------------+--+
simple-ex
------------> term >----+--+------>
^ | |
| v v
|
| * /
|
| | |
+--------------------+--+
term
------------> factor >----+--+------>
^ | |
| v v
|
| & !
|
| | |
+--------------------+--+
factor
------------> factor1 >----+--+------>
^ | |
| v v
|
| << >>
|
| | |
+--------------------+--+
factor1
---+-------+-----> factor2 >--------->
| ^
| |
+-> - >-+
| |
+-> ~ >-+
factor2
---+--------> getconstant >---------+---->
| ^
| |
+-> ( >--> expression >---> ) >-+
getconstant
---+--------> getnumber >---------------->
| ^
| |
+-------> identifier >-------+
| |
| |
+--> ' >--> getchar >--> ' >--+
getnumber
---+---------------> digit >-+----------->
| ^ ^ |
| | | |
+--> $ >--+ +-----------+
identifier
---+-> letter >-------+----------------+-->
| ^ ^ | ^ |
| | | | | |
+--> _ >---+ | +--> letter >--+ |
| | | | | |
| | | | | |
+--> . >---+ | +--> digit >--+ |
| |
+------------------+
Routines.
---------
TEST
Small word to test the evaluation process. Uncomment it before loading.
To use it, type from the Forth prompt (e.g.):
5 constant testcon
test 2*8+_testcon
which should leave the answer on the stack. Reverse the order and the
result should be the same, precedence.
EVALUATE (s addr len -- n )
The topmost word of the parser. It returns the value of the
expression, using the string non destructively. It initializes the
pointer and counter to keep track of the current character to be
examined, ie. the symbol.
(EXPRESSION)
The default for expression. This word corresponds to the first graph.
Note that it is indirectly recursive, by calling the deferred word
expression, which is itself.
Also I have omitted stack pictures, in a recursive context it varies
with the input. Doesn't matter as long as the end result is not
variable.
Program flow is almost a direct correspondence to the graph.
SIMPLE-EX
The next graph translates into this word. Also recursive, as the arrow
around the graph implies. The order of operator precedence is bottom
up; this is the second level from the bottom, the characters * and /
are scanned for here. If not present, the routine is exited, with the
result from TERM.
TERM
Recursive routine, looks for the logical & and ! symbols. It is
expected that the expressions presented to this parser are not to
deeply nested, or the stack might overflow? Normally, that will not be
the case. Most calculators allow appr. 10 levels of nesting, and I
think this one will outdo them.
FACTOR
Recursive also. Here I cheated a little, I don't parse the entire >>
and <<. I simply ignore the second character. The second 'readsymbol'
takes care of that.
FACTOR1
The negate and not level. It is not recursive, simply decides on the
character in symbol, what to do with factor2. The graph doesn't have
an arrow back to itself.
FACTOR2
Looks at the next symbol, and if it is a (, bracket that is, will call
expression, and cause indirect recursion, by calling the entire
sequence. Note how the missing ), closing bracket, is handled.
If it is not a nesting, getconstant gets called.
GETCONSTANT
A constant can be a character, a number or an identifier. Each one has
a special starting character, allowing for this parser not to have to
backtrack. Letter? is in this version [ . | _ | a..z | A..Z ]
IDENTIFIER
Places the substring from the expression at (Forth's) 'here', appends a
null character after it (required in my system) and finds it in the
Forth context dictionary. If not found aborts with an error message,
displaying the unknown identifier. If found, it will execute the word.
Therefore if the word is a constant it will return a value on the
stack. Any word returning a value could be used, even one with side
effects.
GETNUMBER
Also places a substring at 'here', but calls 'number' to convert the
string into a value. The number is in hex base if preceded by $.
Again: ' 0 here count + c! ' is particular to my system, most other
F83's will require a blank instead of a zero.
GETCHAR
Read the next symbol and uses it as a character constant. Since Forth
already translates it for me, it is easy. It does expect a closing
single quote and aborts complaining if it is missing.
The first screen of the parser is a set of normal Forth words the
explanation on the shadow screen is explicit enough. The only strange
trick used here is not to point to the beginning of the string, but to
one character past the end. This allows the length of the string to be
decremented to zero, at which time the string is exhausted. This makes
it unnecessary to keep the length in a variable and comparing a counter,
which is incremented, to it. The current symbol is pointed to by
symbol> - insymbol or the end of the string less the counter.
On this screen I have also defined what a digit and what a letter should
be.
This parser can be extended, modified, to allow more levels and
operators, or to change the meaning of identifiers. An example of how
techniques from other languages can be used in Forth. And this parser
can be used for people who cannot get the hang of RPN. I wrote it to
input a bunch of text files written in another language.
Peter Appelman.