home *** CD-ROM | disk | FTP | other *** search
Text File | 1986-11-30 | 43.5 KB | 1,336 lines |
- Newsgroups: mod.sources
- Subject: TRC - expert system building tool (part 2 of 8)
- Approved: jpn@panda.UUCP
-
- Mod.sources: Volume 3, Issue 110
- Submitted by: ihnp4!dicomed!ndsuvax!nckary (Daniel D. Kary)
-
-
-
- This is NOT a shell archive. Simply delete everything up to and including
- the cut mark and save the result as tutorial.doc.
-
- Dan Kary
- ihnp4!dicomed!ndsuvax!nckary
-
- -------------- cut here ---------------
-
-
-
-
-
-
-
-
- An Introduction to TRC
-
-
- Daniel D. Kary
-
- North Dakota State University
- Computer Science Department
- 300 Minard Hall
- Fargo, ND 58102
-
-
- _A_B_S_T_R_A_C_T
-
-
- TRC is a compiler that is useful in building
- expert systems. The input is a rule based system
- whose input is syntactically similar to the input
- to YACC[1]. The output is a set of C language
- procedures.
-
- While not all features of TRC are discussed,
- the major features of the language are presented.
- Example code is used to illustrate the features
- and references to more detailed documentation are
- included. This may be the best starting point for
- first time users.
-
-
-
- _1. _I_N_T_R_O_D_U_C_T_I_O_N
-
- The fundamental notion that virtually the entire field
- of expert systems is built upon is the situation/action rule
- paradigm for problem solving. This paradigm is on the one
- hand the embodyment of simplicity and on the other hand a
- tool that is stunningly powerful.
-
- The situation/action rule paradigm is a way of embody-
- ing both information about a problem and the way the infor-
- mation is applied to solving the problem in a single struc-
- ture. Consider this trivial problem, you have a pile of
- coins and wish to reduce it to the smallest number of coins
- of equal value. One of the rules in a system to solve this
- problem could be:
-
- IF: there are five pennys in the pile
-
- THEN: substitute a nickel for the five pennys.
-
- A situation/action rule has the form of an IF...THEN...
- statement, common to virtually every progamming language.
-
-
-
-
-
-
-
-
-
- - 2 -
-
-
- The IF part is the situation, the rule checks to see if this
- situation exists, if it does the THEN or action part is exe-
- cuted. In addition to the rules there is a pattern matcher,
- which is invoked in the situation part to determine if the
- situation exists. This pattern matcher is typically more
- powerful that what is available in an IF statement in a pro-
- gramming language. The pattern matcher searches a data
- base. The data base contains all the information that is
- specific to this instance of the problem, and in some cases
- information that is germain to all or many instances of the
- problem. In the trivial example given here, the data base
- would contain the pile of coins. Finally there is a stra-
- tegy for deciding which rule to test next. Usually the
- rules are tested in a pre-specified order. When a rule
- whose situation part is found to be true, its action part is
- executed. The strategy for testing rules usually continues
- with the first rule each time any rule fires.
-
- This simple paradigm has been used to build systems
- with expert levels of problem solving ability in areas as
- diverse as elucidating the structure of hydrocarbons[2] to
- diagnosing blood diseases[3] or pulmonary function
- disorder[4]. In each case a system of rules is built up,
- each rule embodying a 'rule of thumb' or a piece of 'common
- wisdom' or 'accepted practice' specific to the problem
- domain and used by a human expert in solving the problem.
- The system of rules is referred to as a 'rule based system'
- or an 'expert system'. The expert system attempts to solve
- a problem by emulating the problem solving behavior of a
- human expert. Expert systems are often easy to modify or
- extend. Just as a human expert gains expertise, rules
- representing new knowledge can be added to an expert system.
-
- Since the situation/action rule is basically an
- IF...THEN... construct, why is a special language needed?
- Writing these rules in a traditional programming language
- can be tedious, a single situation part may require multiple
- conditional tests. In a system with a large number of
- rules, the structure of the system may be difficult to see
- and difficult to modify because the many details of the pro-
- gramming language tend to hide the structure. Writing the
- code that maintains and searches the data base that is
- referred to in the situation part is tedious and repiti-
- tious, making it an ideal subject for automation.
-
- The rest of this tutorial is devoted to describing TRC,
- a tool for building expert systems. The next section, sec-
- tion 2, describes the overall format of the input to TRC.
- Section 3 presents a sample set of rules to give an initial
- overview of how TRC works. The remaining sections present
- semi-formal descriptions of the syntax of TRC.
-
- This document does not contain all the information
- needed by advanced users. _T_h_e _T_R_C _L_a_n_g_u_a_g_e _R_e_f_e_r_e_n_c_e
-
-
-
-
-
-
-
-
-
- - 3 -
-
-
- _M_a_n_u_a_l[_5] contains a complete formal description of the
- features of TRC.
-
- _2. _B_A_S_I_C _S_P_E_C_I_F_I_C_A_T_I_O_N_S
-
- Every TRC program consists of five sections, the
- header, definitions, short term memory (data base), long
- term memory (rules) and the trailer. These sections are
- separated by double percent "%%" marks as in YACC specifica-
- tions. The form of a full specification is illustrated in
- figure 1. The header, short term memory and trailer are
- optional so the minimum specification would contain only the
- definitions and long term memory. All of the %% marks must
- be present in each specification file.
-
-
- header
- %%
- definitions
- %%
- short term memory
- %%
- long term memory
- %%
- trailer
-
- Figure 1: TRC Specification
-
-
- The purpose of the header and trailer sections is to
- permit the inclusion of C language code in the program, much
- as in YACC. One of the common features of compiled pro-
- cedural languages is that data objects and data types used
- in the program must be declared or defined before they are
- used. This is also true for TRC. While TRC is not a pro-
- cedural language, declaring objects before using them sim-
- plifys the process of translating to C which is a procedural
- language.
-
- The remaining components of the TRC grammar are tradi-
- tional components of expert systems. The short term memory
- section, herinafter abbreviated STM, is sometimes called the
- data base. It contains the data that is searched in the
- situation part of a rule. Expert systems are usually
- modeled after the problem solving behavior of a single
- expert or a group of experts in solving a single problem or
- class of problems. The information specific to the current
- instance of the problem is usually gathered by the expert,
- manipulated in the solving of the problem and then forgot-
- ten. The way this information is remembered and then for-
- gotten in the human brain and the area of the human brain
- where it is stored is called short term memory by psycholo-
- gists. Using that same name here refers to the similarity
- of purpose.
-
-
-
-
-
-
-
-
-
- - 4 -
-
-
- The long term memory, herinafter referred to as LTM,
- contains the rules and again is a reference to human brain
- processes. This name is a reference to the part of the
- human brain that remembers things for a long time. The
- expert usually has a set of formal procedures and informal
- rules of thumb that are used in solving the problem. The
- data changes with each instance of the problem, but the
- rules for solving it remain the same. Human experts have
- the ability to gather more expertise, to learn more about
- the problem. The experts learning behavior is imitated by
- adding new rules or modifying existing rules in LTM.
-
- _3. _W_R_I_T_I_N_G _R_U_L_E_S
-
- The coin problem mentioned in the introduction will be
- used to illustrate the syntax of writing a rule. This
- presentation is made only as an illustration. A complete
- description of the syntax of a rule will be given in section
- [?]. A set of four rules that will reduce the pile of coins
- will be given. The rule mentioned in the introduction
- searched the data base for five pennys and replaced them
- with a nickel. To express this as a TRC rule, write:
-
- R1:
- 5 PENNY
- =>
- MARK 5 PENNY
- ADD NICKEL
- ;
-
-
- All rules begin with a label, in this case 'R1:'. A
- label is a token followed by a colon, and a token is a
- string of characters (upper case or lower case), digits
- and/or the underscore character. The first character of a
- token must be a character. Labels are used to refer to the
- rule by name in optimizing and user specified control.
- These issues are discussed in section 6.
-
- The label is followed by the situation part, which
- specifies what to search for in the data base (STM). In
- this case we are specifying a search for five items of type
- PENNY. The arrow symbol ( => ) marks the end of the situa-
- tion part and the beginning of the action part. If the
- situation part evaluates to true (there are 5 pennies in
- STM) the action part will be executed. There are two state-
- ments in the action part of this rule. The statement 'MARK 5
- PENNY' specifys that the 5 pennies that were found in STM
- should now be removed. The statement 'ADD NICKEL' specifys
- that one more nickel should be added to STM (the pile of
- coins). So this rule solves a small part of the problem by
- performing a simple transaction.
-
- The remaining three rules needed to reduce the pile of
-
-
-
-
-
-
-
-
-
- - 5 -
-
-
- coins to the minimum number of equal value (assuming only
- pennies, nickels, dimes and quarters are used) are listed in
- figure 2.
-
- R2:
- 2 NICKEL
- =>
- MARK 2 NICKEL
- ADD DIME
- ;
-
- R3:
- 2 DIME
- NICKEL
- =>
- MARK 2 DIME
- NICKEL
- ADD QUARTER
- ;
-
- R4:
- 3 DIME
- =>
- MARK 3 DIME
- ADD QUARTER
- NICKEL
- ;
-
- Figure 2: Coin Rules
-
-
-
- This simple set of rules illustrates both the indepen-
- dence of each rule and the interaction of the rules. Each
- rule describes a transaction that will reduce the number of
- coins in the pile without changing the total value of all
- the coins in the pile. Suppose the pile of coins consists
- of three dimes and one nickel. Initially R4 is the only
- rule whose situation part is true. After the action part of
- R4 is executed, R2 becomes true by virtue of the fact that
- R4 added a nickel. After R2 is executed the pile is reduced
- to a quarter and a dime, the minimum number of coins to make
- thirty-five cents.
-
- As was mentioned earlier, each of these rules is basi-
- cally an IF...THEN.. construct. The meaning and purpose of
- each of these transactions is quite evident when expressed
- as a situation/action rule. The same may not be true of an
- equivalent program written in a procedural language that
- included IF...THEN.. statements, procedure calls for search-
- ing the data base (STM) and procedure calls to remove coins
- from the data base or add coins to the data base.
-
- If we want to include other coins, a half dollar or
-
-
-
-
-
-
-
-
-
- - 6 -
-
-
- dollar coin, we can easily add another rule or two. Unusual
- coins, perhaps a twenty cent piece and a thirty-five cent
- piece, may force us to rewrite a previous rule or reorder
- the rules. These changes are easily acomplished in a rule
- based language like TRC, they may not be so easily accom-
- plished in a procedural language.
-
- All upper case letters were used for all the tokens, or
- words, in this set of rules. All reserved words in TRC are
- all upper case. MARK and ADD are reserved words that can be
- used in the action part. The rule labels and the names of
- the things that are being searched for in STM (PENNY, DIME
- etc.) were also expressed in upper case. This is a sug-
- gested convention. Either upper or lower or both may be
- used. Later we will see how C language code can be embedded
- in both the situation and action parts. Most of the C
- language is written in lower case, writing TRC in upper case
- will make it easier to distinguish the two.
-
- _4. _D_A_T_A _D_E_F_I_N_I_T_I_O_N
-
- Now that the basic idea of a rule based system has been
- presented a more formal look at the syntax of TRC is in
- order. TRC rules will request that a data base be searched,
- or that things be added to or removed from the data base.
- The code for searching the data base or adding new things to
- the data base or removing things from the data base is gen-
- erated by TRC. In order for TRC to generate this code it
- must know what kinds of things are going to be in the data
- base (STM).
-
- Suppose an expert system that dealt with the real value
- of coins, rather than just their face value was needed.
- Information about each coin might include not only it's dom-
- ination, but also the year and site of it's minting, it's
- condition and it's numismatic value. The types of things
- that are referred to in the rules (coins, in this case) will
- be called objects. The attributes or values that are asso-
- ciated with each object will be referred to as 'elements'.
- All objects (and their elements) that will be referred to in
- the rules must first be defined in the definition section of
- the code.
-
- The definition section of the code consists of a list-
- ing of the definitions of each of the object types. A YACC
- grammar for a single definition is given in figure 3.
- Strong type checking is enforced by the compiler, the type
- (INT, FLOAT or STRING) of items to be compared must be
- identical. Each definition which contains an item_list
- results in the declaration of a structure in the output
- code. STM consists of lists for each of the types declared
- and a count of the number of items in each list. For
- objects which contain no elements, TRC generates no list and
- no searching code, checking only the count.
-
-
-
-
-
-
-
-
-
- - 7 -
-
-
-
- definition : TOKEN
- | TOKEN '(' item_list ')'
- ;
-
- item_list : item_list item
- ;
-
- item : TOKEN ':' type
- ;
-
- type : INT
- | FLOAT
- | STRING
- | POINTER
- ;
-
- Figure 3: YACC Grammar for a Definition
-
-
- The purpose of the POINTER type is to generate a
- pointer to a structure of the type of the list. This is a
- 'hook' that permits building arbitrary structures in user
- code. There is no direct support for this type.
-
- Though it is not necessary in any sense, it may be a
- good idea to use all upper case character for object and
- element definitions. This will make these items much more
- visible in the code output by TRC should it become necessary
- to review that code. Some correct definitions include:
-
- A (A1 : STRING
- A2 : INT)
- B (B1 : FLOAT B2 : INT)
- C
-
- These definitions create three classes of objects.
- Objects in class A contain a string and an integer, those in
- B contain a double precision floating point value and an
- integer and those in object class C contain no elements.
-
- _5. _S_H_O_R_T _T_E_R_M _M_E_M_O_R_Y
-
- The purpose of the STM section of the code is to permit
- the initial contents of STM to be specified. TRC will pro-
- duce a single procedure, _i_n_i_t, which adds the listed objects
- to STM. Objects are added to STM by insertion at the head
- of the list. The objects listed in the STM section of the
- code are inserted in the opposite order that they are listed
- so the final result is that STM is initialized just as
- listed.
-
- This section of the code is intended to serve two pur-
- poses. When an expert system is being developed it provides
-
-
-
-
-
-
-
-
-
- - 8 -
-
-
- a way to place data in STM without having to write input
- routines. After the rules are written and debugged and a
- separate input routine has been written, this section can be
- used to specify an initial condition that may be needed for
- every instance of the problem. Figure 5 gives a YACC gram-
- mar for a single entry in STM.
-
- entry : count TOKEN
- | count TOKEN '(' init_list ')'
- ;
-
- count : /* empty */
- | INTEGER
- ;
-
- init_list : /* empty */
- | init_list init_item
- ;
-
- init_item : TOKEN ARROW INTEGER
- | TOKEN ARROW DOUBLE
- | TOKEN ARROW STR
- ;
-
- Figure 5: YACC Grammar for a STM Entry
-
-
-
- The objects to be added are just listed, along with the
- initial value of any elements the object may have. String
- elements that are not explicitly initialized are set equal
- to the null string, numeric elements that are not explicitly
- initialized are set equal to zero. Suppose the following
- objects were declared in a definition part:
-
- A (A1 : STRING A2 : INT)
- B (B1 : FLOAT B2 : INT)
- C
-
- Some correct entries would include:
-
- 10 A (A2 => 9)
- B (B1 => 1.1)
- 2 B (B1 => 2.2
- B2 => 6)
- C
- A
-
- Some incorrect entries are:
-
- 10 (A2 => 9) /* the object name is missing */
- B (B1 => 9) /* FLOAT literals MUST contain
- a decimal point */
- C (C2 => 1) /* object C does not include
-
-
-
-
-
-
-
-
-
- - 9 -
-
-
- element C2 */
-
- _6. _L_O_N_G _T_E_R_M _M_E_M_O_R_Y
-
- LTM is the section where the rules are enumerated. TRC
- generates a loop which tests the situation part of each rule
- in the order they are listed. When a rule is found who's
- situation part is true, that rule's action part is executed.
- Testing will normally resume at the beginning of the list of
- rules. The grammar, or syntax, for a single rule will now
- be presented. Code examples will illustrate the syntax.
-
- LTM may begin with optional switches to turn on trac-
- ing, profiling or backtracking. These options, which may
- also be turned on with command line options, are discussed
- in section 8. Following the option part is a listing of the
- rules. Figure 5 gives a YACC grammar for a single rule.
-
- Each rule begins with a label. This label is copied
- unmodified to the output source code and is used as a label
- in the main loop. It will aid the readability of the output
- if labels are all upper case and as descriptive as possible.
- Following the label is the left hand side or situation part
- of the rule. This part of the rule is where the search
- strategy is specified and the items to search for are
- enumerated.
-
- There are two search strategies, linear and recursive.
- The linear strategy is the default. In the linear search
- strategy each match causes a linear search from the begin-
- ning of STM. The first object that matches the test is
- marked as in use. Subsequent searches will ignore the pre-
- viously marked item. As soon as any single match fails the
- entire rule fails. Consider the following example:
-
- A (A.A1 == "TEST")
-
- This specification requests that two objects be
- searched for, one where the elements of the object A can
- have any value and one where element A1 of object A is the
- string "TEST". The code generated by TRC for this rule will
- first check that the list of A objects has at least two
- objects, little sense in searching a list when it is known
- that the search will fail. If the list has at least two
- objects, the first object will match the first A, since no
- values were specified for the elements any object will
- match. The list of A objects is then searched for one in
- which element A1 is equal to the string "TEST". The success
- or failure of the rule depends on the success or failure of
- this search.
-
- Clearly is is possible that only one object in list A
- had element A1 equal to "TEST". It is also possible that the
- object whose element A1 was equal to "TEST" was the first
-
-
-
-
-
-
-
-
-
- - 10 -
-
-
-
- production : label lhs '=>' rhs ';'
- ;
- label : TOKEN ':'
- ;
- lhs : /* empty */
- | lhs match
- ;
- match : RECURS
- | count TOKEN
- | NOT TOKEN
- | count '(' free_variable match_list ')'
- ;
- free_variable : /* empty */
- | HAT TOKEN TOKEN
- ;
- match_list : /* empty */
- | match_list match_element
- ;
- match_element : TOKEN '.' TOKEN relop INTEGER
- | TOKEN '.' TOKEN relop DOUBLE
- | TOKEN '.' TOKEN relop STR
- | TOKEN '.' TOKEN relop TOKEN '.' TOKEN
- ;
- rhs : optional_part pass_part
- ;
- optional_part : /* empty */
- | optional_part option
- ;
- option : MARK mark_list
- | ADD add_list
- | OPTIMIZE TOKEN
- ;
- mark_list : mark_item
- | mark_list mark_item
- ;
- mark_item : count TOKEN
- ;
- add_list : entry
- | add_list entry
- ;
- pass_part : /* empty */
- | C_COCE
- ;
- relop : "=="
- | "!="
- | "<="
- | '<'
- | ">="
- | '>'
- ;
-
- Figure 5: YACC Grammar for a Rule
-
-
-
-
-
-
-
-
-
-
- - 11 -
-
-
- object in the list. If that were in fact the case then the
- rule would fail even though it could have been true with a
- different search strategy. In this simple case the problem
- could be avoided by simply reordering the situation part so
- that the object with an element A1 equal to "TEST" was
- searched for first.
-
- Not all examples are this simple, and that is the rea-
- son for the recursive search strategy. In the recursive
- strategy if a given test fails, the previous test is undone
- and redone from the point where the selection was made.
- This process continues until a match is found or it is found
- that a match is not possible. The recursive search is a
- more powerful pattern matching tool, but it is much more
- expensive in execution time. The search time for the linear
- search is order N while for the recursive strategy it is
- order N squared. For large N this is a very substantial
- difference.
-
- To select a recursive search, the reserved word RECURS
- is included in the situation part. The clearest code will
- result if RECURS immediately follows the rule's label. If a
- rule is declared RECURS, the recursive search will apply to
- all objects in the situation part. There is no way to
- search recursively for some objects and linearly for others
- in a single rule. The scope of the RECURS declaration is
- one rule. Many expert system development tools use only the
- powerful but time consuming recursive search technique.
- Making this facility optional enables the user to exercise
- some control on the search time. It is also possible that
- the order that the objects occur in the list is important,
- in this case the linear search would be required. TRC
- always inserts new objects at the front of the list and
- never reorders a list or drops an element from a list,
- unless specifically directed to.
-
- It is sometimes necessary to compare an element of one
- object with an element of some previously found object,
- rather than to some literal value. To do this a name for
- the previously found object is needed. A name that is
- assigned to an object is referred to as a free_variable.
- The scope of a free_variable is the current rule. Using the
- previous definitions some examples are:
-
- (^A FOO)
- (A.A1 == FOO.A1)
- (^A BAR
- A.A1 != "TEST")
- (B.B2 != BAR.A2)
-
- The first line in this example picks the first object
- of the A list and assigns the free_variable FOO to that
- object. In the second line the A list is searched for an
- object whose element A1 is equal to the element A1 found in
-
-
-
-
-
-
-
-
-
- - 12 -
-
-
- the first line. The third line searches the A list for an
- object whose element A1 is not equal to "TEST" and assigns
- the free_variable BAR to this object. The final line
- searches the B list for an object with an element B2 not
- equal to the element A2 found in the previous search.
- Notice what is happening here, elements from different lists
- are being compared. This comparison is permitted because
- both elements are integers, so the types of the elements
- match. In complex matches like this it is frequently neces-
- sary to use the recursive search.
-
- A new definition is needed to consider yet another
- case:
-
- C (C1 : INT C2 : INT)
-
- The final case to be considered is the case where two
- elements of the same object are compared. There are two
- equivalent ways to specify this:
-
- (^C FOO
- C.C1 == FOO.C2)
- (C.C1 == C.C2)
-
- TRC will generate identical code for either of these
- statements. In each line a specification is made that the C
- list be searched for an object where elements C1 and C2 are
- equal. There is a subtle but important difference between
- these similar examples and all previous examples. In all
- previous cases the right hand side of the relational expres-
- sion evaluated to an absolute value before the search began.
- In this example the absolute value of the right hand side of
- the relation changes with every object that is tested.
- There is a small code overhead for this type of testing,
- which is noticeable only if used on many different elements
- of many different types of objects.
-
- Finally the form NOT TOKEN, where TOKEN is some object
- is an explicit test for an empty list. The case of search-
- ing a list for the absence of some element is discussed
- later.
-
- The situation and action parts are separated by the
- ARROW symbol, "=>". The action part can contain MARK, ADD
- and OPTIMIZE statements. The MARK statement lists the
- number and type of objects to remove from STM. The only
- items that may be removed from STM by a MARK statement are
- objects that were enumerated in the situation part. This
- restriction is necessary because those objects are the only
- ones that definitely are in STM.
-
- The ADD statement lists objects that are to be added to
- STM. Objects are inserted at the head of their respective
- lists in the order they are listed in the action part.
-
-
-
-
-
-
-
-
-
- - 13 -
-
-
- Objects are always ADDed to STM before objects are removed,
- regardless of the order of ADD and MARK statements. This is
- necessary because ADD statements can refer to elements that
- are about to be removed. Assume the previous definitions of
- A and B for this example rule:
-
- RULE:
- (A.A1 == "BAR")
- (^A FOO
- A.A1 == "FOO")
- =>
- MARK 2 A
- ADD B (B1 => 3.14159
- B2 => FOO.A2)
- {
- printf("RULE is firing\n");
- }
- ;
-
- TRC will generate code that will search the A list
- first for an object with element A1 equal to "BAR" and then
- for an object with element A1 equal to "FOO". If both of
- these searches succeed the action part will execute. The
- MARK statement specifies that both A objects are to be
- removed from STM. This could also have been specified:
-
- MARK A A
- or
- MARK A
- MARK A
-
- The MARK statement causes objects to be removed in the
- order they were found. It is possible for a situation to
- exist where it is not desirable to remove the objects in the
- order they were found. In the example above it may be
- desirable to remove the second type A object, but not the
- first. Objects may be MARKed based on their free_variable
- name. The following statement will cause only the second
- type A object from the sample rule to be MARKed:
-
- MARK FOO
-
- The example ADD statement adds an object of type B to
- list B copying a value out of list A. The code generated by
- TRC will do the ADD first then the MARK since the ADD state-
- ment refers to a MARKed element.
-
- The code section is executed after the ADD and MARK
- code and simply prints a message. It is included here to
- demonstrate what a code section looks like. Information on
- techniques used to refer to objects in the code section is
- presented in _T_h_e _T_R_C _L_a_n_g_u_a_g_e _R_e_f_e_r_e_n_c_e _M_a_n_u_a_l[_5]. The
- final semicolon is included in the TRC syntax to give the
- parser a point to sync on in case of syntax errors in the
-
-
-
-
-
-
-
-
-
- - 14 -
-
-
- source.
-
- The OPTIMIZE statement is used to tell TRC that after
- the current rule executes it is not necessary to search the
- list of rules from the very beginning of LTM, rather the
- search can begin with the named rule. The naming of the
- OPTIMIZE statement refers to its primary, but not neces-
- sarily only purpose. The OPTIMIZE statement can be used to
- impose a control structure on LTM. For convenience the
- label "Start" alway precedes the first rule and the label
- "End" always follows the last rule.
-
- The TRC grammar does not include a way to specify a
- search for the absence of some element. This can be accom-
- plished using the OPTIMIZE statement and a side effect of
- the search strategy. The LTM section in Figure 7 demon-
- strates this possibility.
-
- RULE1:
- (A.A1 == "FOO")
- =>
- OPTIMIZE RULE3
- ;
- RULE2:
- =>
- {
- /* do your thing here */
- }
- ;
- RULE3:
- /* system continues here */
-
- Figure 7: Testing for the Absence of a Pattern
-
-
-
- Figure 7 illustrates a technique for testing for the
- absence of some element. RULE1 tests for the presence of
- the element and uses the OPTIMIZE statement to branch around
- RULE2 if it is found. The situation part of RULE2 is empty.
- An empty situation part always evaluates to true. RULE2
- will always fire when RULE1 fails and never be tested when
- RULE1 fires. The combination of RULE1 and RULE2 is a rule
- that tests for the absence of an element.
-
- _7. _H_E_A_D_E_R _a_n_d _T_R_A_I_L_E_R
-
- The header and trailer are lexically identical and
- serve similar purposes; the inclusion of C code that is not
- related to a specific rule but is of a more global nature.
- The syntax is identical for the header, trailer and code
- section of long term memory. The code section must begin
- with an open brace '{' and end with a closed brace '}'. A
- code section is recognized by the input scanner using a very
-
-
-
-
-
-
-
-
-
- - 15 -
-
-
- trivial algorithm; when an open brace is encountered a code
- section begins. The 'brace count' is set to one and each
- time an open brace is encountered the 'brace count' is
- incremented. Each time a closed brace is encountered the
- 'brace count' is decremented. When the 'brace count' is
- zero the code section is presumed to have ended. All text
- in the code section, except the initial open brace and final
- closed brace, is passed through untouched.
-
- This simple algorithm avoids the potential problem of
- having to parse the C language within TRC, but it is very
- easy to defeat. If the number of braces in a code section
- is not balanced, the end of the section will not be deter-
- mined correctly - this includes braces embedded in comments.
- A single missing brace may cause the entire compilation to
- be aborted. Worse yet would be two complimentary missing
- braces in separate sections of the specification. Very
- large pieces of the specification may be passed. These
- problems are common in programming languages and not diffi-
- cult to avoid. Sample valid headers and trailers are illus-
- trated in figure 6.
-
-
- {
- /* this is a sample valid header section */
-
- struct mystruct{
- int a, b, c;
- struct mystruct *next;
- };
- }
- %%
- definitions
- %%
- short term memory
- %%
- long term memory
- %%
- {
- /* this is a sample valid trailer section */
- myprocedure()
- {
- /* do my thing in here */
- }
- }
-
- Figure 6: Sample Header and Trailer
-
-
- On the somewhat related subject of comments, C style
- comments may be included anywhere in the specification that
- a space may occur. Comments that are not part of a code
- section are recognized by the scanner and are discarded.
- Nested comments outside of the code section are not
-
-
-
-
-
-
-
-
-
- - 16 -
-
-
- supported, comments occuring in a code section are passed
- through.
-
- The code in the header section is written on the output
- file _h_e_a_d_e_r._h. This file is included in all output files.
- The header section should be used to declare structures and
- variables which may be used in the action part of a rule.
- Since this code will be included in several files it should
- not contain initialized data or procedures, which would
- cause duplicate definition errors at compile time.
-
- The code in the trailer section is written on the out-
- put file _l_o_o_p._c. This is the code file which contains the
- main loop, and includes the definitions of all the struc-
- tures and global variables manipulated by the inference
- engine. Including procedures in the trailer is a convenient
- way of gaining visibility of those objects.
-
-
- _8. _O_P_T_I_O_N_S
-
- The option section occurs at the beginning of LTM. The
- option section may be empty, but any options must precede
- the first rule. Options may also be specified with command
- line flags. There are several options; TRACE, PROFILE,
- BACKTRACK, DUMP, RECURS, ZERO and SAVE.
-
- The TRACE option causes a runtime trace to be created.
- The primary purpose of this trace is to facilitate generat-
- ing explanations. People seldom take the advice of an
- expert without a satisfactory explanation of why the advice
- should be followed. It may not be reasonable to expect peo-
- ple to take advice of an expert system that can not explain
- itself. The trace is a list of rules that were fired, or
- inferences that were drawn. Having the trace facilitates
- explanations of the type, "I found that a, b and c were true
- and therefore concluded that d should be pursued". The gen-
- eration of explanations is left to user code, only the trace
- is provided.
-
- Turning the TRACE option causes the procedure
- _a_p_p_e_n_d__t_r_a_c_e and the structure _t_r_a_c_e to be generated. Each
- time a rule is fired the procedure append_trace is called
- with the number of the rule that fired. This number is
- appended to the list of rules. The list structure is
- defined:
-
- struct trace{
- int rule;
- struct trace *next;
- } *trace_front, *trace_back;
-
- The pointer _t_r_a_c_e__f_r_o_n_t points to the head of the list
- and _t_r_a_c_e__b_a_c_k points to the last item in the list. Rules
-
-
-
-
-
-
-
-
-
- - 17 -
-
-
- are numbered beginning with 1 from the start of LTM. if the
- label of the rule would be more convenient it can be
- obtained from the rule_names array, e.g. the label of rule
- one is:
-
- rule_names[1]
-
- The PROFILE option generates two arrays in which counts
- of the number of times each rule executed and each match was
- searched are stored. The procedure _p_r_i_n_t__p_r_o_f_i_l_e will print
- a summary of the execution of the inference engine.
-
- The BACKTRACK option generates a structure and pro-
- cedures needed to implement backtracking. These structures
- and procedures are detailed in _T_h_e _T_R_C _R_e_f_e_r_e_n_c_e _M_a_n_u_a_l[_5].
- Backtracking is a technique for searching a problem space.
- When a dead end is reached, the last decision is undone and
- the search continues. In the inference engine generated by
- TRC one backtracking step is taken each time all the rules
- in LTM fail. When backtracking is enabled, objects marked
- for deletion from STM are saved in a back track structure
- along with their former position. The number of items added
- by the rule is saved in the same structure. To undo a rule
- the formerly added objects are deleted and the formerly
- deleted objects are restored to their original position in
- the STM.
-
- The backtrack procedures assume that the ADD and MARK
- statements are the only way that STM is modified. If user
- code modifies STM the backtrack save and restore procedures
- will have to be modified to be cognizant of user code
- changes to STM.
-
- In a system with backtracking it is essential that some
- rule recognizes when the problem is solved and returns to
- the calling procedure. If no rule does this, the system
- will perform every manipulation on STM that it can in every
- order that it can and finally will return with STM fully
- restored to it's original state. Thus vast resources can be
- consumed to to obtain the same results that not calling the
- inference engine at all would produce.
-
- The DUMP option causes code to print the contents of
- STM on the standard output to be generated. The procedure
- _d_u_m_p__s_t_m will print out the contents of each list of objects
- in STM on the standard output. There is also a procedure
- _d_u_m_p_%_s__s_t_r_u_c_t generated for each object defined, where "%s"
- is replaced by the object name. Calls to generate dumps of
- the entire STM or specific lists may be embedded in the
- C_Code part. TRC itself never generates calls to the dump
- procedures.
-
- The RECURS option is the only option that may be placed
- in the option part of LTM or in the situation part of a
-
-
-
-
-
-
-
-
-
- - 18 -
-
-
- rule. If RECURS is used in the option part all rules will
- default to the recursive search strategy. This option can
- be turned off for a given rule by including NORECURS in the
- situation part of the rule.
-
- The ZERO option will generate a single procedure, _z_e_r_o.
- This procedure will free all the structures that are dynami-
- cally allocated by the TRC generated code. The structures
- that are allocated dynamically include STM, the backtracking
- stack (if backtracking is enabled), the profiling arrays (if
- profiling is enabled) and the trace list (if tracing is
- enabled). TRC will generate code for any combination of
- options. This is useful in situations where the expert sys-
- tem is called more than once. The zero procedure will clean
- up anything left by a previous invocation.
-
- The SAVE option will generate procedures to write all
- dynamically allocated structures to a file and procedures to
- restore those structures from a previously written file.
- This option makes it easy to write expert systems which
- checkpoint their own execution. It is then possible to res-
- tart execution in the case of a crash without having to redo
- all the work that has already been done. It is necessary to
- save all dynamically allocated structures including STM, the
- backtracking stack, profiling arrays and the trace list.
- Separate procedures are generated for saving and reloading
- each of these structures.
-
- _9. _E_N_V_I_R_O_N_M_E_N_T
-
- TRC is a compiler that translates a rule based system
- to a set of C language procedures. It is useful in develop-
- ing expert systems. TRC produces only an inference engine
- and supporting structures, input and output processing must
- be added with additional code, presumably in C. The minimum
- external code is a main procedure that will initialize STM
- and call the inference engine. Figure 8 is a minimal main
- procedure that includes examples of calls to procedures to
- dump STM, print the trace list and print the execution pro-
- file. This assumes that the DUMP, TRACE and PROFILE options
- were turned on.
-
- The output of TRC is written on several files in the
- current directory. The file names generated are; add.c,
- dump.c, free.c, loop.c, misc.c, search.c, relink.c,
- backtrack.c, profile.c, zero.c and save.c. A sample
- Makefile is given in Figure 8. The reference to main.c
- refers to user supplied code.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- - 19 -
-
-
-
- #include <stdio.h>
- #include "loop.h"
- extern char *rule_names[];
-
- main()
- {
- struct trace *temp;
-
- /* initialize STM */
- init();
-
- printf("Initial STM");
- dump_stm();
-
- /* call the inference engine */
- loop();
-
- printf("Final STM");
- dump_stm();
-
- /* dump the contents of the trace structure */
- temp = trace_front;
- while(temp){
- printf("%s",rule_names[temp->rule]);
- temp = temp->next;
- }
-
- print_profile();
- }
-
- Figure 7: Sample Main Procedure
-
-
- # Makefile for expert systems generated by TRC
-
- PROG = loop
- OBJS = add.o backtrack.o dump.o free.o loop.o \
- misc.o profile.o relink.o save.o \
- search.o zero.o main.o
- INCS = loop.h
- CC = cc
-
- all: $(PROG)
- $(CC) -c -O $<
-
- $(OBJS): $(INCS)
-
- $(PROG): $(OBJS)
- $(CC) -o $@ $(OBJS) $(LIBS)
-
- Figure 8: Sample Makefile
-
-
-
-
-
-
-
-
-
-
-
- - 20 -
-
-
- BIBLIOGRAPHY
-
-
- 1. Johnson, S. C. (1975), "YACC: Yet Another Compiler Com-
- piler", Computer Science Technical Report No. 32, Bell
- Laboratories, Murray Hill, NJ.
-
- 2. Lindsay, Robert, et.al (1980), _A_p_p_l_i_c_a_t_i_o_n_s _o_f _A_r_t_i_f_i_c_i_a_l
- _I_n_t_e_l_l_i_g_e_n_c_e _f_o_r _C_h_e_m_i_c_a_l _I_n_f_e_r_e_n_c_e: _T_h_e _D_E_N_D_R_A_L _P_r_o_j_e_c_t,
- McGraw Hill, New York.
-
- 3. Davis, R., B.G. Buchanan and E.H. Shortliffe (1977),
- "Production Rules as a Representation for a Knowledge-Based
- Consultation Program", Artificial Intelligence, Volume 8,
- Issue 1, February 1977.
-
- 4. Feigenbaum, E.A., (1978), "The Art of Artificial Intelli-
- gence: Themes and case studies of knowledge engineering",
- IJCAI.
-
- 5. Kary, Daniel D. (1985), "The TRC Reference Manual", North
- Dakota State University, Division of Mathematical Sciences,
- Fargo, ND.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-