home *** CD-ROM | disk | FTP | other *** search
Text File | 1986-11-30 | 45.1 KB | 1,466 lines |
- Newsgroups: mod.sources
- Subject: TRC - expert system building tool (part 3 of 8)
- Approved: jpn@panda.UUCP
-
- Mod.sources: Volume 3, Issue 111
- Submitted by: ihnp4!dicomed!ndsuvax!nckary (Daniel D. Kary)
-
- This is NOT a shell archive. Simply delete everything up to and including
- the cut mark and save the result as reference.1.doc.
-
- Dan Kary
- ihnp4!dicomed!ndsuvax!nckary
-
- -------------- cut here ---------------
-
-
-
-
-
-
-
-
- The TRC Reference Manual
-
-
- Daniel D. Kary
-
- North Dakota State University
- Computer Science Department
- 300 Minard Hall
- Fargo, ND 58102
-
-
- _A_B_S_T_R_A_C_T
-
-
- The syntax of TRC is formally defined. The
- output of TRC is elucidated.
-
-
-
- TABLE OF CONTENTS
-
- PART ONE - INPUT
- 1. INTRODUCTION
- 2. OVERVIEW
- 3. LEXICAL ELEMENTS
- 4. DEFINITIONS
- 5. SHORT TERM MEMORY
- 6. LONG TERM MEMORY
- 7. OPTIMIZER
-
- PART TWO - OUTPUT
- 8. OVERVIEW
- 9. COMMON PROCEDURES
- 10. DATA OBJECTS
- 11. MANIPULATING THE DATA
- 12. TRANSLATING RULES
- 13. OPTIONS
-
- APPENDICES
- A. TRC GRAMMAR
- B. ERROR MESSAGES
- C. STYLE NOTES
- D. SAMPLE PROGRAM
-
-
- _1. _I_N_T_R_O_D_U_C_T_I_O_N
-
- TRC is a programming language that is useful for build-
- ing expert systems. It is presumed that the reader is fami-
- liar with expert systems in general and has used at least
- one expert system building tool. Some terms that are widely
-
-
-
-
-
-
-
-
-
- - 2 -
-
-
- used in describing expert systems have specific meanings
- when used to describe TRC and will be defined now.
-
- The set of situation-action rules that embody the
- knowledge an expert uses to solve a problem are referred to
- as Long Term Memory (LTM). The information that may vary
- with each instance of the problem is referred to as Short
- Term Memory (STM). The code which determines if the situa-
- tion part of a rule is true will be called a pattern matcher
- or a matcher. The code which determines which rule to
- activate will be called an inference engine and includes
- both the matcher and the LTM. The input to the TRC compiler
- is called a specification.
-
- The input to the TRC compiler is a rule based language.
- The output is a set of C language files. The procedures in
- the C language files output by the TRC compiler collectively
- implement the inference engine. An inference engine is to
- an expert system as a parser is to a compiler: it is of cen-
- tral importance but it does not comprise a complete imple-
- mentation. TRC does not provide code for interaction with
- the user, but does permit the programmer to easily add this
- code.
-
- This document is divided into two parts and a set of
- appendices. The first part presents a formal definition of
- the input language with examples of each language feature.
- The second part describes the output of the TRC compiler and
- includes some important insight on integrating TRC generated
- code with other C language code. The appendices include the
- complete TRC grammar, a listing and explanation of all the
- error messages that TRC might produce and a sample specifi-
- cation.
-
-
- PART ONE - INPUT
-
-
- _2. _O_V_E_R_V_I_E_W
-
- Every specification file consists of five sections, the
- header, definitions, short term memory (data base), long
- term memory (rules) and the trailer. These sections are
- separated by double percent characters. The form of a full
- specification is illustrated below. The header, STM and
- trailer are optional so the minimum specification would con-
- tain only the definitions and LTM. All of the "%%" marks
- must be present in each specification file.
-
-
- header
- %%
- definitions
- %%
-
-
-
-
-
-
-
-
-
- - 3 -
-
-
- STM
- %%
- LTM
- %%
- trailer
-
-
- The purpose of the header and trailer sections is to
- permit the inclusion of C language code in the specifica-
- tion. The header and trailer are each composed of a single
- lexical element called a c-code which is defined in section
- 3. Separate sections are devoted to each of the remaining
- parts of a TRC specification.
-
- _3. _L_E_X_I_C_A_L _E_L_E_M_E_N_T_S
-
- A program consists of a single file. A file is a
- sequence of lexical elements composed of characters. Char-
- acters may be one of these classes; (1) upper-case-letters,
- (2) lower-case-letters, (3) digits, (4) special characters
- (5) separators (6) embedded characters and (7) other charac-
- ters.
-
- (1) upper-case-letters
- A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
-
- (2) lower-case-letters
- a b c d e f g h i j k l m n o p q r s t u v w x y z
-
- (3) digits
- 0 1 2 3 4 5 6 7 8 9
-
- (4) special characters
- " ( ) : ; = < > ! ^ _ $ . { } / * %
-
- (5) separators
- space tab newline
-
- (6) embedded characters
- \n embedded-newline
- \t embedded-tab
- \b embedded-backspace
- \r embedded-carriage-return
- \f embedded-form-feed
- \\ embedded-back-slash
- \" embedded-quote
-
- (7) other characters
- @ # & + [ ] ` ~ ' | , ?
-
- The following names are used when referring to special characters:
-
- Symbol Name Symbol Name
-
-
-
-
-
-
-
-
-
-
- - 4 -
-
-
- " quote : colon
- ; semicolon = equal
- ! exclamation ^ hat
- _ underscore $ dollar
- < less-than > greater-than
- ( left-paren ) right-paren
- { left-bracket } right-bracket
- * asterisk % percent
- - minus
-
-
- A lexical element is a either a delimiter, an identif-
- ier, an integer literal, a floating point literal, a string
- literal, a comment or a c_code. In some cases a separator
- is required between lexical elements, specifically when
- adjacent lexical elements could be interpreted as a single
- lexical element. A separator is any of space, tab or new-
- line. One or more separators are permitted between any two
- lexical element, before the first lexical element and after
- the last lexical element.
-
- _3._1. _D_E_L_I_M_I_T_E_R_S
-
- A delimiter may be one of the following special charac-
- ters:
-
- : ; " . ^ - ( ) { } < >
-
- A delimiter may be one of the following compound delim-
- iters. Each compound delimiter is composed of two adjacent
- special characters.
-
- => %% != == >= <=
-
- The following names are used when referring to compound
- delimiters:
-
- Delimiter Name
-
- => arrow
- %% delim
- != not-equal
- == equality
- >= greater-than-or-equal
- >= less-than-or-equal
-
- _3._2. _I_D_E_N_T_I_F_I_E_R_S
-
- Identifiers are used as tokens and as reserved words.
- Separators are not allowed in an identifier. The underscore
- character is the only special character that may be part of
- an identifier.
-
- identifier ::= letter { underscore | letter | digit}
-
-
-
-
-
-
-
-
-
- - 5 -
-
-
- letter ::= upper-case-letter | lower-case-letter
-
- Examples:
-
- PENNY Get_Stuff x1
- ThisOne WrZ_123 etc_
-
- The identifiers that are reserved words are:
-
- ADD MARK PROFILE
- BACKTRACK NORECURS RECURS
- DUMP NOT SAVE
- EMPTY OPTIMIZE STRING
- FLOAT POINTER TRACE
- INT PREFIX ZERO
-
- _3._3. _N_U_M_E_R_I_C _L_I_T_E_R_A_L_S
-
- There are two classes of numeric literals, integer
- literals and floating point literals. The presence of a
- decimal point distinguishes floating point literals from
- integer literals.
-
- floating-point-literal ::= [ minus ] digits dot digits
- integer-literal ::= [ minus ] digits
- digits ::= digit { digit }
-
- Example integer literals:
-
- 1 -33 187
-
- Example floating point literals:
-
- 0.5 -3.14159 6.0
-
- _3._4. _S_T_R_I_N_G _L_I_T_E_R_A_L_S
-
- A string literal is formed by a sequence of characters
- (possibly zero) enclosed between quote characters. Any of
- the six classes of characters may be embedded in a string.
-
- string-literal ::= quote { [ character ] } quote
-
- Examples:
- ""
- "\"recursion\""
- "these characters can be in a string $, ! => etc_\n"
-
- _3._5. _C_O_M_M_E_N_T_S
-
- Comments may be included anywhere in the input file
- that separators or delimiters may occur. Comments follow
- the style of comments in the C language. Comments may not
- be nested within comments. Any of the six classes of
-
-
-
-
-
-
-
-
-
- - 6 -
-
-
- characters may be embedded in a comment.
-
- comment ::= slash asterisk { [ character ] } asterisk slash
-
- Examples:
- /* a simple comment */
-
- /*
- a multi-line
- comment
- */
-
- /*******************
- * A Fancy Comment *
- *******************/
-
- _3._6. _C__C_O_D_E
-
- A c_code is a fragment of C language code that is
- embedded in the input file. A c_code is recognized by the
- scanner as a single lexical item. The C language itself is
- not parsed by TRC. A c-code may not contain a procedure or
- function.
-
- c_code ::= left-bracket { [character] | [c_code] } right-bracket
-
- Example:
-
- {
- if(condition){
- action(argument);
- another_action();
- }
- }
-
- _4. _D_E_F_I_N_I_T_I_O_N_S
-
- Each entity that can be referred to by a TRC rule must
- be defined in the definition section. Each entity that is
- defined is called an _o_b_j_e_c_t. Objects may have numeric or
- string values associated with them. These associated values
- are called _e_l_e_m_e_n_t_s of the object. There are two forms of
- definitions. There is a simple form for objects which have
- no elements and an extended form for objects that have asso-
- ciated elements.
-
- definition ::= identifier
- definition ::= identifier left-paren item-list right-paren
- item-list ::= { [ item ] }
- item ::= identifier colon type
- type ::= INT | FLOAT | STRING | POINTER
-
- For each object there will be an associated object
- count in the output code, which represents the number of
-
-
-
-
-
-
-
-
-
- - 7 -
-
-
- objects of that type that exist at any point in time. For
- each object with at least one element, there will be an
- associated structure and list of objects of that type in the
- output code. The set of object counts and object lists
- defined in the definition section represent the STM that the
- system of rules may refer to.
-
- Each element that is defined for a given object must
- have a data type specified. Strong type checking is
- enforced throughout TRC. Comparisons and assignments of
- elements must involve elements or literals of the same type.
- There is no coercion of types. The INT, FLOAT, STRING and
- POINTER types are described in section 10. The POINTER type
- is a pointer to a structure of the type of the object that
- contains it.
-
- Examples:
-
- A
- b_b (This : INT)
- CAST ( THAT : INT
- The_Other : FLOAT)
-
- _5. _S_H_O_R_T _T_E_R_M _M_E_M_O_R_Y
-
- The short term memory (stm) section of the input is
- where the initial state of the working memory is specified.
- The intention of this section is to permit the working
- memory to be initialized to some state that may be required
- for each invocation of the expert system. It is also
- intended to serve as a way of entering test data while the
- expert system is being developed, before data entry pro-
- cedures are developed.
-
- stm ::= { [ entry ] }
- entry ::= [ integer-literal ] identifier
- entry ::= [ integer-literal ] identifier
- left-paren { [ init-item ] } right-paren
- init-item ::= identifier arrow value
- value ::= integer-literal
- value ::= floating-point-literal
- value ::= string-literal
-
- The short term memory section is a list of objects that
- are to be entered into the working memory. If an object has
- one or more elements, those elements can be initialized to
- user specified values. Numeric values that are not speci-
- fied are initialized to zero and string values that are not
- specified are initialized to an empty string. The integer-
- literal that may precede the name (identifier) of the object
- specifies how many objects of that type are to be added to
- working memory with the given element values, e.g.:
-
- /* definition part */
-
-
-
-
-
-
-
-
-
- - 8 -
-
-
- A (A1 : STRING
- A2 : FLOAT)
- B (B1 : STRING
- B2 : FLOAT)
- %%
- /* short term memory */
- A ( A1 => "string") /* It is not necessary to
- initialize all the elements
- in an object.
- */
- 2 A ( A2 => 3.34
- A1 => "thing") /* Nor is it necessary to
- initialize elements in the
- order they were declared.
- */
- 3 B /* It is not necessary to
- initialize the elements
- at all.
- */
- %%
-
-
-
-
- _6. _L_O_N_G _T_E_R_M _M_E_M_O_R_Y
-
- Long term memory is the section where the
- situation/action rules are enumerated. This section may
- begin with a listing of options that are to be turned on.
- All options in this section can also be specified by command
- line flags. Since the syntax for the long term memory sec-
- tion is more complex than for the other sections, it will be
- presented in several parts.
-
- _6._1. _O_P_T_I_O_N_S
-
- The long term memory is composed of two sections, the
- options and the rules.
-
- ltm ::= { [option] } { rule }
- option ::= ZERO | PROFILE | BACKTRACK
- | DUMP | RECURS | NORECURS
- | SAVE | TRACE | PREFIX identifier
-
- _6._1._1. _Z_E_R_O
-
- The ZERO option directs the compiler to generate a pro-
- cedure that will free all the dynamic structures allocated
- by TRC generated code. This feature is useful when develop-
- ing inference engines that will be entered more than once.
- It is often necessary to remove the 'leftovers' from a pre-
- vious execution before beginning a new execution.
-
-
-
-
-
-
-
-
-
-
-
- - 9 -
-
-
- _6._1._2. _P_R_O_F_I_L_E
-
- The PROFILE option directs the compiler to generate
- code to profile the execution of the inference engine and a
- procedure to print a summary of that profile. The profiling
- code counts the number of times that each rule searches some
- part of STM and how many times each rule is fired.
-
- _6._1._3. _B_A_C_K_T_R_A_C_K
-
- The BACKTRACK option directs the compiler to generate
- an inference engine that will backtrack when no rule is
- true. Backtracking is accomplished by undoing the actions
- of the last rule that fired and continuing to test rules as
- if the undone rule had never fired.
-
- _6._1._4. _D_U_M_P
-
- The DUMP option directs the compiler to generate pro-
- cedures that will print the contents of STM on the standard
- output. The intention of this option is to simplify the
- process of developing and debugging rules. By having the
- DUMP procedures generated automatically, the knowledge
- engineer is freed of the mundane task of writing procedures
- to display the current state of the STM. The DUMP pro-
- cedures are not intended to serve as the output of an expert
- system. Appropriate output routines will have to be
- developed by the knowledge engineer after the rules have
- been written.
-
- _6._1._5. _R_E_C_U_R_S
-
- TRC will generate code that uses one of two strategies
- for searching STM. These strategies (detailed in section
- 6.3.3) are called LINEAR and RECURSIVE. The LINEAR strategy
- is the default. The RECURS directive in the option part
- directs the compiler to use the RECURSIVE strategy as the
- default. It is possible to override the default on a per
- rule basis. Overriding the default is discussed in section
- 6.3.3.
-
- _6._1._6. _N_O_R_E_C_U_R_S
-
- The NORECURS option directs the compiler to use the
- LINEAR search strategy in all rules, unless otherwise
- directed. Since this is the default condition, it is not
- necessary to use this option.
-
- _6._1._7. _S_A_V_E
-
- The SAVE option directs the compiler to generate pro-
- cedures to save all objects which are dynamically allocated
- by TRC code on a file. The compiler will also generate pro-
- cedures which can restore the dynamically allocated
-
-
-
-
-
-
-
-
-
- - 10 -
-
-
- structures from the previously written files. The intention
- of this option is to simplify the development of expert sys-
- tems with checkpointing and restarting capabilities. The
- procedures generated by this option and the use of those
- procedures is described in section 13.6 and Appendix C.
-
- _6._1._8. _T_R_A_C_E
-
- The TRACE option directs the compiler to trace the exe-
- cution of the inference engine by maintaining a list of the
- rules that have been fired in the order they were fired.
- This list can be used to produce an explaination of the
- actions taken by the expert system.
-
- _6._1._9. _P_R_E_F_I_X
-
- The PREFIX option directs the compiler to use the iden-
- tifier that follows the reserved word 'PREFIX' as a prefix
- for all data objects and procedures generated by TRC. The
- intention of this option is to facilitate building expert
- systems that have more than one inference engine. Supplying
- different prefixes for each inference engine insures that
- there will be no name conflicts between separate inference
- engines, e.g.:
-
- PREFIX X_
-
-
- _6._2. _R_U_L_E_S
-
- The second section if LTM is the list of rules. Each
- rule has a label, which can be supplied or automatically
- generated by TRC. The label is used whenever it is neces-
- sary or convenient to refer to the rule by name. The label
- is followed by the situation part (described in the next
- section). The situation part is a list of statements fol-
- lowed by the arrow delimiter. The action part (described in
- the section following the description of the situation part)
- follows the arrow delimiter and is itself a list of state-
- ments. The action part is followed by a semicolon which
- terminates the rule.
-
- rule ::= label situation arrow action semicolon
- label ::= identifier colon | colon
-
- _6._3. _S_I_T_U_A_T_I_O_N
-
- The situation part specifies how STM is to be searched
- and what must be present in STM for the situation part to be
- true.
-
-
- situation ::= { [ s-option ] } { [ match ] }
- s-option ::= EMPTY identifier identifier
-
-
-
-
-
-
-
-
-
- - 11 -
-
-
- s-option ::= RECURS | NORECURS
- match ::= [ integer-literal ] identifier
- match ::= NOT identifier
- match ::= [ integer-literal ] left-paren name
- match-list right-paren
- match ::= c-code
- name ::= hat identifier identifier
- match-list ::= { match-item }
- match-item ::= identifier dot identifier relop literal
- match-item ::= identifier dot identifier relop
- identifier dot identifier
- relop ::= equality | not-equal | less-than
- relop ::= greater-than | greater-than-or-equal
- relop ::= less-than-or-equal
-
- _6._3._1. _M_A_T_C_H_I_N_G
-
- It is necessary to understand how matching is specified
- before the s-option part can be explained. A match, which
- will also be referred to as a test, is a statement of what
- the inference engine is to search for in STM. Assume the
- following objects were defined in the definition section:
-
- %%
- A (A1 : INT
- A3 : INT
- A2 : STRING)
- B (B1 : INT
- B2 : STRING)
- %%
-
- The simplest match specifies only the object that must
- be present. A search for one object of type A and one
- object of type B can be specified as follows:
-
- A B
-
- A search for two objects of type A and two objects of
- type B can be specified in many ways, including these four
- equivalent ways:
-
- A A B B
-
- 2A 2 B
-
- A B A B
-
- A A 2B
-
- The objects can be listed in any order and may be pre-
- ceded by an integer literal. The integer literal specifys
- how many objects of the named type are to be search for. In
- one of the examples there is a space between the count and
- the object name and in other examples there is no space
-
-
-
-
-
-
-
-
-
- - 12 -
-
-
- between the count and the object. Spaces are required only
- when there would be a conflict without a space. Since the
- string "2A" (for example) begins with a digit, it is
- presumed to be a numeric literal. Since "A" is not a digit,
- the numeric literal ends at that point. Since the numeric
- literal contained no decimal point, it is an integer-
- literal. The string is therefore lexically equivalent to "2
- A".
-
- The reserved word NOT is used to explicitly test for an
- empty list. The following match statement will be true if
- there are no objects of type A in STM:
-
- NOT A
-
- Any rule which contains a search for an object and a
- test for that same list being empty can never be true. TRC
- generates an error message in this situation because even
- though it is syntactically correct, it is in fact meaning-
- less, e.g.:
-
- A
- NOT A
-
- Very often it is necessary to search STM based not only
- on the type of object, but also based on the values of the
- elements of the object. This is specified by placing a list
- of the element values after the element name.
-
- (A.A1 == 2)
- (B.B1 != 3
- B.B2 <= "THIS")
- (A.A2 == "HERE" A.A1 > 6)
-
- These three statements can be translated as follows:
- First search the A list for an object whose element A1 is
- equal to two, then search the B list for an object whose
- element B1 is not equal to three and whose element B2 is
- less than or equal to "THIS", finally search the A list for
- an object whose element A2 is equal to "HERE" and whose ele-
- ment A1 is greater than six. This situation part would be
- true if all three objects were found in STM, otherwise it
- would be false. In the first match only the value of A1 is
- specified. Only the elements that are specified are tested,
- the values of any other elements that the object may contain
- are not considered. Association of parameters is by name,
- so it is not necessary to list elements in the order they
- were declared. The third match statement in the example
- above lists the value of element A2 first, even though ele-
- ment A1 was declared first.
-
- The final case that must be considered is the case
- where it is necessary to search STM for an object whose ele-
- ments are to be tested against the result of some previous
-
-
-
-
-
-
-
-
-
- - 13 -
-
-
- search. To do this it is first necessary to name the object
- that is being searched for so that it may later be referred
- to, e.g.:
-
- (^A FIRST
- A.A1 == 2)
- (B.B1 == FIRST.A3)
- (A.A1 != A.A3)
-
- The first statement begins with a hat character. This
- indicates that this object is to be named. The hat charac-
- ter is followed by the object type and it's name. The name
- is followed by a list of the elements to search for, in this
- case a search for element A1 equal to two is specified.
- This statement can be translated as follows: Search the A
- list for an object whose element A1 is equal to two and name
- that object "FIRST". A name that is applied to an object is
- called a free variable. The scope of a free variable is the
- current rule. Free variable names can be reused in subse-
- quent rules.
-
- The second statement specifys that the B list is to be
- searched for an element B1 whose value is equal to the value
- of the element A3 found in the previous statement. The free
- variable name makes it possible to refer to previously found
- elements.
-
- The third statement, while looking innocent enough, is
- radically different from all previous examples. In all the
- previous examples the exact value that was being searched
- for was known before the search began. That value was
- expressed as either a literal, or the value of some element
- that was found in a previous test. In the third statement,
- the A list is being searched for an object whose elements A1
- and A3 are not equal to one another. The values these ele-
- ments are to have are not specified, only their relationship
- to one another. This can be further complicated:
-
- (^A Second
- A.A1 == 3)
- (A.A1 < Second.A3
- A.A3 < A.A1)
-
- In the second match statement of this example A1 is
- being compared to the value of A3 in the object named
- "Second" and it is being compared to the value of the ele-
- ment A3 in the object that contains it. An element may
- appear on the left hand side of the relational operator only
- once in a given match statement. It is now possible to con-
- sider the effects of the options.
-
-
-
-
-
-
-
-
-
-
-
-
-
- - 14 -
-
-
- _6._3._2. _O_P_T_I_O_N_S
-
- The situation part begins with a (possibly empty) list
- of options. The reserved words RECURS or NORECURS may
- appear in the option part of the situation. The appearance
- of one of these words causes the named strategy to be used
- rather than the current default strategy. It is not an error
- to explicitly specify the default strategy, but it is
- unnecessary. The option part of the situation may also
- include EMPTY statements. An EMPTY statement is a static
- object declaration. The intention of the EMPTY statement is
- to provide a means of passing data from STM to embedded c-
- code and from embedded c-code to STM. Examples in section
- 12 will illustrate these actions.
-
- _6._3._3. _S_E_A_R_C_H _S_T_R_A_T_E_G_I_E_S
-
- A small example provides an easy way to illustrate the
- two search strategys. This example is a complete TRC
- specification, though not useful for anything other than as
- an example.
-
- %%
- PENNY (MINT : STRING DATE : INT)
- %%
- PENNY (MINT => "DENVER"
- DATE => 1964)
- PENNY (DATE => 1966)
- %%
- R1:
- (^PENNY First)
- (PENNY.DATE <= First.DATE)
- =>
- MARK PENNY
- ;
- %%
-
- STM will be initialized to contain two objects of type
- PENNY, the first minted in Denver in 1964, the second minted
- in an unspecified location in 1966. Since the reserved word
- RECURS does not appear in either option section, the default
- LINEAR search strategy will be used.
-
- In the LINEAR strategy, STM is searched in a linear
- fashion for each object specified in the situation part.
- Objects are searched for in the order they are listed. In
- this example, the object named "First" will be associated
- with the first object in the list. Since the values of the
- elements are not specified, any object of type PENNY will
- match. This object is then temporarily marked as being "in
- use" and can not be used to match any subsequent tests. The
- list will then be searched for an object of type PENNY whose
- DATE element is less than or equal to the DATE element of
- the "First" object. The only other object in the list has a
-
-
-
-
-
-
-
-
-
- - 15 -
-
-
- DATE element of 1966 which is not less than or equal to
- 1964, so the rule fails. In the LINEAR strategy, when any
- test in the situation part fails, the entire rule fails
- immediately, no further tests are made.
-
- It should be obvious that this rule would have been
- true if "First" had been associated with the second object
- in the PENNY list. This is precisely the purpose of the
- RECURSIVE search strategy. In the RECURSIVE search stra-
- tegy, when a test fails, the previous test is redone. To
- redo a test, the object that was marked as "in use" is
- unmarked, and the list is searched from that point for an
- object that will match the test. The RECURSIVE search fails
- when a single test fails and it is no longer possible to
- undo the previous test (this occurs when there is no previ-
- ous test). The RECURSIVE search strategy is a powerful pat-
- tern matching tool, but it can be expensive in terms of exe-
- cution time.
-
- _6._3._4. _E_M_B_E_D_D_E_D _C_O_D_E
-
- Arbitrary C language code may be embedded in the situa-
- tion part anywhere a match may occur. Recall that embedded
- code (c-code) is recognized as a single lexical element by
- the scanner, the C language itself is not parsed by TRC.
- Errors in embedded code will not be detected by TRC. The
- intention of permitting embedded code in the situation part
- is to make it possible to include tests that may not fit the
- context of a match against STM.
-
- In order to integrate an embedded code test with the
- existing match statements, it is necessary to have a way to
- refer to objects in embedded code. In order for embedded
- code to have the same functionality as a match statement, it
- is necessary to have a way to cause a rule to fail in the
- embedded code. Each of these facilities are provided.
-
- _6._3._5. _E_M_P_T_Y _O_B_J_E_C_T_S
-
- The purpose of the EMPTY statement is to create a named
- object that can be referred to by match statements and
- embedded code, without having to exist in STM. One of the
- capabilities that results is the ability to have STM
- searched on the basis of the result of some external pro-
- cedure, e.g.:
-
- R1:
- EMPTY PENNY SPARE /* this creates an object of
- type PENNY that is named
- SPARE. This object exists
- separately from STM and it's
- elements are not initialized. */
- {
- /* this embedded C code precedes
-
-
-
-
-
-
-
-
-
- - 16 -
-
-
- any search of STM
- */
- if(($SPARE.DATE = external-procedure()) <= 1920){
- $FAIL.
- }
- }
- (PENNY.DATE == SPARE.DATE)
- =>
- MARK PENNY
- ;
-
- Several things are happening in this example. First an
- object of type PENNY is created and given the name SPARE.
- This object exists separately from STM and will exist only
- during the current rule. It is useful only as something
- that can be referred to in other statements. A section of
- embedded code precedes the only match statement in this
- rule. When the code produced by TRC is compiled and run,
- that embedded code will be executed before STM is searched,
- by virtue of the fact that it precedes the match statement.
-
- The embedded code contains an "if" statement which con-
- tains an embedded assignment and function call as part of
- it's condition. The left-hand-side of the embedded assign-
- ment, "$SPARE.DATE" is not syntactically correct C language
- code. The dollar character is a flag to TRC that indicates
- a reference to a named object. The identifier that follows
- the dollar character will be translated by TRC during the
- output phase. This translation is described in section 12.
- The statement "$FAIL." is translated by TRC into whatever
- statements are required to make this rule fail. The defini-
- tion of failure depends on the search strategy. If the
- LINEAR strategy is being used, "$FAIL." will cause the rule
- to stop searching STM and continue with the next rule. If
- the RECURSIVE strategy is being used, "$FAIL." will cause
- the rule to undo and then redo the previous match statement.
-
- An object name preceded by the dollar character may
- occur in the embedded code anywhere a variable name may
- occur, since that is what it will actually be translated to.
- Embedded code may also refer to objects that exist in STM
- using the same dollar character translation technique:
-
- R1:
- RECURS
- (^PENNY NEW
- PENNY.MINT == "DENVER)
- {
- if(some-function($NEW.DATE))
- $FAIL.
- else
- $NEW.DATE = 0;
- }
- (PENNY.DATE == 1921)
-
-
-
-
-
-
-
-
-
- - 17 -
-
-
- =>
- MARK PENNY
- ;
-
- The "else" part of the embedded code illustrates an
- assignment to an element of the object named NEW. The
- object that is being called NEW exists in STM and this
- assignment to it's DATE element is permanent. Since this
- rule is recursive, it is possible that this embedded code
- will set the DATE element of every object in the PENNY list
- to zero. These modifications are made before it is even
- known that the situation part is true. Modifying STM in the
- situation part of a rule would be a major departure from
- traditional expert system implementation techniques. Furth-
- ermore, the BACKTRACKing option is unaware of changes made
- in STM by embedded code. The BACKTRACKing option is unable
- to correctly undo this rule.
-
- _6._4. _A_C_T_I_O_N
-
- The ACTION part specifies what is to be done if the
- situation part is true. The actions that can be taken pri-
- marily involve adding objects to STM or deleting objects
- from STM. Recall that the non-terminal 'entry' was defined
- in section 4.
-
- action ::= statements c-code
- statements ::= { [statement] }
- statement ::= MARK mark-list
- statement ::= ADD add-list
- statement ::= OPTIMIZE identifier
- mark-list ::= { [ mark-item ] }
- mark-item ::= [ integer-literal ] identifier
- add-list ::= { [ entry ] }
-
- _6._4._1. _M_A_R_K
-
- The MARK statement is used to delete objects from STM.
- Only objects that were found in the situation part may be
- deleted. The reason for this constraint is that only the
- objects found in the situation part are definitely known to
- exist in STM. STM is searched only in the situation part,
- there is no searching in the action part. Objects may be
- deleted by name or in the order they were found, e.g. (using
- the definitions from section 6.3.1):
-
- R1:
- (A.A1 != A.A3)
- (^A FIRST
- A.A1 == 2)
- (B.B1 == FIRST.A3)
- =>
- MARK A
- ;
-
-
-
-
-
-
-
-
-
- - 18 -
-
-
- This MARK statement will delete the object in the A
- list that met the test (A.A1 != A.A3). In some instances it
- may be desirable to delete an object that was not the first
- object that was found, e.g.:
-
- R1:
- (A.A1 != A.A3)
- (^A FIRST
- A.A1 == 2)
- (B.B1 == FIRST.A3)
- =>
- MARK FIRST
- ;
-
- The A list object named 'FIRST' is the second object of
- type A to be found. It is specified as the object to delete
- by using it's free variable name. A MARK statement can
- specify a count of how many objects of a given type are to
- be deleted. A MARK statement may list any number of objects
- to delete, and each object to be deleted can have a separate
- MARK statement if desired. In no case can more objects be
- deleted than were found in the situation part. Each of the
- following examples is equivalent:
-
- R1:
- (A.A1 != A.A3)
- (^A FIRST
- A.A1 == 2)
- (B.B1 == FIRST.A3)
- =>
- MARK B FIRST A
- ;
-
- R1:
- (A.A1 != A.A3)
- (^A FIRST
- A.A1 == 2)
- (B.B1 == FIRST.A3)
- =>
- MARK 2A
- MARK B
- ;
-
- R1:
- (A.A1 != A.A3)
- (^A FIRST
- A.A1 == 2)
- (B.B1 == FIRST.A3)
- =>
- MARK FIRST
- MARK A
- MARK B
- ;
-
-
-
-
-
-
-
-
-
-
- - 19 -
-
-
- _6._4._2. _A_D_D
-
- The ADD statement is used to add new objects to STM.
- As in the MARK statement, an ADD statement can specify one
- or several objects to add to STM. The value of each element
- of each object can be specified as in the STM section of the
- specification. Each object is inserted at the head of the
- appropriate list. The insertions are actually made in the
- opposite order that they are listed, the net effect is that
- the objects appear at the head of the list in the order they
- are specified. ADD and MARK statements may be intermixed in
- any order, e.g.:
-
- R1:
- (A.A1 != A.A3)
- (^A FIRST
- A.A1 == 2)
- (B.B1 == FIRST.A3)
- =>
- MARK FIRST
- ADD A (A.A1 => 6
- A.A3 => FIRST.A3)
- ADD B (B.B2 => FIRST.A2
- B.B1 => 9)
- MARK B
- ;
-
- All the ADD statements will be executed before any MARK
- statements are executed regardless of the order of the
- statements in the action part. The statements are ordered
- by the compiler to insure that an ADD statement does not
- refer to an object that has already been MARKed. In the
- example above, the first ADD statement refers to the object
- named 'FIRST'. The object named 'FIRST' is MARKed in the
- previous statement. If the code were executed in the speci-
- fied order, the element 'FIRST.A3' would not exist when the
- ADD statement was executed.
-
- _6._4._3. _O_P_T_I_M_I_Z_E
-
- The OPTIMIZE statement is named for it's primary func-
- tion, hand optimization of LTM. There is also a built in
- optimizer that can be invoked. Optimization is discussed in
- detail in section 7. The OPTIMIZE statement can be thought
- of as an unconditional GOTO statement. In normal execution,
- after a rule fires the rules are tested from the beginning
- of LTM for the next rule that will fire. The OPTIMIZE
- statement can specify a point other than the start of LTM to
- begin testing rules. In addition to optimization, it can be
- used to impose a customized control structure on the set of
- rules.
-
- One example of the use of the OPTIMIZE statement is to
- implement a search for the absence of some object(s) in STM,
-
-
-
-
-
-
-
-
-
- - 20 -
-
-
- which is not otherwise supported by the TRC language. To
- search for the absence of some object(s), use two rules.
- The first rule searches for the presence of the object(s) in
- question, if the rule is true then the object(s) are not
- absent. If the rule fails, the object(s) are absent, e.g.:
-
- R1:
- /* effectively search for the
- absence of an object A with
- element A1 == 2 */
-
- (A.A1 == 2)
- =>
- /* If this rule is true, branch
- around the next rule */
- OPTIMIZE R3
- ;
-
- R2:
- /* If R1 fails, then there is
- no object A with element
- A1 == 2. An empty situation
- part such as this always
- evaluates to true */
- =>
- /* whatever you wish to do in response
- to the absence of A1 == 2 */
- ;
-
- R3:
- /* continue here if R1 is true */
- . . .
-
-
- _6._4._4. _C-_C_O_D_E
-
- A c-code may follow the MARK, ADD and OPTIMIZE state-
- ments. This is user code that is to be executed when a rule
- fires. C-code may not appear between MARK, ADD or OPTIMIZE
- statements. If it is necessary to refer to an object that
- is being MARKed in c-code, it should be done in the situa-
- tion part. A c-code may precede the arrow symbol that
- separates the situation and action parts. C-code in this
- position is equivalent to c-code in the situation part
- preceding the MARK, etc. statements, e.g.:
-
- R1:
- (^A FIRST)
- {
- /* this c-code follows all
- situation tests. It is
- effectively in the action
- part since it will execute
- only if the situation is
-
-
-
-
-
-
-
-
-
- - 21 -
-
-
- true */
-
- some_procedure($FIRST.A1);
- }
- =>
- MARK FIRST
- ;
-
-
- _7. _O_P_T_I_M_I_Z_E_R
-
- The optimizer does not produce code that is optimum in
- any sense. What it does is to perform a single, very useful
- code modification that can have a very positive impact on
- execution time.
-
- Consider the execution of an inference engine. Each
- rule is tested until one who's situation part is true is
- found. This rule's action part is then executed. When
- rules are being tested the problem space is being searched.
- When an action part is executed a step is taken in the solu-
- tion of the problem. Searching the problem space is clearly
- part of the solution, but the action part is where the the
- results occur.
-
- The goal, which is not attained, is to reduce the
- search time to zero. To attain this goal it would be neces-
- sary to know each time a rule fires which rule will fire
- next. This is generally not known. In particular when the
- inference engine begins execution, the contents of STM are
- not known, any rule can be the first rule to fire. Once a
- rule has fired and each time any rule fires a great deal of
- implicit knowledge about the contents of STM is obtained.
- It is known that no rule previous to the current rule is
- true and no rule previous to the current rule can be true
- after the execution of the current rule unless the current
- rule modifies STM in such a way as to make some previous
- rule true. This simple fact is the entire basis of the
- optimizer, which attempts to reduce the number of rules that
- are tested by deducing which rules can not possibly fire.
-
- Three tests must be performed to determine a candidate
- next rule, which is the first rule in LTM that can possibly
- fire after the current rule fires. The three tests are
- called the NOT test, the ADD test and the MARK test.
-
- The first case to be considered is the case of a rule
- which contains a NOT statement in the situation part. A NOT
- test is an explicit test for an empty list. When a rule
- that fires contains an ADD statement it will not be possible
- for any previous rule with a NOT statement referring to that
- list to be the next rule to fire. Likewise, if a rule that
- fires contains a MARK statement and no ADD statement refer-
- ring to that same list, it is possible that the list will
-
-
-
-
-
-
-
-
-
- - 22 -
-
-
- become empty making it possible for the rule with the NOT
- statement that previously failed to become true. If it is
- determined that it is possible for a rule to fire after the
- NOT test, that rule becomes the candidate rule and no
- further testing is done.
-
- Consider the case of a rule with no NOT statements that
- recursively searches STM for a situation. If this rule
- fails, it will continue to fail until something is added to
- STM to make it true. If all rules searched STM recursively
- it would be known when a rule fires that of the rules that
- precede the current rule, only those rules that search for
- something added to STM by the current rule can possibly fire
- in the next pass.
-
- If the current rule adds something to STM, control
- could continue with the first rule that searches for that
- something rather than the first rule in LTM. If no rule
- prior to the current rule searches for those things added to
- STM by the current rule or if the current rule adds nothing
- to STM then no prior rule can execute. Control could con-
- tinue with the current rule rather than at the beginning of
- LTM. By causing control to continue with a rule later than
- the first rule the amount of searching is reduced.
-
- The case of a rule that performs only a linear search
- on STM must also be considered. The previous conclusion
- about items being added to STM is still true; a rule that
- adds something to STM can cause a linear search rule to
- become true. With linear search it is also possible that a
- rule will become true if something is removed from STM. If
- a linear rule searches for several similar items which are
- present but not correctly ordered it is possible for this
- linear search to fail where a recursive search would not
- have failed. If there were excess items, removing one or
- more may cause a different linear assignment which could
- make a linear rule true. This is the MARK test. Examples
- of this situation are non-trivial, but where correctness is
- an issue these cases can not be overlooked.
-
- The TRC optimizer selects a continuation point for each
- rule based on what the rule adds to or deletes from STM
- rather than testing each rule from the beginning of LTM.
- The continuation point is the first rule that could fire
- based on the NOT and ADD tests for all rules, and the MARK
- test for linear rules. The TRC optimizer is somewhat naive
- in that it considers only items added or deleted with the
- ADD and MARK statements. The optimizer is unaware of any
- changes that may have been made to STM by user code. The
- caveat is if STM is modified in user code the optimizer may
- produce incorrect code. The optimizer, which can be invoked
- with a command line option (-O), tests each rule individu-
- ally and ignores those rules that were hand optimized in the
- specification.
-
-
-
-
-
-
-