home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Club Amiga de Montreal - CAM
/
CAM_CD_1.iso
/
files
/
406.lha
/
OPS5c
/
ops5c.doc.pp
/
ops5c.doc
Wrap
Text File
|
1990-08-07
|
117KB
|
3,037 lines
OPS5c User's Guide
Version 1.08a
For the Amiga Personal Computer
Bernie J. Lofaso, Jr.
Table of Contents
1. Introduction.............................................4
1.1 What is OPS5c?........................................4
1.2 OPS5 History..........................................4
1.3 Production Systems Overview...........................5
2. Working Memory...........................................9
2.1 Class Declarations....................................9
2.2 Working Memory Element Structure......................9
2.3 Vector Attributes....................................10
3. Rules...................................................12
3.1 Rule Structure.......................................12
4. The LHS.................................................13
4.1 Condition Elements...................................13
4.2 Variables............................................13
4.3 Quoting..............................................13
4.4 Predicates...........................................14
4.5 Conjuctions..........................................14
4.6 Disjunctions.........................................15
4.7 Negated Condition Elements...........................15
4.8 Element Variables....................................16
5. The RHS.................................................17
5.1 Element Designators..................................17
5.2 RHS Patterns.........................................17
5.3 Pre-defined Functions................................18
5.3.1 genatom.........................................18
5.3.2 litval..........................................18
5.3.3 substr..........................................18
5.3.4 compute.........................................18
5.3.5 accept..........................................19
5.3.6 acceptline......................................19
5.4 RHS Actions..........................................19
5.4.1 make............................................20
5.4.2 remove..........................................20
5.4.3 modify..........................................20
5.4.4 write...........................................21
5.4.5 bind............................................21
5.4.6 cbind...........................................21
5.4.7 call............................................22
5.4.8 halt............................................22
5.4.9 openfile........................................22
5.4.10 closefile......................................22
5.4.11 default........................................22
6. Conflict Resolution.....................................24
7. OPS5 Programming........................................26
8. Top Level Commands......................................29
8.1 include..............................................29
8.2 run..................................................30
8.4 ppwm.................................................30
8.5 wm...................................................30
8.6 cs...................................................31
8.7 strategy.............................................31
8.8 pbreak...............................................31
8.9 dump.................................................31
8.8 size.................................................31
9. The OPS5c System........................................32
9.1 Installing OPS5c.....................................32
9.2 Differences from OPS5................................32
9.3 Compiler Switches....................................34
9.4 Run-Time Switches....................................35
9.5 Interfacing With External Code.......................35
9.5.1 Data Types......................................36
9.5.2 The Result Element..............................37
9.5.3 Interning and String Spaces.....................37
9.5.4 Functions Versus Actions........................39
10. References..............................................40
Appendix A - Dollar Library Description.....................41
Appendix B - String Library Description.....................44
Index.......................................................45
OPS5c User's Guide
1. Introduction
1.1 What is OPS5c?
OPS5c is a compiled version of the OPS5 expert system
language originally developed at Carnegie-Mellon University. It
belongs to a class of languages which are referred to as rule-
based (or production-based) expert systems. In such systems,
knowledge is expressed as a set of rules (also called
productions). These rules are meant to express some deductive
statements which may be used as applicable in solving a problem
requiring an expert level of performance. If you have never
programmed in a rule-based language, expect to spend a little
time "getting up to speed". The methodologies used in
programming such systems are considerably different from
programming in conventional programming languages. This manual
should serve as a brief overview of the operation of production
systems (PS's) in general and the OPS5c programming language in
particular. There are several good texts that you may want to
consult in order to get better acquainted with production
systems and several are specific to the OPS5 language.
1.2 OPS5 History
OPS5 is part of a family of languages developed at
Carnegie-Mellon University by Charles Forgy, John McDermott,
Allen Newell, and Michael Rychener in the mid-70's. The
inference engine used by OPS5 is based on the RETE pattern
matching algorithm developed by Charles Forgy. The original
systems were written in Lisp. They read in OPS5 source code and
produced an intermediate pseudo-code which could then be
interpreted by the OPS5 inference engine. Much of the OPS5
language semantics indicate the underlying Lisp language used in
coding the first system. Another version of OPS5 was written in
a language called BLISS. The BLISS system runs considerably
faster than the original interpreted Lisp but had some
limitations not present in the Lisp version. More recent
versions have been written in Common Lisp which run many times
faster than the original Lisp version, partially due to better
Lisp coding and probably to some extent due to greater variety
and power of the data structures and functions available in
Common Lisp. The most recent version of OPS5 from CMU is a
system called ParaOps which compiles OPS5 source directly to
machine language. While being considerably faster than previous
OPS5 versions the ParaOps system is very machine specific as it
produces assembly language as its output.
The OPS5c system was designed to accept a language as close
to the OPS5 language definition as possible. As with the BLISS
version of OPS5, some language features were not implemented.
OPS5c is, however, approximately 98% true to the language
definition and the features missing are not likely to be missed
by most OPS5 programmers. Unlike previous systems, OPS5c uses
the TREAT pattern matching algorithm developed by Dr. D. P.
- 4 -
OPS5c User's Guide
Miranker. In general, the TREAT algorithm has shown to be the
superior algorithm in most cases. The OPS5c compiler takes as
input an OPS5 source program and produces code in the C
programming language as output. This code is then compiled and
linked with a run-time library module to produce an executable
of the OPS5 program. The main emphasis of this system, besides
OPS5 source code compatibility, is speed. Whenever a speed
versus space trade-off was encountered during system design, the
option producing the fastest execution was almost always
selected (within reason of course). A limited amount of testing
on small programs done in December 1988 showed that the OPS5c
system showed a 30-120% speed advantage on benchmark programs
over the then fastest known OPS5 system. Additional speed and
space improvements have been made to the system since that
preliminary testing occurred. Additional benefits of OPS5c are
that it may be interfaced with other languages (primarily C at
the moment). The main disadvantage to OPS5c is that it often
produces large executables. It is not recommended for machines
with less than 1 Meg. of RAM unless only small rule systems are
used.
1.3 Production Systems Overview
In describing production systems in general, the
descriptions will be of systems which have architectures similar
to OPS5. OPS5 terminology will be used as much as possible so
that any reference to external materials will be more
understandable to the reader.
As previously mentioned, production systems consist of a
set of rules, each of which consists of two parts. These are
the antecedent, more commonly known as the left-hand side (LHS),
and the consequent, more commonly known as the right-hand side
(RHS). The LHS can be thought of as a series of conditions
which when all are true simultaneously, make the rule eligible
to "fire". A rule may fire when all its LHS parts, called
condition elements (CEs), are true and it has been selected as
the correct rule to fire. In firing, the rule instructs the
system to perform one or more actions as indicated by the RHS.
The determination of when a rule may fire as well as what effect
it's actions have when it does fire is dictated by working
memory (WM).
Working memory is best thought of as a set of facts
maintained by OPS5. These facts take the form of discrete
records similar to structs in the C language. Each such
structure belongs to a 'class'. Each class may have attributes,
which act as data slots, defined for it. Declarations of valid
classes and their attributes must precede the rules definitions
in the OPS5 source file. A typical declaration might be:
(literalize person name age height weight)
We could then begin creating instances of this class to
represent individual facts. Even with only one such class
- 5 -
OPS5c User's Guide
definition we might want to represent a set of facts such as:
(person ^name John ^age 23 ^height 69 ^weight 165)
(person ^name Mary ^age 18 ^height 62)
(person ^name Mark ^age 18)
Each such instance is called a working memory element (WME). It
is these elements which are matched against the patterns of the
LHS of each rule. When a rule is fired, statements in the RHS
may add to, remove from, or modify the contents of working
memory.
In general, a production system executes a cycle of phases.
The phases of a PS cycle are MATCH, CONFLICT RESOLUTION, and
ACT. The MATCH and ACT phases we have mentioned earlier. They
corresponding to matching patterns of the LHS to working memory
elements and performing the actions of the RHS when a rule is
selected to fire. The intermediate phase, CONFLICT RESOLUTION,
determines how a rule is selected amongst many possibilities for
firing. After the system has performed MATCH against the LHSs
of all rules, there may either be more than one rule which has a
LHS match or a rule could have one or more ways of producing a
match. The first case should be very obvious. It simply means
that the facts supporting more than one rule firing exist. It
is up to CONFLICT RESOLUTION to decide which rule will fire on
the current cycle. In the second case, we may find that there
exists more than one way of matching WMEs to the LHS of one or
more rules. As an example, given the three WMEs above, the LHS
of a rule such as
(person ^age > 20)
(person ^age < 20)
could be matched in two ways. We can form a pair using the
first and second WMEs or we can form a pair using the first and
third WMEs. Each combination of valid mapping of WMEs to the
CEs of a rule constitutes what is called an instance. OPS5 will
calculate all the instances it can, then select one instance,
and hence the rule that it corresponds to, to fire on the
current cycle. As we shall later see, instances may be created,
but are never guaranteed to be selected to be fired. Other more
appropriate instances may be created during the MATCH phase or
it is possible for instances to be destroyed. At any moment in
time, if we were to stop the system and examine all the
instances we would refer to this set of instances as the
conflict set. The CONFLICT RESOLUTION phase examines the
conflict set after completing the matching of WMEs added to the
system in the ACT phase of the previous cycle. Note that the
MATCH phase performs an incremental matching. It needs only
perform matching for newly added WMEs since the result of
matching produces instances and all instances produced from
previously added WMEs have already been calculated and retained
in the conflict set.
The cycle that we have just described is initiated by
adding WMEs to working memory, thus causing matching with LHSs
- 6 -
OPS5c User's Guide
of the rules in the system. The cycle will continue until there
are no more instances in the conflict set, hence no rules can
fire, or a rule fires which contains an explicit halt
instruction in its RHS. Hopefully in either case the system has
arrived at a solution to the problem at hand.
It should be noted that in our brief description of
CONFLICT RESOLUTION, we indicated that the system selects a
specific instance to fire. We speak of instances firing rather
than rules for several reasons. First, it is important to know
exactly what WMEs match particular CEs. This is because is it
common practice for an action in the RHS of a rule to reference
values in the WMEs which match the LHS CEs. Thus we must know
exactly which elements match which CEs in order to know the
appropriate value to use in the actions. Second, we associate
an instance with a specific rule, so specifying the instance
specifies the rule implicitly. As we shall see later when we
describe how CONFLICT RESOLUTION is performed, the set of WMEs
that form a particular instance determines its "priority" that
is calculated by CONFLICT RESOLUTION. All instances are ordered
by priority at the end of the MATCH phase and the instance with
highest priority is selected to fire.
Additionally, we should address the OPS5c matching
algorithm at a slightly higher level of detail. These details
apply specifically to the TREAT match algorithm used by OPS5c.
The traditional match algorithm for OPS5 is the RETE match.
Refer to [Miranker87] or other reference materials for
additional details on the TREAT and RETE algorithms. Associated
with each CE is a data structure called an alpha memory (amem).
Such a structure also exists in the RETE algorithm. As we add
new WMEs to working memory, we must first decide which CEs the
new WME might be associated with in the process of creating
instances. A WME can be associated with some CE for the purpose
of creating an instance only if it passes all the constant tests
as well as all variable binding tests. Constant tests are those
whose values are not variables but either numeric or symbolic
constants. Variable tests make comparisons relative to other
WMEs bound to CEs in the rule. While the variable binding tests
depend on what WMEs are currently bound to the other CEs in a
rule, the constant tests are independent of other WMEs. Thus it
makes sense to perform these tests only once for each CE/WME
combination and if the WME passed the constant tests for the CE,
we store the WME in the alpha memory of the CE. In that fashion
we some other WME is added to another CE of the rule, we can
scan the alpha memory structure of each CE to look for
candidates for satisfying variable bindings rather than having
to scan all of working memory. Usually a WME will match the
constant tests of many CEs and therefore will be stored in many
alpha memories. We sometimes refer to the MATCH phase as
containing two sub-phases, SELECT and JOIN (these terms come
from database terminology). The SELECT sub-phase corresponds to
performing constant tests for a newly added WME so it can be
added to the appropriate alpha memories. The JOIN sub-phase
corresponds to the variable binding tests that are performed in
order to try to create instances to be added to the conflict
- 7 -
OPS5c User's Guide
set.
As a recap, we may describe the incremental matching as
such. When a new WME is matched, we examine all CEs which refer
to the same class as the WME being added. If all constant tests
of a particular CE are satisfied by the WME, we add the WME to
the alpha memory of the CE. At this point we may stop and look
for correct variable bindings in the rule that the CE
participates in. We do this by scanning all the alpha memories
of the other CEs to find consistent variable bindings. Note
that if we find consistent bindings for all CEs of the rule, we
create an instance. Also note that the instance MUST contain
the new WME bound to the CE that we have just added the WME to.
Actually we add to the alpha memory of a CE, not the CE, but
this is mearly terminology and we often use the terms alpha
memory, amem, and CE to mean the same thing - namely a storage
structure for WMEs which satisfy the constant tests of the CE.
- 8 -
OPS5c User's Guide
2. Working Memory
2.1 Class Declarations
As mentioned earlier, classes and the attributes they use
are usually declared before being used in any rules. It is
usually best to declare all classes and specify all attributes
to be used with each class although neither of these practices
are enforced by OPS5. OPS5 will allow the user to reference
class names that have not been defined. It will also allow
attributes which were defined in one class to be referenced in
another class. This is due to the simplistic symbol table kept
by OPS5 systems. No record is kept of which classes declared a
particular attribute name. Unfortunately this "feature" is one
that has been exploited by OPS5 programmers in the past and thus
is handled similarly by OPS5c to maintain compatibility.
Classes and their attributes are declared using the
LITERALIZE statement. As seen in an earlier example, the
keyword LITERALIZE is followed by a class name which may be
followed by zero or more attribute names. The entire declara-
tion must be enclosed in parentheses. It is acceptable to use
the same attribute name in different classes but a class name
may appear in only one LITERALIZE statement.
2.2 Working Memory Element Structure
A working memory element (WME) consists of a time tag and
an array of attributes. The time tag is a unique number
assigned to each WME at the time of its creation. The more
recent the WME the larger its time tag will be. The time tag is
used not only as a unique identifier, but is also the key
feature in the algorithms used by CONFLICT RESOLUTION. In most
OPS5 implementations there are 127 attribute "slots" defined in
each WME. Regardless of the number of attributes defined for a
class, each WME contains room for 127 attribute values. When
the declarations section of an OPS5 program has been parsed by
an OPS5 interpreter or compiler, it assigns an offset in this
array to each of the attribute names that have been declared.
The offset assigned to a particular attribute is the same
regardless of the class (or classes) the attribute has been
declared in. The algorithm must consider which classes the
attribute has been declared in so that it can assign a single
offset to an attribute name while not assigning the same offset
to attributes which appear in the same class. Because of this
it is not valid to assume that the order of attributes in a
class declaration has anything to do with the assignment of
offsets to those attributes. Assignments may be in a different
order and may not be sequential. While most classes will have
offsets not used by any attributes declared in that class, the
unused offsets may have assigned offsets that are either higher
numbered or lower numbered. If a programmer chooses to use
attribute names in classes for which they were not declared when
writing rules, there is the risk that the offset assigned to two
- 9 -
OPS5c User's Guide
or more attributes used in the same class will be the same.
This will usually cause over-writing of data and incorrect
program behavior. This is the main reason that declaration of
all attributes to be used in a class is a recommended practice.
OPS5 also has a mechanism for allowing the programmer to
control the offset assignments to certain attributes. This is
accomplished with the LITERAL statement. This statement simply
assigns offsets to attributes blindly. An example LITERAL
declaration follows.
(literal name = 2 age = 4 address = 45)
Attribute names appearing in a LITERAL statement do not have to
appear in any LITERALIZE, though they may. Since LITERAL state-
ments are processed before LITERALIZEs (although they may appear
intermixed in the source code), the system may find that it
cannot provide assignments for LITERALIZEs which allow no
duplicate offsets in a class, in which case the system will
abort with an error message.
Attributes in a WME are similar to Lisp atoms. For those
not familiar with Lisp, an atom is a structure with a field
indicating the atom type (symbol, integer, float, etc.) and the
data value. Because the type is associated with the data value,
not with the attribute name or location, we may assign a value
with one data type (say an integer) to some attribute, then
reassign another value with a different data type (say a
string). Certain functions in the OPS5 language, namely calcu-
lations and relational comparison predicates, are the only parts
of the language that operate on specific types. Attribute types
will also be a concern to the programmer who wishes to interface
OPS5c programs to external C code. It should be noted that for
comparison purposes symbols (strings) must be identical and are
case sensitive, i.e. 'dog' is not equal to 'DOG'. Normally OPS5
will consider an integer and floating point value to be equal if
their difference, when the integer is converted to float, is
zero. In OPS5c numeric attributes must be the same type to be
equal regardless of their values. By defining the preprocessor
symbol COERCE while compiling the C output of the OPS5c
compiler, the system will perform coercion of incompatible data
types whenever possible.
Whenever a WME is created and assigned values, the
attribute positions which are not assigned values explicitly are
given a value of "nil" (a symbolic atom). Rules may test for
unassigned attribute values by comparing to "nil" in a CE in
their LHS.
2.3 Vector Attributes
Normally we associate a single value with each attribute.
There are cases where we wish to store a set of values that are
associated with a single attribute name. OPS5 provides methods
for finding the offset of an attribute and referencing attribute
slots a certain relative distance past the offset. Thus we can
- 10 -
OPS5c User's Guide
say that the set of values associated with some attribute is all
values with an offset equal to or greater than the attribute's
offset. In order to may sure that we don't over-write other
attribute slots, we may ensure that there are no other
attributes (at least in a specific class) that has a greater
offset. This is done by declaring the "multi-valued" attribute
in the class (or classes) it will be used and additionally
declaring it as a vector attribute. One or more attributes may
be declared as vector attributes using one or more VECTOR-
ATTRIBUTE statements. An example would be:
(vector-attribute classmates teachers)
It should be noted that only one vector attribute is allowed per
class and any attributes declared to be vector attributes will
be treated as vector attributes in all classes that they are
declared in.
The user should be aware that the OPS5c implementation
normally will not treat all WMEs as being of equal length.
Instead it sizes each class according to the maximum offset of
the attributes declared in that class. If the class declaration
contains an attribute that has been declared to be a vector
attribute, then the class will be sized with the maximum number
of attributes that a WME can contain (128 in this
implementation). There are ways, however, of forcing all WMEs
or WMEs of certain classes to be of specific lengths. See the
discussion of the '-w' compiler or run-time switch. This action
might be desirable (in fact necessary for correct program
execution) if the program being run references attributes in a
class for which the attribute was not declared. This is a bad
programming practice but the above mentioned override mechanism
can compensate.
- 11 -
OPS5c User's Guide
3. Rules
3.1 Rule Structure
Below is a template for the syntactic structure of an OPS5
rule:
(p rule-name
(class-a ^attr-1 value-1 ^attr-2 value-2 ...)
(class-b ^attr-x value-x ^attr-y value-y ...)
...
-->
(action-a parameters-for-a)
(action-b parameters-for-b)
...
)
The indentation is optional but is used to emphasize the
different parts of a rule. A more terse template for a rule
might be:
(p rule-name LHS-CEs --> RHS-actions)
Rule names may be any string which does not contain any blanks.
- 12 -
OPS5c User's Guide
4. The LHS
4.1 Condition Elements
As previously mentioned, the LHS of a rule is composed of
one or more condition elements (CEs). A CE is a set of
attribute-value pairs. This is usually an attribute name
followed immediately by a value. In OPS5 an attribute name in a
CE is distinguished from a value by prefacing a string
(representing a declared attribute name) with a caret ('^').
Spaces between the caret and the attribute name are allowed. If
a value is not immediately preceded by an attribute name, then
the value pertains to the attribute slot immediately following
the previously referenced attribute slot. If there are no
previously referenced attributes then the value pertains to the
first WME attribute, which is always the class name of the WME.
Another alternate method of specifying an attribute position is
to follow the caret with a numeric value. The numeric value
indicates an attribute offset for the WME.
4.2 Variables
Instead of specifying an attribute value after an attribute
name, an attribute variable may be specified. An attribute
value is indicated by enclosing a string in angled brackets.
The string may not contain any blanks. An example of an
attribute variable is <age>. The first appearance of a variable
tells the system to create a variable binding by assigning the
attribute value of whatever WME is bound to that CE during the
matching process to the indicated variable. Subsequent
occurrences of the variable in the rule will substitute the
variable value literally for the variable occurrence. The
variable may appear in subsequent CEs or in actions in the right
hand side.
4.3 Quoting
Quoting symbols is a method of using characters that are
not normally allowed in symbols or have some other meaning to
the system. For example, normally a string with one or more
imbedded spaces would be interpreted by the system as multiple
symbols rather than a single symbol. Perhaps we wish to include
the caret character as the first character of a symbol.
Normally the system would interpret any symbol starting with a
caret to be an attribute name rather than a value.
One method of quoting which is supported both in rules and
in the entering of commands from the top level (see section 8)
is the use of quote characters to surround the string to be
treated as a single symbol. The three quote characters
supported are the single quote, the double quote, and the
vertical bar. Another method of quoting which is only supported
- 13 -
OPS5c User's Guide
in CEs is by using the OPS5 quote operator which is two
consecutive slashes. Using this method, a space must separate
the quote operator and the symbol to be quoted. Some examples
of quoted symbols are:
"Hello world!"
'"hi," she said'
|That's correct|
// ^dog
4.4 Predicates
Predicates are relational tests that the user may want
performed between the attributes of WMEs and the CE pattern
during matching. Predicates are specified by placing the
predicate preceding the test value where one would normally
specify a value to match against. For example, if we want to
test for children under the age of twelve, we might specify
'^age < 12' as our attribute-value pair. We may also use
variables instead of constants to test the attribute against,
thus we could have written '^age < <my-age>'. Seven predicates
are defined, equal '=', not equal '<>', less than '<', greater
than '>', less than or equal '<=', greater than or equal '>=',
and same '<=>'. The same predicate test to see if the attribute
and test value have the same type, i.e. if both are numeric or
both are symbolic. If the variable has been previously bound,
i.e. it either appears in some earlier CE or appears earlier in
the current CE, then in the absence of a predicate, equivalence
testing is performed with the variable binding rather than
creating a new variable binding.
4.5 Conjuctions
We use braces ('{','}') when we wish to indicate that there
are several tests that we wish a single attribute value to hold
true for. For example if we are looking for an age group we
might specify '^age {> 20 < 25}'. If we also wish to bind the
attribute value to an unbound variable, we should include that
variable without specifying a test predicate. Additionally, of
course, the variable should not have appeared in a previous CE
of that rule (i.e. the variable is unbound). For example: '^age
{<her-age> > 20 < 25}'.
Another use for braces is as a place holder in a pattern.
The pattern (cat {} black) will match WMEs of class 'cat' whose
second attribute may be anything and whose third attribute is
'black'. Note that when we say second or third attribute we are
referring to the offset numbers assigned to attributes. This
may or may not have anything to do with the order that
attributes appear in the literalize statements.
- 14 -
OPS5c User's Guide
4.6 Disjunctions
Disjunctions allow the matching of one of a selection of
choices. Disjunctions are indicated by enclosing a set of
choices in double angled brackets ('<<','>>'). An example would
be the CE
(cat << black tan yellow >>)
which would match all WMEs of class 'cat' whose second attribute
is either 'black' or 'tan' or 'yellow'. It should be noted that
there is an implicit quoting mechanism in the use of
disjunctions. A CE such as
(cat << black tan <x> >>)
would specify the literal '<x>' as an alternate value for the
second attribute rather than match the current binding of the
variable x. Hence, variables are not allowed to be bound or
tested in a disjunction.
4.7 Negated Condition Elements
Condition elements may be negated by placing a minus sign
before the rest of the CE. This has the meaning that in order
for a rule to fire, we must be able to find a WME that matches
each of the positive CEs and there must not exist any WMEs that
match the negated CEs. As with positive CEs, negated CEs may
contain tests against previously bound variables. In order for
an instance to be created, there must exist a WME for each
positive CE which satisfies all tests whether they are constants
tests or variable bindings that test values between different
CEs. When negated CEs are present in a rule we have the
additional restriction that for any potential instance which
might be created when we consider only positive CEs, we must
also not be able to find any WMEs matching the negated CEs. Any
WMEs which match a negated CE (for a particular instance) is
said to "block" the creation of that instance for the rule.
Since negated CEs may contain variable bindings, we must speak
of WMEs that match negated CEs with respect to the instance.
The instance (considering positive CEs only) defines the
variable bindings which can decide which WMEs do or don't match
negated CEs. This also implies that variables cannot be bound
in negated CEs which might be intuitively expected since the
meaning of a negated CE is that something of this pattern
couldn't be found.
If all WMEs which block an instance are removed from
working memory, and those WMEs which match the positive elements
remain, then one or more instances will be formed once the last
blocking WME has been deleted. (Remember that this does not
mean that the instances from this rule will necessarily fire, it
just means that its instances are added to the conflict set and
are allowed to compete in the selection of the next instance to
fire.) Similarly, if a WME is added to working memory and it
matches a negated CE for some instance which was previously
added to the conflict set but not yet fired, that instance will
be removed from the conflict set. Should the blocking WME later
- 15 -
OPS5c User's Guide
be removed and all the WMEs matching positive CEs remain, the
instance will be recreated and added to the conflict set. This
is the only set of circumstances which allows a unique instance
to be created more than once (i.e. it must be blocked between
creations). This may also occur even if the instance had been
fired. The removal of the blocking WME might still have
resulted in a second (or subsequent) recreation of the instance.
This is usually not a concern but in some cases the programmer
should take into account that such a situation can occur.
For convenience, we will use the abbreviations +CE and -CE
to stand for positive condition element and negative condition
element, respectively.
4.8 Element Variables
Just as we might wish to bind an attribute value to a
variable for use in RHS actions, we may also wish to specify
that some action be taken with respect to the WME matching a
particular CE of the rule. We may designate such a WME in one
of two ways. The first is simply a reference number, between 1
and the maximum number of CEs per rule (64 in the current
implementation), which specifies the CEs appearence position in
the LHS. Another method is by associating a name with certain
CEs that we may wish to reference. Such a name is called an
element variable. This is the preferred method of referencing
WMEs from the RHS since deletion and addition of new CEs in the
rule can cause position shifts which would affect the numerical
element designator, but not an element variable. We specify an
element variable by placing a variable name (a symbol enclosed
in angled brackets) either before the CE or after the CE and
enclosing both the variable and CE in braces. Below are
examples of element variables in the LHS.
{ <THE-CAR> (car ^manufacturer buick ^make <m-name>) }
{ (person ^hobby tennis) <THE-PLAYER> }
We shall see how to use such references when we discuss those
RHS actions that operate on WME references.
- 16 -
OPS5c User's Guide
5. The RHS
5.1 Element Designators
Some RHS actions require an element designator to specify a
particular WME of the instance being fired that the action is to
be performed on. There are two ways of making this
specification - an element variable, or a numeric offset. The
element variable specification is simply the element variable
name (with enclosing angled brackets). The numeric offset
specification indicates the number of the WME appearing in the
instance. Note that these numbers start at one and -CEs are not
counted. Thus if a LHS has a +CE followed by a -CE followed by
another +CE, the numeric element designator for the last CE
would be two.
5.2 RHS Patterns
Patterns in the RHS are similar in form to those of the LHS
but have a somewhat different meaning. RHS patterns consist of
constants, variables, attribute names, the quote operator
('//'), and function calls. Unlike LHS patterns, which are
evaluated at compile time, the evaluation of RHS patterns occurs
at run time. The evaluation of RHS patterns has the effect of
altering a global buffer area called the result element. The
result element (RE) is simply an array of attributes. One can
think of it as a global WME which is not actually part of
working memory. Associated with the RE is a pointer which keeps
track of the current position in the buffer. Terms in the RHS
pattern which evaluate to attribute references (names and
attribute offsets) have the effect of moving the RE pointer,
while terms which evaluate to values are placed in the locations
indicated by the RE pointer. When multiple values are placed in
the RE without an intervening attribute name or offset
reference, the values are placed in consecutive locations
starting at the location indicated by the RE pointer. Each time
a different pattern is evaluated, the result element is filled
with the atom special 'nil' prior to evaluation. As we shall
see later, the RE is used not only for pattern evaluation, but
for argument passing and value reception with external user-
defined functions and actions.
All RHS actions consist of an action keyword, followed by a
pattern. The pattern is evaluated and the resulting list of
instantiated values is used as arguments for the action.
Pattern instantiation is quite similar to that in the LHS with
the effects of constants, attribute names, and the quote
operator being identical. With the exception of the BIND and
CBIND actions, variables in RHS patterns are not bound, but are
mearly instantiated. Functions do not exist in LHS patterns.
In RHS patterns they are indicated by enclosing a list of one or
more terms in parentheses. The term following the opening
parenthesis must be a constant specifying the function name to
- 17 -
OPS5c User's Guide
be called. The remainder of the list is evaluated and passed to
the function as arguments. Function arguments may only be
constants or variables, thus no nesting of function calls may be
used. Refer to section 5.3 for a discussion of predefined
functions and to section 9.5 for a discussion of user-defined
functions.
Another difference between LHS patterns and RHS patterns is
the use of variable attribute names. The term '^<aname>'
appearing in the RHS would be interpreted as the attribute name
or offset number specified in the variable 'aname'. This syntax
is not allowed in LHS patterns.
5.3 Pre-defined Functions
All RHS functions return one or more values at consecutive
locations in the result element. The values returned start at
the current position in the result element.
5.3.1 genatom
The genatom function is used to generate a unique symbolic
value. This value is placed at the current position in the
result element.
5.3.2 litval
This function takes a single symbol or variable which
represents an attribute name. It returns the offset assigned to
it in all WMEs. If the symbol or variable value is not an
attribute name, then the function returns zero.
5.3.3 substr
This function extracts a sequence of one or more values
from a WME and places them in sequential locations in the RE
starting at the current position. The first argument to the
function is the element designator which indicates the WME to
extract from. The second and third arguments indicate attribute
offsets which specify the range of attribute values to copy to
the RE. Either of these two arguments may be specified as an
integer, an attribute name (which is converted to its assigned
offset), or a variable that is bound to an integer or attribute
name. The order of the range specification arguments does not
matter.
5.3.4 compute
This function evaluates arithmetic expressions. Expres-
sions may contain the operators '+', '-', '*', '//', and '\\'.
The latter two are used for division and modulus. Operands may
- 18 -
OPS5c User's Guide
be numeric constants or variables bound to numeric values. Note
that compute has no operator precedence and operators are
evaluated from right to left (not left to right as one might
expect). This may be overriden by the use of parentheses.
5.3.5 accept
This function reads input from some input stream. An input
stream is usually keyboard input typed by the user, or a file.
If called without arguments, accept reads input from the current
default input stream. A single optional argument may be
supplied which indicates an alternate input stream to read from.
The optional argument must be either a symbol or a variable
bound to a symbolic value. This string indicates the file
handle of the input stream (see section 5.4 for descriptions of
accessing file streams using RHS actions or top level commands).
If the input begins with an open parenthesis, then a list
of values will be read and placed at sequential locations in the
RE starting at the current position. Input is read until a
closing parenthesis is found. If the first character of the
input is not an open parenthesis, then a single input value is
read into the RE.
If the input function tries to read past the end of a file,
then the atom 'end-of-file' is the value returned.
5.3.6 acceptline
This function also reads input from a stream, but differs
from the accept function in that it reads exactly one line of
input, strips out all parentheses, and then places the values
read into the RE. Parameters are optional. If any parameters
are supplied, they are used to specify the input stream and/or
affect the behavior of the function when there are no values in
the line input or an end of file condition exists. If the first
parameter evaluates to a symbol associated with an input stream,
then that parameter acts as a specification of the input stream
to read from. If the first parameter is not associated with an
input stream, then it is treated the same as subsequent
parameters. The remaining parameters, if present, will be
placed in the RE if an empty input line is read.
5.4 RHS Actions
The actions supported in the RHS for OPS5c are 'make',
'remove', 'modify', 'write', 'bind', 'cbind', 'call', 'halt',
'openfile', 'closefile', and 'default'. These are the same
actions supported by OPS5 except that OPS5 also supports a
'build' command that may be used to dynamically add rules to the
system at run-time. This is not supported in the OPS5c compiled
environment.
- 19 -
OPS5c User's Guide
5.4.1 make
The make command is used to create new WMEs to be added to
WM during the MATCH phase of the next cycle. The pattern
following the make keyword is instantiated and a new WME is
created. Minimally a class name must be given and optional
attribute-value pairs or simply values may be supplied. For
example, the action:
(make person ^name Cindy ^age 25)
creates a new WME for the class person with the value 'Cindy' in
the attribute slot for 'name' and the value 25 in the attribute
slot for age. Similarly to patterns appearing in the LHS,
attribute names may be omitted and values are assigned to
sequential slot offsets after the last referenced offset.
5.4.2 remove
The remove action indicates one or more WME elements to
remove. These WMEs are indicates by a list of element
designators, which as previously mentioned can consist of either
a numeric CE offset (starting at 1) or an element variable name
(including the angled brackets surrounding the variable name).
It is not an error to remove the same CE twice, but removals
after the first are ignored.
5.4.3 modify
The modify action can be thought of as a make followed by a
remove. The first parameter to modify must be a single element
designator. This copies the WME specified to a temporary work
buffer where the remaining pattern elements are interpreted as
attribute-value pairs that alter the buffer before executing a
make command using the buffer values. Once a WME has been
modified, the old WME is removed. Since multiple removes are
allowed without ill side effects, and removes are not actually
processed until the RHS actions have been executed, it is
possible to have multiple modifications of a single WME create
several new WMEs. For example, a rule such as:
(p split-path
{ <SEGMENT> (segment ^start <s> ^end <e>) }
(new-point ^value <p>)
-->
(modify <SEGMENT> ^end <p>)
(modify <SEGMENT> ^start <p>)
)
could be used to make two line segments from one by adding a new
point in between the lines current start and end points. Either
- 20 -
OPS5c User's Guide
modify command will operate using the original contents of the
WME bound to <SEGMENT> and the <SEGMENT> WME will be removed at
the end of the ACT phase.
5.4.4 write
The write function evaluates a pattern and writes this
output to an output stream. If the first argument to write is a
symbol (or variable with a symbolic value) that is associated
with an output file handle, then the first argument acts as an
output stream specifier, otherwise the first argument is treated
the same as the rest of the arguments. Each evaluated argument
is written to the output stream. There are three special
"functions" which can modify the action of write. These are
'tabto', 'rjust', and 'crlf'. The tabto function takes one
argument which must evaluate to a numeric quantity. This
quantity is the column number that the output cursor is advanced
to by printing space characters. If the current output line is
already past the column number specified, then a newline
character is sent first. The rjust function also takes a single
numeric parameter. This indicates a field width. The value
following the rjust function will be printed, right justified,
in a field whose width is x characters, where x is the value of
the numeric parameter. The crlf function takes no arguments.
It sends a newline character to the output stream. These three
special functions are not really functions at all. They only
have meaning with the write action and are referred to as
functions because they follow function call syntax. Example:
(write (crlf) "The age of" <name> "is" <age> (crlf))
It should be noted that printing of subsequent values always has
a space between the values. Thus (write "hello" !) will produce
the output 'hello !', not 'hello!'.
5.4.5 bind
The bind action is used to bind a functions return value to
a variable. Example:
(bind <x> (compute <x> * 2))
5.4.6 cbind
The cbind action is used to bind an element variable to the
last element that was added to working memory. The addition is
usually due to a previous make or modify action, though WMEs may
be added by calling external functions. The cbind action takes
a single argument, the variable to be bound to.
- 21 -
OPS5c User's Guide
5.4.7 call
The call action is used to invole a user-defined action.
The first parameter to call is the name of the user routine.
The remaining parameters are pattern elements which are
evaluated before being passed as parameters to the routine.
Section 9.5 and the Appendicies describe interfacing with user-
written C routines and library routines that may be called from
these routines. Names of user-defined routines should be
declared (with class declarations) by using the external
declaration. Example:
/* In declaration section */
(external DrawLine)
...
/* Some rule RHS */
(call DrawLine <x> <y>)
5.4.8 halt
The halt action terminates program execution irregardless
of the state of working memory or the conflict set.
5.4.9 openfile
The openfile action opens a file for reading or writing and
assigns a file handle to a symbol for future reference to that
file. The first parameter supplied to openfile is the name to
be used as a file handle in subsequent references. The second
parameter is the file name, usually in quotation marks. The
third and final parameter is one of two keywords, either 'in' or
'out'. These indicate if the file is being opened in read or
write mode, respectively.
5.4.10 closefile
The closefile action is used to close files previously
opened with the openfile action. It takes the file's handle as
a single argument.
5.4.11 default
The default action determines where various output streams
created by OPS5c are directed or where the input stream is read
from. The three streams used by OPS5c are 'trace', 'write', and
'accept'. The trace stream is where any trace information, such
as which rule fired on each cycle, is directed. Various levels
of trace information, including no trace information, can be
provided as the system executes. See the description of the top
level 'watch' command in section 8 for information on trace
levels. The write stream is where output from the write action
is directed. The accept stream is the input stream where the
- 22 -
OPS5c User's Guide
accept and acceptline functions read input from. One of the
more common uses of default is to redirect trace information to
a file.
The default action takes two parameters, a file handle and
a stream designator which must be one of 'trace', 'write', or
'accept'. Example:
(openfile trace-output "foobar.trace" out)
(default trace-output trace)
/* run program */
(closefile trace-output)
- 23 -
OPS5c User's Guide
6. Conflict Resolution
The purpose of conflict resolution is to choose one
instantiation from the conflict set which is to fire on the
current cycle. There are many criterion which may drive
conflict resolution in a system like OPS5, but the two major
driving forces in OPS5 conflict resolution strategies are
recency and specificity. These are discussed in more detail
below.
The conflict set may be viewed as a list of possible
assertions that may be made on a particular cycle. A single
instance can be thought of as a set of facts and the
justification (rule) which allows new facts to be asserted (the
RHS actions). Once a particular instance has been selected for
firing by conflict resolution, we must remove that instance from
the conflict set, otherwise it may fire again and cause a
condition known as trivial cycling where the same rule fires
repeatedly.
When a new assertion has been made, say adding a new WME to
the working memory, it is frequently the case that this new
element will be the basis for further inferences. Most systems
will have a large search space of possible solutions to the
problem being solved and the most common system behavior is to
find one of many possible solutions rather than all solutions.
this implies that in scanning parts of the solution space, we
wish to traverse the tree of selection possibilities in a depth-
first fashion as opposed to a breadth-first fashion. This is
done by assuming that given two facts to reason on, the most
recent fact is probably the better one to base assertions on.
This heuristic is known as recency and is a strong driving force
in the OPS5 conflict resolution strategies. To support recency,
each WME in the system is assigned a time tag. This is a unique
integer number given to the element at the time of its creation.
Part of the conflict resolution strategy is to look at the time
tags of all the WMEs that make up each instance, and order the
instances according to the recency of the WME time tags
associated with it.
Another heuristic used in conflict resolution is that of
specificity. Specificity assumes that if two rules are very
similar, but one either contains more condition elements in the
LHS or more tests for the same number of condition elements in
the LHS, then the rule with more CEs/tests must be a special
case of the other, more general, rule. Specificity demands that
we execute the rule with more CEs/tests first.
There are two conflict resolution strategies in OPS5, LEX
and MEA. They are fairly simple, with MEA being a modification
of LEX. We shall describe the LEX strategy first. The criteria
for comparing two instances is used to order the conflict set.
First, the time tags of the positive CEs are ordered in
descending order. The first time tag in each series is
compared. The instance with the greater tag is the more recent
and is thus preferred over the other one. If the time tags are
the same, then the second tags in each list are compared. This
continues until a difference in the lists is noted. If the
- 24 -
OPS5c User's Guide
lists are of unequal length, then the longer list is preferred,
provided the pair-wise comparisons of tags do not favor the
shorter list before its elements are exhausted. This provides
for selection of the more specific rule (the one with more CEs
has the longer list). If two lists are the same length and have
the same elements, then the number of tests (both constant tests
and variable binding tests) are compared and the rule containing
the most tests is selected. This also favors the more specific
rule. If the number of tests are equal then an arbitrary
selection if made.
The MEA strategy has the same rules as the LEX strategy
with one change. We first compare the time tags associated with
the first CE of each instance's rule. If that comparison is
equivalent, then we order the remaining time tags and perform
the pair-wise comparisons. This modification exists because the
first CE of a rule is often a context clause which can define
when a rule is eligible to fire. More will be said about
context clauses in section 7.
- 25 -
OPS5c User's Guide
7. OPS5 Programming
This section is not intended to teach you all the tricks of
programming in OPS5. There are books devoted to this subject.
Instead, this section is intended to give you an overview of the
type of structures you can expect to see in typical OPS5
programs and some of the issues that should be of concern when
you begin writing your own OPS5 programs. The novice programmer
will find this section important to understanding rule-based
programming in OPS5 while those who have some knowledge of OPS5
will probably already know most of the points outlined here.
By its nature, rule based programming is data driven.
Input data in the form of assertions active certain rules, which
may make new assertions activating different rules and so on. A
program execution is a constant cycle of modifying facts and
making and retracting assertions via rule firing. This is quite
a bit different from the declarative programming languages that
most of use use such as C, Fortran, Pascal, etc. In
conventional languages we specify an algorithm or a sequence of
steps to perform. We know that A will be done before B because
we have written the program such that the function that performs
A will be called before the function that performs B. In
contrast, in a rule based system we make statements that when
certain facts hold true, it is appropriate to assert that the
addition or deletion of facts in the database are appropriate
should a given rule be selected to fire. Usually we don't know
when a rule will fire, the number of times it should fire, or
even if it will fire. If our logic manipulating the facts is
correct then we shouldn't care.
While this seems like a desirable trait, there are
pitfalls. First, the methods that we use to solve a problem can
almost never be totally data driven. Each method we use can be
broken up into parts or sections which solve subtasks of the
larger problem. Many times it is necessary to solve one subtask
before considering the solutions to other subtasks. For an
efficiency standpoint it is not reasonable to incur the
execution overhead involved in maintaining a state of reasoning
for some rule if we know that rule does not pertain to the
current subtask the system is trying to solve. Not only is it
inefficient to perform such unnecessary calculation, it may be
incorrect to do so. If we have no mechanism to disallow a rule
from firing, it may activate and fire prematurely making an
incorrect assertion based on incomplete data. To prevent both
undesirable actions, we have certain implementation dependent
rule evaluation procedures and also allow the programmer to use
such features to control the flow of the program execution
somewhat.
We wish the system to do the minimum amount of work
necessary to correctly match rules that may fire with facts. It
should be obvious that if we do not have at least one fact in
each alpha memory for the +CEs in a rule, then we cannot create
an instantiation for that rule. If the rule does not have at
least one WME in each of its +CEs, then the rule is considered
inactive and the only computation done is to place new WMEs
- 26 -
OPS5c User's Guide
being added to working memory into the appropriate CEs for that
rule. We do not attempt to calculate variable bindings because
we know that at least one CE cannot have anything bound to it,
thus such calculations would be useless. This simple
implementation rule allows a programmer to greatly increase both
program efficiency as well as control. This is accomplished by
the programmer's use of context clauses. Normally, a context
clause is a class declared by the programmer which describes the
current goal or focus of the program. Normally such a class is
given a name like 'goal' or 'context'. Each rule in the system
(or at least most of them) will contain one CE specifying the
context in which that rule is eligible to fire. If the context
is currently set to something else, and we have already
specified that there is only one context element in the system
at a time, then the first CE of that rule will have nothing in
it and we will not attempt variable binding computations for the
rule when WMEs are added to the other CEs of the rule. The
context specifies that the current goal, and nothing but the
current goal, will be worked on by the system. Rules not
pertaining to solving the current goal are given a minimal
amount of attention by the system. Note that when we recognize
that it is appropriate to change the system goal, we must modify
the existing goal element. Frequently, this will cause a large
amount or work to be done, because we will destroy all the
instance that the old goal element participated in and there are
often many rules that the creation of a new goal element will
activate and perform variable binding calculations for. For
this reason, we can see that it would be very inefficient to
code a rule based program with a large number of rules
pertaining to a few goals and switch between the goals
frequently. Each goal switch could force a large amount of work
to be done.
One trick that is common in OPS5 programming pertains to
switching contexts or goals. Often we may desire to execute
some task (a specific goal or context) until no more rules in
that context can fire. Then we wish to change the goal. To do
this we rely on the OPS5 conflict resolution strategy. Remember
that in resolving which instance is to fire on the current
cycle, we use time ordered tag lists. If we exhaust one list
while doing the comparison between two lists, then the longer
list wins. This is the same as saying that when comparing
instances between two rules, given identical time tags
participating in both rules, the rule with the most CEs will win
the conflict resolution. We can therefore write rules to switch
context by relying on the fact that such a rule will never fire
as long as a more specific rule (one with more CEs) can fire.
As an example, suppose that some part of our algorithm has
produced a set of guesses, we select one guess to take action
on, then we wish to delete the non-selected guesses, perhaps
with the intent of creating a new set later. We might see rules
such as:
- 27 -
OPS5c User's Guide
(p remove-guesses
(goal ^value delete-guesses)
{ <THE-GUESS> (guess) }
-->
(remove <THE-GUESS>)
)
(p process-guess-start
{ <THE-GOAL> (goal ^value delete-guesses) }
-->
(modify <THE-GOAL> ^value process-guess)
)
We know that although the second rule will be active through
this subtask, it cannot fire unless the is no possibility of
firing the first rule. Hence we will achieve our goal of
deleting the old guesses before moving on the the task of
processing the one selected.
- 28 -
OPS5c User's Guide
8. Top Level Commands
Top level commands are the interpreted commands that the
user types in. The top level commands used in OPS5c are very
similar to those used in the lisp-based OPS5. Some commands are
unsupported while other commands are added to OPS5c to enhance
its usability.
When a user executes an OPS5c program, a few lines of
system identification are printed followed by the prompt 'Top
Level> '. At this point the user can type any of the commands
listed in this section. Minimally, the user must do something
to initiate processing by the OPS5c system. Initially there is
no information in the working memory. In general there are
three ways that a user can initiate program execution once the
initial top level prompt is given. He can created WMEs using
the top level make command. He may use the make command to
create a start symbol. Or he may use the include command to add
many WMEs to working memory.
The first two methods are the same techniques used in OPS5.
If the user knows that certain WMEs need to be created at the
start of every run, he can create an initialization rule whose
RHS makes the necessary elements. The LHS of such a rule
usually depends on the existance of a single element of some
class such as 'start'. The last method allows the user to
specify a path name to a file containing make statements. This
file will be interpreted to create the WMEs needed. Note that
unlike OPS5, the act of making a WME mearly means that we have
created storage representing the WME and correctly initialized
the WME's attribute values. The WME will not actually be
inserted into working memory until a cycle is run. Thus if you
type make commands from the top level prompt, do not expect to
see any instances created or insertions into alpha memories
until you have issued the run command. The same effect will be
noticed when using the remove command. Instances that may be
created as a result of removing a WME will not be observed until
one or more cycles of the system are executed following the
issuing of the remove command.
Many RHS actions are also supported as top level commands
with nearly identical syntax. The main limitations imposed by
using the top level command is that you cannot use variable
references, the quote operator ('//'), and cannot use functions
as arguments. Commands with these characteristics include make,
remove, openfile, closefile, default, and call. For the remove
command, instead of supplying a list of element designators to
delete, the user supplies a list of time tag elements to be
deleted. If called with an asterisk as the only argument, the
command will delete all of working memory. The following sub-
sections describe the remaining top level commands. Those
specific to OPS5c are indicated as such.
8.1 include (OPS5c only)
This command is used to read a text file containing make
- 29 -
OPS5c User's Guide
commands. Appropriate WMEs are created but are not added to
working memory until at least one cycle has been run after the
include command. The include command takes a file name as the
single parameter. The file name may be quoted using double
quotes, single quotes, or vertical bars. Example:
(include "auto-diagnose.input")
8.2 run
This command starts executing the match/resolve/act cycles.
Optionally an integer parameter may be given to run to indicate
the maximum number of cycles to run before returning to the top
level. This is particularly useful for debugging where you may
want to examine the state of the system after a certain number
of cycles and before the program has completed executing.
8.3 watch
This command tells the system how much trace information to
print out as the system executes. Watch takes a single numeric
parameter which must be 0, 1, 2, or 3. A watch level of zero
indicates that no trace information is to be output. The only
output from the program will be that generated by write
statements in the RHS of rules that fire or from externally
called procedures. The default watch level of one will print a
cycle number, rule number, rule name and time tag list for each
instance that is fired. A watch level two provides the watch
level one information in addition to indicating when working
memory elements are made or removed. The WME itself is printed
preceded by '>>' if the element is being added to working memory
or '<<' if the element is being removed from working memory. A
watch level of three will indicate watch level two information
in addition to showing instances being added and removed from
the conflict set.
8.4 ppwm
The ppwm command is used to print WMEs. The command is
followed by a pattern which can only consist of constants and
attribute names. The system will scan all of working memory for
the specified pattern and will print out each WME (including its
assigned time tag) that matches the pattern. When no arguments
are given to ppwm, all of working memory is printed.
8.5 wm
The wm command is similar to the ppwm command except that
instead of specifying a pattern as the arguments of the command,
a list of time tags is specified. Similar to ppwm the command
prints all of working memory if it is called with no arguments.
- 30 -
OPS5c User's Guide
8.6 cs
This command prints out the contents of the conflict set.
The dominant instance (the instance to fire next) is indicated
as such.
8.7 strategy
This command selects the conflict resolution strategy. It
takes either 'lex' or 'mea' as a single argument, or if called
with no arguments it prints the current conflict resolution
strategy.
8.8 pbreak
This command can set break points on specific rules. When
called with a rule name it will set a break point on that rule
if one is currently unset or remove a break point for that rule
if one is currently set. Having a break point set for a
particular rule means that the system will return to the top
level immediately after firing an instance of that rule.
Issuing the command with no arguments prints the names of all
rules which currently have break points set.
8.9 dump (OPS5c only)
This command prints out the contents (WME time tags) of
each of the alpha memories.
8.10 size (OPS5c only)
This command prints out the number of WMEs in the working
memory. The number of WMEs in each class is listed separately.
- 31 -
OPS5c User's Guide
9. The OPS5c System
9.1 Installing OPS5c
Each C compiler has directories that it searches for
standard include files when compiling and standard libraries
when linking. By placing the appropriate OPS5c files in these
places, you may use the OPS5c compiler just as you would your C
compiler. The compiler itself is called 'ops5c'. You may place
this somewhere in your execution path if you desire (C: or
elsewhere). The header file needed to compile the output from
the OPS5c compiler is called 'o5c.h'. Place this file with
other standard headers such as 'stdio.h'. The OPS5c run-time
library is called 'ops5c.lib'. Place this with your standard
libraries such as 'c.lib'. The commands for compiling the
sample program 'tourney.ops' using the Aztec C compiler is given
below:
ops5c test_files/tourney.ops
cc tourney1.c
ln -o tourney tourney1.o -lops5c -lm -lc
Note that to produce the very fastest executables you may turn
on optimization when compiling the C output file, but you will
probably find that this has little effect other than making your
compile take longer. It is far more important to optimize the
library code. The code provided in the distribution was created
using Aztec C 5.0 with complete optimization turned on (-so).
9.2 Differences from OPS5
OPS5c was designed to be as compatible as possible with
existing OPS5 programs. Complete compatibility is virtually
unobtainable since OPS5c is compiled and typically OPS5 is
interpreted. Other differences may be observed due to
implementation details of underspecified features of the
language. The following list is a summary of the most important
differences that you may encounter when running OPS5c.
o The pm, matches, excise, and back top level commands of
OPS5 are not implemented in OPS5c.
o The top level command dump is used to examine the alpha
memory contents. It takes no arguments and prints out
the contents of all alpha memories.
o The top level command size is used to count the number
of elements in each class and add them to determine the
entire size of working memory.
o The top level include command works similarly to the
- 32 -
OPS5c User's Guide
load command found in some Lisp implementations of OPS5.
It accepts a file name as a parameter. The indicated
file must contain only make statements which will add
new elements to working memory.
o Several of the '$' functions used by OPS5 do not make
sense in a non-Lisp environment. Additions have also
been made to the $functions (see Interfacing With
External Code).
o It may or may not be possible to provide execution of
OPS5c which is identical to that of OPS5 for programs
that rely on arbitrary selection of instances. This
occurs when two are more instances are considered
equivalent by the conflict resolution rules and they
exist simultaneously in the conflict set. See Run-Time
Switches for a possible fix to this problem.
o OPS5c does not normally perform coercion between numeric
data types. You can make the system perform coercion by
defining the preprocessor symbol COERCE when compiling
the <program>1.c file. This will substitute a function
call to perform type checking and coercion for the
equals and not-equals tests rather than use the macros
that are normally used.
o OPS5 specifies that all WMEs should contain the same
number of attributes, usually 128 in most
implementations. In order to conserve space on small
systems, OPS5c sizes all WMEs of a particular class
according to the maximum offset of the attributes
declared in the class. (Note that this is not the same
as the number of attributes in the class since one
attribute may have a large offset assigned to it.) This
may be overriden with the -w switch at either compile-
time or run-time.
o OPS5c supports modular compilation. A large system may
be compiled into multiple C source code files, then
linked together to form an executable. This is useful
for systems with small amounts of RAM or compilers that
cannot handle large symbol tables. It often speeds the
compilation process as well.
o It should be noted that in this implementation the
maximum number of CEs per rule is 64. It is STRONGLY
recommended that you limit the number of CEs per rule.
(No more that 10 CEs per rule is suggested for
performance reasons.) Large rules should be broken up
into smaller ones as necessary.
- 33 -
OPS5c User's Guide
9.3 Compiler Switches
There are five switches accepted by the OPS5c compiler.
These are -t, -s, -c, -o and -w. The most commonly used switch
is -s. This switch controls the ordering method used by the
compiler. A separate join code sequence is generated for each
CE of each rule. There are two methods used for ordering the
scan order of the join. Seed ordering uses the CE being added
to as the first CE in the join ordering with the remaining CEs
being examined in lexical order. Heuristic ordering is based on
seed ordering but tries to examine CE dependencies to produce a
more efficient join order of the non-seed CEs (those remaining
after the seed CE has been selected). Normally heuristic
ordering is used since it normally produces results as good or
better than seed ordering. The -s switch indicates that seed
ordering should be used. One item of concern is that the
heuristic ordering algorithm will perform exhaustive search of
all orderings in the worst case. This only happens for rules
with highly connected CEs (CEs which have common variable
bindings). If worst case behavior is observed, the number of
orderings examined is proportional to the square of the number
of CEs in a rule, therefore the compilation time for rules with
a large number of connected CEs may be very high. The compiler
prints an asterisk to the screen upon completion of each rule
compiled. If compilation appears to be very slow then the user
should consider aborting the compile and recompiling with the -s
switch.
The -c and -o switches are used in conjunction with
creating multiple output modules. Multiple output modules may
be necessary on some systems to overcome certain software or
hardware limitations. When multiple output modules become
necessary, the compiler produces a series of files named
'<file>#.c', where the # sign is a numeric sequence number. It
also produces a script file called 'comp.sh' which can be used
to compile and link the multiple modules. Multiple modules may
be advantageous for systems with limited memory which may not be
able to compile an entire large rule system. They may also be
necessary for linkers which have hard set limits on the number
of symbols a single object module may have. Normally the
compiler counts functions produced for join operations and
functions produced for executing right-hand-side actions. These
functions will constitute symbols present in object modules.
The compiler has a cut-off setting which indicates when a
module has too many symbols. If compiling an addition rule into
the current module will exceed the symbol cut-off, then the
current module is terminated and a new module is started. The
default symbol-cutoff value is 3000. The user may specify a new
cut-off value by using the -c switch with a numeric value for
the symbol cut-off immediately following. To suppress module
creation and produce only one module, use the -o switch.
The -t switch is used to specify tracefile execution.
Tracefile execution is used to run benchmarks comparing OPS5c
with other systems. A tracefile of some expert system run is
produced with the other system using a watch level 2 setting
- 34 -
OPS5c User's Guide
(with the watch command). This tracefile can be read by OPS5c
to determine appropriate RHS actions after each conflict
resolution step. In this way we do not have to worry about
recoding external functions used in RHS actions in order to run
OPS5c, the tracefile will determine RHS actions. Note that this
method may give poor results since the cost of interpreting the
tracefile may be considerably more or considerably less than
that observed had external function calls to C routines been
used. To use tracefile execution, follow the -t switch with the
path name of the tracefile.
The -w switch allows the user to specify the size of the
WMEs created at run-time. It applies to all WMEs rather than to
specific classes. Its use is primarily for programs using
undeclared classes or attribute specifications in classes that
the attributes were not declared in. The switch is followed by
an integer number from 1 to 128 which indicates the number of
attribute slots the WMEs will contain. This number must be as
large as the largest offset of any class or the program will not
execute correctly. Programs whose class declarations contain
all attributes used by references of that class do not have to
worry about setting the -w flag as the system will correctly
size each class.
9.4 Run-Time Switches
The executable produced by compiling and linking the OPS5c
output files will accept three switches at run-time, -r0, -r1
and -w. The -r switches are used to reverse the sense of
resolving arbitrary comparisons in the conflict resolution step.
When two equivalent instances are compared the system can either
retain its current best choice, or swap the best choice with the
newly encountered equivalent one. Whether or not the system
performs the swap when this equivalence arises is determined by
these switches. The -r0 switch controls intra-rule comparisons,
i.e. comparisons between instances produced by the same rule.
The -r1 switch controls inter-rule comparisons - comparisons
between instances produced by different rules.
The -w switch works identically to its compile-time
counterpart. If used, it overrides variable sizing of classes
and will also override any setting created by use of the -w
compile-time switch.
9.5 Interfacing With External Code
OPS5c has the capability of interfacing with externally
written C code. Actually the type of code that is interfaced to
is irrelevant. It may be FORTRAN or assembler as long as the
translators output a common object format understood by the
linker. All external programs must use the interface defined by
OPS5c for passing parameters and results between the expert
system and the external routines. Additionally, the external
routines must have some understanding of the data types and
- 35 -
OPS5c User's Guide
symbol tables used by the OPS5c system in order to correctly
interface with the expert system.
9.5.1 Data Types
The OPS5c system has the attribute as its simplest
unstructured data type. An attribute is actually synonymous
with the term atom as used in the Lisp programming language and
both terms will be used interchangably. An atom is actually
many different data types. An atom is a structure containing
two parts, the storage location for a data type, and a flag
indicating what type of data is actually being stored there. An
atom is similar to the concept of a union in the C language, and
in fact is implemented as such: (as defined in o5c.h)
typedef union {
float fval;
int ival;
StrID string;
} Data;
typedef struct attribute {
unsigned char type;
Data data;
} Attr;
The types currently defined are:
AT_FLOAT : a floating point value
AT_INTEGER : an integer value
AT_SYMBOL : an interned string (see interning below)
If the user wished to create an attribute to represent the
constant integer 5, he might code it as:
Attr five;
five.type = AT_INTEGER;
five.data.ival = 5;
When coding external functions, the programmer must keep in mind
that attributes may be of any type and change their type
dynamically as new values are assigned. No assumptions should
be made about attribute types. Library functions exist for
comparing attribute values. If these are not used, the
programmer is responsible for type checking and coercion. Most
functions that interface with OPS5c will require pointers to
attributes (typedef'd as AttrID) as parameters. It will usually
be necessary for external routines to declare attributes,
initialize them correctly, and use the appropriate functions for
comparison, assignment, and retrieval of attribute values.
- 36 -
OPS5c User's Guide
9.5.2 The Result Element
The result element is a global buffer used to pass
parameters between OPS5c and external routines. The result
element (RE) is an array of attributes whose size is that of the
largest WME supported by the implementation (128 attributes).
Values of attributes may be copied to the RE or copied from the
RE using a library of functions known as the dollar library.
This name arises from the fact that the Lisp implementation
names functions which operate on the RE using names that start
with a dollar sign ($reset, $value, etc.). Many of these names
have been kept, where appropriate, but the functions are named
dollar_reset, dollar_value, etc. since C function names cannot
contain a dollar sign. Appendix A describes the functions in
the dollar library.
9.5.3 Interning and String Spaces
The term interning comes from the Lisp environment and is
the primary difference between strings and symbols. A string,
as defined by the C programming language, is simply a sequence
of characters terminated by a null character. We often refer to
a string or consider its value to be the pointer to the first
character of the string. Such a pointer is used to perform
comparisons between two strings on a character by character
basis to determine equality, inequality, or other relationships.
In OPS5c we are interested in the relationships of equality and
inequality between attributes with non-numeric data types. If
we were to store non-numeric information as strings and perform
character by character matching, the testing process would be
very slow. Instead, we form a data structure known as a symbol
table. Using such a structure we can store strings (known as
symbols) and look up their pointers by a process known as
interning. To intern a string, we call the intern function with
the string as an argument. The intern function will scan the
symbol table looking for an identical string (i.e. symbol). If
such a symbol is found then the address of the symbol is
returned as the value of the function. If no such symbol is
found, the string is copied into the symbol table, thus creating
a new symbol, and its address is returned as the value of the
interning function. This is done for all non-numeric attribute
values in the system. The main reason for this is that once a
string has been interned, i.e. entered into the symbol table, we
may compare two symbols by comparing their addresses, since if
the original strings were identical, then the interning process
will return identical addresses. This greatly speeds up
comparison of non-numeric data and also results in less storage
requirements since we store the address as the value of an
interned symbol.
As a result, if an external routine wishes to manipulate
symbolic data such that the OPS5c inference engine can match
properly, attribute values going into and coming out of the RE
must be represented as interned symbols. The dollar library
- 37 -
OPS5c User's Guide
contains a very useful function, dollar_intern, for converting a
string into an interned symbol.
To fully interface with the system, the programmer must be
aware of a more generic library known as the strings library.
Functions in the strings library aid in manipulating values
interned as symbols. The strings library operates on string
spaces which are equivalent to symbol tables. The distinction
and hence reason for different names, is that the OPS5c system
keeps several different string spaces (symbol tables). The most
often used string space is that for attribute values. It is
called 'attr_values' and should be referenced as such when
calling functions in the string library.
The functions in the string library are listed in Appendix
B. The most commonly used functions are AddString() and
StringAvail(). The AddString() function performs interning in a
string space. The StringAvail() function simply tests for the
existance of a symbol in a string space. Unlike AddString(),
StringAvail() will not add a new symbol if it is not already
there. Note that the string library functions return pointers
of type String Entry. The String Entry data type is defined
below.
typedef struct {
HT_Entry_Header link;
StrID str_id;
} String_HT_Header;
typedef String_HT_Header *StringEntry;
Frequently the user will wish to dereference the StringEntry to
obtain the StrID of the returned value. The StrID is not a
string pointer and should be treated as a handle to the symbol.
The GetString() macro can be used to return the string
associated with a StrID.
Another string space of interest is 'attr_names'. It
contains symbols for all the declared attribute names known by
the system. The programmer should not attempt to add to the
attr_names string space, only to read from it. StringEntry's
retrieved from this name space should be cast to (struct
attr_off *) where:
struct attr_off {
String_HT_Header link;
int offset;
};
Such a pointer can be dereferenced by ptr->offset to retrieve
the offset (slot position in the RE) associated with a
particular attribute name. This is a common occurrance when we
wish to access values by attribute name which is the most common
method.
- 38 -
OPS5c User's Guide
9.5.4 Functions Versus Actions
There are two types of external code that the user can
interface to an OPS5c program, functions and actions. The
distinction between these two forms of interface in the OPS5
language is vague. Functions must return a single value in the
result element. This is accomplished with the dollar_value()
call. One and only one call to dollar_value() must be made by a
function. The OPS5 language does not specify how argument
passing to functions are implemented. For predefined functions
in-line code or calls to routines in the run-time library are
produced. For user-defined functions, arguments are passed via
a global array args[] which is an array of attribute values
(atoms). A global counter, fargc, is used to indicate the
number of valid arguments in args[]. This is similar to the
argc and argv[] method of passing command line arguments to C
programs. Once parameters are loaded into the args[] array, the
external function is called. A C function integer argument,
which is the number of actual parameters, is passed to the
function.
Unlike functions, actions may utilize all the functions
defined in the dollar library. An action may return zero or
more values. These values are usually returned in the result
element using the dollar_value() function. Actions are passed
arguments through the result element. Before calling the action
the result element is cleared (filled with nil atoms) and then
it is filled with the evaluated pattern that comprises the
parameters. No modification of the result element is performed
after the action has finished with it.
Note that the OPS5 language specifies that all external
functions and actions must be declared with the external
statement. In OPS5c this is not a requirement, however, actions
which are not declared using external cannot be called from the
top-level environment.
- 39 -
OPS5c User's Guide
10. References
Forgy81
Forgy, Charles L. , OPS5 User's Manual, Department of
Computer Science, Carnegie-Mellon University. CMU-CS-81-135
Miranker87
Miranker, D.P., TREAT: A Better Match Algorithm for AI
Production Systems: Long Version, Artificial Intelligence
Laboratory, University of Texas at Austin, Technical Report
AI TR-87-58
- 40 -
OPS5c User's Guide
Appendix A - Dollar Library Description
/* Possible return value of dollar_litbind() */
#define NOT_DEFINED -1
/************************************************************
This routine creates a new wme from the result element.
************************************************************/
void
dollar_assert(class_index)
int class_index;
/************************************************************
This function takes an atom (attribute id) and returns its
integer value.
************************************************************/
int
dollar_cvan(atom)
AttrID atom;
/************************************************************
This function takes an integer and returns a numeric atom
(interned).
************************************************************/
Attr
dollar_cvna(num)
int num;
/************************************************************
This function is used to test the equality of two atoms
(attributes).
************************************************************/
BOOLEAN
dollar_eql(atom1,atom2)
AttrID atom1;
AttrID atom2;
/************************************************************
This function places a float value in the result element. The
value of the float is returned.
************************************************************/
float
dollar_flt_val(value)
float value;
- 41 -
OPS5c User's Guide
/************************************************************
This function determines if a file of specified name has been
opened for input.
************************************************************/
FILE *
dollar_ifile(attr)
AttrID attr;
/************************************************************
This function places an integer value in the result element.
The value of the integer is returned.
************************************************************/
int
dollar_int_val(value)
int value;
/************************************************************
This function is used to intern a string into the attr_values
string table.
************************************************************/
StrID
dollar_intern(str)
char *str;
/************************************************************
This function determines if its string argument is an attribute
name that has been assigned an offset at compile time. If so,
the offset is returned. If not NOT_DEFINED is returned.
************************************************************/
int
dollar_litbind(str)
char *str;
/************************************************************
This function returns an attibute (Attr, not an AttrID) when
given an index into the result element
************************************************************/
Attr
dollar_parameter(index)
int index;
/************************************************************
This function returns a count of the number of elements in the
result element.
************************************************************/
int
dollar_parametercount()
- 42 -
OPS5c User's Guide
/************************************************************
This routine clears the result element (fills it with nil) and
resets the result element index.
************************************************************/
void dollar_reset()
/************************************************************
This function places a string value in the result element. A
StrID must be passed. (See strings module.) The StrID is
returned as the result.
************************************************************/
StrID
dollar_str_val(value)
StrID value;
/************************************************************
This predicate determines if an atom (attribute) is symbolic.
************************************************************/
BOOLEAN
dollar_symbol(x)
Attr x;
/************************************************************
This routine sets the index of the result element. Indicies are
numbered starting at 1. The argument passed must be integer.
************************************************************/
dollar_tab1(num)
int num;
/************************************************************
Similar to dollar_tab1() above except that the argument passed
is an AttrID. The attribute may be numeric (integer or float)
or a numeric string.
************************************************************/
void
dollar_tab2(attr_id)
AttrID attr_id;
/************************************************************
This routine takes an attribute structure (not an AttrID) and
places it in the result element at the current position. The
pointer is then incremented to the next position.
************************************************************/
dollar_value(attr)
Attr attr;
- 43 -
OPS5c User's Guide
Appendix B - String Library Description
/************************************************************
* This function "interns" a string and returns its ID code. In
* this implementation version, the ID is a memory address.
************************************************************/
StringEntry
AddString(str,string_table,struct_size)
char *str;
HashTable string_table;
int struct_size;
/************************************************************
* This function checks to see if the indicated string already
* exists in the specified string table. If so, the StringEntry
* of the string is returned, else NULL is returned.
************************************************************/
StringEntry
StringAvail(str,string_table)
char *str;
HashTable string_table;
/************************************************************
* This is a macro which can be used with a StrID to return the
* pointer to the C string representation of the interned
* string.
************************************************************/
char *
GetString(str_id)
StrID str_id;
- 44 -
OPS5c User's Guide
Index
// 14, 17-18, 29
accept stream 22
ACT 2-3, 5-13, 16-24, 26-30, 34-37, 39
actions 2-3, 5-7, 12-13, 16-17, 19-20, 24, 26, 29, 34-35, 39
AddString() 38
alpha memory 7-8, 26, 32
amem 7-8
atom 2, 10, 17-19, 36, 39, 41, 43
Attr 2, 5, 9-18, 20, 29-30, 33, 35-39, 41-43
attribute 2, 5, 9-11, 13-18, 20, 29-30, 33, 35-39, 41-43
attribute variable 13
attribute-value pair 13-14, 20
AttrID 36, 41-43
attr_names 38
attr_values 38, 42
call 2, 4-7, 16-19, 21-22, 26, 29-39
case 5-7, 10, 16, 24, 34
CE 2-43
class 2, 4-5, 8-15, 20, 22, 27, 29, 31-33, 35, 41
closefile 2, 19, 22-23, 29
coercion 10, 33, 36
comp.sh 34
condition element 2, 5, 13, 15-16, 24
conflict resolution 2, 6-7, 9, 24, 27, 31, 33, 35
conflict set 6-7, 15-16, 22, 24, 30-31, 33
conjuctions 2, 14
constant tests 7-8, 25
context clause 25, 27
contexts 27
crlf 21
default 2, 19, 22-23, 29-30, 34
disjunctions 2, 15
dominant instance 31
element designator 2, 16-18, 20, 29
element variable 2, 16-17, 20-21
functions 2-4, 10, 17-18, 21, 23, 29, 33-39
heuristic ordering 34
incremental matching 6, 8
instance 5-8, 15-17, 24-25, 27, 29-31, 33, 35
interning 3, 36-38
JOIN 7, 34
join code 34
LEX 24-25, 31, 34
LHS 2, 5-7, 10, 12-13, 16-18, 20, 24, 29
literal 5, 9-10, 13-15
literalize 5, 9-10, 14
make 2, 5, 7, 16, 19-21, 24, 26, 29, 33
MATCH 4, 6-8, 13-16, 20, 26, 30, 32, 37, 40
MEA 4, 6, 8, 13, 15, 17, 21, 24-25, 29, 31
modular compilation 33
modules 34
nil 10, 17, 39, 43
- 45 -
OPS5c User's Guide
o5c.h 32, 36
offsets 9-10, 17-18, 20
openfile 2, 19, 22-23, 29
ops5c.lib 32
predicates 2, 10, 14, 43
quoting 2, 13, 15
RE 2-44
recency 24
remove 2, 6, 15-16, 19-21, 24, 28-31
result element 3, 17-18, 37, 39, 41-43
RETE 4-5, 7, 9, 13, 18, 20, 29, 32
RHS 2, 5-7, 12, 16-20, 22, 24, 29-30, 35
rjust 21
rule 2, 4-10, 12-16, 19-20, 22, 24-31, 33-35
seed ordering 34
SELECT 5-7, 15, 24-28, 31, 33-34
specificity 24
StrID 36, 38, 42-44
string space 3, 37-38
StringAvail() 38
StringEntry 38, 44
strings 10, 37-38, 43
subtasks 26
symbol table 9, 33, 36-38
symbols 10, 13-14, 34, 37-38
tabto 21
time tags 9, 24-25, 27, 29-31
top level 2, 13, 19, 22, 29-32
trace stream 22
TREAT 4-5, 7, 11, 13, 19, 21, 38, 40
variable binding tests 7, 25
vector attributes 2, 10-11
WM 2, 5-11, 13-18, 20-21, 24, 26-27, 29-31, 33, 35, 37, 41
WME 6-11, 13-18, 20-21, 24, 26-27, 29-31, 33, 35, 37, 41
Working memory 2, 5-7, 9, 15, 17, 21-22, 24, 27, 29-33
working memory element 2, 6, 9, 30
write stream 22
\\ 18
- 46 -