The KB contains constants that denote collections of other concepts, such as #$AnimalWalkingProcess (the set of all actions in which some animal walks) or #$Typewriter (the set of all typewriters). It can have constants that denote individual things, some of which are more-or-less permanently in the KB, like #$InternalRevenueService, and some of which might get created only when reasoning about some state of affairs, like #$Walking00036 (a particular case of walking). Some of the individuals represented in the KB are predicates, such as #$isa or #$likesAsFriend, that allow one to express relationships among constants. Others are functions, such as #$GovernmentFn, which can take constants or other things as arguments, and generate new concepts (i.e., (#$GovenmentOf #$Canada)).
Each constant has its own data structure in the KB, consisting of the constant and the assertions which describe it.
CYC® constants are referred to with the prefix "#$" (read "hash-dollar"). These characters are sometimes omitted in documents describing CycL, and they may be omitted by certain interface tools. But in these CYC® Documentation pages, the policy will be to use the "#$" prefix when referring to CYC® constants.
Constant names can include any uppercase or lowercase letter, any digit, and the symbols "-" (dash), "_" (underscore), and "?" (question mark). No other characters, such as "!", "&", or "@" are allowed. This policy is enforced in the CYC® Functional Interface and in the CYC® Web Interface.
CYC® constant names are case-sensitive: #$foo is not the same as #$Foo. However, distinguishing two constant names solely on the basis of capitalization is prohibited by the system.
All CYC® predicate names must begin with a lowercase character. (This does not include all the things that are presently instances of #$Predicate in CYC®. Some of these latter things are more like functions, and their names begin with uppercase letters).
All non-predicate constant names must begin with an uppercase character. Non-predicate constant names may also begin with a numeric character (e.g., #$3MCorporation). We may also allow predicates to begin with numeric characters, if someone makes a compelling argument for why this should be allowed.
All CYC® constant names should be composed of one or more meaningful "words" in sequence, with no breaks except for dashes or underlines (e.g. #$isa and #$SportsCar). A sequence of numeric characters may count as a "word" (e.g., #$FrontOfficeOf123Corp). With the exception noted above for predicate names, each (non-numeric) "word" in a sequence must begin with a capital letter. An acronym may count as a "word", but all its characters will be the same case (e.g., lower case if the acronym begins the name of a predicate constant; otherwise uppercase).
Hyphens are used to set off parts of names which restrict or refine the meaning of the name, as in #$Fruit-TheWord or #$Horse-Domesticated.
When naming a constant, it's important to assign a name that distinguishes the denoted concept from other concepts it might get confused with. So "Bow" would be a terrible name for a constant. Instead, names like "Bow-BoatPart", "BowTheWeapon", "Bowing-BodyMovement" should be used, depending on the underlying concept denoted.
Sometimes it is possible to take this principle of specificity in names to an extreme, and attempt to embody the whole meaning of the constant in its name. This is discouraged. For example, one might be tempted to give the constant #$physicalParts the name "distinctIdentifiablePhysicalParts", but it is better to leave the name a bit terser since it isn't easily confused with some other concept, and put the additional information in the constant documentation.
It's also very important never to assume that you, the observer of the CYC® KB, can know with certainty what a constant denotes to the system, just from seeing its name and nothing else.
The meaning of a constant in CycL is determined by the assertions in the KB that use that constant. For example, from the following assertions, it is easy to tell what the hypothetical constant #$EMRG means:
(#$isa #$EMRG #$Color) (#$colorOfObject #$Grass37 #$EMRG) (#$forAll ?O (#$implies (#$isa ?O #$Okra) (#$colorOfObject ?O #$EMRG)))For convenience, we choose names for CYC® constants that will indicate to human users what the constant is intended to mean. (For example, #$Red or #$RedColor.) But remember, CYC® doesn't understand those strings. Don't be misled by evocative names like #$LittleRedHairedGirlLikedByCharlieBrown. Unless that constant is appropriately related to other CYC® constants such as #$FemaleChild, #$hairColor, #$RedHairColor, #$CharlieBrown, and #$likesAsFriend, it is meaningless to CYC®.
(#$implies (#$and (#$isa ?TRANSFER #$TransferringPossession) (#$fromPossessor ?TRANSFER ?FROM)) (#$isa ?FROM #$SocialBeing))"The initial possessor in a possession transfer is a social being."
Every formula has the structure of a Lisp list. It is enclosed in parentheses, and consists of a list of objects which are commonly designated ARG0, ARG1, ARG2, etc. The object in the ARG0 position may be a predicate, a logical connective, or a quantifier. The remaining arguments may be atomic constants, non-atomic terms, variables, numbers, English strings delimited by double quotes ("), or other formulas.
(#$likesAsFriend #$DougLenat #$KeithGoolsbey) (#$skillCapableOf #$LinusVanPelt #$PlayingAMusicalInstrument #$performedBy) (#$colorOfObject ?CAR ?COLOR)The first two of the atomic formulas above are ground atomic formulas (GAFs), since none of the terms filling the argument positions ARG1, ARG2, etc. are variables.
The number of arguments a predicate takes is determined by its arity. A predicate is described as unary, binary, ternary, quaternary, or quintary, according to whether it takes 1, 2, 3, 4, or 5 arguments. Currently, no CycL predicate takes more than 5 arguments; however, if some representation required a predicate to take more arguments, CycL would be changed to allow this.
To be well-formed, an atomic formula must have the right number of arguments for the predicate filling the ARG0 position. So,
(#$likesAsFriend #$DougLenat #$KeithGoolsbey #$Fido)is not well-formed, since the arity of #$likesAsFriend is 2, but this formula gives 3 arguments to #$likesAsFriend.
(#$isa #$residesInDwelling #$BinaryPredicate) (#$arg1Isa #$residesInDwelling #$Animal) (#$arg2Isa #$residesInDwelling #$ShelterConstruction)To be well-formed, every formula which has #$residesInDwelling in the ARGO position must have a term which is an instance of #$Animal in the ARG1 position, and term which is an instance of #$ShelterConstruction in the ARG2 position. So,
(#$residesInDwelling #$PottedPlant37 #$KarensHouse)is probably not well-formed. Though we can never be absolutely certain just from the names, #$KarensHouse could be an instance of #$ShelterConstruction, but #$PottedPlant37 is probably not an instance of #$Animal.
(#$not (#$colorOfObject #$FredsBike #$RedColor))will be true if and only if (#$colorOfObject #$FredsBike #$RedColor) is false. Likewise,
(#$not (#$not (#$colorOfObject #$FredsBike #$RedColor)))will have the same truth value as (#$colorOfObject #$FredsBike #$RedColor).
(#$and (#$colorOfObject #$FredsBike #$RedColor) (#$objectFoundInLocation #$FredsBike #$FredsGarage))This formula states that Fred's bike is red and that it is located in Fred's garage. If both of those things are true then the whole formula is true, but if one or both are false, then the whole formula is false.
(#$or (#$colorOfObject #$FredsBike #$RedColor) (#$objectFoundInLocation #$FredsBike #$FredsGarage) (#$owns #$Fred #$FredsBike))This assertion states that either Fred's bike is red, or it is located in Fred's garage, or Fred owns it, or all three. (The word "or" in English is sometimes taken to imply that one alternative or the other is true, but not both. That is not the case with #$or.) If any or all of these three statements is true, then the whole formula is true. All would have to be false for the formula as a whole to be false.
(#$implies (#$owns #$Bike001 #$Fred) (#$colorOfObject #$Bike001 #$RedColor))This assertion states that if #$Bike001 is owned by #$Fred, then it is red. Newcomers to formal logic may misinterpret #$implies as implying a causal relationship. But, strictly speaking, a #$implies assertion says only that either the first argument is false, or the second argument is true. So, for example, the assertion
(#$implies (#$isa #$RichardNixon #$Fruit) (#$colorOfObject #$BillJ #$PastelMintGreen))is true, because the first argument is false.
Assertions involving #$implies are very common in the CYC® KB. We also call them conditionals or rules, and we often refer to the first argument as the antecedent and the second argument as the consequent. Note, however, that the particular formula above is not representative of assertions likely to be found in the CYC® KB. We will come to some more representative examples in a moment.
Suppose A and B are syntactically legal, and C is not. Then,
(#$not A) (#$and A) (#$and A B) (#$or A) (#$or A B) (#$implies A B)would all be CycFormulas. But
(#$not A B) (#$and) (#$and A C) (#$implies A)would NOT be CycFormulas. Why? (#$not A B) violates the requirement that #$not take only one formula as an argument. (#$and) and (#$implies A) also violate restrictions on the number of formulas these connectives take as arguments. (#$and A C) is not well-formed because C is not; any complex formula that contained C would be syntactically bad for the same reason.
It should also be noted that #$and and #$or are elements of #$VariableArityRelation: hence, if A, B, C, and D are well-formed CycL expressions,
(#$and A B C D)would be well-formed and also
(#$or A B C D)would be well-formed.
(#$forAll ?X (#$implies (#$owns #$Fred ?X) (#$objectFoundInLocation ?X #$FredsHouse)))This formula states that it is true, concerning every object in the CYC® ontology, that if #$Fred owns that object, then that object is located in #$FredsHouse. In other words, all Fred's stuff is in his house.
(#$forAll ?X (#$forAll ?Y (#$implies (#$and (#$owns #$Fred ?X) (#$owns #$Fred ?Y)) (#$near ?X ?Y))))which says that any two things owned by Fred are near each other. Note that each quantifier introduces a new variable, and that each variable must have a different name.
However, if a unbound variable appears in a CycL formula, it is always assumed to be universally quantified, with the result that
(#$implies (#$owns #$Fred ?X) (#$objectFoundInLocation ?X #$FredsHouse))is exactly equivalent to
(#$forAll ?X (#$implies (#$owns #$Fred ?X) (#$objectFoundInLocation ?X #$FredsHouse)))Since the former is easier to write and read, it is almost always preferred in practice, and you will rarely see a #$forAll while browsing the CYC® KB. Note, however, that unbound variables which appear only in the consequent of a conditional, and not in the antecedent, may have drastic and undesired consequences. Take, for example, the following:
(#$implies (#$owns #$Fred ?WHATEBER) (#$objectFoundInLocation ?WHATEVER #$FredsHouse))Because of the typo, the variable ?WHATEVER will range over the entire CYC® ontology. In other words, the assertion above states that as long as Fred owns one thing, everything is located in #$FredsHouse--probably not what we wanted.
(#$implies (#$isa ?A #$Animal) (#$thereExists ?M (#$mother ?A ?M)))This assertion states that, for every animal, there exists at least one object which is that animal's mother. The object which is the animal's mother may be an object which is already represented by a CYC® constant, or it may be a new object of which CYC® has no knowledge. But unless and until it is told otherwise, CYC® will assume that the object is a new one not identical with any "known" object.
(#$implies (#$isa ?P #$Person) (#$thereExistExactly 2 ?LEG (#$and (#$isa ?LEG #$Leg) (#$anatomicalParts ?P ?LEG))))
(#$implies (#$isa ?T #$Table) (#$thereExistAtLeast 3 ?LEG (#$and (#$isa ?LEG #$Leg) (#$anatomicalParts ?T ?LEG))))
(#$implies (#$isa ?P #$Person) (#$thereExistAtMost 1 ?SPOUSE (#$spouse ?P ?SPOUSE)))
(#$implies (#$isa ?A #$Animal) (#$thereExists ?M (#$and (#$mother ?A ?M) (#$isa ?M #$FemaleAnimal))))will appear in the KB as 4 different assertions:
(#$isa #$SKF-8675309 #$SkolemFunction) (#$arity #$SKF-8675309 1) (#$implies (#$isa ?A #$Animal) (#$mother ?A (#$SKF-8675309 ?A))) (#$implies (#$isa ?A #$Animal) (#$isa (#$SKF-8675309 ?A) #$FemaleAnimal))For more details, look at "An Introduction to CYC® Inferencing".
Consider, for example, the function #$FruitFn, which takes as an argument a type of plant and returns the collection of the fruits of that type of plant. This function can be used to build the following NATs:
(#$FruitFn #$AppleTree) (#$FruitFn #$PearTree) (#$FruitFn #$WatermelonPlant) . . . .Note that there may or may not be a named CYC® constant corresponding to the collection of apples (that is, a constant called #$Apple). The NAT (#$FruitFn #$AppleTree) provides a way of talking about this collection even if the corresponding constant does not exist.
NATs can be used anywhere a constant can be used. One could write, for example:
(#$implies (#$isa ?APPLE (#$FruitFn #$AppleTree)) (#$colorOfObject ?APPLE #$RedColor))
A few functions do not have a fixed arity, but can take a variable number of arguments. Mathematical functions like #$PlusFn are one example. And in Cyc-10, IBQEs are now treated as NATs in which the units of measure are functions which can take either one or two arguments, according to whether they are intended to denote a single value or a range.
Functions with no fixed arity are defined using the predicate #$argsIsa, which specifies a single type of which every argument must be an instance.
Functions differ from predicates in that they return a CYC® term as a result. Accordingly, function definitions must also describe the type of the result to be returned, using the predicate #$resultIsa. Consider, for example, the function #$GovernmentFn:
(#$arity #$GovernmentFn 1) (#$arg1Isa #$GovernmentFn #$GeopoliticalEntity) (#$resultIsa #$GovernmentFn #$RegionalGovernment)The argument to #$GovernmentFn must always be an instance of #$GeopoliticalEntity, and a NAT created using #$GovernmentFn will always be an instance of #$RegionalGovernment. So, for instance,
(#$isa (#$GovernmentFn #$UnitedStatesOfAmerica) #$RegionalGovernment)
The distinction between individuals and collections is an important one in CycL. For more on this topic, look at the constants #$Individual and #$Collection.
The definition of an instance of #$CollectionDenotingFunction should specify, not only its argument types and result type, but also the collection that the result will have as genls. This is done using the predicate #$resultGenl. For example, if the function #$LeftPairMemberFn is defined by:
(#$isa #$LeftPairMemberFn #$CollectionDenotingFunction) (#$arity #$LeftPairMemberFn 1) (#$arg1Isa #$LeftPairMemberFn #$SymmetricalPartType) (#$resultIsa #$LeftPairMemberFn #$ExistingObjectType) (#$resultGenl #$LeftPairMemberFn #$LeftObject)then the following must be true concerning a NAT constructed from #$LeftPairMemberFn and #$Shoe:
(#$isa (#$LeftPairMemberFn #$Shoe) #$ExistingObjectType) (#$genls (#$LeftPairMemberFn #$Shoe) #$LeftObject)In other words, the set of left shoes is an instance of #$ExistingObjectType and a subset of #$LeftObject.
When a new reified NAT-constant is first created, CYC® automatically sets up the correspondence
(#$termOfUnit NAT-CONSTANT NAT-EXPRESSION)where NAT-CONSTANT is the automatically created constant and NAT-EXPRESSION is the non-atomic term that can be used to refer to it. In the CYC® web interface, such an assertion might look like this:
(#$termOfUnit (#$GovernmentFn #$Canada) (#$GovernmentFn #$Canada))It looks like the first and second arguments are the same, but that's because, in lieu of a proper constant name for the reified NAT-constant, the system uses the NAT expression as a print name. If you look very carefully at any #$termOfUnit assertion in the web interface, you will see that the opening parenthesis of ARG1 is a followable link (depending on the web-browser you use, it may be underlined, or a different color), but the opening parenthesis of ARG2 is just opaque text. Clicking on the opening paren of a NAT expression will display the page for the reified NAT-constant the expression denotes.
Moreover, a reified NAT can be explicitly identified with an existing constant using the predicate #$termOfUnit:
(#$termOfUnit #$TheYear1996 (#$YearFn 1996)) (#$termOfUnit #$Apple (#$FruitFn #$AppleTree))When a NAT is identified with a constant using #$termOfUnit, the two are asserted to be de dicto equivalent.
Skolem functions are reifiable.
Non-reifiable functions include mathematical functions like #$PlusFn. Just because we use a NAT like (#$PlusFn 59 64) doesn't mean we want to add to the KB a unit denoting the number 123. If we want to talk about the number 123, we'll just refer to it directly.
Also, #$UnitOfMeasure is not a subset of #$ReifiableFunction, so IBQEs such as (#$Inch 37) and (#$Meter 500) are not reified when they are referred to.
For example:
(#$implies (#$and (#$equals ?U (#$PreviouslyOwnedFn ?ARG1)) (#$isa ?X ?U)) (#$hasAttributes ?X #$Used))For quantifying into NATs you should always use #$equals, never #$termOfUnit. The predicate #$termOfUnit is used in the automatic mapping between the system generated data structure and the original non-atomic-term and should figure in the hand-entered assertions of human Cyclists very rarely, if at all.
Note that you cannot quantify into a NAT unless it is built on an instance of #$ReifiableFunction.
The CYC® KB consists of a large number of assertions. When a formula is successfully asserted into the KB, it is stored as one of these. Each assertion is composed of a number of elements:
Microtheories are covered in more detail here, as well as in the constant vocabulary, under #$Microtheory. Where does the microtheory information on assertions come from? That depends on the origin of the assertion. If an assertion is added to the KB by the inference engine as the result of firing a rule, the inference engine code decides what microtheory the conclusion should be added in and records it at add time. If an assertion is the result of a person or external program asserting a formula into the KB, at that time the asserter must specify which microtheory the formula is to go in. Some interfaces for knowledge entry may not require the user to specify a microtheory for new assertions, and will then either try to choose the right one or will use #$BaseKB as a default. If you use such an interface make sure you know what the default behavior is.
Assertions that are monotonically true are held to be true in every case, that is, for every possible set of bindings to the universally quantified variables (if any) in the assertion, and cannot be overridden. In the case of a monotonically true assertion with universally quantified variables in its formula, if an object is found for which the assertion is not true, an error is signalled. In the case of a ground assertion that is monotonically true, if the negation of that formula is ever asserted or arrived at during inference (in the same microtheory), an error is signalled.
Assertions that are default true, in contrast, are held to be true in most cases, and can be overridden. If the negation of an existing ground, default assertion is asserted in the same microtheory, or is arrived at through inference, no error is signalled. Instead, the argumentation mechanism is invoked to decide what the final truth value of the assertion will be.
By default, GAFs which begin with the predicates #$isa and #$genls are monotonically true, while all other assertions (including rules) are default true.
Assertions with direction code are not used in normal inference at all; instead, special HL modules have been written to supplant the need for inference using the assertion itself. Code assertions cannot be edited via the HTML interface.
One way of viewing directions is as a hierarchy of "when it gets used" :
Older CYC® documentation refers to "access levels" rather than to direction. Access level 0 is equivalent to direction forward, while access level 4 or higher is equivalent to direction backward.
The support is the one element of an assertion which need not be specified by the KEer when performing knowledge entry. It is created and updated automatically. However, supports can be displayed by KB browsing tools.