home *** CD-ROM | disk | FTP | other *** search
Text File | 1992-04-26 | 55.1 KB | 1,031 lines |
- Please let me know if you receive this.
-
- The enclosed manuscript is a draft, which contains more than any
- published paper. Papers that have been published or are in press are:
-
- Ray, T. S. 1991. ``Is it alive, or is it GA?''
- Proceedings of the 1991 International Conference on Genetic Algorithms,
- Eds. Belew, R. K., and L. B. Booker, San Mateo, CA: Morgan Kaufmann, 527--534.
-
- Ray, T. S. 1991. ``An approach to the synthesis of life.''
- Artificial Life II, Santa Fe Institute Studies in the Sciences of
- Complexity, vol. XI, Eds. C. Langton, C. Taylor, J. D. Farmer, & S. Rasmussen,
- Redwood City, CA: Addison-Wesley, 371--408.
-
- Ray, T. S. 1991. ``Population dynamics of digital organisms.''
- Artificial Life II Video Proceedings, Ed. C. G. Langton,
- Redwood City, CA: Addison Wesley.
-
- Ray, T. S. 1991. ``Evolution and optimization of digital organisms.''
- Scientific Excellence in Supercomputing: The IBM 1990 Contest Prize Papers,
- Eds. Keith R. Billingsley, Ed Derohanes, Hilton Brown, III.
- Athens, GA, 30602, The Baldwin Press, The University of Georgia.
- Publication date: December 1991.
-
- I don't have a paper in it, but you might be interested in the
- following book:
-
- Langton, Christopher G. [ed.]. 1989. Artificial life: proceedings of an
- interdisciplinary workshop on the synthesis and simulation of living systems.
- Vol. VI in the series: Santa Fe Institute studies in the sciences of
- complexity. Addison-Wesley.
-
- I will be at the Santa Fe Institute Feb. 1 thru Aug. 31, 1992.
- This work will also be presented in the following upcoming seminars:
-
- National Science Foundation, Washington, September 20, 1991
- University of Rochester, Biology, October 18, 1991
- University of Illinois, Complex Systems Colloquia, October 25, 1991
- University of Maryland, Zoology, October 29, 1991
- University of Kentucky, Lexington, Biology, October 31, 1991
- University of Delaware, Entomology, November 5, 1991
- Stony Brook, Department of Ecology and Evolution, November 6, 1991
- Drexel University, Electrical Engineering, November 8, 1991
- The University of the Arts, Philadelphia, Design in Cyberspace lectures,
- November 12, 1991
- IBM, T. J. Watson Research Center, Yorktown Heights, NY, November 13, 1991
- Thinking Machines Corp., Cambridge, November 14, 1991
- Digital Equipment Corp., Hudson, MA, November 15, 1991
- American Society of Information Science, New Jersey, November 19, 1991
- Texas Instruments, Dallas, November 21, 1991
- Harvard University, Biology (Lewontin's lab), December 2, 1991
- Boston University, Computational Sciences Center, December 3, 1991
- MIT Nanotechnology Study Group, December 3, 1991
- University of Massachusetts Boston, Biology, December 5, 1991
- Yale University, Biology, December 6, 1991
- University of Arizona, Ecology & Evolutionary Biology, March 10, 1992
- Cornell University, Mathematical Sciences Institute, CA Workshop, May 1992
- Gordon Conference on Theoretical Biology, New Hampshire, June 8-12, 1992
-
- A DOS version of the Tierra software with a decent front end will be ready
- for distribution by December. The source code is available by ftp at:
- tierra.slhs.udel.edu or life.slhs.udel.edu in a directory named tierra
-
- The source code in the ftp site will compile and run on DOS (tested with
- Turbo C) but does not have a nice front end).
-
- There is an Artificial Life mailing list. It is distributed as a digest
- to avoid chaos. You may subscribe to the list by sending a message to:
-
- alife-request@cognet.ucla.edu
-
- and you may post to the list by sending a message to:
-
- alife@cognet.ucla.edu
-
- Tom Ray
- University of Delaware
- School of Life & Health Sciences
- Newark, Delaware 19716
- ray@tierra.slhs.udel.edu
- ray@life.slhs.udel.edu
- ray@brahms.udel.edu
- 302-451-2281 (FAX)
- 302-451-2753
-
-
- The figures for this paper are in available on request in Generic CADD
- batch file format. This paper is being sent in two parts. You will have
- to cut the headers of the two parts at the scissors marks, then cat them
- together.
-
- O /
- ================================= x -------------------------------------
- o \
- \documentstyle[12pt]{article}
-
- \flushbottom
- \textheight 9in
- \textwidth 6.5in
- \textfloatsep 30pt plus 3pt minus 6pt
- \parskip 7.5pt plus 1pt minus 1pt
- \oddsidemargin 0in
- \evensidemargin 0in
- \topmargin 0in
- \headheight 0in
- \headsep 0in
-
- % Hanging Paragraph
- \def\XP{\par\begingroup\parindent 0in\everypar{\hangindent .3in}}
- \def\eXP{\par\endgroup}
-
- % Left Justified Paragraph
- \def\LP{\par\begingroup\parindent 0in\everypar{\hangindent 0in}}
- \def\eLP{\par\endgroup}
-
- \begin{document}
- \thispagestyle{empty}
-
- \LP
- \bf Thomas S. Ray\rm \\
- School of Life \& Health Sciences, University of Delaware, Newark, Delaware
- 19716,\\
- ray@brahms.udel.edu\\
- \rule[6pt]{6.5in}{1pt}
- \Large \bf Evolution and Optimization of Digital Organisms\rm \normalsize\\
- \rule[6pt]{6.5in}{2pt}
- \eLP
-
- Digital organisms have been synthesized based on a computer metaphor of
- organic life in which CPU time is the ``energy'' resource and memory is
- the ``material'' resource. Memory is organized into informational ``genetic''
- patterns that exploit CPU time for self-replication. Mutation generates
- new forms, and evolution proceeds by natural selection as different
- ``genotypes'' compete for CPU time and memory space. In addition, new
- genotypes appear which exploit other ``creatures'' for informational or
- energetic resources.
-
- The digital organisms are self-replicating computer programs, however,
- they can not escape because they run exclusively on a virtual computer
- in its unique machine language. From a single ancestral ``creature''
- there have evolved tens of thousands of self-replicating genotypes of
- hundreds of genome size classes. Parasites evolved, then creatures that
- were immune to parasites, and then parasites that could circumvent the
- immunity. Hyper-parasites evolved which subvert parasites to their own
- reproduction and drive them to extinction. The resulting genetically
- uniform communities evolve sociality in the sense of creatures that can
- only reproduce in cooperative aggregations, and these aggregations are
- then invaded by cheating hyper-hyper-parasites.
-
- Diverse ecological communities have emerged. These digital communities
- have been used to experimentally study ecological and evolutionary processes:
- e.g., competitive exclusion and coexistance, symbiosis, host/parasite density
- dependent population regulation, the effect of parasites in enhancing
- community diversity, evolutionary arms races, punctuated equilibrium, and
- the role of chance and historical factors in evolution. It is possible to
- extract information on any aspect of the system without disturbing it, from
- phylogeny or community structure through time to the ``genetic makeup'' and
- ``metabolic processes'' of individuals. Digital life demonstrates the
- power of the computational approach to science as a complement to the
- traditional approaches of experiment, and theory based on analysis through
- calculus and differential equations.
-
- Optimization experiments have shown that freely evolving digital organisms
- can optimize their algorithms by a factor of 5.75 in a few hours of real
- time. In addition, evolution discovered the optimization technique of
- ``unrolling the loop''. Evolution may provide a new method for the
- optimization or generation of application programs. This method may
- prove particularly useful for programming massively parallel machines.
-
- \LP
- \rule[6pt]{6.5in}{1pt}
- evolution, ecology, artificial life, synthetic life, emergence,
- self-replication, diversity, adaptation, coevolution, optimization
- \eLP
-
- \newpage
-
- % \baselineskip = 20pt
-
- \LP
- \bf Thomas S. Ray\rm \\
- School of Life \& Health Sciences, University of Delaware, Newark, Delaware
- 19716,\\
- ray@brahms.udel.edu\\
- \rule[6pt]{6.5in}{1pt}
- \Large \bf Evolution and Optimization of Digital Organisms\rm \normalsize\\
- \rule[6pt]{6.5in}{2pt}
- \vspace{6cm}
-
- \begin{quote}
- Marcel, a mechanical chessplayer... his exquisite 19th-century brainwork
- --- the human art it took to build which has been flat lost, lost as the
- dodo bird ... But where inside Marcel is the midget Grandmaster, the
- little Johann Allgeier? where's the pantograph, and the magnets? Nowhere.
- Marcel really is a mechanical chessplayer. No fakery inside to give
- him any touch of humanity at all.\\
- \hspace*{2in}--- Thomas Pynchon, \it Gravity's Rainbow\rm .
- \end{quote}
-
- \large \bf 1 INTRODUCTION\rm \normalsize
- \eLP
-
- Ideally, the science of biology should embrace all forms of life. However
- in practice, it has been restricted to the study of a single instance of
- life, life on earth. Life on earth is very diverse, but it is presumably
- all part of a single phylogeny. Because biology is based on a sample size
- of one, we can not know what features of life are peculiar to earth, and
- what features are general, characteristic of all life. A truly comparative
- natural biology would require inter-planetary travel, which is light
- years away. The ideal experimental evolutionary biology would involve
- creation of multiple planetary systems, some essentially identical,
- others varying by a parameter of interest, and observing them for billions
- of years.
-
- A practical alternative to an inter-planetary or mythical biology is to
- create synthetic life in a computer. The objective is not necessarily
- to create life forms that would serve as models for the study of natural
- life, but rather to create radically different life forms, based on a
- completely different physics and chemistry, and let these life forms
- evolve their own phylogeny, leading to whatever forms are natural to their
- unique physical basis. These truly independent instances of life may
- then serve as a basis for comparison, to gain some insight into what is
- general and what is peculiar in biology. Those aspects of life that prove
- to be general enough to occur in both natural and synthetic systems can then
- be studied more easily in the synthetic system. ``Evolution in a bottle''
- provides a valuable tool for the experimental study of evolution and ecology.
-
- The intent of this work is to synthesize rather than simulate life. This
- approach starts with hand crafted organisms already capable of replication
- and open-ended evolution, and aims to generate increasing diversity and
- complexity in a parallel to the Cambrian explosion. To state such a goal
- leads to semantic problems, because life must be defined in a way that does
- not restrict it to carbon based forms. It is unlikely that there could be
- general agreement on such a definition, or even on the proposition that life
- need not be carbon based. Therefore, I will simply state my conception of
- life in its most general sense. I would consider a system to be living if
- it is self-replicating, and capable of open-ended evolution. Synthetic life
- should self-replicate, and evolve structures or processes that were not
- designed-in or pre-conceived by the creator (Pattee [30]; Cariani [5]).
-
- Core Wars programs, computer viruses, and worms (Cohen [6]; Dewdney [10,
- 11, 13, 14]; Denning [9]; Rheingold [32]; Spafford et al. [33]) are capable
- of self-replication, but fortunately, not evolution. It is unlikely that
- such programs will ever become fully living, because they are not likely
- to be able to evolve.
-
- Most evolutionary simulations are not open-ended. Their potential is limited
- by the structure of the model, which generally endows each individual with a
- genome consisting of a set of pre-defined genes, each of which may exist
- in a pre-defined set of allelic forms (Holland [20]; Dewdney [12]; Dawkins
- [7, 8]; Packard [29]; Ackley \& Littman [1]). The object being evolved
- is generally a data structure representing the genome, which the simulator
- program mutates and/or recombines, selects, and replicates according to
- criteria designed into the simulator. The data structures do not contain the
- mechanism for replication, they are simply copied by the simulator if they
- survive the selection phase.
-
- Self-replication is critical to synthetic life because without it, the
- mechanisms of selection must also be pre-determined by the simulator. Such
- artificial selection can never be as creative as natural selection. The
- organisms are not free to invent their own fitness functions. Freely
- evolving creatures will discover means of mutual exploitation and
- associated implicit fitness functions that we would never think of.
- Simulations constrained to evolve with pre-defined genes, alleles and fitness
- functions are dead ended, not alive.
-
- The approach presented here does not have such constraints. Although the
- model is limited to the evolution of creatures based on sequences of machine
- instructions, this may have a potential comparable to evolution based on
- sequences of organic molecules. Sets of machine instructions similar to
- those used in the Tierra Simulator have been shown to be capable of
- ``universal computation'' (Aho et al. [2]; Minsky [26]; Langton [24]).
- This suggests that evolving machine codes should be able to generate any
- level of complexity.
-
- Other examples of the synthetic approach to life can be seen in the work of
- Holland [21], Farmer et al.\ [16], Langton [22], Rasmussen et al.\
- [31], and Bagley et al.\ [3]. A characteristic these efforts
- generally have in common is that they parallel the origin of life event by
- attempting to create prebiotic conditions from which life may emerge
- spontaneously and evolve in an open ended fashion.
-
- While the origin of life is generally recognized as an event of the first
- order, there is another event in the history of life that is less well known
- but of comparable significance: the origin of biological diversity and
- macroscopic multicellular life during the Cambrian explosion 600 million
- years ago. This event involved a riotous diversification of life forms.
- Dozens of phyla appeared suddenly, many existing only fleetingly, as
- diverse and sometimes bizarre ways of life were explored in a relative
- ecological void (Gould [18]; Morris [27]).
-
- The work presented here aims to parallel the second major event in the
- history of life, the origin of diversity. Rather than attempting to create
- prebiotic conditions from which life may emerge, this approach involves
- engineering over the early history of life to design complex evolvable
- organisms, and then attempting to create the conditions that will set off
- a spontaneous evolutionary process of increasing diversity and complexity
- of organisms. This work represents a first step in this direction, creating
- an artificial world which may roughly parallel the RNA world of
- self-replicating molecules (still falling far short of the Cambrian explosion).
-
- The approach has generated rapidly diversifying communities of self-replicating
- organisms exhibiting open-ended evolution by natural selection. From a single
- rudimentary ancestral creature containing only the code for self-replication,
- interactions such as parasitism, immunity, hyper-parasitism, sociality and
- cheating have emerged spontaneously. This paper presents a methodology and
- some first results.
-
- Apart from its value as a tool for the study or teaching of ecology and
- evolution, synthetic life may have commercial applications. Evolution of
- machine code provides a new approach to the design and optimization of
- computer programs. In an analogy to genetic engineering, pieces of
- application code may be inserted into the genomes of digital organisms,
- and then evolved to new functionality or greater efficiency.
-
- \LP
- \rule[6pt]{6.5in}{1pt}
-
- \begin{quote}
- Here was a world of simplicity and certainty...
- a world based on the one and zero of life and
- death. Minimal, beautiful. The patterns of lives and deaths....
- weightless, invisible chains of electronic presence or absence. If
- patterns of ones and zeros were ``like'' patterns of human lives and
- deaths, if everything about an individual could be represented in a
- computer record by a long string of ones and zeros, then what kind of
- creature would be represented by a long string of lives and deaths?
- It would have to be up one level at least --- an angel, a minor god,
- something in a UFO.\\
- \hspace*{2in} --- Thomas Pynchon, \it Vineland\rm .
- \end{quote}
-
- \large \bf 2 METHODS\rm \normalsize
-
- \bf 2.1 THE METAPHOR\rm
- \eLP
-
- Organic life is viewed as utilizing energy, mostly derived
- from the sun, to organize matter. By analogy, digital life can be
- viewed as using CPU (central processing unit) time, to organize memory.
- Organic life evolves through natural selection as individuals compete for
- resources (light, food, space, etc.) such that genotypes which leave the
- most descendants increase in frequency. Digital life evolves through the
- same process, as replicating algorithms compete for CPU time and memory
- space, and organisms evolve strategies to exploit one another. CPU time is
- thought of as the analog of the energy resource, and memory as the analog
- of the spatial resource.
-
- The memory, the CPU and the computer's operating system are viewed as elements
- of the ``abiotic'' (physical) environment. A ``creature'' is then designed
- to be specifically adapted to the features of the computational environment.
- The creature consists of a self-replicating assembler language program.
- Assembler languages are merely mnemonics for the machine codes that are
- directly executed by the CPU. These machine codes have the characteristic
- that they directly invoke the instruction set of the CPU and services provided
- by the operating system.
-
- All programs, regardless of the language they are written in, are converted
- into machine code before they are executed. Machine code is the natural
- language of the machine, and machine instructions are viewed by this
- author as the ``atomic units'' of computing. It is felt that machine
- instructions provide the most natural basis for an artificial chemistry
- of creatures designed to live in the computer.
-
- In the biological analogy, the machine instructions are considered to be
- more like the amino acids than the nucleic acids, because they are
- ``chemically active''. They actively manipulate bits, bytes, CPU registers,
- and the movements of the instruction pointer (see below). The digital
- creatures discussed here are entirely constructed of machine instructions.
- They are considered analogous to creatures of the RNA world, because the
- same structures bear the ``genetic'' information and carry out the
- ``metabolic'' activity.
-
- A block of RAM memory (random access memory, also known as ``main'' or
- ``core'' memory) in the computer is designated as a ``soup'' which can
- be inoculated with creatures. The ``genome'' of the creatures consists of
- the sequence of machine instructions that make up the creature's
- self-replicating algorithm. The prototype creature consists of 80 machine
- instructions, thus the size of the genome of this creature is 80 instructions,
- and its ``genotype'' is the specific sequence of those 80 instructions
- (Appendix C).
-
- \LP
- \bf 2.2 THE VIRTUAL COMPUTER --- TIERRA SIMULATOR\rm
- \eLP
-
- The computers we use are general purpose computers, which means, among
- other things, that they are capable of emulating through software, the
- behavior of any other computer that ever has been built or that could be
- built (Aho et al. [2]; Minsky [26]; Langton [24]). We can utilize
- this flexibility to design a computer that would be especially hospitable
- to synthetic life.
-
- There are several good reasons why it is not wise to attempt to synthesize
- digital organisms that exploit the machine codes and operating systems of
- real computers. The most urgent is the potential threat of natural evolution
- of machine codes leading to virus or worm type of programs that could
- be difficult to eradicate due to their changing ``genotypes''. This potential
- argues strongly for creating evolution exclusively in programs that run only
- on virtual computers and their virtual operating systems. Such programs
- would be nothing more than data on a real computer, and therefore would
- present no more threat than the data in a data base or the text file of a
- word processor.
-
- Another reason to avoid developing digital organisms in the machine code of
- a real computer is that the artificial system would be tied to the hardware
- and would become obsolete as quickly as the particular machine it was
- developed on. In contrast, an artificial system developed on a virtual
- machine could be easily ported to new real machines as they become available.
-
- A third issue, which potentially makes the first two moot, is that
- the machine languages of real machines are not designed to be evolvable,
- and in fact might not support significant evolution. Von Neuman type
- machine languages are considered to be ``brittle'', meaning that the
- ratio of viable programs to possible programs is virtually zero. Any
- mutation or recombination event in a real machine code is almost certain
- to produce a non-functional program. The problem of brittleness can be
- mitigated by designing a virtual computer whose machine code is designed
- with evolution in mind. Farmer \& Belin [17] have suggested that
- overcoming this brittleness and ``Discovering how to make such self-replicating
- patterns more robust so that they evolve to increasingly more complex states
- is probably the central problem in the study of artificial life.''
-
- The work described here takes place on a virtual computer known as Tierra
- (Spanish for Earth). Tierra is a parallel computer of the MIMD (multiple
- instruction, multiple data) type, with a processor (CPU) for each creature.
- Parallelism is imperfectly emulated by allowing each CPU to execute a small
- time slice in turn. Each CPU of this virtual computer contains two address
- registers, two numeric registers, a flags register to indicate error
- conditions, a stack pointer, a ten word stack, and an instruction pointer.
- Each virtual CPU is implemented via the C structure listed in Appendix A.
- Computations performed by the Tierran CPUs are probabilistic due to flaws
- that occur at a low frequency (see Mutation below).
-
- The instruction set of a CPU typically performs simple arithmetic
- operations or bit manipulations, within the small set of registers contained
- in the CPU. Some instructions move data between the registers in the CPU,
- or between the CPU registers and the RAM (main) memory. Other instructions
- control the location and movement of an ``instruction pointer'' (IP). The
- IP indicates an address in RAM, where the machine code
- of the executing program (in this case a digital organism) is located.
-
- The CPU perpetually performs a fetch-decode-execute-increment\_IP
- cycle: The machine code instruction currently addressed by the IP
- is fetched into the CPU, its bit pattern is decoded to determine which
- instruction it corresponds to, and the instruction is executed. Then
- the IP is incremented to point sequentially to the next position in RAM,
- from which the next instruction will be fetched. However, some instructions
- like JMP, CALL and RET directly manipulate the IP, causing execution to
- jump to some other sequence of instructions in the RAM. In the Tierra
- Simulator this CPU cycle is implemented through the time\_slice routine
- listed in Appendix B.
-
- \LP
- \bf 2.3 THE TIERRAN LANGUAGE\rm
- \eLP
-
- Before attempting to set up a synthetic life system, careful thought must
- be given to how the representation of a programming language affects its
- adaptability in the sense of being robust to genetic operations such as
- mutation and recombination. The nature of the virtual computer is defined
- in large part by the instruction set of its machine language. The approach
- in this study has been to loosen up the machine code in a ``virtual
- bio-computer'', in order to create a computational system based on a hybrid
- between biological and classical von Neumann processes.
-
- In developing this new virtual language, which is called ``Tierran'', close
- attention has been paid to the structural and functional properties of the
- informational system of biological molecules: DNA, RNA and proteins. Two
- features have been borrowed from the biological world which are considered
- to be critical to the evolvability of the Tierran language.
-
- First, the instruction set of the Tierran language has been defined to be
- of a size that is the same order of magnitude as the genetic
- code. Information is encoded into DNA through 64 codons, which are
- translated into 20 amino acids. In its present manifestation, the Tierran
- language consists of 32 instructions, which can be represented by five bits,
- \it operands included\rm.
-
- Emphasis is placed on this last point because some instruction sets are
- deceptively small. Some versions of the redcode language of Core Wars
- (Dewdney [10, 13]; Rasmussen et al. [31]) for example are defined to have
- ten operation codes. It might appear on the surface then that the
- instruction set is of size ten. However, most of the ten instructions have
- one or two operands. Each operand has four addressing modes, and then an
- integer. When we consider that these operands are embedded into the machine
- code, we realize that they are in fact a part of the instruction set, and
- this set works out to be about $10^{11}$ in size. Similarly, RISC machines
- may have only a few opcodes, but they probably all use 32 bit instructions,
- so from a mutational point of view, they really have $2^{32}$ instructions.
- Inclusion of numeric operands will make any instruction set extremely large
- in comparison to the genetic code.
-
- In order to make a machine code with a truly small instruction set, we must
- eliminate numeric operands. This can be accomplished by allowing the CPU
- registers and the stack to be the only operands of the instructions. When
- we need to encode an integer for some purpose, we can create it in a numeric
- register through bit manipulations: flipping the low order bit and shifting
- left. The program can contain the proper sequence of bit flipping and shifting
- instructions to synthesize the desired number, and the instruction set need
- not include all possible integers.
-
- A second feature that has been borrowed from molecular biology in the design
- of the Tierran language is the addressing mode, which is called ``address
- by template''. In most machine codes, when a piece of data is addressed, or
- the IP jumps to another piece of code, the exact numeric address of the data
- or target code is specified in the machine code. Consider that in the
- biological system by contrast, in order for protein molecule A in the cytoplasm
- of a cell to interact with protein molecule B, it does not
- specify the exact coordinates where B is located. Instead, molecule A
- presents a template on its surface which is complementary to some surface on
- B. Diffusion brings the two together, and the complementary conformations
- allow them to interact.
-
- Addressing by template is illustrated by the Tierran JMP (jump) instruction.
- Each JMP instruction is followed by a sequence of NOP (no-operation)
- instructions, of which there are two kinds: NOP\_0 and NOP\_1. Suppose we
- have a piece of code with five instruction in the following order:
- JMP NOP\_0 NOP\_0 NOP\_0 NOP\_1. The system will search outward in both
- directions from the JMP instruction looking for the nearest occurrence of the
- complementary pattern: NOP\_1 NOP\_1 NOP\_1 NOP\_0. If the pattern is found,
- the instruction pointer will move to the end of the complementary pattern
- and resume execution. If the pattern is not found, an error condition (flag)
- will be set and the JMP instruction will be ignored (in practice, a limit
- is placed on how far the system may search for the pattern).
-
- The Tierran language is characterized by two unique features: a truly small
- instruction set without numeric operands, and addressing by template.
- Otherwise, the language consists of familiar instructions typical of most
- machine languages, e.g., MOV, CALL, RET, POP, PUSH etc. The complete
- instruction set is listed in Appendix B.
-
- \LP
- \bf 2.4 THE TIERRAN OPERATING SYSTEM\rm
- \eLP
-
- The Tierran virtual computer needs a virtual operating system that will be
- hospitable to digital organisms. The operating system will determine the
- mechanisms of interprocess communication, memory allocation, and the
- allocation of CPU time among competing processes. Algorithms will
- evolve so as to exploit these features to their advantage. More than being
- a mere aspect of the environment, the operating system together with the
- instruction set will determine the
- topology of possible interactions between individuals, such as the ability
- of pairs of individuals to exhibit predator-prey, parasite-host or
- mutualistic relationships.
-
- \LP
- \bf 2.4.1 Memory Allocation --- Cellularity\rm
- \eLP
-
- The Tierran computer operates on a block of RAM of the real computer which
- is set aside for the purpose. This block of RAM is referred to as the
- ``soup''. In most of the work described here the soup consisted of about
- 60,000 bytes, which can hold the same number of Tierran machine instructions.
- Each ``creature'' occupies some block of memory in this soup.
-
- Cellularity is one of the fundamental properties of organic life, and can
- be recognized in the fossil record as far back as 3.6 billion years (Barbieri
- [4]). The cell is the original individual, with the cell membrane defining
- its limits and preserving its chemical integrity. An analog to the cell
- membrane is needed in digital organisms in order to preserve the integrity of
- the informational structure from being disrupted easily by the activity of
- other organisms. The need for this can be seen in Artificial Life models
- such as cellular automata where virtual state machines pass through one
- another (Langton [22, 23]), or in core wars type simulations where
- coherent structures demolish one another when they come into contact
- (Dewdney [10, 13]; Rasmussen et al. [31]).
-
- Tierran creatures are considered to be cellular in the sense that they are
- protected by a ``semi-permeable membrane'' of memory allocation. The Tierran
- operating system provides memory allocation services. Each creature has
- exclusive write privileges within its allocated block of memory. The ``size''
- of a creature is just the size of its allocated block (e.g., 80 instructions).
- This usually corresponds to the size of the genome. This ``membrane'' is
- described as ``semi-permeable'' because
- while write privileges are protected, read and execute privileges are not.
- A creature may examine the code of another creature, and even execute it,
- but it can not write over it. Each creature may have exclusive write
- privileges in at most two blocks of memory: the one that it is born with
- which is referred to as the ``mother cell'', and a second block which it
- may obtain through the execution of the MAL (memory allocation) instruction.
- The second block, referred to as the ``daughter cell'', may be used to grow
- or reproduce into.
-
- When Tierran creatures ``divide'', the mother cell loses write privileges
- on the space of the daughter cell, but is then free to allocate another
- block of memory. At the moment of division, the daughter cell is given
- its own instruction pointer, and is free to allocate its own second block of
- memory.
-
- \LP
- \bf 2.4.2 Time Sharing --- The Slicer\rm
- \eLP
-
- The Tierran operating system must be multi-tasking (or parallel) in order for
- a community of individual creatures to live in the soup simultaneously. The
- system doles out small slices of CPU time to each creature in the soup in turn.
- The system maintains a circular queue called the ``slicer queue''. As each
- creature is born, a virtual CPU is created for it, and it enters the slicer
- queue just ahead of its mother, which is the active creature at that time.
- Thus the newborn will be the last creature in the soup to get another time
- slice after the mother, and the mother will get the next slice after its
- daughter. As long as the slice size is small relative to the generation
- time of the creatures, the time sharing system causes the world to approximate
- parallelism. In actuality, we have a population of virtual CPUs, each of
- which gets a slice of the real CPU's time as it comes up in the queue.
-
- The number of instructions to be executed in each time slice may be set
- proportional to the size of the genome of the creature being executed, raised
- to a power. If the ``slicer power'' is equal to one, then the slicer is size
- neutral, the probability of an instruction being executed does not depend on
- the size of the creature in which it occurs. If the power is greater than one,
- large creatures get more CPU cycles per instruction than small creatures.
- If the power is less than one, small creatures get more CPU cycles per
- instruction. The power determines if selection favors large or small
- creatures, or is size neutral. A constant slice size selects for small
- creatures.
-
- \LP
- \bf 2.4.3 Mortality --- The Reaper\rm
- \eLP
-
- Self-replicating creatures in a fixed size soup would rapidly fill the
- soup and lock up the system. To prevent this from occurring, it is
- necessary to include mortality. The Tierran operating system includes a
- ``reaper'' which begins ``killing'' creatures from a queue when the memory
- fills to some specified level (e.g., 80\%). Creatures are killed by
- deallocating their memory, and removing them from both the reaper and
- slicer queues. Their ``dead'' code is not removed from the soup.
-
- In the present system, the reaper uses a linear queue. When a creature is
- born it enters the bottom of the queue. The reaper always kills the creature
- at the top of the queue. However, individuals may move up or down in the
- reaper queue according to their success or failure at executing certain
- instructions. When a creature executes an instruction that generates an
- error condition, it moves one position up the queue, as long as the
- individual ahead of it in the queue has not accumulated a greater number
- of errors. Two of the instructions are somewhat difficult to execute
- without generating an error, therefore successful execution of these
- instructions moves the creature down the reaper queue one position, as long
- as it has not accumulated more errors than the creature below it.
-
- The effect of the reaper queue is to cause algorithms which are fundamentally
- flawed to rise to the top of the queue and die. Vigorous algorithms have a
- greater longevity, but in general, the probability of death increases with age.
-
- \LP
- \bf 2.4.4 Mutation\rm
- \eLP
-
- In order for evolution to occur, there must be some change in the genome
- of the creatures. This may occur within the lifespan of an individual,
- or there may be errors in passing along the genome to offspring. In order
- to insure that there is genetic change, the operating system randomly flips
- bits in the soup, and the instructions of the Tierran language are
- imperfectly executed.
-
- Mutations occur in two circumstances. At some background rate, bits are
- randomly selected from the entire soup (e.g., 60,000 instructions totaling
- 300,000 bits) and flipped. This is analogous to mutations caused by
- cosmic rays, and has the effect of preventing any creature from being
- immortal, as it will eventually mutate to death. The background mutation
- rate has generally been set at about one bit flipped for every 10,000
- Tierran instructions executed by the system.
-
- In addition, while copying instructions during the replication
- of creatures, bits are randomly flipped at some rate in the copies. The copy
- mutation rate is the higher of the two, and results in replication errors.
- The copy mutation rate has generally been set at about one bit flipped for
- every 1,000 to 2,500 instructions moved. In both classes of mutation,
- the interval between mutations varies randomly within a certain range to
- avoid possible periodic effects.
-
- In addition to mutations, the execution of Tierran instructions is flawed
- at a low rate. For most of the 32 instructions, the result is off by plus
- or minus one at some low frequency. For example, the increment instruction
- normally adds one to its register, but it sometimes adds two or zero. The
- bit flipping instruction normally flips the low order bit, but it sometimes
- flips the next higher bit or no bit. The shift left instruction normally
- shifts all bits one bit to the left, but it sometimes shifts left by two
- bits, or not at all. In this way, the behavior of the Tierran instructions
- is probabilistic, not fully deterministic.
-
- It turns out that bit flipping mutations and flaws in instructions are not
- necessary to generate genetic change and evolution, once the community
- reaches a certain state of complexity. Genetic parasites evolve which are
- sloppy replicators, and have the effect of moving pieces of code around
- between creatures, causing rather massive rearrangements of the genomes.
- The mechanism of this ad hoc sexuality has not been worked out, but is
- likely due to the parasites' inability to discriminate between live, dead
- or embryonic code.
-
- Mutations result in the appearance of new genotypes, which are watched
- by an automated genebank manager. In one implementation of the manager,
- when new genotypes replicate twice, producing a genetically identical
- offspring at least once, they are given a unique name and saved to disk.
- Each genotype name contains two parts, a number and a three letter code.
- The number represents the number of instructions in the genome. The three
- letter code is used as a base 26 numbering system for assigning a unique
- label to each genotype in a size class. The first genotype to appear in
- a size class is assigned the label aaa, the second is assigned the label
- aab, and so on. Thus the ancestor is named 80aaa, and the first mutant
- of size 80 is named 80aab. The first creature of size 45 is named 45aaa.
-
- The genebanker saves some additional information with each genome: the
- genotype name of its immediate ancestor which makes possible the
- reconstruction of the entire phylogeny; the time and date of origin;
- ``metabolic'' data including the number of instructions executed in the
- first and second reproduction, the number of errors generated in the first
- and second reproduction, and the number of instructions copied into the
- daughter cell in the first and second reproductions (see Appendix C, D); some
- environmental parameters at the time of origin including the search limit
- for addressing, and the slicer power, both of which affect selection for size.
-
- \LP
- \bf 2.5 THE TIERRAN ANCESTOR\rm
- \eLP
-
- I have used the Tierran language to write a single self-replicating program
- which is 80 instructions long. This program is referred to as the
- ``ancestor'', or alternatively as genotype 0080aaa (Fig.\ 1). The ancestor
- is a minimal self-replicating algorithm which was originally written for use
- during the debugging of the simulator. No functionality was designed into
- the ancestor beyond the ability to self-replicate, nor was any specific
- evolutionary potential designed in. The commented Tierran assembler and
- machine code for this program is presented in Appendix C.
-
- The ancestor examines itself to determine where in memory it begins and ends.
- The ancestor's beginning is marked with the four no-operation template:
- 1 1 1 1, and its ending is marked with 1 1 1 0. The ancestor locates its
- beginning with the five instructions: ADRB, NOP\_0, NOP\_0, NOP\_0, NOP\_0.
- This series of instructions causes the system to search backwards
- from the ADRB instruction for a template complementary to the four NOP\_0
- instructions, and to place the address of the complementary template
- (the beginning) in the ax register of the CPU (see Appendix A). A similar
- method is used to locate the end.
-
- Having determined the address of its beginning and its end, it subtracts
- the two to calculate its size, and allocates a block of memory of this size
- for a daughter cell. It then calls the copy procedure which copies the entire
- genome into the daughter cell memory, one instruction at a time.
- The beginning of the copy procedure is marked by the four no-operation
- template: 1 1 0 0. Therefore the call to the copy procedure is accomplished
- with the five instructions: CALL, NOP\_0, NOP\_0, NOP\_1, NOP\_1.
-
- When the genome has been copied, it executes the DIVIDE instruction, which
- causes the creature to lose write privileges on the daughter cell memory,
- and gives an instruction pointer to the daughter cell (it also enters the
- daughter cell into the slicer and reaper queues). After this first
- replication, the mother cell does not examine itself again; it proceeds
- directly to the allocation of another daughter cell, then the copy procedure
- is followed by cell division, in an endless loop.
-
- Fourty-eight of the eighty instructions in the ancestor are no-operations.
- Groups of four no-operation instructions are used as complementary templates
- to mark twelve sites for internal addressing, so that the creature can locate
- its beginning and end, call the copy procedure, and mark addresses for loops
- and jumps in the code, etc. The functions of these templates are commented
- in the listing in Appendix C.
-
- \pagebreak
-
- \LP
- \rule[6pt]{6.5in}{1pt}
-
- \large \bf 3 RESULTS\rm \normalsize
-
- \bf 3.1 EVOLUTION\rm
- \eLP
-
- Evolutionary runs of the simulator are begun by inoculating the soup of about
- 60,000 instructions with a single individual of the 80 instruction ancestral
- genotype. The passage of time in a run is measured in terms of how many
- Tierran instructions have been executed by the simulator. The original
- ancestral cell executes 839 instructions in its first replication, and 813 for
- each additional replication. The initial cell and its replicating daughters
- rapidly fill the soup memory to the threshold level of 80\% which starts the
- reaper. Typically, the system executes about 400,000 instructions in filling
- up the soup with about 375 individuals of size 80 (and their gestating
- daughter cells). Once the reaper begins, the memory remains roughly 80\%
- filled with creatures for the remainder of the run.
-
- \LP
- \bf 3.1.1 Micro-Evolution\rm
- \eLP
-
- If there were no mutations at the outset of the run, there would be no
- evolution. However, the bits flipped as a result of copy errors or background
- mutations result in creatures whose list of 80 instructions (genotype) differs
- from the ancestor, usually by a single bit difference in a single instruction.
-
- Mutations in and of themselves, can not result in a change in the size of
- a creature, they can only alter the instructions in its genome. However,
- by altering the genotype, mutations may affect the process whereby the
- creature examines itself and calculates its size, potentially causing it
- to produce an offspring that differs in size from itself.
-
- Four out of the five possible mutations in a no-operation instruction convert
- it into another kind of instruction, while one out of five converts it into
- the complementary no-operation. Therefore 80\% of mutations in templates
- destroy or change the size of the template, while one in five alters the
- template pattern. An altered template may cause the creature to make
- mistakes in self examination, procedure calls, or looping or jumps of the
- instruction pointer, all of which use templates for addressing.
-
- \LP
- \bf 3.1.1.1 parasites\rm
- \eLP
-
- An example of the kind of error that can result from a mutation in a
- template is a mutation of the low order bit of instruction 42 of the
- ancestor (Appendix C). Instruction 42 is a NOP\_0, the third component
- of the copy procedure template. A mutation in the low order bit would
- convert it into NOP\_1, thus changing the template from 1 1 0 0 to: 1 1 1 0.
- This would then be recognized as the template used to mark the end of the
- creature, rather than the copy procedure.
-
- A creature born with a mutation in the low order bit of instruction 42 would
- calculate its size as 45. It would allocate a daughter cell of size 45 and
- copy only instructions 0 through 44 into the daughter cell. The daughter
- cell then, would not include the copy procedure. This daughter genotype,
- consisting of 45 instructions, is named 0045aaa.
-
- Genotype 0045aaa (Fig.\ 1) is not able to self-replicate in isolated culture.
- However, the semi-permeable membrane of memory allocation only protects write
- privileges. Creatures may match templates with code in the allocated memory
- of other creatures, and may even execute that code. Therefore, if creature
- 0045aaa is grown in mixed culture with 0080aaa, when it attempts to call the
- copy procedure, it will not find the template within its own genome, but if
- it is within the search limit (generally set at 200--400 instructions) of the
- copy procedure of a creature of genotype 0080aaa, it will match templates, and
- send its instruction pointer to the copy code of 0080aaa. Thus a parasitic
- relationship is established (see ECOLOGY below). Typically, parasites begin
- to emerge within the first few million instructions of elapsed time in a run.
-
- \LP
- \bf 3.1.1.2 immunity to parasites\rm
- \eLP
-
- At least some of the size 79 genotypes demonstrate some measure of
- resistance to parasites. If genotype 45aaa is introduced into a soup,
- flanked on each side with one individual of genotype 0079aab, 0045aaa will
- initially reproduce somewhat, but will be quickly eliminated from the soup.
- When the same experiment is conducted with 0045aaa and the ancestor, they
- enter a stable cycle in which both genotypes coexist indefinitely. Freely
- evolving systems have been observed to become dominated by size 79 genotypes
- for long periods, during which parasitic genotypes repeatedly appear, but
- fail to invade.
-
- \LP
- \bf 3.1.1.3 circumvention of immunity to parasites\rm
- \eLP
-
- Occasionally these evolving systems dominated by size 79 were successfully
- invaded by parasites of size 51. When the immune genotype 0079aab was tested
- with 0051aao (a direct, one step, descendant of 0045aaa in which instruction
- 39 is replaced by an insertion of seven instructions of unknown origin), they
- were found to enter a stable cycle. Evidently 0051aao has evolved some way to
- circumvent the immunity to parasites possessed by 0079aab. The fourteen
- genotypes 0051aaa through 0051aan were also tested with 0079aab, and none were
- able to invade.
-
- \LP
- \bf 3.1.1.4 hyper-parasites\rm
- \eLP
-
- Hyper-parasite have been discovered, (e.g., 0080gai, which differs by 19
- instructions from the ancestor, Fig.\ 1). Their ability to subvert the energy
- metabolism of parasites is based on two changes. The copy procedure does not
- return, but jumps back directly to the proper address of the reproduction loop.
- In this way it effectively seizes the instruction pointer from the parasite.
- However it is another change which delivers the coup de gr\^{a}ce: after
- each reproduction, the hyper-parasite re-examines itself, resetting the bx
- register with its location and the cx register with its size. After the
- instruction pointer of the parasite passes through this code, the CPU of the
- parasite contains the location and size of the hyper-parasite and the
- parasite thereafter replicates the hyper-parasite genome.
-
- \LP
- \bf 3.1.1.5 social hyper-parasites\rm
- \eLP
-
- Hyper-parasites drive the parasites to extinction. This results in a
- community with a relatively high level of genetic uniformity, and therefore
- high genetic relationship between individuals in the community. These are
- the conditions that support the evolution of sociality, and social
- hyper-parasites soon dominate the community. Social hyper-parasites (Fig.\ 2)
- appear in the 61 instruction size class. For example, 0061acg is social in
- the sense that it can only self-replicate when it occurs in aggregations.
- When it jumps back to the code for self-examination, it jumps to a template
- that occurs at the end rather than the beginning of its genome. If the
- creature is flanked by a similar genome, the jump will find the target
- template in the tail of the neighbor, and execution will then pass into
- the beginning of the active creature's genome. The algorithm will fail unless
- a similar genome occurs just before the active creature in memory. Neighboring
- creatures cooperate by catching and passing on jumps of the instruction
- pointer.
-
- It appears that the selection pressure for the evolution of sociality is that
- it facilitates size reduction. The social species are 24\% smaller than the
- ancestor. They have achieved this size reduction in part by shrinking their
- templates from four instructions to three instructions. This means that there
- are only eight templates available to them, and catching each others jumps
- allows them to deal with some of the consequences of this limitation as well
- as to make dual use of some templates.
-
- \LP
- \bf 3.1.1.6 cheaters: hyper-hyper-parasites\rm
- \eLP
-
- The cooperative social system of hyper-parasites is subject to cheating,
- and is eventually invaded by hyper-hyper-parasites (Fig.\ 2). These cheaters
- (e.g., 0027aab) position themselves between aggregating hyper-parasites so
- that when the instruction pointer is passed between them, they capture it.
-
- \LP
- \bf 3.1.1.7 a novel self-examination\rm
- \eLP
-
- All creatures discussed thus far mark their beginning and end with templates.
- They then locate the addresses of the two templates and determine their genome
- size by subtracting them. In one run, creatures evolved without a template
- marking their end. These creatures located the address of the template
- marking their beginning, and then the address of a template in the middle of
- their genome. These two addresses were then subtracted to calculate half of
- their size, and this value was multiplied by two (by shifting left) to
- calculate their full size.
-
- \LP
- \bf 3.1.1.8 an intricate adaptation\rm
- \eLP
-
- The arms race described in the paragraphs above took place over a period of
- a billion instructions executed by the system. Another run was allowed to
- continue for fifteen billion instructions, but was not examined in detail.
- A creature present at the end of the run was examined and found to have
- evolved an intricate adaptation. The adaptation is an optimization technique
- known as ``unrolling the loop''.
-
- The central loop of the copy procedure performs the following operations:
- 1) copies an instruction from the mother to the daughter, 2) decrements the
- cx register which initially contains the size of the parent genome, 3) tests
- to see if cx is equal to zero, if so it exits the loop, if not it remains
- in the loop, 4) increment the ax register which contains the address in the
- daughter where the next instruction will be copied to, 5) increment the
- bx register which contains the address in the mother where the next instruction
- will be copied from, 6) jump back to the top of the loop.
-
- The work of the loop is contained in steps 1, 2, 4 and 5. Steps 3 and 6 are
- overhead. The efficiency of the loop can be increased by duplicating the
- work steps within the loop, thereby saving on overhead. The creature from
- the end of the long run had repeated the work steps three times within the
- loop, as illustrated in Appendix E, which compares the copy loop of the
- ancestor with that of its decendant.
-
- \LP
- \bf 3.1.2 Macro-Evolution\rm
- \eLP
-
- When the simulator is run over long periods of time, hundreds of millions or
- billions of instructions, various patterns emerge. Under selection for small
- sizes there is a proliferation of small parasites and a rather interesting
- ecology (see below). Selection for large creatures has usually lead to
- continuous incrementally increasing sizes (but not to a trivial concatenation
- of creatures end-to-end) until a plateau in the upper hundreds is reached.
- In one run, selection for large size lead to apparently open ended size
- increase, evolving genomes larger than 23,000 instructions in length.
- These evolutionary patterns might be described as phyletic gradualism.
-
- The most thoroughly studied case for long runs is where selection, as
- determined by the slicer function, is size neutral. The longest runs to date
- (as much as 2.86 billion Tierran instructions) have been in a size neutral
- environment, with a search limit of 10,000, which would allow large creatures
- to evolve if there were some algorithmic advantage to be gained from larger
- size. These long runs illustrate a pattern which could be described as
- periods of stasis punctuated by periods of rapid evolutionary change, which
- appears to parallel the pattern of punctuated equilibrium described by
- Eldredge \& Gould [15] and Gould \& Eldredge [19].
-
- Initially these communities are dominated by creatures with genome sizes
- in the eighties. This represents a period of relative stasis, which has
- lasted from 178 million to 1.44 billion instructions in the several long
- runs conducted to date. The systems then very abruptly (in a span of 1 or
- 2 million instructions) evolve into communities dominated by sizes ranging
- from about 400 to about 800. These communities have not yet been seen to
- evolve into communities dominated by either smaller or substantially larger
- size ranges.
-
- The communities of creatures in the 400 to 800 size range also show a
- long-term pattern of punctuated equilibrium. These communities regularly come
- to be dominated by one or two size classes, and remain in that condition for
- long periods of time. However, they inevitably break out of that stasis
- and enter a period where no size class dominates. These periods of rapid
- evolutionary change may be very chaotic. Close observations indicate that
- at least at some of these times, no genotypes breed true. Many
- self-replicating genotypes will coexist in the soup at these times, but at
- the most chaotic times, none will produce offspring which are even their same
- size. Eventually the system will settle down to another period of stasis
- dominated by one or a few size classes which breed true.
-
- \LP
- \bf 3.2 DIVERSITY\rm
- \eLP
-
- Most observations on the diversity of Tierran creatures have been based on
- the diversity of size classes. Creatures of different sizes are clearly
- genetically different, as their genomes are of different sizes. Different
- sized creatures would have some difficulty engaging in recombination if they
- were sexual, thus it is likely that they would be different species.
- In a run of 526 million instructions, 366 size classes were generated, 93
- of which achieved abundances of five or more individuals. In a run of
- 2.56 billion instructions, 1180 size classes were generated, 367 of which
- achieved abundances of five or more.
-
- Each size class consists of a number of distinct genotypes which also vary
- over time. There exists the potential for great genetic diversity within a
- size class. There are 32$^{80}$ distinct genotypes of size 80, but how many
- of those are viable self-replicating creatures? This question remains
- unanswered, however some information has been gathered through the use
- of the automated genebank manager.
-
- In several days of running the genebanker, over 29,000 self-replicating
- genotypes of over 300 size classes accumulated. The size classes and
- the number of unique genotypes banked for each size are listed in Table 1.
- The genotypes saved to disk can be used to inoculate new soups individually,
- or collections of these banked genotypes may be used to assemble ``ecological
- communities''. In ``ecological'' runs, the mutation rates can be set to zero
- in order to inhibit evolution.
-
-