home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Collection of Hack-Phreak Scene Programs
/
cleanhpvac.zip
/
cleanhpvac
/
TIERRA40.ZIP
/
DOC
/
TIERRA.DOC
< prev
next >
Wrap
Text File
|
1992-09-11
|
202KB
|
4,404 lines
/* TIERRA.DOC 9-9-92 documentation for the Tierra Simulator */
/* Tierra Simulator V4.0: Copyright (c) 1991, 1992 Tom Ray & Virtual Life */
This file contains the following sections:
1) LICENSE AGREEMENT
2) WHAT THIS PROGRAM IS, PUBLICATIONS, NEWS
3) RELATED SOFTWARE (IMPORTANT)
4) QUICK START <-------------- You might want to start here!
4.1) DOS QUICK START
4.2) UNIX QUICK START
5) RUNNING TIERRA
5.1) Startup
5.2) The Assembler/Disassembler
5.3) The Birth-Death Output
5.4) The Genebank Output
5.5) Restarting an Old Run
5.6) The User Interface
5.6.1) The Basic Interface
5.6.1.1) The Basic Screen
5.6.1.2) The Menu Options
5.6.1.2.1) The Size Histogram
5.6.1.2.2) The Memory Histogram
5.6.1.2.3) The Genotype Histogram
5.6.1.2.4) The Size Class Information Display
5.6.1.2.5) The Virtual Debugger
5.6.1.2.6) The Genome Injector
5.6.2) The Standard Output
5.6.3) The tierra.log file
6) LISTING OF DISTRIBUTION FILES
7) SOUP_IN PARAMETERS (IMPORTANT)
8) THE INSTRUCTION SETS
8.1) Synopsis of the Four Sets
8.2) Details of Set 1
8.3) New Features in Sets 2 through 4
9) THE ANCESTOR & WRITING A CREATURE
9.1) The Ancestor
9.2) Writing a Creature
10) IF YOU WANT TO MODIFY THE SOURCE CODE
10.1) Creating a Frontend
10.2) Creating New Instruction Sets
10.3) Creating New Slicer Mechanisms
10.4) Creating a Sexual Model
10.5) Creating a Multi-cellular Model
11) CONFIGURATION AT COMPILE TIME (configur.h)
12) KNOWN BUGS
13) IF YOU HAVE PROBLEMS
13.1) Problems with Installation
13.2) Problems Running Tierra
14) REGISTRATION & MAILING LISTS
1) LICENSE AGREEMENT
/*
* Tierra Simulator V4.0: Copyright (c) 1990, 1991, 1992 Thomas S. Ray
*
* by Tom Ray, ray@brahms.udel.edu ray@santafe.edu (the bulk of the code)
* Dan Pirone, cocteau@life.slhs.udel.edu (frontend, overhaul, sex)
* Tom Uffner, tom@genie.slhs.udel.edu (rework of genebanker & assembler)
*
* If you purchased this program on disk, thank you for your support. If
* you obtained the source code through the net or friends, we invite you to
* contribute an amount that represents the program's worth to you. You may
* make a check in US dollars payable to Virtual Life, and mail the check to
* one of the two addresses listed below.
*
* This license agreement has two parts:
*
* 1) The source code, documentation, and the beagle.exe file can be freely
* distributed.
*
* 2) The executables (the .exe files in DOS) are for sale and can not be
* freely distributed (with the exception of the beagle.exe file).
* Executables (binaries) on any platform (Unix, Mac, Amiga, DOS, etc.)
* can not be freely distributed.
*
* These two points are elaborated below:
*
* 1) The source code, documentation, and the beagle.exe file can be freely
* distributed.
*
* The source code and documentation is copyrighted, all rights reserved.
* The source code, documentation, and the beagle.exe file may be freely
* copied and distributed without fees (contributions welcome), subject to
* the following restrictions:
*
* - This notice may not be removed or altered.
*
* - You may not try to make money by distributing the package or by using the
* process that the code creates.
*
* - You may not prevent others from copying it freely.
*
* - You may not distribute modified versions without clearly documenting your
* changes and notifying the principal author.
*
* - The origin of this software must not be misrepresented, either by
* explicit claim or by omission. Since few users ever read sources,
* credits must appear in the documentation.
*
* - Altered versions must be plainly marked as such, and must not be
* misrepresented as being the original software. Since few users ever read
* sources, credits must appear in the documentation.
*
* 2) The executables (the .exe files in DOS) are for sale and can not be
* freely distributed (with the exception of the beagle.exe file).
* Executables (binaries) on any platform (Unix, Mac, Amiga, DOS, etc.)
* can not be freely distributed.
*
* The executables (the .exe files in DOS) are copyrighted, all rights
* reserved. You should treat this software just like a book. This means
* that this software (the executables) may be used by any number of people
* and may be freely moved from one computer to another so long as the program
* is not used by more than one person at a time. This applies to binaries on
* any platform.
*
* The following provisions also apply in both cases 1 and 2:
*
* - Virtual Life and the authors are not responsible for the consequences of
* use of this software, no matter how awful, even if they arise from flaws
* in it.
*
* - Neither the name of Virtual Life, nor the authors of the code may be used
* to endorse or promote products derived from this software without
* specific prior written permission.
*
* - The provision of support and software updates is at our discretion.
*
* Please contact Tom Ray (full address below) if you have questions or would
* like an exception to any of the above restrictions.
*
* If you make changes to the code, or have suggestions for changes,
* let us know! If we use your suggestion, you will receive full credit
* of course.
*
* THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR IMPLIED
* WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
* MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
*
* Tom Ray
*
* Virtual Life (September through December)
* P.O. Box 625
* Newark, Delaware 19715
*
* School of Life & Health Sciences
* University of Delaware
* Newark, DE 19716
*
* ray@udel.edu (email)
* 302-831-2753 (Phone)
* 302-831-2281 (Fax)
*
* or
*
* Santa Fe Institute (January through August)
* 1660 Old Pecos Trail
* Suite A
* Santa Fe, NM 87501
*
* ray@santafe.edu (email)
* 505-984-8800 (Phone)
* 505-982-0565 (Fax)
*/
2) WHAT THIS PROGRAM IS, PUBLICATIONS, NEWS
The C source code creates a virtual computer and its operating system,
whose architecture has been designed in such a way that the executable
machine codes are evolvable. This means that the machine code can be mutated
(by flipping bits at random) or recombined (by swapping segments of code
between algorithms), and the resulting code remains functional enough of the
time for natural (or presumably artificial) selection to be able to improve
the code over time.
Along with the C source code which generates the virtual computer, we
provide several programs written in the assembler code of the virtual
computer. Some of these were written by a human and does nothing more than
make copies of itself in the RAM of the virtual computer. The others evolved
from the first, and are included to illustrate the power of natural selection.
The virtual machine is an emulation of a MIMD (multiple instruction
stream, multiple data stream) computer. This is a massively parallel computer
in which each processor is capable of executing a sequence of operations
distinct from the other processors. The parallelism is only emulated by
time slicing, but there really are numerous virtual CPUs. One CPU will be
created and assigned to each ``creature'' (self-replicating algorithm)
living in the RAM of the virtual computer. The RAM of the virtual computer
is known as the ``soup''.
The operating system of the virtual computer provides memory management
and timesharing services. It also provides control for a variety of factors
that affect the course of evolution: three kinds of mutation rates,
disturbances, the allocation of CPU time to each creature, the size of the
soup, the spatial distribution of creatures, etc. In addition, the operating
system provides a very elaborate observational system that keeps a record of
births and deaths, sequences the code of every creature, and maintains a
genebank of successful genomes. The operating system also provides facilities
for automating the ecological analysis, that is, for recording the kinds of
interactions taking place between creatures.
The version of the software currently being distributed is considered
to be a research grade implementation. This means two things: 1) It is
under very rapid development, and may not be completely bug free. 2) We have
chosen to go with modifiability and modularity over speed of execution.
If you find bugs in the code, please report them to us. By the time
you find them and report them, we may have eliminated them, and would be
able to provide you with a fixed version. If not, we will be able to fix
the bug, and would like to make the fix available to other users.
We have chosen modifiability over speed primarily because we know that
the original version of the virtual computer was very poorly designed, except
with respect to the features that make it evolvable. Specifically, consider
that one third of the original instruction set is taken up by pushing and
popping from the stack; there are only two inter-register moves, ax to bx and
cx to dx; dx isn't used for anything (in the initial version, dx was used to
set the template size, but that has been abandoned); there are no moves
between CPU registers and RAM; there is no I/O; and there is no way of
addressing a data segment.
In August 1991, 100% of the original virtual CPU code was replaced, with
new code that does exactly the same thing. However, the new code is written
in a generalized way, a meta-virtual computer, that makes it trivial to
alter the machine architecture. With the new implementation of the virtual
computer, it is possible for anyone to painlessly swap in their favorite
CPU architecture and instruction set, and their innovation will be seamlessly
embedded within the heart of the elaborate observational software. Knowing
how bad the original design was, there was a temptation to fix it when the
virtual computer was reworked, but the original implementation was retained
for historical reasons. In spite of its shortcomings, life proliferated in
the environment that it created. Things should get interesting as we improve
the architecture. The new organization of the code should make that easy.
In August of 1992, the modifications of the previous August bore fruit
as three new instruction sets were implemented. These are documented in some
detail below, and in general, eliminate most of the problems discussed
above.
The bulk of the code and documentation was written by Tom Ray, whose
address is listed at the end of this file. Substantial contributions have
been made by: Dan Pirone, cocteau@santafe.edu, has been involved in
the Tierra project since the Fall of 1990. Dan began working full
time on Tierra in February 1992. Dan has written the user interface,
and has been working on a phylogeny tool that should be completed by
October of 1992. Tom Uffner, tom@genie.slhs.udel.edu, reworked the genebanker
and assembler/disassembler in the Fall of 1991. Marc Cygnus,
cygnus@udel.edu, developed the ALmond monitor, a separate piece
of software that displays activity in a running Tierra (see below).
The behavior of this software is described in the following publications:
Ray, T. S. 1991. ``Is it alive, or is it GA?''
Proceedings of the 1991 International Conference on Genetic Algorithms,
Eds. Belew, R. K., and L. B. Booker, San Mateo, CA: Morgan Kaufmann, 527-534.
Ray, T. S. 1991. ``An approach to the synthesis of life.''
Artificial Life II, Santa Fe Institute Studies in the Sciences of
Complexity, vol. XI, Eds. Farmer, J. D., C. Langton, S. Rasmussen, &
C. Taylor, Redwood City, CA: Addison-Wesley, 371-408.
Ray, T. S. 1991. ``Population dynamics of digital organisms.''
Artificial Life II Video Proceedings, Ed. C.G. Langton,
Redwood City, CA: Addison Wesley.
Ray, T. S. 1991. ``Evolution and optimization of digital organisms.''
Scientific Excellence in Supercomputing: The IBM 1990 Contest Prize
Papers, Eds. Keith R. Billingsley, Ed Derohanes, Hilton Brown, III.
Athens, GA, 30602, The Baldwin Press, The University of Georgia.
Ray, T. S. 1992. ``Evolution ecology and optimization of digital organisms.''
Santa Fe Institute working paper (available in the ftp site as tierra.tex
and tierra.ps).
The Tierra Simulator has been widely reported in the media. Below is a
list of most of the national or international reports that I am aware of.
If you know of some news report not on this list, please send me a hard
copy.
Nature (John Maynard Smith, UK) February 27, 1992: ``Byte-sized evolution.
...we badly need a comparative biology. So far, we have been able to study
only one evolving system and we cannot wait for interstellar flight to
provide us with a second. If we want to discover generalizations about
evolving systems, we will have to look at artificial ones. Ray's study is a
good start.''
Science (Mitchell Waldrop, USA) August 21, 1992: ``Artificial Life's Rich
Harvest, Startlingly realistic simulations of organisms, ecosystems, and
evolution are unfolding on computer screens as researchers try to recreate
the dynamics of living things.''
New York Times (Malcolm Browne, USA) August 27, 1991: ``Lively Computer
Creation Blurs Definition of Life. Software forms, obeying Darwin's rules,
vie to avoid the `reaper'.''
Whole Earth Review (Steven Levy, USA) Fall 1992: ``Artificial Life, The
Quest for a New Creation.'' Reprinted from a book of the same name published
by Pantheon Books.
Science News (John Travis, USA) August 10, 1991: ``Digital Darwinism:
Electronic Ecosystem. Evolving `life' flourishes and surprises in a
novel electronic world''.
Nature (Laurence Hurst & Richard Dawkins, UK) May 21, 1992:
``Life in a test tube.''
Scientific American (John Rennie, USA) January 1992: ``Cybernetic Parasites...
Tierra... has been hailed as the most sophisticated artificial-life program
yet developed...''
New Scientist (Roger Lewin, UK) February 22, 1992: ``Life and death in a
digital world. No one can turn back the evolutionary clock, but we can
follow the fate of a rich menagerie of artificial organisms as they evolve
in a model world.''
The Economist (Anon, UK) January 4, 1992: ``The meaning of `life'.
In order to understand the origin of life, scientists are switching from the
chemistry set to the computer. In the process, they are beginning to
understand what it means to be alive.''
Release 1.0 (Esther Dyson, US) April 28, 1992: ``Artificial Worlds: A
Field Scientist in Tierra Cognita.''
Guardian (Jocelyn Paine, UK) January 9, 1992: ``Unravelling the loop in the
primordial soup. Tierran machine code is so adaptable it survives. Jocelyn
Paine charts the evolution of artificial life within the computer.''
Asahi (Katsura Hattori, Japan) September 15, 1992: Title in Japanese
characters.
Actuel (Ariel Kyrou, France) April 1992: ``Visite Guidee Aux Extremes De
La Science: La Vie Artificielle. Etes-vous pr\^{e}ts \`{a} entrer dans
l'univers vertigineux de la vie artificielle? Un champ scientifique tout neuf
sur lequel se penchent les grosses t\^{e}tes et les Nobel de labos
am\'{e}ricains.''
The Chronicle of Higher Education (David Wilson, USA) December 4, 1991:
``Approaching Artificial Life on a Computer. Survival-of-the-fittest
electronic organisms dramatically illustrate Darwinian principles.''
Mikrobitti (Pekka Tolonen, Finland) November 1991: ``Olemmeko humanoiden
biologinen koe? Tierra simuloi el\"{a}m\"{a}\"{a}.''
Europeo (Giovanni Caprara, Italy) September 1991: ``Anche il computer ha
fatto un figlio. Un biologo americano ha creato un software capace di
elaborare programmi che si evolvono da soli.''
GenteMoney (Riccardo Orizio, Italy) November 1991: ``Cos\`{\i} ho dato
la vita al software.''
Computerworld (Michael Alexander, USA) September 30, 1991: ``Tierra adds to
evolutionary studies. A computerized world created on an IBM PC could
have real-world benefits for scientists.''
Sueddeutsche Zeitung (Konrad Peters, Germany) October 21, 1991:
``Die Evolution im Computer. `K\"{u}nstliches Leben' hilft Biologen und
Informatikern auf die Spr\"{u}nge.''
Super Interessante (Anon, Brazil) November 1991: ``A vida dentro do
computador.''
Technology Review (Susan Scheck, USA) April 14, 1991: ``Is It Live Or Is
It Memory?''
Corriere Della Sera (Giovanni Capara, Italy) August 28, 1991: ``Pronto in
USA il programma che si riproduce. Il computer `padre' crea vita
informatica.''
Fakta (Tom Ottmar, Norway) March 1992: ``Den Lever! En `skabning', der
best\aa r af nuller og \'{e}nere, er vokset ud af indamaden p\aa \ en
computer og er blevet en videnskabelig sensation i USA.''
Associated Press (Theresa Humphrey, USA) October 1991: ``Bringing life to
computer. U of D biologist's program is self-replicating, shows evolution.''
Hovedomr\aa det (Jakob Skipper, Denmark) December 6, 1990: ``Kunstigt liv.
Nu kommer det kunstige liv. En voksende gruppe af dataloger, biologer,
fysikere, psykologer og mange andre forskere efterlinger p\aa \ computer
det naturlige liv.''
3) RELATED SOFTWARE (IMPORTANT)
The Tierra simulator is the central piece of a growing set of programs.
The accessory programs aid in observing the results of Tierra runs. You will
want to select one or both of the following:
3.1) The Beagle Explorer which graphically displays the output that Tierra
saves to disk. Beagle runs only on DOS systems, and while the heart of
the source code is available, Beagle uses the Greenleaf DataWindows
interface, and that source can not be distributed (available from
Greenleaf Software, 16479 Dallas Parkway, Bent Tree Tower Two, Suite 570,
Dallas, Texas, 75248, phone: 214-248-2561). Beagle would normally be
distributed in the executable form. Once you have the executables, you
will be able to run them on your PC, and you will not need Greenleaf.
You only need Greenleaf if you plan to modify the source code. Beagle is
currently configured only for the CGA or VGA graphics modes, but can be
extended on request, in time. Beagle executables are now available on disk
from Virtual Life, or in the same ftp site as the Tierra source code, in
the directory: /beagle. If you pick up the Beagle executables from
the ftp site, remember to trasfer the files in binary mode.
3.2) The ALmond Monitor which displays activity in a running Tierra
simulator. ALmond runs as a simultaneous but independent process (from
Tierra) on the same or on a seperate machine, and establishes network
socket communication with Tierra. ALmond can be attached to and detached
from a running Tierra simulator without interrupting the simulation.
ALmond runs only on unix systems supporting X Windows. The entire ALmond
source code is available. Almond was developed and tested on Sun 3 and Sun
4 machines. It was developed under X11R4 by Marc Cygnus. Dan Pirone
has now taken over ALmond development.
4) QUICK START
The steps required to run the system on DOS and UNIX are slightly
different, so there are two sets of instructions listed below.
4.1) DOS QUICK START
If you obtained the Tierra software on disk, the installation program
will take care of steps 1 - 3, so you can skip to step 4. If you
obtained the software over the net, start with step 1.
step 1) You should have a directory containing the executables and
source code and five subdirectories: td, gb1, gb2, gb3, and gb4. The td
directory is where a record of births and deaths will be written. The gb
directories contain the initial genomes used to innoculate the soup and the
opcode maps. The genebanker will save new genomes to the gb directories.
There is a gb directory for each of the four instruction sets.
step 2) You must compile the assember/disassembler, arg, and the
simulator, tierra. We include the two Turbo C V 2.0 project files:
tierra.prj and arg.prj. If you are using a more recent version of the
compiler, such as Borland C++, you must use the Borland project tool to
create a binary project file. Just list the files listed in the two ascii
project files that are provided. Compile these projects using the large
memory model. Put the executables in the path.
step 3) You must assemble the initial genomes, as binaries are not
portable. To do this, go into the gb1 directory and type:
arg c 0080.gen 80 0080aaa.tie
This will create the binary file 0080.gen which contains a creature that you
can use to innoculate the soup, the ancestor 0080aaa. You can check to see if
this worked by disassembling the genome, by typing:
arg x 0080.gen aaa
This will create the ascii file 0080aaa. Compare it to the original,
0080aaa.tie (it will not be exactly the same). Before you start a run, copy
0080.gen to 0080gen.vir, in order to have a virgin copy for use later when
you start another run.
copy 0080.gen 0080gen.vir
You can do the same for each of the four gb directories (gb1, gb2, gb3,
and gb4). Be sure to assemble the genomes listed at the ends of the
corresponding soup_in files (soup_in1, soup_in2, soup_in3, soup_in4).
step 4) Go back to the source code directory and examine the file
soup_in1. This file contains all of the parameters that control the run. It
is currently set up to innoculate the soup with one cell of genotype 0080aaa,
and to run for 500 million instructions in a soup of 50,000 instructions. You
will need a text editor if you want to modify this file. If you use a regular
word processor, be sure that you write the file back out as a plain ASCII text
file.
step 5) Run the simulator by typing: tierra
step 6) When the run is over, if you want to start a new run, you
should clean up the genebank, because the simulator will read in all genomes
in the genebank at startup. The best way to do this is to use the batch files
that are provided for this purpose: clr1.bat, clr2.bat, clr3.bat and clr4.bat.
Since we are discussing a run of instruction set 1, use clr1.bat, by typing
clr1 to the DOS prompt. The batch file will take care of the cleanup.
If you wish to use a cumulative genebank in successive runs, use the
corresponding cumulative clear batch files: cclr1.bat, cclr2.bat, cclr3.bat,
and cclr4.bat.
4.2) UNIX QUICK START
step 1) You should have a directory containing the source code and five
subdirectories: td and gb1, gb2, gb3 and gb4. The td (tiedat) directory is
where a record of births and deaths will be written. The gb (genebank)
directories contain the initial genomes used to innoculate the soup and the
opcode map, and the genebanker will save new genomes to these directories.
step 2) You must compile the assember/disassembler, arg, and the
simulator, tierra. There is a Makefile included to perform the compilation.
This Makefile needs to be edited to comment in the lines for your particular
hardware. It has been tested on Sun 3, Sun 4, IBM RS6000, Silicon Graphics
Personal Iris and Indigo, DEC DS5000, and NeXT. If you can use the Makefile,
type: make, and follow instructions.
step 3) You must assemble the initial genome, as binaries are not
portable. To do this, go into the gb1 directory and type:
../arg c 0080.gen 80 0080aaa.tie
This will create the binary file 0080.gen which contains a creature that you
can use to innoculate the soup, the ancestor 0080aaa. You can check to
see if this worked by disassembling the genome, by typing:
../arg x 0080.gen aaa
This will create the ascii file 0080aaa. Compare it to the original,
0080aaa.tie (they will not be exactly the same). Before you start a run,
copy 0080.gen to 0080gen.vir, in order to have virgin copies for use later
when you start another run.
cp 0080.gen 0080gen.vir
You can do the same for each of the four gb directories (gb1, gb2, gb3,
and gb4). Be sure to assemble the genomes listed at the ends of the
corresponding soup_in files (soup_in1, soup_in2, soup_in3, soup_in4).
step 4) Go back to the source code directory and examine the file
soup_in1. This file contains all of the parameters that control the run. It
is currently set up to innoculate the soup with one cell of genotype 0080aaa,
and to run for 500 million instructions in a soup of 50,000 instructions.
step 5) Run the simulator by typing:
tierra
or: tierra > /dev/null & (to run it in the background
a Log file can be created by setting
the soup_in variable Log = 1)
In order to run tierra in the background, you must compile it with:
#define FRONTEND STDIO
If you will run Tierra in the foreground, we recommend that you use:
#define FRONTEND BASIC
These definitions are made in the configur.h file.
step 6) When the run is over, if you want to start a new run, you
should clean up the genebank. The best way to do this is to use the
Unix script files that have been provided for this purpose (clr1, clr2,
clr3 and clr4). You must make the clr# files executable by changing their
protection:
chmod +x clr1
Then all you have to do is type ``clr1'' to the prompt, and the
shell script will take care of the cleanup.
If you wish to use a cumulative genebank in successive runs, use the
cumulative clear files (cclr1, cclr2, cclr3, cclr4). You must also make
sure that they are executable:
chmod +x cclr1
5) RUNNING TIERRA
This section has the following sub-sections:
5.1) Startup
5.2) The Assembler/Disassembler
5.3) The Birth-Death Output
5.4) The Genebank Output
5.5) Restarting an Old Run
5.6) The User Interface
5.6.1) The Basic Interface
5.6.1.1) The Basic Screen
5.6.1.2) The Menu Options
5.6.1.2.1) The Size Histogram
5.6.1.2.2) The Memory Histogram
5.6.1.2.3) The Genotype Histogram
5.6.1.2.4) The Size Class Information Display
5.6.1.2.5) The Virtual Debugger
5.6.1.2.6) The Genome Injector
5.6.2) The Standard Output
5.6.3) The tierra.log file
5.1) Startup
The first steps in running Tierra are described briefly above. One must
place the genomes and the opcode map in the gb directory, and one must have
created the td directory to receive the output of birth and death data. The
genome files are supplied in the form of ASCII assembler code files. These
must be assembled into binary form to be able to execute on the virtual
machine. If you type arg, the assembler will give you a brief listing of
assembler options. More complete documentation of the assembler follows:
5.2) The Assembler/Disassembler
This documentation was written by Tom Uffner.
Arg(1) USER COMMANDS Arg(1)
NAME
arg - genbank archive utility
SYNOPSIS
arg c|r[v12...9] afile size file1 [file2...]
arg x|t[v12...9] afile [gen1 [gen2...]]
DESCRIPTION
The arg utility is used to manipulate the genebank archives
that are used by tierra(1). It is used to assemble or dis-
sasemble tierra code, list the genomes contained in a file,
and also to convert between the old and new file formats.
The arg commands are:
c - create afile and add genomes in file1...filen
r - replace in afile (or add to end) genomes in
file1...filen
x - extract entire contents or specified genomes from
afile
t - list entire contents or specified genomes in afile
The optional modifiers are:
v - verbose output
1,2...9 - instruction set (defaults to Format: parameter
of ascii source or to INST = 1)
Where filen and afile are any legal filenames, genn is a 3
character genome label, and size is a decimal integer. (Note
that tierra(1) expects archives to have names consisting of
4 digits, and an extension of .gen, or .tmp.
FILES
GenebankPath/nnnn.gen permanantly saved genomes
GenebankPath/nnnn.tmp genomes from periodic saves
SEE ALSO
An Approach to the Synthesis of Life
tierra(1), ov(1X), beagle(1DOS), genio(3), arg(5)
BUGS
Genome extraction and internal search functions could be
faster, and will be in the next release.
Tierra, V 4.0 Last change: 8 September 1992 2
Please remember that this new form of arg needs the file opcode.map
to be in the current working directory. Arg ALWAYS reads the file
opcode.map for the mappings for assembling/disassembling genebank
archives.
5.3) The Birth-Death Output
During a run, if the DiskOut parameter is non-zero, a record of births
and deaths will be written to disk in the path specified by OutPath, to
files whose names depend on the BrkupSiz parameter. The format of this
file is a bit cryptic, so it will be explained here. The file has either
three or four columns of output, depending on whether the GeneBnker parameter
is set. Three of the columns remain the same either way: 1) elapsed time
since last birth or death event in instructions, output in hexidecimal format.
2) a `b' or a `d' depending on whether this is a birth or a death. 3) the size
of the creature in instructions, output in decimal format. If the genebanker
is on, then there will be a fourth column containg the three letter code
identifying the genotype of the creature. Mutations appear in the
birth-death record as the death of one genotype followed by the birth of
another, with an elapsed time of zero between the two events.
What makes the file cryptic, and also compact, is that columns are
implied to be identical in successive records unless otherwise indicated.
Only the first column, elapsed time since last record, must be printed on
every line, and only the first record must have all three or four columns.
Therefore, if there is a series of successive births, only the first birth
record in the series will contain the b. Notice that at the beginning of
the file, there will generally be many lines with just one column, because
at the outset, all records are of births of the same size and genotype.
The record of births and deaths is read by the Beagle program, and
converted into a variety of graphic displays: frequency distributions over
time, phase diagrams of the interactions between pairs of sizes or genotypes,
or diversity and related measures over time. The source code for reading and
interpreting the record of births and deaths is in the bread.c module of the
Beagle source code.
5.4) The Genebank Output
If the GeneBnker parameter is set to a non-zero value, then as each
creature is born, its genome will be sequenced and compared to that of its
mother. If they are identical, the daughter will be assigned the same name
as the mother. If they are different, the genome of the daughter will be
compared to the same size genomes held in the RAM genebank. If the daughter
genome is found in the bank, it will be given the same name as the matching
genome in the bank. If the daughter genome is not found in the RAM genebank,
it will be compared to any same size genomes stored on the disk that are not
in the RAM genebank. If the daughter genome is found in the disk genebank,
it will be given the same name as the matching genome in the disk genebank,
and that genome will be brougt back into RAM from the disk. If the
daughter genome does not match the mother or any genome in either the RAM
or disk banks, then it will be assigned an arbitrary but unique three letter
code for identification.
The genebanker keeps track of the frequency of each size class and each
genotype in the soup. If a genotype exceeds one of the two genotype frequency
thresholds, SavThrMem or SavThrPop, its assigned name will be made permanent,
and it will be saved to disk in a .gen file. Genotypes are grouped into
individual files on the basis of size. For example, all permanent genotypes
of size 80 will be stored together in binary form in a file called 0080.gen.
When the simulator comes down, or when the state of the simulator is
saved periodically during a run, all genotypes present in the soup which have
not been assigned permanent names will be stored in files with a .tmp
extension. For example, all temporary genotype names of size 45 would be
stored in binary form in a file called 0045.tmp.
The parameter RamBankSiz places a limit on how many genotypes will be
stored in RAM. If there are more than RamBankSiz genotypes in the RAM bank,
if any of them have permanent genotype names but have gone extinct in the
soup, they will be swapped out to disk into a .gen file. Genotypes without
permanent names are deleted when they go extinct. If the number
of living genotypes in the soup is greater than RamBankSiz, then the value
of RamBankSiz is essentially ignored (actually it is treated as if the
value were equal to the number of living genotypes). Living genotypes are
never swapped out to disk, only extinct ones. Under DOS, the parameter
RamBankSiz is of little importance, because it is raised if there are more
living genotypes and if the simulator uses all 640K of RAM, RamBankSiz is
lowered so that genotypes will be swapped out to avoid running out of
memory. Under Unix, this parameter does determine how many genotypes
will be held in the rambank, as long as there are not more than RamBankSiz
living genotypes.
The binary genebank files can be examined with the assembler-disassembler
arg (see the relevant documentation, section 5.2 above). Also, the Beagle
Explorer program contains utilities for examing the structure of genomes.
One tool condenses the code by pulling out all instructions using templates,
which can reveal the pattern of control flow of the algorithm. Another
function allows one genome to be used as a probe of another, to compare
the similarities and differences between genomes, or to look for the presence
of a certain sequence in a genome. A completely separate tool called probe
will scan the genebank pulling out any genomes that meet a variety of
criteria that the user may define.
5.5) Restarting an Old Run
When you start Tierra by typing tierra to the prompt, you may provide
an optional command line argument, which is the name of the file to use as
input. This is the file that contains the startup parameters. The default
file name is soup_in. When a simulator comes down, and periodically during
a run, the complete state of the machine is saved so that the simulator can
start up again where it left off. In order to do this you must have the
simulator read the file soup_out on startup. This means that you must type:
tierra soup_out
That is all there is to it.
5.6) The User Interface
There are two interfaces available for Tierra. The source code and
executables are shipped in a form that is configured for the nicer of the
two, the Basic interface. However, if you make the appropriate modifications
to the configur.h file, you can recompile with the Standard Output interface
(useful for running Tierra in the background). Documentation for each of
these interfaces follows.
5.6.1) The Basic Interface
5.6.1.1) The Basic Screen
The Basic frontend features a dynamic interface to the simulation.
The screen area is divided into five basic areas:
The STATS area consists of the top two lines of the screen. This area
displays several important variables, whose values are updated after every
birth:
> InstExec = 75,529993 Cells = 387 Genotypes = 191 Sizes = 23
> Extracted = 0080aad @ 8
InstExec = 75,529993 tells us that the simulation has executed a total of
75,529,993 instructions. Cells = 387 tells us that there are presently
387 adult cells living in the soup. Genotypes = 191 tells us that there
are presently 191 distinct genotypes of adult cells living in the soup.
Sizes = 23 tells us that there are presently 23 distinct sizes of adult
cells living in the soup. Extracted = 0080aad @ 8 tells us that the last
genotype to cross one of the frequency thresholds (SavThrMem or SavThrPop)
and get saved to disk and get a permanent name was a creature of size 80
with the name aad, which had a population of 8 adult cells when it crossed
the threshold.
The PLAN area displays the values of several variables, whose values are
updated every million instructions:
> InstExeC = 75 Generations = 188 Thu Apr 30 11:49:46 1992
> NumCells = 356 NumGenotypes = 178 NumSizes = 21
> AvgSize = 76 NumGenDG = 171 NumGenRQ = 239
> AvgPop = 376 Births = 1007 Deaths = 1034
> RateMut = 10966 RateMovMut = 1216 RateFlaw = 97280
> MaxGenPop = 21 (0078aak) MaxGenMem = 21 (0078aak)
InstExeC = 75 tells us that this set of variables was written when
the simulation had executed 75 million instructions. Generations = 188
tells us that the simulation had run for about 188 generations at this time.
Thu Apr 30 11:49:46 1992 tells us the actual time and date that this data
was printed.
NumCells = 356 tells us that there were 356 adult cells living in the soup.
NumGenotypes = 178 tells us that there were 178 distinct genotypes of adult
cells living in the soup. NumSizes = 21 tells us that there were 21 distinct
sizes of adult cells living in the soup.
AvgSize = 76 tells us that during the last million instructions, the
average size was 76. NumGenDG = 171 tells us that there are 171 genotypes
that have received permanent names and been saved to disk as .gen files in
the genebank directory gb. NumGenRQ = 239 tells us that at this time there
were 239 distinct genotypes being held in the genebank in RAM (the ramqueue).
AvgPop = 376 tells us that during the last million instructions, the
population was 376 adult cells. Births = 1007 tells us that during the last
million instructions, there were 1007 births. Deaths = 1034 tells us that
during the last million instructions, there were 1034 deaths.
RateMut = 10966 tells us that the actual average background (cosmic ray)
mutation rate for the upcoming million instructions will be one mutation per
10966/2 instructions exectued. RateMovMut = 1216 tells us that the actual
average move mutation rate (copy error) for the upcoming million instructions
will be one mutation for every 1216/2 instructions copied. RateFlaw = 97280
tells us that the actual average flaw rate for the upcoming million
instructions will be one flaw for every 97280/2 instructions exectued. The
reason that these numbers represent twice the average mutation rates is that
they are used to set the range of a uniform random variate determining the
interval between mutations.
MaxGenPop = 21 (0078aak) tells us that at this time, the genotype
with the largest population is 0078aak, and that it has a population of 21
adult cells. MaxGenMem = 21 (0078aak) tells us that the genotype whose
adult cells occupy the largest share of space in the soup is 0078aak, and
that it has a population of 21 adult cells.
The MESSAGE area, for state changes, and Genebank data. This area serves
many purposes - memory reallocation messages, Genebank information displays,
large interface prompts (eg changing a soup_in variable ).
The ERROR area at the second to the last line of the screen.
All simulation errors and exit conditions are passed through this area.
The HELP area at the last line of the screen. This area provides
suggestions for keystroke navigation. Under DOS it will usually look
like this:
> Press any Key for menu ...
Under Unix it will generally look like this:
> Press Interupt Key for menu ...
Under DOS, pressing any key will get you the menu, under Unix, pressing the
interrupt key (usually Ctrl C) will get you the menu, described in the
next section.
5.6.1.2) The Menu Options
Please note that if the Tierra simulator is started with two arguments,
it will come up with the menu system activated. The first argument must be
the name of the soup_in file, the second argument is a dummy and anything will
do: tierra soup_in1 junk
The frontend menu looks like this:
> TIERRA | i-info v-var s-save q-save&quit Q-quit m-misc c-continue |->
These options allow for rapid IO from the simulation, in a user friendly
format. The interface allows transition between any data display mode. The
features are ativated by a single keypress ("Enter/Return" is usually not
needed).
The options are:
i-info v-var s-save q-save&quit Q-quit m-misc c-continue |->
i-info: will display some information about creatures stored in the Genebank.
See below for details.
v-var: allows you to alter the value of any of the variables in the
soup_in file at any point during a run.
s-save: will cause the state of the system to be saved at this point, and
then continue the run.
q-save&quit: will cause the state of the system to be saved at this point,
and then exit from the run.
Q-quit: will exit immediately without saving the state of the system.
m-misc: pulls up the miscellaneous sub-menu
c-continue: continue the run.
If you press 'i' for info, from the main TIERRA menu, you are able to select
one of the Genebank (GeneBnker = 1) data display modes:
> INFO | p-plan s-size_histo g-gen_histo m-mem_histo z-size_query |->
The second line from the top of the screen will change, providing additional
information on the operating system memory use, or the Tierrian memory use.
At the bottom of the screen, a new list of GeneBank data display options are
available. All of these modes will try to provide as much data as can fit on
the screen. These modes are detailed below:
Hit the c key to continue (to get out of this level of the menu)
The options are:
p-plan s-size_histo g-gen_histo m-mem_histo z-size_query
If you press 'p' from the info menu you get the Plan Display mode:
>Now in Plan Display mode, updated every million time steps
This provides the normal statistics every million virual time steps
(this is also an easy way to clear the message area):
> InstExeC = 141 Generations = 363 Thu Mar 26 15:20:28 1992
> NumCells = 483 NumGenotypes = 202 NumSizes = 27
> AvgSize = 53 AvgPop = 471
> RateMut = 5426 RateMovMut = 848 RateFlaw = 16960
> NumGenRQ = 579 NumGenDM = 0 NumGenDG = 400
> births = 1210 deaths = 1125
> MaxGenPop = 48 (0032aah) MaxGenMem = 48 (0032aah)
The meaning of this information is detailed above.
5.6.1.2.1) The Size Histogram
If you press 's' from the info menu, the message area will show a histogram
of frequency distributions of the currently living size classes:
13 1 2 | *
18 3 9 | **
19 3 4 | *
20 2 4 | *
21 1 3 | *
22 1 4 | *
27 3 5 | *
34 2 4 | *
35 1 4 | *
36 34 84 | ****************
37 70 198 | *************************************
38 33 92 | ******************
39 7 9 | **
40 13 25 | *****
44 1 2 | *
45 1 2 | *
46 2 2 | *
47 1 4 | *
49 1 2 | *
54 1 1 | *
56 2 3 | *
70 10 13 | ***
71 21 42 | ********
72 24 42 | ********
73 6 6 | **
74 7 12 | ***
75 3 3 | *
77 2 2 | *
78 6 7 | **
109 1 2 | *
In the above histogram, the left column of numbers is the size class,
the middle column is the number of genotypes of that size class, and the
right column is the number of living adult cells of that size class.
5.6.1.2.2) The Memory Histogram
If you press 'm' from the info menu, the message area will show a histogram
of frequency distributions of the currently living size classes, by memory
use:
Size Memory use (size * freq.)
18 3 180 | *
19 3 76 | *
20 2 60 | *
21 1 63 | *
22 1 110 | *
27 3 162 | *
31 1 31 | *
33 1 33 | *
34 2 136 | *
35 1 105 | *
36 37 3096 | *****************
37 69 7141 | *************************************
38 34 3610 | *******************
39 7 351 | **
40 13 960 | *****
41 1 41 | *
42 1 42 | *
43 1 43 | *
44 1 44 | *
45 1 90 | *
46 2 92 | *
47 1 235 | **
48 1 48 | *
49 1 98 | *
50 1 50 | *
54 1 54 | *
56 2 168 | *
66 1 66 | *
71 22 42 | *
72 25 42 | *
In the above histogram, the left column of numbers is the size class,
the middle column is the number of genotypes of that size class, and the
right column is the amount of memory occupied by living adult cells of that
size class.
5.6.1.2.3) The Genotype Histogram
If you press 'g' from the info menu, the message area will show a histogram
of genotype sizes, by specific genotypes:
Size Frequency
18aab 3 | ***
aac 6 | ******
27aab 2 | **
aad 3 | ***
35aaa 3 | ***
36aam 7 | *******
aat 39 | **************************************
abb 3 | ***
37abb 26 | *************************
abp 5 | *****
abt 6 | ******
acm 23 | **********************
acn 4 | ****
38abb 5 | *****
abc 3 | ***
40aaj 3 | ***
aal 2 | **
47aaa 5 | *****
49aaa 2 | **
56aac 2 | **
70aae 2 | **
aaf 2 | **
71aac 3 | ***
aaj 4 | ****
aat 3 | ***
aav 6 | ******
72aai 2 | **
aau 3 | ***
aax 4 | ****
74aac 4 | ****
In the above histogram, the left column of numbers is the size and genotype
class, and the right column is the number of living adult cells of that
genotype class.
5.6.1.2.4) The Size Class Information Display
If you press 'z' from the info menu, you will be prompted for a
specific size class to examine, then you will get a list of the most
common genotypes of that size, with some statistics on them:
Gene: # Mem Errs Move Bits
aae 4 0 1 39 EX TC TP MF MT MB
aaj 1 0 0 0 EX TC TP MF MT MB
aak 1 0 0 0 EX TC TP MF MT MB
aan 1 0 0 0 EX TC TP MF MT MB
aap 1 0 0 0 EX TC TP MF MT MB
aaq 1 0 0 0 EX TC TP MF MT MB
aat 10 0 2 38 EXsofh TCsofh TPs MF MT MBsof
aau 1 0 0 0 EX TC TP MF MT MB
aaw 1 0 0 0 EX TC TP MF MT MB
aax 2 0 0 0 EX TC TP MF MT MB
aay 1 0 0 0 EX TC TP MF MT MB
aba 1 0 0 0 EX TC TP MF MT MB
abb 1 0 0 0 EX TC TP MF MT MB
abc 1 0 0 0 EX TC TP MF MT MB
abe 1 0 7 146 EX TC TP MF MT MB
abi 61 3 2 38 EXsdofh TCsofh TPs MFsof MTsf MBsdofh
abk 1 0 2 38 EX TC TP MF MT MB
abl 1 0 0 0 EX TC TP MF MT MB
abm 1 0 0 0 EX TC TP MF MT MB
abw 17 1 2 38 EXsdofh TCsofh TPs MF MT MBsdofh
aby 6 0 1 38 EXsdofh TCsof TPs MF MT MBsdofh
acb 2 0 0 0 EX TC TP MF MT MB
acd 1 0 0 0 EX TC TP MF MT MB
ace 1 0 0 0 EX TC TP MF MT MB
acf 1 0 0 0 EX TC TP MF MT MB
acg 4 0 2 38 EX TC TP MF MT MB
ach 13 0 2 38 EXsdofh TCsdofh TPs MFsdofh MTsdf MBsdofh
aci 2 0 2 38 EX TC TP MF MT MB
acj 2 0 0 0 EX TC TP MF MT MB
# = acutal count, populations of adult cells of this genotype
Mem = is the percent of Soup, occupied by adult cells of this genotype
Err = is the number of error flag commited,
Move = number of instructions moved to daughter cell,
Bits = Watch bits, defined in section: 7) SOUP_IN PARAMETERS
0 values usually represent cases of insufficient data.
If you press 'm' for misc, from the main TIERRA menu, you will get the
miscellaneous sub-menu, which looks like the following in unix:
> MISC | H-Histo Logging I-Inject Gene M-Micro Toggle |->
and the following in DOS:
> MISC | H-Histo Logging S-DOS Shell I-Inject Gene M-Micro Toggle |->
When this sub-menu is selected, some information is displayed at the top
of the screen about what variables are #defined:
VER=4.00 INST=1 PLOIDY=1 ERROR
This tells us that this is Version 4.00 of the Tierra program, we are using
Instruction Set 1, that it is a haploid model (Ploidy = 1), and that error
checking code is included.
From this menu, you have the following options:
H-Histo Logging: if you press the H key, it will toggle the logging of any
histograms you create to the tierra.log file.
S-DOS shell: (DOS only) this causes you to exit to the DOS prompt, while the
simulator remains in the RAM in suspended animation. You can now do
anything that doesn't require a lot of memory. Be careful that if you
change directories, you come back to the current directories before you
exit from the shell. Also, be careful not to do anything that changes
the graphics mode.
I-Inject Gene: inject a genome of your choice from the genebank into the
soup of the running simulator.
M-Micro Toggle: turn on the virtual debugger. The debugger has three states:
delay, keypress, and off. In delay mode, the debugger will execute one
Tierran instruction per second. In the keypress mode, the debugger will
execute one Tierran instruction per keypress.
c-continue: exit from this sub-menu.
5.6.1.2.5) The Virtual Debugger
The debugger has three main states: delay, keypress, and off. In delay mode,
the debugger will execute one Tierran instruction per second. In the keypress
mode, the debugger will execute one Tierran instruction per keypress. Once
you have selected your mode, press c to start the simulator again. You can
use this tool to step through the genome of a creature, either to see what
it does, or to debug a creature that you are writing.
In keypress mode, there are two additional modes. Given that there is
usually a population of cells in the soup, the virtual debugger will swap
from cell to cell as each one gets its time slice. However, this can be
disconcerting if one is trying to study the behavior of a particular cell.
The debugger can be made to follow a single creature using the trace option
as specified at the bottom of the screen in keypress mode:
MICRO | T-Track cell ESC-Main Menu n-Next Step
If you hit the T key, you will go into trace mode, and the menu will change
to:
MICRO | t-Untrack cell ESC-Main Menu n-Next Step
Now hitting the t key will return to the mode that swaps between cells.
If you wish to start debugging from the very beginning of a run, you will
want to start the simulator fresh with the menu activated, so that you can
start the debugger before the creature has started to run. This is done by
giving two arguments when starting Tierra, one must be the name of the
soup_in file, the second is a dummy argument. So for example, you should type:
tierra soup_in1 junk
The debugger display in keypress mode looks like:
InstExec = 11,346072 Cells = 679 Genotypes = 274 Sizes = 26
VER=4.0 INST=1 PLOIDY=1 MICRO
MICRO STEP Mode = keypress
Cell 1: 55 0045aaa @ 12595 Slice= 19 Stack[ 12628]
IP [ 12572]( -23 ) = 0x01 nop1 [ 12639]
AX [ 30236] [ 12595]
BX [ 12632] [ 45] <
CX [ 7] [ 0]
DX [ 4] [ 0]
Flag = 0 [ 0]
Daughter @ 30199 + 45 [ 0]
[ 0]
[ 0]
MICRO | T-Track cell ESC-Main Menu n-Next Step
The various components of the display are documented in the following:
Cell 1: 55 0045aaa @ 12595 Slice= 19
Each creature has associated with it, a cell structure. The cell
structures are organized in a two dimensional array. Cell 1: 55 tells us
that the structure for the currently active creature is at location 1,55 in
the cells array.
The currently active creature is 0045aaa whose cell starts at address
12595 in the soup. This creature currently has 19 CPU cycles left in its
time slice. Note that when the Slice counts down to zero, the slicer will
swap in the next creature in the slicer queue, and the debugger will display
the CPU of the next creature. Therefore when there are more than one
creature in the soup, it is hard to follow the activity of a single creature,
since the debugger will be swapped from creature to creature. This problem
can be eliminated by activating the track mode (see above).
Daughter @ 30199 + 45
The currently active creature has allocated a space of 45 instructions
at address 30199 in the soup, which it presumably will use to make a daughter.
IP [ 12572]( -23 ) = 0x01 nop1
The instruction pointer (IP) of the currently active creature is
presently located at the soup address 12572. Notice that this address is
located 23 instruction BEFORE the start of its own cell. The number in
parentheses (-23) is the offset of the IP from the start of the cell. As
long as the IP is inside the cell, this offset will be displayed without a
sign. If the IP is outside of the cell, the offset will be displayed with
a + or a - sign. The current offset of -23 means that the instruction pointer
of this creature is executing the code of some other creature. In fact
0045aaa is a parasite. The instruction currently being executed is
represented in the soup by the hex value 0x01. The assembler mnemonic of this
instruction is nop1, it is a no-operation.
AX [ 30236]
BX [ 12632]
CX [ 7]
DX [ 4]
The four CPU registers of the currently active creature contain the
values indicated above. Probably, the AX value is the address where the
creature is writing in its daughter cell, the BX value is the address that
it is copying from in its own cell, the CX value is the number of
instructions of the genome remaining to be copied, and the DX value is the
size of the last template used.
Flag = 0
This shows the status of the flag register. This can be used to
recognize when an instruction has failed, generating an error condition.
Stack[ 12628]
[ 12639]
[ 12595]
[ 45] <
[ 0]
[ 0]
[ 0]
[ 0]
[ 0]
[ 0]
The stack of the currently active creature contains the values indicated
above. The stack pointer points at the location indicated by the < sign
(the top of the stack). The values on the stack probably include the
address of the instruction pointer (12628) which was pushed on the stack when
the parasite called the copy procedure of its host, the address of the end
of the creature (12639), the address of the beginning of the creature
(12595), and the size of the creature (45).
5.6.1.2.6) The Genome Injector
The menu system provides a mechanism for injecting genomes into a
running simulation. This tool allows a genome from the genebank to be
injected into the run at the user's command. The genome must be in a .gen
file in the path indicated by the GenebankPath soup_in variable.
However, there is a function Inject(), in the genebank.c module, which
takes a pointer to a genome as an argument. This function can be used to
inject genomes from any source. An interesting use of this function would be
to facilitate migration of genomes between simulations running on separate
machines, creating an archipelago (this is the basis of the CM5 port
currently underway).
5.6.2) The Standard Output & Interrupt Handler
When Tierra is compiled with: #define FRONTEND STDIO, while running it
produces output to the console that looks something like this:
Using instruction set (INST) = 1
TIERRA: LOG = on, Histogram Logging = off
sizeof(Instruction) = 1
sizeof(Cell) = 180
sizeof(MemFr) = 16
60000 bytes allocated for soup
16380 bytes allocated for cells
8000 bytes allocated for MemFr
tsetup: arrays allocated without error
beginning of GetNewSoup
seed = 1
init of soup complete
GetNewSoup: loading 0080aaa into cell 0,2
InstExeC = 0 Generations = 0 Wed Apr 29 17:00:06 1992
NumCells = 1 NumGenotypes = 1 NumSizes = 1
AvgSize = 80 NumGenDG = 1 NumGenRQ = 1
RateMut = 6382 RateMovMut = 1280 RateFlaw = 102400
tsetup: soup gotten
extract: 0080aaa @ 276 v
InstExeC = 1 Generations = 3 Wed Apr 29 17:01:04 1992
NumCells = 376 NumGenotypes = 131 NumSizes = 8
AvgSize = 79 NumGenDG = 1 NumGenRQ = 131
AvgPop = 292 Births = 995 Deaths = 620
MaxGenPop = 208 (0080aaa) MaxGenMem = 208 (0080aaa)
RateMut = 12515 RateMovMut = 1264 RateFlaw = 101120
extract: 0080abl @ 8
InstExeC = 2 Generations = 5 Wed Apr 29 17:02:05 1992
NumCells = 380 NumGenotypes = 159 NumSizes = 12
AvgSize = 79 NumGenDG = 2 NumGenRQ = 159
AvgPop = 378 Births = 847 Deaths = 843
MaxGenPop = 169 (0080aaa) MaxGenMem = 169 (0080aaa)
RateMut = 12648 RateMovMut = 1264 RateFlaw = 101120
extract: 0080aan @ 8
extract: 0045aab @ 8
InstExeC = 3 Generations = 8 Wed Apr 29 17:03:17 1992
NumCells = 373 NumGenotypes = 141 NumSizes = 15
AvgSize = 77 NumGenDG = 4 NumGenRQ = 141
AvgPop = 368 Births = 918 Deaths = 925
MaxGenPop = 144 (0080aaa) MaxGenMem = 144 (0080aaa)
RateMut = 11794 RateMovMut = 1232 RateFlaw = 98560
extract: 0080adj @ 7
extract: 0080acu @ 7
extract: 0080aek @ 8
extract: 0045aad @ 8
InstExeC = 4 Generations = 10 Wed Apr 29 17:05:11 1992
NumCells = 383 NumGenotypes = 144 NumSizes = 19
AvgSize = 74 NumGenDG = 8 NumGenRQ = 144
AvgPop = 376 Births = 1071 Deaths = 1061
MaxGenPop = 118 (0080aaa) MaxGenMem = 118 (0080aaa)
RateMut = 11185 RateMovMut = 1184 RateFlaw = 94720
extract: 0080aad @ 8
The meaning of each different kind of information is described below:
> Using instruction set (INST) = 1
Because we are likely to proliferate instruction sets in the near
future, the system lets us know which one it is using.
> TIERRA: LOG = on, Histogram Logging = off
If the soup_in variable ``Log'' is non-zero, most of the information
shown in the standard output listing above will be written to the file
``tierra.log'' on disk. Histogram Logging = off indicates that histograms
viewed through the menu system will not be saved to the log. This option
can be toggled so that static histograms are saved to the log.
> sizeof(Instruction) = 1
> sizeof(Cell) = 180
> sizeof(MemFr) = 16
The size in bytes of each of the main structures, of which the system
will allocate large arrays at startup.
> 60000 bytes allocated for soup
> 16380 bytes allocated for cells
> 8000 bytes allocated for MemFr
The total number of bytes used for each of the three main arrays of
structures.
> tsetup: arrays allocated without error
Statement indicating that the arrays were allocated without error.
> beginning of GetNewSoup
Statement indicating that the program is entering the GetNewSoup()
function.
> seed = 690349257
A record of the seed of the random number generator used in this run.
This can be used to repeat the run if desired.
> init of soup complete
A statement indicating that the soup has been initialized.
> GetNewSoup: loading 0080aaa into cell 0,2
A statement indicating that the system is innoculating the soup with
a creature of size 80. There will be a comparable line for every creature
used in innoculating the soup at startup. The first creature goes into cell
2 of array 0, because cells 0 and 1 are used for other purposes.
> InstExeC = 0 Generations = 0 Wed Apr 29 17:00:06 1992
> NumCells = 1 NumGenotypes = 1 NumSizes = 1
> AvgSize = 80 NumGenDG = 1 NumGenRQ = 1
> RateMut = 6382 RateMovMut = 1280 RateFlaw = 102400
> tsetup: soup gotten
These lines indicate the starting conditions of several variables which
will be explained below.
> extract: 0080aaa @ 276 v
This line indicates that the genotype 0080aaa crossed one of the
frequency thresholds set in the soup_in file, SavThrMem or SavThrPop,
and that there were 276 adult creatures of this genotype in the soup
when this was noted. However, no creatures are extracted until the
reaper is activated when the soup becomes full. This means that 0080aaa
was not actually extracted at the time that it crossed a threshold, but
actually much later, when it had a relatively large population. The v
after 276 indicates that this was a ``virtual extraction'', which means
that the genome was not actually saved to disk, since it already has been
saved to disk. Anytime a permanent genotype goes extinct, then reappears
and crosses a threshold, it will experience a virtual extraction, which
just means that the crossing of the threshold will be reported as an extract
in standard out and in the tierra.log file (this information can be put
to good use by the tieout tool: tieout tierra.log ie ex)
> InstExeC = 1 Generations = 3 Wed Apr 29 17:01:04 1992
> NumCells = 376 NumGenotypes = 131 NumSizes = 8
> AvgSize = 79 NumGenDG = 1 NumGenRQ = 131
> AvgPop = 292 Births = 995 Deaths = 620
> MaxGenPop = 208 (0080aaa) MaxGenMem = 208 (0080aaa)
> RateMut = 12515 RateMovMut = 1264 RateFlaw = 101120
A statement of this form is printed after every million instructions
executed by the system. See the plan() function in the bookeep.c module
for more details on this.
> InstExeC = 1 Generations = 3 Wed Apr 29 17:01:04 1992
InstExeC = 1 tells us that one million instructions have been executed
in this run. Generations = 3 tells us that roughly three generations of
creatures have passed so far during this run. Wed Apr 29 17:01:04 1992
tells us the time and date of this record.
> NumCells = 376 NumGenotypes = 131 NumSizes = 8
NumCells = 376 tells us that there were 376 adult cells (and a roughly
equal number of daughter cells) at this point in the run. NumGenotypes = 131
tells us that there were 131 distinct adult genotypes (code sequences) living
in the soup at the time of this record. NumSizes = 8 tells us that there
were eight distinct adult genome sizes (creature code lengths) living in the
soup at this time.
> AvgSize = 79 NumGenDG = 1 NumGenRQ = 131
AvgSize = 79 tells us that the average size of all the adult creatures
living in the soup at this time was 79 instructions. NumGenDG = 1 tells us
that there is one genotype that has received a permanent name and been saved
to disk as .gen files in the genebank directory gb. NumGenRQ = 131 tells us
that at this time there were 131 distinct genotypes being held in the genebank
in RAM (the ramqueue).
> AvgPop = 292 Births = 995 Deaths = 620
AvgPop = 292 tells us that during the last million instructions the average
population was 292 adult cells. Births = 995 tells us that during the last
million instructions, there were 995 births. Deaths = 620 tells us that
during the last million instructions, there were 620 deaths.
> MaxGenPop = 208 (0080aaa) MaxGenMem = 208 (0080aaa)
MaxGenPop = 208 (0080aaa) tells us that at this time, the genotype
with the largest population is 80aaa, and that it has a population of 208
adult cells. MaxGenMem = 208 (0080aaa) tells us that the genotype whose
adult cells occupy the largest share of space in the soup is 80aaa, and that
it has a population of 208 adult cells.
> RateMut = 12515 RateMovMut = 1264 RateFlaw = 101120
RateMut = 12515 tells us that the actual average background (cosmic ray)
mutation rate for the upcoming million instructions will be one mutation per
12515/2 instructions exectued. RateMovMut = 1264 tells us that the actual
average move mutation rate (copy error) for the upcoming million instructions
will be one mutation for every 1264/2 instructions copied. RateFlaw = 101120
tells us that the actual average flaw rate for the upcoming million
instructions will be one flaw for every 101120/2 instructions exectued. The
reason that these numbers represent twice the average mutation rates is that
they are used to set the range of a uniform random variate determining the
interval between mutations.
> extract: 0080abl @ 8
This is a real extraction, as indicated by the absence of a v after
the 8. This genotype, 0080abl, crossed the threshold frequency with a
population of 8 adult creatures, its name was made permanent, and its genome
was saved to disk.
When Tierra is running in the foreground it is possible to interrupt
it on either DOS or UNIX, usually by typing Ctrl C (^C). When you do this
you will get a brief message listing your options, which looks something
like this:
==========================================================
TIERRA: SIGINT Handler
---------------------------------------------------------
InstExe.m = 6 InstExec.i = 478414 NumCells = 362
NumGenotypes = 145 NumSizes = 20
---------------------------------------------------------
Key Function
i Information on simulation
v Change a soup_in variable
S Execute a system Shell
s Save the soup
q Save the soup & quit
Q Quit/Abort simulation
c To Continue simulation
---------------------------------------------------------
TIERRA | i-info v-var s-save S-shell q-save&quit Q-quit c-continue |->
DOS only: Coreleft, is the amount of free memory above the highest block of
allocated memory, within the first 640K of system memory. This is a clue to
how much memory remains for use by the simulator.
Unix only: If ALCOMM has been defined at compile time, VPORT is the value of
the socket for ALmond data communication.
You must now choose one of the options by typing one of the corresponding
letters: ivSsqQc. When you type the letter, the simulator will either prompt
you for more input or do the requested operation.
The options are:
i-info v-var s-save S-shell q-save&quit Q-quit c-continue |->
i-info: will display some information about creatures stored in the Genebank.
See below for details.
v-var: allows you to alter the value of any of the variables in the
soup_in file at any point during a run.
s-save soup: will cause the state of the system to be saved at this point, and
then continue the run.
S-Shell: (DOS only) this causes you to exit to the DOS prompt, while the
simulator remains in the RAM in suspended animation. You can now do
anything that doesn't require a lot of memory. Be careful that if you
change directories, you come back to the current directories before you
exit from the shell. Also, be careful not to do anything that changes
the graphics mode.
q-save&quit: will cause the state of the system to be saved at this point,
and then exit from the run.
Q-quit: will exit immediately without saving the state of the system.
c-continue: continue the run.
i-info: Choosing this option produces a sub-menu:
---------------------------------------------------------
s Spectrum of all Size Classes
m Spectrum of Size Classes, by memory use
g Spectrum of Size Classes, by geneotype
z Break down of a specific Size Class
Any other key to main menu ...
---------------------------------------------------------
These options are documented under the BASIC frontend above.
5.6.3) The tierra.log file
If the soup_in variable:
Log = 1 0 = no log file, 1 = write log file
is set to a non-zero value, a file named tierra.log will be written to
the current directory. This file contains in abbreviated form, much the
same information that is contained in the Standard Output frontend. An
example of the output to this file follows:
TIERRA: LOG = on, Histogram Logging = off
sizeof(Instruction) = 1
sizeof(Cell) = 180
sizeof(MemFr) = 16
60000 bytes allocated for soup
16380 bytes allocated for cells
8000 bytes allocated for MemFr
tsetup: arrays allocated without error
beginning of GetNewSoup
seed = 1
init of soup complete
GetNewSoup: loading 0080aaa into cell 0,2
ie0 gn0 Wed Apr 29 16:12:08 1992
nc1 ng1 ns1
as80 rq1 dg1
rm6382 mm1280 rf102400
tsetup: soup gotten
ex = 0080aaa @ 276 v
ie1 gn3 Wed Apr 29 16:13:14 1992
nc376 ng131 ns8
as79 rq131 dg1
bi995 de620 ap292
mp208 @ 0080aaa mg208 @ 0080aaa
rm12515 mm1264 rf101120
ex = 0080abl @ 8
ie2 gn5 Wed Apr 29 16:14:17 1992
nc380 ng159 ns12
as79 rq159 dg2
bi847 de843 ap378
mp169 @ 0080aaa mg169 @ 0080aaa
rm12648 mm1264 rf101120
ex = 0080aan @ 8
ex = 0045aab @ 8
ie3 gn8 Wed Apr 29 16:15:32 1992
nc373 ng141 ns15
as77 rq141 dg4
bi918 de925 ap368
mp144 @ 0080aaa mg144 @ 0080aaa
rm11794 mm1232 rf98560
ex = 0080adj @ 7
ex = 0080acu @ 7
ex = 0080aek @ 8
ex = 0045aad @ 8
ie4 gn10 Wed Apr 29 16:17:27 1992
nc383 ng144 ns19
as74 rq144 dg8
bi1071 de1061 ap376
mp118 @ 0080aaa mg118 @ 0080aaa
rm11185 mm1184 rf94720
Because this file is of essentially the same form as the standard
output, only the abbreviations will be documented here. Refer to the
documentation of the Standard Output to interpret the meaning of this file.
ie = InstExeC; gn = Generations; nc = NumCells; ng = NumGenotypes;
ns = NumSizes; as = AvgSize; rq = NumGenRQ; dg = NumGenDG;
bi = Births; de = Deaths; ap = AvgPop; mp = MaxGenPop; mg = MaxGenMem.
rm = RateMut; mm = RateMovMut; rf = RateFlaw;
6) LISTING OF DISTRIBUTION FILES
The distribution includes the following files:
arg.c - the main module for the assembler/disassembler, originally written by
Tom Uffner. This program converts ascii assembler files into binary
files which can be executed by the Tierran virtual computer.
arg.prj - the Turbo C V 2.0 project file for compiling the
assember/disassembler. If you are using a more recent Borland compiler,
you must use the Borland tool to build the binary project file. Include
the modules listed in this ascii version of the file.
arg_inc.c - a file used to include some of the files used in compiling the
assembler (arg). They are included in this way because arg_inc.c
defines ARG, which alters the included modules to be compatible with
arg rather than tierra.
arginst.h - a file containing some variable used by the assembler.
bookeep.c - source code for bookeeping routines, which keep track of how many
of what kind of creatures are in the soup, and other stuff like that.
cclr1, cclr2, cclr3, cclr4, cclr1.bat, cclr2.bat, cclr3.bat, cclr4.bat -
Unix shell scripts, and DOS batch files for cleaning up an old run in
preparation for a new run with a cumulative genebank (or you can just
use it to remove files generated by a run, to get you disk space back).
You must change the protection on the unix files to make them executable:
chmod +x cclr1
There is a different file for each instruction set.
clr1, clr2, clr3, clr4, clr1.bat, clr2.bat, clr3.bat, clr4.bat -
Unix shell scripts and DOS batch files for cleaning up an old run in
preparation for a new run (or you can just use it to remove files
generated by a run, to get you disk space back).
You must change the protection on the unix files to make them executable:
chmod +x clr1
There is a different file for each instruction set.
configur.h - a file for configuring Tierra. You probably won't need to
touch this unless you get into advanced stuff.
declare.h - all global variables are declared in this file, except those
whose values are set by soup_in. Those globals are declared in
soup_in.h. declare.h is included by tierra.c which contains the main
function.
diskbank.c - some functions that deal with writing to disk.
extern.h - all global variables are delcared as extern in this file, and this
file is included by all *.c files except tierra.c which includes
delcare.h instead.
frontend.c - functions for handling input/output for Tierra. This stuff was
written by Dan Pirone. This module contains basic data reporting
functions, and includes one of the following:
tstdio.c - low level IO for standard IO, simple, and good for debugging.
tcurses.c - advanced IO for unix, based on "standard" curses calls,
Please see the manual pages on curses for the nitty gritty details.
tturbo.c - advanced IO for the Turbo C / DOS environment.
On machines with EGA or VGA harware, 43/50 lines modes are used.
genio.c - functions for input/output of creatures. This stuff is also used
by arg.c, the assembler/disassembler. This module has benefited from
a lot of work by Tom Uffner.
instruct.c - this module contains generalized executable functions. These
generalized functions are mapped to specific functions by the parsing
functions in the parse.c module.
license.h - a file stating the terms of the license agreement, which is
include in all the source code modules.
Makefile - the make file, for compiling tierra and arg under Unix.
memalloc.c - functions for handling memory allocation in the soup, the stuff
that ``cell membranes'' are made of.
memtree.c - additional memory allocation routines, written by
Chris Stephenson of the IBM T. J. Watson Research Center.
parse.c - the parsing functions interpret the executable code of the
creatures, and map it onto the executable functions contained in the
instruct.c module. This file includes one of the following four files,
depending on which instruction set is being compiled:
parse1.c
parse2.c
parse3.c
parse4.c
portable.c - functions for portability between operating systems.
portable.h - definitions for portability between operating systems and
architectures.
probe.c - a separate tool for observing the genebank, documented in probe.doc.
prototyp.h - all functions in Tierra are prototyped here.
queues.c - queue management functions for the slicer and reaper queues.
rambank.c - functions for managing the genebank. This module has benefited
from a lot of work by Tom Uffner.
rnd_inst.c - a separate tool for randomizing the mapping of opcodes to
instructions, documented in rnd_inst.doc
slicers.c - interchangeable slicer functions. This file contains some
experiments in the allocation of cpu time to creatures. This is
an interesting thing to play with.
soup_in1, soup_in2, soup_in3, soup_in4 - the ascii files read by Tierra on
startup, which contains all the global parameters that determine the
environment. There is a different file for each instruction set.
soup_in.h - this file defines the default values of all the soup_in variables,
and defines the instruction set by mapping the assember mnemonics to the
opcodes, parser functions, and executables.
tcurses.c - see under frontend.c above.
tcolors.cfg - an ascii file that maps the colors of the Tierra display.
This makes it easy to remap the colors. This is useful for odd displays
such as LCD where the colors do not map nicely.
tieout.c - a separate tool for observing the contents of the tierra.log file
documented in tieout.doc.
tieout.doc - documentation of the tieout tool.
tieout.exe - the DOS distribution also includes the executable version of
the tieout tool, so that it does not have to be compiled.
See tieout.doc.
tierra.c - this file contains the main function, and the central code
driving the virtual computer.
tierra.doc - this file, on line documentation.
tierra.exe - the DOS distribution includes the executable version of the
Tierra program, so that it does not have to be compiled.
tierra.h - this file contains all the structure definitions. It is a good
source of documentation for anyone trying to understand the code.
tierra.prj - the Turbo C V 2.0 project file for compiling Tierra.
If you are using a more recent Borland compiler, you must use the
Borland tool to build the binary project file. Include
the modules listed in this ascii version of the file.
tmonitor.c - this file contains support calls for the ALmond tool.
trand.c - random number generation routines from Numerical Recipes in C.
tsetup.c - routines called when Tierra starts up and comes down. Tom Uffner
has put some work into this module as well.
ttools.c - routins used in generating the histograms displayed by the
frontend.
gb1: - a subdirectory containing the genomes of the creatures saved
during a run of instruction set 1.
gb1/0080aaa.tie - the ancestor, written by a human, mother of all other
creatures.
gb1/0073aaa.tie - a new ancestor, written by a human, like 0080aaa, but
with some junk code removed.
gb1/0022aaa.tie - the smallest non-parasitic self-replicating creature to
evolve.
gb1/0045aaa.tie - the archtypical parasite.
gb1/0046aaa.tie - a symbiont with 0064aaa. These two were created by hand,
by splitting 0080aaa into two parts.
gb1/0064aaa.tie - a symbiont with 0046aaa. These two were created by hand,
by splitting 0080aaa into two parts.
gb1/0072aaa.tie - a phenomenal example of optimization through evolution,
involving the unrolling of the copy loop.
At the time of this writing, only a single run, of a billion
instructions, has been analyzed at the level of detail to reveal the specifics
of the ecological arms race that took place. That information is presented
in the manuscripts cited above. This run can be seen on the video, available
from Media Magic, P.O. Box 507, Nicasio, CA 94946. Some of the major players
in that run are listed below, and included in the software distribution,
in the chronological order of their appearance:
gb1/0079aab.tie - this creature dominated the soup at the time that a major
optimization took place, resulting in 0069aaa and its relatives.
gb1/0069aaa.tie - this fully self-replicating (non-parasitic) creature
appeared at a time that the dominant self-replicating size class was 79.
Thus this creature was a hopeful monster. It managed to shave 10
instructions off of the genome in a single genetic change. Actually this
creature claims that its immediate ancestor (parental genotype) was
0085aal. If this is true, then it actually shaved 16 instructions off
its size in a single genetic event. Unfortunately, 0085aal was not
preserved in the fossil record.
gb1/0069aab.tie - this host creature co-dominates the soup for a time, with
the parasite 0031aaa.
gb1/0031aaa.tie - this parasite co-dominates the soup for a time, with the
host 0069aab.
gb1/0070aaw.tie - while 0069aab and 0031aaa co-dominate the soup, this
creature, which is the first hyper-parasite, surges up. It and its
relatives drive the parasites to extinction.
gb1/0061aag.tie - this hyper-parasites dominates at the time that the first
social creatures appear.
gb1/0061aai.tie - this is the first social hyper-parasite to surge up. It is
social by virtue of using a template in its tail to jump back to its
head. This only works when it occurs in close aggregations of same
kind creatures.
gb1/0061aab.tie - this is the second social hyper-parasite to surge up. It
operates on the basis of a completely different social mechanism from
0061aai. In this creature the basis of cooperation is that the
template used to search for its tail, does not match to the tail
template. However, it two of these creatures abut in memory, the
union of the tail template of one with the head template of the next,
forms the template that is used to identify the tail. This algorithm
also only works when the creatures occur in aggregations. However, this
mechanism is more fragile than that used by 0061aai. For 0061aab, the
two cooperating creatures must exactly abut. For 0061aai, the two
cooperating creatures may have some space between them. Therefore
it is not surprising that the social mechanism of 0061aai is the one
prevailed.
gb1/0061aaa.tie - this is the social hyper-parasite that dominated at the time
that cheaters invaded.
gb1/0027aab.tie - this is the dominant cheater that invaded against the social
hyper-parasites.
gb1/0080.gen - the DOS distribution also includes the assembled binary version
of the 0080aaa.tie file, so that Tierra can be run without having to
assemble the genome file first.
gb1/arg.exe - the DOS distribution includes the executable version of the
assembler/disassembler.
gb1/opcode.map - a file read by Tierra, Beagle, and arg upon startup. This
file determines the correspondence between opcodes and the associated
machine instructions. This is a simple way to play with the definition
of the physics and chemistry of the simulator.
gb1/opcode.vir - a virginal copy of opcode.map. You may use the rnd_inst tool
to create new opcode maps, or you may create them by hand. You will
have to name the new versions opcode.map in order to assemble the
genomes based on the new mapping. You can keep opcode.vir in reserve.
gb1/opcode.scr - an opcode map where all pairs of most similar instructions
are mapped with a hamming distance of five, including the two nops.
gb1/probe.doc - documentation of the probe tool.
gb1/probe.exe - the DOS distribution includes the executable version of the
probe program, so that it does not have to be compiled.
gb1/rnd_inst.doc - documentation of the run_inst tool.
gb1/rnd_inst.exe - the DOS distribution includes the executable version of the
rnd_inst program, so that it does not have to be compiled.
gb2: - a subdirectory containing the genomes of the creatures written for
use with instruction set 2.
gb2/0005aaa.tie - non-replicating program written to test get().
gb2/0010aaa.tie - non-replicating program written to test put.
gb2/0011aaa.tie - non-replicating program written to test get with template.
gb2/0013aaa.tie - non-replicating program to test put with template.
gb2/0016aaa.tie - non-replicating program to test put.
gb2/0060aaa.tie - a parasite hand-made from the ancestor.
gb2/0095aaa.tie - the ancestor for use in instruction set 2.
gb2/opcode.map - the opcode map for use in instruction set 2.
gb2/opcode.vir - a virginal copy of the opcode map for use in
instruction set 2.
gb3: - a subdirectory containing the genomes of the creatures written for
use with instruction set 3.
gb3/0050aaa.tie - a parasite hand-made from the ancestor.
gb3/0093aaa.tie - the ancestor for use in instruction set 3.
gb3/opcode.map - the opcode map for use in instruction set 3.
gb3/opcode.vir - a virginal copy of the opcode map for use in
instruction set 3.
gb4: - a subdirectory containing the genomes of the creatures written for
use with instruction set 4.
gb4/0082aaa.tie - the ancestor for use in instruction set 4.
gb4/opcode.map - the opcode map for use in instruction set 4.
gb4/opcode.vir - a virginal copy of the opcode map for use in
instruction set 4.
td: - a subdirectory where a complete record of births and deaths will
be written.
td/break.1 - a file containing a record of births and deaths. A dummy file is
provided to hold the space.
td/diverse.c - a separate tool for observing the diversity of genotypes and
size classes in a run, documented in diverse.doc.
td/diverse.doc - documentation of the diversity tool.
td/diverse.exe - the DOS distribution also includes the executable version of
the diversity index tool, so that it does not have to be compiled.
See diverse.doc.
td/fragment.c - a separate tool for peparing fragments of a run for
observation by the beagle tool, documented in beagle.doc.
td/fragment.exe - the DOS distribution also includes the executable version of
the fragment tool. See beagle.doc.
td/run_info.c - a separate tool for peparing the output of a run for
observation by the beagle tool, documented in beagle.doc.
td/run_info.exe - the DOS distribution also includes the executable version of
the run_info tool. See beagle.doc.
7) SOUP_IN PARAMETERS
A typical soup_in file looks like the following:
/* begin soup_in file */
# tierra core: 9-9-92
# observational parameters:
BrkupSiz = 1024 size of output file in K, named break.1, break.2 ...
CumGeneBnk = 1 Use cumulative gene files, or overwrite
debug = 0 0 = off, 1 = on, printf statements for debugging
DiskOut = 1 output data to disk (1 = on, 0 = off)
GeneBnker = 1 turn genebanker on and off
GenebankPath = gb1/ path for genebanker output
hangup = 0 0 = exit on error, 1 = hangup on error for debugging
Log = 1 0 = no log file, 1 = write log file
MaxFreeBlocks = 700 initial number of structures for memory allocation
OutPath = td/ path for data output
RamBankSiz = 1000 array size for genotypes in ram, use with genebanker
SaveFreq = 50 frequency of saving core_out, soup_out and list
SavMinNum = 2 minimum number of individuals to save genotype
SavThrMem = .04 threshold memory occupancy to save genotype
SavThrPop = .04 threshold population proportion to save genotype
WatchExe = 0 mark executed instructions in genome in genebank
WatchMov = 0 set mov bits in genome in genebank
WatchTem = 0 set template bits in genome in genebank
# environmental variables:
alive = 500 how many generations will we run
DistFreq = -.3 frequency of disturbance, factor of recovery time
DistProp = .2 proportion of population affected by distrubance
DivSameGen = 0 cells must produce offspring of same genotype, to stop evolution
DivSameSiz = 0 cells must produce offspring of same size, to stop size change
DropDead = 5 stop system if no reproduction in the last x million instructions
GenPerBkgMut = 16 mutation rate control by generations ("cosmic ray")
GenPerFlaw = 64 flaw control by generations
GenPerMovMut = 8 mutation rate control by generations (copy mutation)
IMapFile = opcode.map map of opcodes to instructions, file in GenebankPath
MalMode = 1 0 = first fit, 1 = better fit, 2 = random preference,
# 3 = near mother's address, 4 = near dx address, 5 = near top of stack address
MalReapTol = 1 0 = reap by queue, 1 = reap oldest creature within MalTol
MalTol = 20 multiple of avgsize to search for free block
MateProb = 0.2 probability of mating at each mal
MateSearchL = 5 multiple of avgsize to search 0 = no limit
MateSizeEp = 2 size epsilon for potential mate
MateXoverProp = 1.0 proportion of gene to secect for crossover point
MaxMalMult = 3 multiple of cell size allowed for mal()
MemModeFree = 0 read, write, execute protection for free memory
MemModeProt = 2 rwx protect mem: 1 bit = execute, 2 bit = write, 4 bit = read
MinCellSize = 8 minimum size for cells
MinTemplSize = 1 minimum size for templates
MovPropThrDiv = .7 minimum proportion of daughter cell filled by mov
new_soup = 1 1 = this a new soup, 0 = restarting an old run
NumCells = 6 number of creatures and gaps used to inoculate new soup
PhotonPow = 1.5 power for photon match slice size
PhotonWidth = 8 amount by which photons slide to find best fit
PhotonWord = chlorophill word used to define photon
ReapRndProp = .3 top prop of reaper que to reap from
SearchLimit = 5
seed = 0 seed for random number generator, 0 uses time to set seed
SizDepSlice = 0 set slice size by size of creature
SlicePow = 1 set power for slice size, use when SizDepSlice = 1
SliceSize = 25 slice size when SizDepSlice = 0
SliceStyle = 2 choose style of determining slice size
SlicFixFrac = 0 fixed fraction of slice size
SlicRanFrac = 2 random fraction of slice size
SoupSize = 50000 size of soup in instructions
space 10000
0080aaa
space 10000
0045aaa
space 10000
0080aaa
/* end soup_in file */
The ordering of parameters is not important, except that the creatures
used to innoculate the soup must be listed at the end of the file, with no
blank lines between the creatures (if there is more than one). Parameters
that are not listed in the soup_in or soup_out files, default to the values
set in the soup_in.h file. The parameter names must begin at the first
character of the line (no spaces or other characters may appear at the
beginning of the line before the parameter name).
The meaning of each of these parameters is explained below:
# tierra core: 9-9-92
# observational parameters:
These lines are comments. The code that reads the soup_in and soup_out
files skips all blank lines, and all lines beginning with the # character.
The input parameters have been divided into two groups, the ``observational
parameters'' and the ``environmental variables''. The observational
parameters are listed first, and they do not affect the course of events
in the run. They only affect what kind of information we extract, and how
and where it is stored.
BrkupSiz = 1024 size of output file in K, named break.1, break.2 ...
If this value is set to zero (0) the record of births and deaths will
be written to a single file named tierra.run. However, if BrkupSiz has a
non-zero value, birth and death records will be written to a series of files
with the names break.1, break.2, etc. Each of these files will have the
size specified, in K (1024 bytes). The value 1024 indicates that the
break files will each be one megabytes in size. The output file(s) will
be in the path specified by OutPath (see below). See also DiskOut.
Please note that these files can get quite large.
CumGeneBnk = 0 0 = start new genebank, 1 = use cumulative genebank
This parameter gives us the option of starting a fresh genebank, or
of working with a cumulative genebank. If you are planning to compare the
results of a series of runs, you should consider using a cumulative
genebank. With a cumulative genebank, if the identical genotypes appear in
successive runs, they will have identical names.
debug = 0 0 = off, 1 = on, printf statements for debugging
This is used during code development, to turn on and off print statements
for debugging purposes.
DiskOut = 1 output data to disk (1 = on, 0 = off)
If this parameter is set to zero (0), no birth and death records will
be saved. Any other value will cause birth and death records to be saved
to a file whose name is discussed under BrkupSiz above, in the path discussed
under OutPath below.
GeneBnker = 1 turn genebanker on and off
The parameter turns the genebanker on and off. The value zero turns
the genebanker off, any other value turns it on. With the genebanker off,
the record of births and deaths will contain the sizes of the creatures,
but not their genotypes. Also no genomes will be saved in the genebank.
When the genebanker is turned on, the record of births and deaths will
contain a three letter unique name for each genotype, as well as the size
of the creatures. Also, any genome whose frequency exceeds the thresholds
SavThrMem and SavThrPop (see below) will be saved to the genebank, in
the path indicated by GenebankPath (see below).
GenebankPath = gb1/ path for genebanker output
This is a string variable which describes the path to the genebank
where the genomes will be saved. The path name should be terminated by
a forward slash.
hangup = 0 0 = exit on error, 1 = hangup on error for debugging
If an error occurs which is serious enough to bring down the system,
having hangup set to 1 will prevent the program from exiting. In this case,
the program will hang in a simple loop so that it remains active for
debugging purposes. Set this parameter only if you are running the
simulator under a debugger.
Log = 1 0 = no log file, 1 = write log file
If Log is non zero, a disk file called tierra.log will be written
mirroring all major simulation states. If disk space is a problem,
please set Log = 0. This can be changed at any point in the run.
MaxFreeBlocks = 700 initial number of structures for memory allocation
There is an array of structures used for the virtual memory allocator.
This parameter sets the initial size of the allocated array, at startup.
OutPath = td/ path for data output
The record of births and deaths will be written to files in a directory
specified by OutPath. See BrkupSiz above for a discussion of the name of
the file(s) containing the birth and death records.
RamBankSiz = 1000 array size for genotypes in ram, use with genebanker
The parameter RamBankSiz places a limit on how many genotypes will be
stored in RAM. When the number of genotypes in the genebank exceeds the
RamBankSiz, if there are genotypes with permanent names in the rambank that
are extinct in the soup, they will be swapped out to disk as .gen files.
However, if the number of living genotypes in the soup is greater than
RamBankSiz, then the value of RamBankSiz is essentially ignored (actually it
is treated as if the value were equal to the number of living genotypes).
Living genotypes are never swapped out to disk, only extinct ones. Under DOS,
the parameter RamBankSiz is of little importance, because it is raised if
there are more living genotypes, and if the simulator uses all 640K of RAM,
RamBankSiz is lowered so that genotypes will be swapped out to avoid running
out of memory. Under DOS, with a SoupSize of 60,000, the rambank can hold
somewhere between 600 and 800 genotypes. If there are more than that, the
system will thrash terribly, swapping genomes in and out from disk to RAM
and back. Under Unix, this parameter does determine how many genotypes
will be held in the rambank, as long as there are not more than RamBankSiz
living genotypes.
SaveFreq = 50 frequency of saving core_out, soup_out and list
Every SaveFreq million instructions, the complete state of the
virtual machine is saved. This is a useful feature for long runs, so that
the system can be restarted if it is interrupted for some reason.
SavMinNum = 2 minimum number of individuals to save genotype
A genotype will not be given a permanent name and saved to disk in
a .gen file unless it has a population of SavMinNum or more adult individuals.
SavThrMem = .04 threshold memory occupancy to save genotype
SavThrPop = .04 threshold population proportion to save genotype
These two variables affect the rate at which genomes are assigned
permanent names and saved to disk. In a soup of 60,000 instructions
(SoupSize = 60000), thresholds of .05 will save about one genome per two
million instructions, thresholds of .04 will save about one genome per
million instructions, thresholds of .03 will save about 1.2 genomes per
million instructions, thresholds of .02 will save about three genome per
million instructions. On DOS systems, care should be taken to avoid
saving too many genomes as they will clog the memory and bring the system
down (See KNOWN BUGS below).
SavThrMem = .04 threshold memory occupancy to save genotype
If a particular genotype fills SavThrMem of the total space available
in the soup, it will be assigned a permanent unique name, and saved to disk.
Note that an adjustment is made because only adult cells are counted, and
embryos generally fill half the soup. Therefore adult cells of a particular
genotype need only occupy SavThrMem * 0.5 of the space to be saved.
SavThrPop = .04 threshold population proportion to save genotype
If a particular genotype amounts to SavThrPop of the total population
of (adult) cells in the soup, it will be assigned a permanent unique name,
and saved to disk.
WatchExe = 0 mark executed instructions in genome in genebank
WatchMov = 0 set mov bits in genome in genebank
WatchTem = 0 set template bits in genome in genebank
WARNING: setting any of these three watch parameters will engage cpu
intensive observational software that will substantially slow down any
simulation. Engage these tools only if you plan to use the data that
they generate.
WatchExe = 0 mark executed instructions in genome in genebank
If the genebank is on, setting this parameter to a non-zero value
will turn on a watch of which instructions are being executed in each
permanent genotype (this helps to distinguish junk code from code that is
executed), and also, who is executing whose instructions. There is a bit
field in the GList structure (bit definitions are defined in the tierra.h
module) that keeps track of whether a creature executes its own instructions,
those of another creature, if another creature executes this creatures
instructions, etc:
bit 2 EXs = executes own instructions (self)
bit 3 EXd = executes daughter's instructions
bit 4 EXo = executes other cell's instructions
bit 5 EXf = executes instructions in free memory
bit 6 EXh = own instructions are executed by other creature (host)
WatchMov = 0 set mov bits in genome in genebank
If the genebank is on, setting this parameter to a non-zero value
will turn on a watch of who moves whose instructions and where. This
information is recorded in the bit field in GList structure:
bit 17 MFs = moves instruction from self
bit 18 MFd = moves instruction from daughter
bit 19 MFo = moves instruction from other cell
bit 20 MFf = moves instruction from free memory
bit 21 MFh = own instructions are moved by other creature (host)
bit 22 MTs = moves instruction to self
bit 23 MTd = moves instruction to daughter
bit 24 MTo = moves instruction to other cell
bit 25 MTf = moves instruction to free memory
bit 26 MTh = is written on by another creature (host)
bit 27 MBs = executing other creature's code, moves inst from self
bit 28 MBd = executing other creature's code, moves inst from daughter
bit 29 MBo = executing other creature's code, moves inst from other cell
bit 30 MBf = executing other creature's code, moves inst from free memory
bit 31 MBh = other creature uses another cpu to move your instructions
WatchTem = 0 set template bits in genome in genebank
If the genebank is on, setting this parameter to a non-zero value
will turn on a watch of whose templates are matched by whom. This
information is recorded in the bit field in the GList structure:
bit 7 TCs = matches template complement of self
bit 8 TCd = matches template complement of daughter
bit 9 TCo = matches template complement of other
bit 10 TCf = matches template complement of free memory
bit 11 TCh = own template complement is matched by other creature (host)
bit 12 TPs = uses template pattern of self
bit 13 TPd = uses template pattern of daughter
bit 14 TPo = uses template pattern of other
bit 15 TPf = uses template pattern of free memory
bit 16 TPh = own template pattern is used by other creature (host)
# environmental variables:
This is a comment indicating the beginning of the environmental
parameters. These parameters affect the course of evolution and
ecology of the run.
alive = 500 how many generations will we run
This tells the simulator how long to run, in generations.
DistFreq = -.3 frequency of disturbance, factor of recovery time
The frequency of disturbance, as a factor of recovery time. This and
the next option control the pattern of disturbance. If you do not want the
system to be disturbed, set DistFreq to a negative value. If DistFreq has
a non-negative value, when the soup fills up the reaper will be invoked to
kill cells until it has freed a proportion DistProp of the soup. The system
will then keep track of the time it takes for the creatures to recover from
the disturbance by filling the soup again. Let's call this recovery time:
rtime. The next disturbance will occur: (rtime X DistFreq) after recovery
is complete. Therefore, if DistFreq = 0, each disturbance will occur
immediately after recovery is complete. If DistFreq = 1, the time between
disturbances will be twice the recovery time, that is, the soup will remain
full for a period equal to the recovery time, before another disturbance hits.
DistProp = .2 proportion of population affected by distrubance
The proportion of the soup that is freed of cells by each disturbance.
The disturbance occurs by invoking the reaper to kill cells until the total
amount of free memory is greater than or equal to: (DistProp X SoupSize).
Note that cells are not killed at random, they are killed off the top of the
reaper queue, except as modified by the ReapRndProp variable.
DivSameGen = 0 must produce offspring of same genotype, to stop evolution
This causes attempts at cell division to abort if the offspring is of
a genotype different from the parent. This can be used when the mutation
rates are set to zero, to prevent sex from causing evolution.
DivSameSiz = 0 cells must produce offspring of same size, to stop size change
Like DivSameGen, but cell division aborts only if the offspring is of
a different size than the parent. Changes in genotype are not prevented,
only changes in size are prevented.
DropDead = 5 stop system if no reproduction in the last x million instructions
Sometimes the soup dies, such as when mutation rates are too high.
This parameter watches the time elapsed since the last cell division, and
brings the system down if it is greater than DropDead million instructions.
GenPerBkgMut = 16 mutation rate control by generations ("cosmic ray")
Control of the background mutation rate ("cosmic ray"). The value 16
indicates that in each generation, roughly one in sixteen cells will be hit
by a mutation. These mutations occur completely at random, and also affect
free space where there are no cells. If the value of GenPerBkgMut were 0.5,
it would mean that in each generation, each cell would be hit by roughly
two mutations.
GenPerFlaw = 64 flaw control by generations
Control of the flaw rate. The value 64 means that in each generation,
roughly one in sixty-four individuals will experience a flaw. Flaws cause
instructions to produce results that are in error by plus or minus one,
in some sense. If the value of GenPerFlaw were 0.5, it would mean that in
each generation, each cell would be hit by roughly two flaws. This parameter
has a profound effect on the rate at which creatures shrink in size under
selection for small size.
GenPerMovMut = 8 mutation rate control by generations (copy mutation)
Control of the move mutation rate (copy mutation). The value 8
indicates that in each generation, roughly one in eight cells will be hit
by a mutation. These mutations only affect copies of instructions made
during replication (by the double indirect mov instruction). When an
instruction is affected by a mutation, one of its five bits is selected
at random and flipped. If the value of GenPerMovMut were 0.5, it would
mean that in each generation, each cell would be hit by roughly two
move mutations.
IMapFile = opcode.map map of opcodes to instructions, file in GenebankPath
Names the file containing the Instruction Set Map. The simulator will
look for this file in the GenebankPath. Please remember to only use
genebank files created with compatible map files. If you specify the
name of the map file as: IMapFile = -.map
then the file will not be read, and the default opcode mapping will be used.
MalMode = 1 0 = first fit, 1 = better fit, 2 = random preference,
# 3 = near mother's address, 4 = near dx address, 5 = near top of stack address
MalReapTol = 1 0 = reap by queue, 1 = reap oldest creature within MalTol
MalTol = 20 multiple of avgsize to search for free block
Memory allocation control:
Chris Stephenson of the IBM T. J. Watson Research Center
(cjs@yktem.vnet.ibm.com) has contributed a new memory allocator to Tierra,
based on a cartesian tree scheme that he developed. The new allocator is
more efficient than the old and provides a variety of options for controlling
where offspring are placed. Free blocks are maintained in a two dimensional
binary tree, in which daughter blocks can not be larger than their parents,
and blocks are ordered from left to right according to their position in
memory.
MalMode = 1 0 = first fit, 1 = better fit, 2 = random preference,
# 3 = near mother's address, 4 = near dx address, 5 = near top of stack address
This variable provides six options for controlling where offspring are
placed in the soup. 0 = first fit: this is the method used in versions of
Tierra before V4.0. In this method, free blocks are checked starting with
the leftmost block in memory, and the first block large enough to accomodate
the request us used.
1 = better fit: this method starts from the root of the tree, and if the
free block is larger than the request, it moves to the daughter free block
that is closest to the size of the request, this provides a better fit of the
request to the free block, but not necessarily the best fit.
2 = random preference: blocks of memory are allocated at positions
selected at random from the soup.
3 = near mother's address: blocks of memory will be placed near the
creature making the request (daughters will be placed near mothers), within
the specified tolerance (see MalReapTol and MalTol).
4 = near dx address: blocks of memory will be placed near the address
specified in the dx register, within the specified tolerance (see MalReapTol
and MalTol).
5 = near top of stack address: blocks of memory will be placed near the
address specified at the top of the stack, within the specified tolerance
(see MalReapTol and MalTol).
MalReapTol = 1 0 = reap by queue, 1 = reap oldest creature within MalTol
When a position and tolerance is specified for the daughter cell, the
reaper will probably have to free memory in that area to service the request.
If the reaper kills strictly by the reaper queue, without regard to location,
it may kill many cells before killing in the vicinity of the request. This
creates situations in which it is difficult for populations to grow, because
the reaper kills much more than is necessary. To resolve this problem, an
option is added in which the reaper still uses the queue, but it only kills
cells that fall within the tolerance region of the requested location.
The variable MalReapTol turns this option on.
MalTol = 20 multiple of avgsize to search for free block
This determines a region of tolerance within which a request for a
specific location of memory can be serviced. It is specified as a multiple
of the average creature size.
MateProb = 0.2 probability of mating at each mal
MateSearchL = 5 multiple of avgsize to search 0 = no limit
MateSizeEp = 2 size epsilon for potential mate
MateXoverProp = 1.0 proportion of gene to secect for crossover point
Haploid Sex (aka crosover):
Haploid sex has been implemented in mimicry of Walter Tackett's
(tackett@ipld01.hac.com) implementation
in his private version of Tierra. This form of sex is more in the spirit
of the genetic algorithm that Tierra, in that the creatures don't have any
choice in the matter, they are forced to have sex at god's whim. However,
it does have the virtue of creating something like a biological species, in
that individuals close in size exchange genes creating an evolving gene pool.
The Tierran reproduction mechanism of repeatedly copying data from an offset
in the mother cell to an offset (presumably) in a daughter cell, can be
modified to easily allow ``crossover'' of genetic data. Crossover seems to
provide a means of increasing the rate of evolution. It can be thought of as
a higher mutation rate which can ``favor good genes''. This affect will be
made clear by euclidating how haploid crossover ``sex'' is done in the
Tierran model.
At the time a creature allocates a block of memory for a daughter (mal),
Tierra decides if this creature will mate, using the soup_in variable
MateProb (0.0 = no mating at all, 100.0 = creatures will try to mate 100% of
the time). If the creature is not selected for mating, Tierra continues
as normal until the next time this creature executes a memory allocate.
If this creature does mate, a point in the creatures genome, selected
randomly from the proportion MateXoverProp of the genome size, is chosen
as the ``cross over'' point. Also whether we use the first part, or the last
part of the genome is also chosen (chances are 1 in 2 for each "half").
No further actions are taken until the creature begins moving data
(presumably from herself, to the daughter cell).
At the time that the creature first copies an instruction of its genome to
a new location (presumably in the daughter cell) using the moviab instruction,
a mate is chosen. The mate is chosen at this point because the chances
of the mate dying before the cross over is complete are smaller now than
at the time of the memory allocation. A potential mate must be similar
in size to the creature trying to mate (creature's size +/- MateSizeEp),
and may not be of the exact same genotype (eg. 0080aaa can't mate
with another 0080aaa, there would be no point in the operation). The
search radius from the creature to the mate is determinied by MateSearchL.
MateSearchL is multiplied by the Average Creature Size to determine the
number of bytes to search. NOTE - in release 3.13, only FORWARD ( towards
higher addresses ) in done. This will become bidirectional in future
releases. Even if a creature is selected for mating a the time of execution
of the mal instruction, it may not actually mate, because it may not be
able to find a mate of a different genotype, in the correct size range, and
within the search radius.
If we have chosen to contribute our first section to the daughter, our mate
will contribute the second section. In that case, the copying of the genome
will proceed normally until we reach the crossover point. After that, each
time we execute the moviab instruction, we will copy an instruction from
the genome of our mate rather than from our genome. The creature is
``unaware'' of this process. For the sake of clarity an example is given:
Example A - Mate contributes second half :
creature ( 10 bytes long ) & mate ( 9 bytes long )
[ step 0 ] mal - selects Xover point of 5, second half
[ step 1 ] move bytes at offset 0-4 from self
[ step 2 ] move byte at offset 5-9 from mate
[ step 3 ] move byte after last byte of mate ( 10th byte )
Example B - Mate contributes first half :
creature ( 10 bytes long ) & mate ( 11 bytes long )
[ step 0 ] mal - selects Xover point of 5, first half
[ step 1 ] move bytes at offset 0-4 from mate
[ step 2 ] move byte at offset 5-10 from self
MaxMalMult = 3 multiple of cell size allowed for mal()
When a cell attempts to allocate a second block of memory (presumably
to copy its genome into), this parameter is checked. If the amount of memory
requested is greater than MaxMalMult times the size of the mother cell, the
request will fail. This prevents mutants from requesting the entire soup,
which would invoke the reaper to cause a massive kill off.
MemModeFree = 0 read, write, execute protection for free memory
When memory is free, not owned by any creature, it will have this
protection. 0 means that anybody can read, write, or execute the memory.
MemModeProt = 2 rwx protect mem: 1 bit = execute, 2 bit = write, 4 bit = read
When memory is owned by a creature, it will have this protection.
2 means that write privelage is protected, so only the owner can write on
the memory. The owner always has read write, execute privelages on owned
memory. However, other creatures must obey the MemModeProt protection,
they are excluded from the activities whose bits are set. Be aware that
each of the three forms of memory protection is turned on at compile time
by the appropriate definitions:
#define READPROT /* define to implement read protection of soup */
#define WRITEPROT /* define to implement write protection of soup */
#define EXECPROT /* define to implement execute protection of soup */
The version distributed on disk only has write protection turned on.
To implement the other forms of protection you will need to comment in
the appropriate line in the configur.h file, and recompile. Once the
relevant form of protection is defined at compile time, the MemModeProt
variable still determines whether it is actually used. We do this because
each form of memory protection is costly of cpu time, and write protection
is the only one likely to be used.
MinCellSize = 8 minimum size for cells
When a cell attempts to divide, this parameter is checked. If the
daughter cell would be smaller than MinCellSize instructions, divide will
fail. The reason this is needed is that with no lower limit, there is a
tendency for some mutants to spawn large numbers of very small cells.
MinTemplSize = 1 minimum size for templates
When an instruction (like jump) attempts to use a template, this
parameter is checked. If the actual template is smaller than MinTemplSize
instructions, the instruction will fail. This is a matter of taste.
MovPropThrDiv = .7 minimum proportion of daughter cell filled by mov
When a cell attempts to divide, this parameter is checked. If the
mother cell has moved less than MovPropThrDiv times the mother cell size, of
instructions into the daughter cell, cell division will abort. A value of .7
means that the mother must at least fill the daughter 70% with instructions
(though all these instructions could have been moved to the same spot in
the daughter cell). The reason this parameter exists is that without it,
mutants will attempt to spew out large numbers of empty cells.
new_soup = 1 1 = this a new soup, 0 = restarting an old run
This value is checked on startup, to determine if this is a new soup,
or if this is restarting an old run where it left off. When the system
comes down, all soup_in parameter (and many other global variables) are
saved in a file called soup_out. The value of new_soup is set to 0 in
soup_out. In order to restart an old run, just use soup_out as the input
file rather than soup_in. This is done by using soup_out as a command line
parameter at startup: tierra soup_out
NumCells = 3 number of creatures and gaps used to inoculate new soup
This parameter is checked at startup, and the system will look for a
list of NumCells creatures at the end of the soup_in file. The value 3
indicates that the soup will initially be innoculated by three cells.
However, NumCells also counts gaps that are placed between cells (without
gaps, all cells are packed together at the bottom of the soup at startup).
The gap control feature does not work at present, so don't use it. Notice
that after the list of parameters in the soup_in file, there is a blank
line, followed by a list of genotypes. The system will read the first
NumCells genotypes from the list, and place them in the soup in the same
order that they occur in the list.
PhotonPow = 1.5 power for photon match slice size
If SliceStyle (see below) is set to the value 1, then the allocation
of CPU cycles to creatures is based on a photon - chlorophyll metaphor.
Imagine that photons are raining down on the soup at random. The cell hit
by the photon gets a time slice that is proportional to the goodness of fit
between the pattern of instructions that are hit, and an arbitrary pattern
(defined by PhotonWord, see below).
The template of instructions defined by PhotonWord is laid over the
sequence of instructions at the site hit by the photon. The number of
instructions that match between the two is used to determine the slice
size. However, the number of matching instructions is raised to the power
PhotonPow, to calculate the slice size.
PhotonWidth = 8 amount by which photons slide to find best fit
When a photon hits the soup, it slides a distance PhotonWidth, counting
the number of matching characters at each position, and the slice size will
be equal to the number of characters in the best match (raised to the power
PhotonPow, see above). If PhotonWidth equals 8, the center of the template
will start 4 instructions to the left of the site hit by the photon, and
slide to 4 instructions to the right of the site hit.
PhotonWord = chlorophill word used to define photon
This string determines the arbitrary pattern that absorbs the photon.
It uses a base 32 numbering system: the digits 0-9 followed by the characters
a-v. The characters w, x, y and z are not allowed (that is why chlorophyll
is misspelled). The string may be any length up to 79 characters.
ReapRndProp = .3 top prop of reaper que to reap from
This parameter determines the degree to which mortality is random.
If ReapRndProp is set to zero, the reaper always kills the creature at the
top of the reaper queue. If ReapRndProp is set to one, the reaper kills
at random. If ReapRndProp is set to 0.3, the reaper will kill a cell selected
at random from the top 30% of the reaper queue.
SearchLimit = 5
This parameter controls how far instructions may search to match
templates. The value five means that search is limited to five times the
average adult cell size. The actual distance is updated every million
instructions.
seed = 0 seed for random number generator, 0 uses time to set seed
The seed for the random number generator. If you use the value zero,
the system clock combined with the random number generator is used to set the
seed. If you use any other value, it will be the seed. The starting seed
(even when provided by the clock) will be written to standard output, the
tierra.log file, and also saved in the soup_out file when the simulator comes
down. By using the original seed and all the same initial parameter settings
in soup_in, a run may be repeated exactly.
SizDepSlice = 0 set slice size by size of creature
This determines a major slicer option. If this parameter is set to
zero, the slice size will either be a constant set by SliceSize (see below)
or a uniform random variate, or a mix of the two. The mix is determined by
the relative values of SlicFixFrac and SlicRanFrac (see below). The actual
slice size will be:
(SlicFixFrac * SliceSize) + (tlrand() % (I32s) ((SlicRanFrac * SliceSize) + 1))
If SizDepSlice is set to a non-zero value, the slice size will be
proportional to the size of the genome. In this case, the base slice size
will be the genome size raised to the power SlicePow (see below). To clarify
let slic_siz = genome_size ^ SlicePow, the actual slice size will be:
(SlicFixFrac * slic_siz) + (tlrand() % (I32s) ((SlicRanFrac * slic_siz) + 1))
SlicePow = 1 set power for slice size, use when SizDepSlice = 1
This parameter is only used when SizDepSlice = 1. In this case, the
genome size is raised to the power SlicePow to determine the slice size
(see algorithm under SizDepSlice above). If SlicePow = 1, the run will be
size neutral, selection will not be biased toward either large or small
creatures (the probability of an instruction being executed is not dependent
on the size of the genome it is located in). If SlicePow > 1, selection will
favor larger genomes. If SlicePow < 1, selection will favor small genomes.
SliceSize = 25 slice size when SizDepSlice = 0
This parameter determines the base slice size when SizDepSlice = 0.
The actual slice size in this case depends on the values of SlicFixFrac
and SlicRanFrac (see below). The way the slice size is actually calculated
is explained under SizDepSlice above.
SliceStyle = 2 choose style of determining slice size
The slicer is a pointer to function, and the function actually used
is determined by this parameter. At present there are three choices (0-2).
The pointer to function is assigned in the setup.c module, and the slicer
functions themselves are contained in the slicers.c module.
0 = SlicerQueue() - slice sizes without a random component
1 = SlicerPhoton() - slice size based on photon interception metaphor
2 = RanSlicerQueue() - slice size with a fixed and a random component
SlicFixFrac = 0 fixed fraction of slice size
When SliceStyle = 2, the slice size has a fixed component and a random
component. This parameter determines the fixed component as a multiple
of SliceSize, or genome_size ^ SlicePow.
SlicRanFrac = 2 random fraction of slice size
When SliceStyle = 2, the slice size has a fixed component and a random
component. This parameter determines the random component as a multiple
of SliceSize, or genome_size ^ SlicePow.
SoupSize = 50000 size of soup in instructions
This variable sets the size of the soup, measured in instructions.
space 10000
0080aaa
space 10000
0045aaa
space 10000
0080aaa
This is the list of cells that will be loaded into the soup when
the simulator starts up. This example indicates that three cells will
be loaded at startup, the ancestor 0080aaa alternating with the parasite
0045aaa. The line:
space 10000
indicates that a space of ten thousand instructions will be left before the
cell that follows. Therefore the first cell will start after 10,000 blank
instructions, and there will be 10,000 instructions between each of the
three creatures.
8) THE INSTRUCTION SETS
8.1) Synopsis of the Four Sets
8.2) Details of Set 1
8.3) New Features in Sets 2 through 4
8.1) Synopsis of the Four Sets
At present, four instruction sets have been implemented. They are
INST == 1, INST == 2, INST == 3, and INST == 4. The associations of opcodes,
mnemonics, parsing functions and executables that define this instruction set
are determined by an array of the following structures (defined in the
tierra.h file):
typedef struct { /* structure for instruction set definitions */
I8s op; /* op code */
I8s mn[9]; /* assembler mnemonic */
void (*execute) P_((Cell *)); /* pointer to execute function */
void (*parse) P_((Cell *)); /* pointer to parse function */
} InstDef; /* this structure is defined in tierra.h */
InstDef id[INSTNUM];
The parsing functions are contained in the parse1.c, parse2.c, parse3.c,
and parse4.c files, which are conditionally included by the parse.c file.
The executables are contained in the instruct.c file.
******************************************************************************
Instset #1 the original instruction set, designed and implemented by
Tom Ray.
This instruction set operates on a cpu based on the following definitions:
#define INSTBITNUM 5
#define INSTNUM 32 /* INSTNUM = 2 ^ INSTBITNUM */
#define INST 1
#define STACK_SIZE 10
#define ALOC_REG 4
#define NUMREG 4 /* NUMREG = ALOC_REG */
typedef struct { /* structure for registers of virtual cpu */
Reg re[ALOC_REG]; /* array of registers */
Reg sp; /* stack pointer */
Reg st[STACK_SIZE]; /* stack */
Reg ip; /* instruction pointer */
I8s fl; /* flag */
} Cpu;
The four registers in the re[] array are referred to as follows:
AX = re[0], BX = re[1], CX = re[2], DX = re[3].
No Operations: 2
nop0
nop1
Memory Movement: 11
pushax (push AX onto stack)
pushbx (push BX onto stack)
pushcx (push CX onto stack)
pushdx (push DX onto stack)
popax (pop from stack into AX)
popbx (pop from stack into BX)
popcx (pop from stack into CX)
popdx (pop from stack into DX)
movcd (DX = CX)
movab (BX = AX)
movii (move from ram [BX] to ram [AX])
Calculation: 9
sub_ab (CX = AX - BX)
sub_ac (AX = AX - CX)
inc_a (increment AX)
inc_b (increment BX)
inc_c (increment CX)
dec_c (decrement CX)
zero (zero CX)
not0 (flip low order bit of CX)
shl (shift left all bits of CX)
Instruction Pointer Manipulation: 5
ifz (if CX == 0 execute next instruction, otherwise, skip it)
jmp (jump to template)
jmpb (jump backwards to template)
call (push IP onto the stack, jump to template)
ret (pop the stack into the IP)
Biological and Sensory: 5
adr (search outward for template, put address in AX, template size in CX)
adrb (search backward for template, put address in AX, template size in CX)
adrf (search forward for template, put address in AX, template size in CX)
mal (allocate amount of space specified in CX)
divide (cell division)
Total: 32 instructions
The following array defines the associations of opcodes, mnemonics,
executable functions, and parsing functions (in that order) used to
implement instruction set #1.
{0x00, "nop0", nop, pnop},
{0x01, "nop1", nop, pnop},
{0x02, "not0", not0, pnot0},
{0x03, "shl", shl, pshl},
{0x04, "zero", movdd, pzero},
{0x05, "ifz", ifz, pifz},
{0x06, "sub_ab", math, psub_ab},
{0x07, "sub_ac", math, psub_ac},
{0x08, "inc_a", math, pinc_a},
{0x09, "inc_b", math, pinc_b},
{0x0a, "dec_c", math, pdec_c},
{0x0b, "inc_c", math, pinc_c},
{0x0c, "pushax", push, ppushax},
{0x0d, "pushbx", push, ppushbx},
{0x0e, "pushcx", push, ppushcx},
{0x0f, "pushdx", push, ppushdx},
{0x10, "popax", pop, ppopax},
{0x11, "popbx", pop, ppopbx},
{0x12, "popcx", pop, ppopcx},
{0x13, "popdx", pop, ppopdx},
{0x14, "jmp", adr, ptjmp},
{0x15, "jmpb", adr, ptjmpb},
{0x16, "call", tcall, ptcall},
{0x17, "ret", pop, pret},
{0x18, "movcd", movdd, pmov_dc},
{0x19, "movab", movdd, pmov_ba},
{0x1a, "movii", movii, pmovii},
{0x1b, "adr", adr, padr},
{0x1c, "adrb", adr, padrb},
{0x1d, "adrf", adr, padrf},
{0x1e, "mal", malchm, pmal},
{0x1f, "divide", divide, pdivide}
******************************************************************************
Instset #2 Based on a design suggested by Kurt Thearling of Thinking Machines,
and implemented by Tom Ray. The novel feature of this instruction set
is the ability to reorder the relative positions of the registers, using
the AX, BX, CX and DX instructions. There are in essence, two sets of
registers, the first set contains the values that the instruction set
operates on, the second set points to the first set, in order to determine
which registers any operation will act on.
Let the four registers containing values be called AX, BX, CX and DX.
Let the four registers pointing to these registers be called R0, R1, R2
and R3. When a virtual cpu is initialized, R0 points to AX, R1 to BX,
R2 to CX and R3 to DX. The instruction "add" does the following:
(R2 = R1 + R0). Therefore CX = BX + AX. However, if we execute the DX
instruction, the R0 points to DX, R1 to AX, R2 to BX and R3 to CX. Now
if we execute the add instruction, we will perform: BX = AX + DX. If we
execute the DX instruction again, R0 points to DX, R1 to DX, R2 to AX,
and R3 to BX. Now the add instruction would perform: AX = DX + DX.
Now the registers can be returned to their original configuration by
executing the following three instructions in order: cx, bx, ax.
This instruction set operates on a cpu based on the following definitions:
#define INSTBITNUM 5
#define INSTNUM 32 /* INSTNUM = 2 ^ INSTBITNUM */
#define INST 2
#define STACK_SIZE 10
#define ALOC_REG 8
#define NUMREG 4 /* NUMREG = ALOC_REG / 2 */
#define GETBUFSIZ 3
#define PUTBUFSIZ 3
typedef struct { /* structure for registers of virtual cpu */
Reg re[ALOC_REG]; /* array of registers */
Reg sp; /* stack pointer */
Reg st[STACK_SIZE]; /* stack */
Reg gb[GETBUFSIZ+3]; /* input buffer */
Reg pb[PUTBUFSIZ+3]; /* output buffer */
Reg ip; /* instruction pointer */
I8s fl; /* flag */
} Cpu;
The eight registers in the re[] array are referred to as follows:
AX = re[0], BX = re[1], CX = re[2], DX = re[3],
R0 = re[4], R1 = re[5], R2 = re[6], R3 = re[7].
No Operations: 2
nop0
nop1
Memory Movement: 12
ax (make AX R0, R1 = R0, R2 = R1, R3 = R2, R3 is lost)
bx (make BX R0, R1 = R0, R2 = R1, R3 = R2, R3 is lost)
cx (make CX R0, R1 = R0, R2 = R1, R3 = R2, R3 is lost)
dx (make DX R0, R1 = R0, R2 = R1, R3 = R2, R3 is lost)
movdd (move R1 to R0)
movdi (move from R1 to ram [R0])
movid (move from ram [R1] to R0)
movii (move from ram [R1] to ram [R0])
push (push R0 onto stack)
pop (pop from stack into R0)
put (write R0 to output buffer, three modes:
#ifndef ICC: write R0 to own output buffer
#ifdef ICC: write R0 to input buffer of cell at address R1,
or, if template, write R0 to input buffers of all creatures within
PutLimit who have the complementary get template)
get (read R0 from input port)
Calculation: 8
inc (increment R0)
dec (decrement R0)
add (R2 = R1 + R0)
sub (R2 = R1 - R0)
zero (zero R0)
not0 (flip low order bit of R0)
shl (shift left all bits of R0)
not (flip all bits of R0)
Instruction Pointer Manipulation: 5
ifz (if R1 == 0 execute next instruction, otherwise, skip it)
iffl (if flag == 1 execute next instruction, otherwise, skip it)
jmp (jump to template, or if no template jump to address in R0)
jmpb (jump back to template, or if no template jump back to address in R0)
call (push IP + 1 onto the stack; if template, jump to complementary templ)
Biological and Sensory: 5
adr (search outward for template, put address in R0, template size in R1,
and offset in R2, start search at offset +- R0)
adrb (search backward for template, put address in R0, template size in R1,
and offset in R2, start search at offset - R0)
adrf (search forward for template, put address in R0, template size in R1,
and offset in R2, start search at offset + R0)
mal (allocate amount of space specified in R0, prefer address at R1,
if R1 < 0 use better fit, place address of allocated block in R0)
divide (cell division, the IP is offset by R0 into the daughter cell, the
values in the four CPU registers are transferred from mother to
daughter, but not the stack. If !R1, eject genome from soup)
Total: 32 instructions
The following array defines the associations of opcodes, mnemonics,
executable functions, and parsing functions (in that order) used to
implement instruction set #2.
{0x00, "nop0", nop, pnop},
{0x01, "nop1", nop, pnop},
{0x02, "ax", regorder, pax},
{0x03, "bx", regorder, pbx},
{0x04, "cx", regorder, pcx},
{0x05, "dx", regorder, pdx},
{0x06, "movdd", movdd, pmovdd},
{0x07, "movdi", movdi, pmovdi},
{0x08, "movid", movid, pmovid},
{0x09, "movii", movii, pmovii},
{0x0a, "push", push, ppush},
{0x0b, "pop", pop, ppop},
{0x0c, "put", put, pput},
{0x0d, "get", get, pget},
{0x0e, "inc", math, pinc},
{0x0f, "dec", math, pdec},
{0x10, "add", math, padd},
{0x11, "sub", math, psub},
{0x12, "zero", movdd, pzero},
{0x13, "shl", shl, pshl},
{0x14, "not0", not0, pnot0},
{0x15, "not", not, pnot},
{0x16, "ifz", ifz, pifz},
{0x17, "iffl", ifz, piffl},
{0x18, "jmp", adr, ptjmp},
{0x19, "jmpb", adr, ptjmpb},
{0x1a, "call", tcall, ptcall},
{0x1b, "adr", adr, padr},
{0x1c, "adrb", adr, padrb},
{0x1d, "adrf", adr, padrf},
{0x1e, "mal", malchm, pmal},
{0x1f, "divide", divide, pdivide}
******************************************************************************
Instset #3 Based on a design suggested and implemented by Tom Ray. This
includes certain features of the RPN Hewlett-Packard calculator.
This instruction set operates on a cpu based on the following definitions:
#define INSTBITNUM 5
#define INSTNUM 32 /* INSTNUM = 2 ^ INSTBITNUM */
#define INST 3
#define STACK_SIZE 10
#define ALOC_REG 4
#define NUMREG 4 /* NUMREG = ALOC_REG */
#define GETBUFSIZ 3
#define PUTBUFSIZ 3
typedef struct { /* structure for registers of virtual cpu */
Reg re[ALOC_REG]; /* array of registers */
Reg sp; /* stack pointer */
Reg st[STACK_SIZE]; /* stack */
Reg gb[GETBUFSIZ+3]; /* input buffer */
Reg pb[PUTBUFSIZ+3]; /* output buffer */
Reg ip; /* instruction pointer */
I8s fl; /* flag */
} Cpu;
The four registers in the re[] array are referred to as follows:
AX = re[0], BX = re[1], CX = re[2], DX = re[3].
No Operations: 2
nop0
nop1
Memory Movement: 11
rollu (roll registers up: AX = DX, BX = AX, CX = BX, DX = CX)
rolld (roll registers down: AX = BX, BX = CX, CX = DX, DX = AX)
enter (AX = AX, BX = AX, CX = BX, DX = CX, DX is lost)
exch (AX = BX, BX = AX)
movdi (move from BX to ram [AX])
movid (move from ram [BX] to AX)
movii (move from ram [BX] to ram [AX])
push (push AX onto stack)
pop (pop from stack into AX)
put (write AX to output buffer, three modes:
#ifndef ICC: write AX to own output buffer
#ifdef ICC: write AX to input buffer of cell at address BX,
or, if template, write AX to input buffers of all creatures within
PutLimit who have the complementary get template)
get (read AX from input buffer)
Calculation: 9
inc (increment AX)
dec (decrement AX)
add (AX = BX + AX, BX = CX, CX = DX))
sub (AX = BX - AX, BX = CX, CX = DX))
zero (zero AX)
not0 (flip low order bit of AX)
not (flip all bits of AX)
shl (shift left all bits of AX)
rand (place random number in AX)
Instruction Pointer Manipulation: 5
ifz (if AX == 0 execute next instruction, otherwise, skip it)
iffl (if flag == 1 execute next instruction, otherwise, skip it)
jmp (jump to template, or if no template jump to address in AX)
jmpb (jump back to template, or if no template jump back to address in AX)
call (push IP + 1 onto the stack; if template, jump to complementary templ)
Biological and Sensory: 5
adr (search outward for template, put address in AX, template size in BX,
and offset in CX, start search at offset +- BX)
adrb (search backward for template, put address in AX, template size in BX,
and offset in CX, start search at offset - BX)
adrf (search forward for template, put address in AX, template size in BX,
and offset in CX, start search at offset + BX)
mal (allocate amount of space specified in BX, prefer address at AX,
if AX < 0 use better fit, place address of allocated block in AX)
divide (cell division, the IP is offset by AX into the daughter cell, the
values in the four CPU registers are transferred from mother to
daughter, but not the stack. If !CX genome will be ejected from
the simulator)
Total: 32 instructions
The following array defines the associations of opcodes, mnemonics,
executable functions, and parsing functions (in that order) used to
implement instruction set #3.
{0x00, "nop0", nop, pnop},
{0x01, "nop1", nop, pnop},
{0x02, "rollu", rollu, pnop},
{0x03, "rolld", rolld, pnop},
{0x04, "enter", enter, pnop},
{0x05, "exch", exch, pnop},
{0x06, "movdi", movdi, pmovdi},
{0x07, "movid", movid, pmovid},
{0x08, "movii", movii, pmovii},
{0x09, "push", push, ppush},
{0x0a, "pop", pop3, ppop},
{0x0b, "put", put, pput},
{0x0c, "get", get, pget},
{0x0d, "inc", math, pinc},
{0x0e, "dec", math, pdec},
{0x0f, "add", math3, padd},
{0x10, "sub", math3, psub},
{0x11, "zero", movdd3, pzero},
{0x12, "shl", shl, pshl},
{0x13, "not0", not0, pnot0},
{0x14, "not", not, pnot},
{0x15, "rand", movdd3, prand},
{0x16, "ifz", ifz, pifz},
{0x17, "iffl", ifz, piffl},
{0x18, "jmp", adr, ptjmp},
{0x19, "jmpb", adr, ptjmpb},
{0x1a, "call", tcall, ptcall},
{0x1b, "adr", adr3, padr},
{0x1c, "adrb", adr3, padrb},
{0x1d, "adrf", adr3, padrf},
{0x1e, "mal", malchm3, pmal},
{0x1f, "divide", divide, pdivide}
******************************************************************************
Instset #4 Based on a design suggested by Walter Tackett of Hughes Aircraft,
and implemented by Tom Ray. The special features of this instruction
set are that all movement between registers of the cpu takes place via
push and pop through the stack. Also, all indirect addressing involves
an offset from the address in the CX register. Also, the CX register
is where most calculations take place.
This instruction set operates on a cpu based on the following definitions:
#define INSTBITNUM 5
#define INSTNUM 32 /* INSTNUM = 2 ^ INSTBITNUM */
#define INST 3
#define STACK_SIZE 10
#define ALOC_REG 4
#define NUMREG 4 /* NUMREG = ALOC_REG */
#define GETBUFSIZ 3
#define PUTBUFSIZ 3
typedef struct { /* structure for registers of virtual cpu */
Reg re[ALOC_REG]; /* array of registers */
Reg sp; /* stack pointer */
Reg st[STACK_SIZE]; /* stack */
Reg gb[GETBUFSIZ+3]; /* input buffer */
Reg pb[PUTBUFSIZ+3]; /* output buffer */
Reg ip; /* instruction pointer */
I8s fl; /* flag */
} Cpu;
The four registers in the re[] array are referred to as follows:
AX = re[0], BX = re[1], CX = re[2], DX = re[3].
No Operations: 2
nop0
nop1
Memory Movement: 13
movdi (move from BX to ram [AX + CX])
movid (move from ram [BX + CX] to AX)
movii (move from ram [BX + CX] to ram [AX + CX])
pushax (push AX onto stack)
pushbx (push BX onto stack)
pushcx (push CX onto stack)
pushdx (push DX onto stack)
popax (pop from stack into AX)
popbx (pop from stack into BX)
popcx (pop from stack into CX)
popdx (pop from stack into DX)
put (write DX to output buffer, three modes:
#ifndef ICC: write DX to own output buffer
#ifdef ICC: write DX to input buffer of cell at address CX,
or, if template, write DX to input buffers of all creatures within
PutLimit who have the complementary get template)
get (read DX from input port)
Calculation: 7
inc (increment CX)
dec (decrement CX)
add (CX = CX + DX)
sub (CX = CX - DX)
zero (zero CX)
not0 (flip low order bit of CX)
shl (shift left all bits of CX)
Instruction Pointer Manipulation: 5
ifz (if CX == 0 execute next instruction, otherwise, skip it)
iffl (if flag == 1 execute next instruction, otherwise, skip it)
jmp (jump to template, or if no template jump to address in AX)
jmpb (jump back to template, or if no template jump back to address in AX)
call (push IP + 1 onto the stack; if template, jump to complementary templ)
Biological and Sensory: 5
adr (search outward for template, put address in AX, template size in DX,
and offset in CX, start search at offset +- CX)
adrb (search backward for template, put address in AX, template size in DX,
and offset in CX, start search at offset - CX)
adrf (search forward for template, put address in AX, template size in DX,
and offset in CX, start search at offset + CX)
mal (allocate amount of space specified in CX, prefer address at AX,
if AX < 0 use better fit, place address of allocated block in AX)
divide (cell division, the IP is offset by CX into the daughter cell, the
values in the four CPU registers are transferred from mother to
daughter, but not the stack. If !DX genome will be ejected from
the simulator)
Total: 32 instructions
The following array defines the associations of opcodes, mnemonics,
executable functions, and parsing functions (in that order) used to
implement instruction set #4.
{0x00, "nop0", nop, pnop},
{0x01, "nop1", nop, pnop},
{0x02, "movdi", movdi, pmovdi},
{0x03, "movid", movid, pmovid},
{0x04, "movii", movii, pmovii},
{0x05, "pushax", push, ppushax},
{0x06, "pushbx", push, ppushbx},
{0x07, "pushcx", push, ppushcx},
{0x08, "pushdx", push, ppushdx},
{0x09, "popax", pop, ppopax},
{0x0a, "popbx", pop, ppopbx},
{0x0b, "popcx", pop, ppopcx},
{0x0c, "popdx", pop, ppopdx},
{0x0d, "put", put, pput},
{0x0e, "get", get, pget},
{0x0f, "inc", math, pinc},
{0x10, "dec", math, pdec},
{0x11, "add", math, padd},
{0x12, "sub", math, psub},
{0x13, "zero", movdd, pzero},
{0x14, "shl", shl, pshl},
{0x15, "not0", not0, pnot0},
{0x16, "ifz", ifz, pifz},
{0x17, "iffl", ifz, piffl},
{0x18, "jmp", adr, ptjmp},
{0x19, "jmpb", adr, ptjmpb},
{0x1a, "call", tcall, ptcall},
{0x1b, "adr", adr, padr},
{0x1c, "adrb", adr, padrb},
{0x1d, "adrf", adr, padrf},
{0x1e, "mal", malchm, pmal},
{0x1f, "divide", divide, pdivide}
******************************************************************************
8.2) Details of Set 1
What follows is a more detailed documentation of instruction set 1:
In general, the ax and bx registers are used to hold addresses, which
refer to locations in the soup. Values contained in the ax and bx registers
are maintained as positive integers, modulus SoupSize. The cx and dx
registers are generally used to hold numbers, which may be either positive or
negative. Values contained in the cx and dx registers are forced to remain
in the range -SoupSize to SoupSize. Any operation which causes the values
in the cx or dx registers to stray out of this range will cause the
instruction to fail and a flag to be set (in this case, the value in the
cx or dx register is reset to zero).
Many of the instructions will fail under certain conditions, which are
specified for each instruction in the following discussion. Any time an
instruction fails, it does nothing, other than to increment the instruction
pointer, decrement the instruction bank, and set the flag register (to the
value 1). Any time and instruction does not fail, it clears the flag register
(to the value 0). All instructions clear the flag on success, and set the
flag on failure. Every instruction decrements the instruction bank by one,
regardless of success or failure.
What follows is a list of the thirty-two instructions of INST == 1,
along with details of their behavior:
{0x00, "nop0", nop, pnop}, /* do nothing */
{0x01, "nop1", nop, pnop},
These two instructions are no-ops, they do nothing, other than to
increment the instruction pointer one place. There are no conditions under
which these instructions fail.
{0x02, "not0", not0, pnot0}, /* flip low order bit of cx */
This instruction flips the low order bit of the cx register. The only
condition under which it fails is if this operation causes the value in
the register to stray out of the range -SoupSize to SoupSize. In this
case cx is set to zero and the flag is set. In every case, the
instruction pointer is incremented by one.
{0x03, "shl", shl, pshl}, /* shift left all register of cx */
This instruction shifts all bits of the cx register one position to
the left, replacing the rightmost (low order) bit with a zero (this is a
binary multiply by two). The only condition under which it fails is if this
operation causes the value in the register to stray out of the range -SoupSize
to SoupSize. In this case cx is set to zero and the flag is set.
In every case, the instruction pointer is incremented by one.
{0x04, "zero", movdd, pzero}, /* cx = 0 */
This instruction sets the cx register to zero. It also increments the
instruction pointer one place. There are no conditions under which this
instruction fails.
{0x05, "ifz", ifz, pifz}, /* execute next instruction only if cx == 0 */
This instruction will increment the instruction pointer by one if the
value in the cx register is zero, or by two if the cx register contains
any other value (this means that the instruction following this one will be
executed only if the value in the cx register is zero). There are no
conditions under which this instruction fails.
{0x06, "sub_ab", math, psub_ab}, /* cx = ax - bx */
This instruction subtracts the value in the bx register from the value
in the cx register, placing the result in the cx register. This instruction
can fail only if the resultant value in the cx register would be outside of
the range -SoupSize to SoupSize, in which case the value in the cx register
is set to zero, and a flag is set. In any case, the instruction pointer
is incremented by one.
{0x07, "sub_ac", math, psub_ac}, /* ax = ax - cx */
This instruction subtracts the value in the cx register from the value
in the ax register, placing the result in the ax register. It increments the
instruction pointer by one. There are no conditions under which this
instruction fails.
{0x08, "inc_a", math, pinc_a}, /* ax++ */
This instruction increments (adds one to) the value in the ax register,
placing the result in the ax register. It increments the instruction pointer
by one. There are no conditions under which this instruction fails.
{0x09, "inc_b", math, pinc_b}, /* bx++ */
This instruction increments (adds one to) the value in the bx register,
placing the result in the bx register. It increments the instruction pointer
by one. There are no conditions under which this instruction fails.
{0x0a, "dec_c", math, pdec_c}, /* cx-- */
This instruction decrements (subtracts one from) the value in the cx
register, placing the result in the cx register. It increments the
instruction pointer by one. This instruction can fail only if the resultant
value in the cx register would be outside of the range -SoupSize to SoupSize,
in which case the cx register is set to zero, and a flag is set.
{0x0b, "inc_c", math, pinc_c}, /* cx++ */
This instruction increments (adds one to) the value in the cx register,
placing the result in the cx register. It increments the instruction pointer
by one. This instruction can fail only if the resultant value in the cx
register would be outside of the range -SoupSize to SoupSize, in which case
the cx register is set to zero, and a flag is set.
{0x0c, "pushax", push, ppushax}, /* push ax onto stack */
This instruction causes the value in the ax register to be pushed onto
the stack, and the stack pointer to be incremented (modulus STACK_SIZE).
It increments the instruction pointer by one. There are no conditions under
which this instruction fails.
{0x0d, "pushbx", push, ppushbx}, /* push bx onto stack */
This instruction causes the value in the bx register to be pushed onto
the stack, and the stack pointer to be incremented (modulus STACK_SIZE).
It increments the instruction pointer by one. There are no conditions under
which this instruction fails.
{0x0e, "pushcx", push, ppushcx}, /* push cx onto stack */
This instruction causes the value in the cx register to be pushed onto
the stack, and the stack pointer to be incremented (modulus STACK_SIZE).
It increments the instruction pointer by one. There are no conditions under
which this instruction fails.
{0x0f, "pushdx", push, ppushdx}, /* push dx onto stack */
This instruction causes the value in the dx register to be pushed onto
the stack, and the stack pointer to be incremented (modulus STACK_SIZE).
It increments the instruction pointer by one. There are no conditions under
which this instruction fails.
{0x10, "popax", pop, ppopax}, /* pop ax off of stack */
This instruction causes the value at the top of the stack to be popped
into the ax register and the stack pointer to be decremented (modulus
STACK_SIZE). It increments the instruction pointer by one. There are no
conditions under which this instruction fails.
{0x11, "popbx", pop, ppopbx}, /* pop bx off of stack */
This instruction causes the value at the top of the stack to be popped
into the bx register and the stack pointer to be decremented (modulus
STACK_SIZE). It increments the instruction pointer by one. There are no
conditions under which this instruction fails.
{0x12, "popcx", pop, ppopcx}, /* pop cx off of stack */
This instruction causes the value at the top of the stack to be popped
into the cx register and the stack pointer to be decremented (modulus
STACK_SIZE). It increments the instruction pointer by one. This instruction
can fail only if the resultant value in the cx register would be outside of
the range -SoupSize to SoupSize, in which case the cx register is set to zero,
and a flag is set.
{0x13, "popdx", pop, ppopdx}, /* pop dx off of stack */
This instruction causes the value at the top of the stack to be popped
into the dx register and the stack pointer to be decremented (modulus
STACK_SIZE). It increments the instruction pointer by one. This instruction
can fail only if the resultant value in the dx register would be outside of
the range -SoupSize to SoupSize, in which case the dx register is set to zero,
and a flag is set.
{0x14, "jmp", adr, ptjmp}, /* outward template jump */
This instruction causes the instruction pointer to be redirected to the
instruction following the nearest occurrence of the template complementary
to the template that follows the jmp instruction. The template is defined
as the group of consecutive nops (nop0 or nop1) that immediately follow the
jmp instruction. The size of the template is just the number of consecutive
nops that follow the jmp instruction. The size of the template is placed in
the dx register.
This instruction is a bi-directional jump, which means that it will
search both forward and backward in memory for the complementary template.
Let a equal the address of the first instruction of the template, and let
s equal the size of the template. The forward search starts at the address
a + s + 1, and the backward search starts at the address a - s - 1. The
search looks in the forward direction first, then alternately looks ahead
or back one additional step until the complementary template is found, or
the search limit (Search_limit) is reached. The template searches wrap around
the ends of the soup.
The instruction will fail if the size of the template following the jmp
instruction is less than MinTemplSize, or greater than SoupSize, or if a
complementary template is not found within the search radius, Search_limit.
On failure, the instruction pointer is moved past the template used by the
jmp instruction: ce->c.ip = ce->c.ip + s + 1.
{0x15, "jmpb", adr, ptjmpb}, /* backward template jump */
This instruction causes the instruction pointer to be redirected to the
instruction following the nearest occurrence (backwards in the soup) of the
template complementary to the template that follows the jmp instruction. The
template is defined as the group of consecutive nops (nop0 or nop1) that
immediately follow the jmp instruction. The size of the template is just the
number of consecutive nops that follow the jmp instruction. The size of the
template is placed in the dx register.
This instruction is a uni-directional jump, which means that it will
search in only one direction (in this case, only backward) in memory for the
complementary template. Let a equal the address of the first instruction of
the template, and let s equal the size of the template. The backward search
starts at the address a - s - 1. The search looks in successive steps
backwards until the complementary template is found, or the search limit
(Search_limit) is reached. The template searches wrap around the ends of the
soup.
The instruction will fail if the size of the template following the jmp
instruction is less than MinTemplSize, or greater than SoupSize, or if a
complementary template is not found within the search radius, Search_limit.
On failure, the instruction pointer is moved past the template used by the
jmp instruction: ce->c.ip = ce->c.ip + s + 1.
{0x16, "call", tcall, ptcall}, /* push ip to stack, outward template jump */
This instruction behaves identically to the jmp instruction, with the
one difference that the address following the tempate following the call
instruction is also pushed onto the stack (this is essentially the address
of the instruction pointer plus one meaningful instruction). On failure, the
instruction pointer is moved past the template used by the jmp instruction:
ce->c.ip = ce->c.ip + s + 1, and the ip will not be pushed onto the stack.
If the call instruction is not followed by a template, it will push the
address of the instruction pointer + 1 onto the stack.
{0x17, "ret", pop, pret}, /* pop ip from stack */
This instruction causes the value at the top of the stack to be popped
into the instruction pointer and the stack pointer to be decremented (modulus
STACK_SIZE). There are no conditions under which this instruction fails.
{0x18, "movcd", movdd, pmov_dc}, /* dx = cx */
This instruction copies the contents of the cx register into the dx
register, leaving the value in cx intact. The flaw may cause the source
register to actually be bx or dx and the destination register to actually
be cx or ax. This instruction could fail if the value placed in the dx
register would be outside of the range -SoupSize to SoupSize, in which case
the dx register is set to zero, and a flag is set.
{0x19, "movab", movdd, pmov_ba}, /* bx = ax */
This instruction copies the contents of the ax register into the bx
register, leaving the value in ax intact. The flaw may cause the source
register to actually be bx or dx and the destination register to actually
be cx or ax. This instruction could fail if the value placed in the cx
register would be outside of the range -SoupSize to SoupSize, in which case
the cx register is set to zero, and a flag is set.
{0x1a, "movii", movii, pmovii},
This instruction copies one instruction in the soup to another location
in the soup. The source instruction is at the address contained in the
bx register, and the destination is the address contained in the ax register.
This instruction could fail under the following circumstances: a) if the
source and destination addresses are the same, b) if the destination address
is not owned by this creature and is write protected, c) if the source
address is not owned by this creature and is read protected, d) if either the
source or destination addresses are outside the soup.
Also, if the destination address is in the daughter, it will perform the
following operation: ce->d.mov_daught++. If the destination address is not
in the daughter cell, it will make a call to MutBookeep(is.dval), since this
could concievably make a genetic change in another creature (if the other
creature were not write protected, or if the ``other'' creature were actually
the one doing the writing).
{0x1b, "adr", adr, padr}, /* search outward for template, return adr in ax */
This instruction causes the ax register to contain the address of the
instruction following the nearest occurrence of the template complementary
to the template that follows the adr instruction. The template is defined
as the group of consecutive nops (nop0 or nop1) that immediately follow the
adr instruction. The size of the template is just the number of consecutive
nops that follow the adr instruction. The size of the template is placed in
the cx register.
This instruction is a bi-directional search, which means that it will
search both forward and backward in memory for the complementary template.
Let a equal the address of the first instruction of the template, and let
s equal the size of the template. The forward search starts at the address
a + s + 1, and the backward search starts at the address a - s - 1. The
search looks in the forward direction first, then alternately looks ahead
or back one additional step until the complementary template is found, or
the search limit (Search_limit) is reached. The template searches wrap around
the ends of the soup. The instruction pointer is moved past the template used
by the adr instruction: ce->c.ip = ce->c.ip + s + 1.
The instruction will fail if the size of the template following the adr
instruction is less than MinTemplSize, or greater than SoupSize, or if a
complementary template is not found within the search radius, Search_limit.
On failure and the cx register is not altered.
{0x1c, "adrb", adr, padrb}, /* search backward for template, rtrn adr in ax */
This instruction causes the ax register to contain the address of the
instruction following the nearest occurrence (backwards in the soup) of the
template complementary to the template that follows the adrb instruction.
The template is defined as the group of consecutive nops (nop0 or nop1) that
immediately follow the adrb instruction. The size of the template is just the
number of consecutive nops that follow the adrb instruction. The size of the
template is placed in the cx register.
This instruction is a uni-directional search, which means that it will
search in only one direction (in this case, only backward) in memory for the
complementary template. Let a equal the address of the first instruction of
the template, and let s equal the size of the template. The backward search
starts at the address a - s - 1. The search looks in successive steps
backwards until the complementary template is found, or the search limit
(Search_limit) is reached. The template searches wrap around the ends of the
soup. The instruction pointer is moved past the template used by the adrb
instruction: ce->c.ip = ce->c.ip + s + 1.
The instruction will fail if the size of the template following the adrb
instruction is less than MinTemplSize, or greater than SoupSize, or if a
complementary template is not found within the search radius, Search_limit.
On failure the cx register is not altered.
{0x1d, "adrf", adr, padrf}, /* search forward for template, rtrn adr in ax */
This instruction causes the ax register to contain the address of the
instruction following the nearest occurrence (forwards in the soup) of the
template complementary to the template that follows the adrf instruction.
The template is defined as the group of consecutive nops (nop0 or nop1) that
immediately follow the adrf instruction. The size of the template is just the
number of consecutive nops that follow the adrf instruction. The size of the
template is placed in the cx register.
This instruction is a uni-directional search, which means that it will
search in only one direction (in this case, only forward) in memory for the
complementary template. Let a equal the address of the first instruction of
the template, and let s equal the size of the template. The forward search
starts at the address a + s + 1. The search looks in successive steps
forwards until the complementary template is found, or the search limit
(Search_limit) is reached. The template searches wrap around the ends of the
soup. The instruction pointer is moved past the template used by the adrf
instruction: ce->c.ip = ce->c.ip + s + 1.
The instruction will fail if the size of the template following the adrf
instruction is less than MinTemplSize, or greater than SoupSize, or if a
complementary template is not found within the search radius, Search_limit.
On failure the cx register is not altered.
{0x1e, "mal", malchm, pmal}, /* allocate & chmod space for a new cell */
This instruction requests memory space in the soup from the operating
system. The amount of space requested is specified by the value in the cx
register. The address of the allocated block is returned in the ax register.
The allocated block is chmoded (its memory is protected) according to the
condition specified by the soup_in variable MemModeProt, which has a default
value of 2 (write privelage protected, but read and execute privelages open).
In any case the owner of the space retains all privelages. This instruction
will increment the instruction pointer by one. If this instruction is
executed successfully, this creature will move down the reaper queue one
position.
The operating system searches for a block of free memory of the size
requested, and will invoke the reaper if such a block is not available.
The block of memory may be located on a first fit basis by beginning the
search at the beginning of memory, or using a better fit method, or the
block may be located at a preferred location to within a tolerance (see
the MalMode, MalReapTol, and MalTol variables described in the section on
SOUP_IN PARAMETERS). Note that the execution of this instruction usually
causes the reaper to kill in order to free space to meet the request.
However, the reaper will never kill the creature making the request.
The allocated block may not wrap around the end of the soup.
This instruction can fail under the following circumstances: a) if the
amount of memory requested is less than or equal to zero (note that a request
of 1 may be converted into a request of 0 by a flaw), b) if the creature
already owns a second block of memory whose size is exactly the amount of
memory requested (in other words, if the creature makes two successive calls
to mal(), requesting the same amount of memory, without an intervening call
to divide()), c) if the amount of memory requested is more than MaxMalMult
times the size of the creature making the request (the default value of
MaxMalMult is 3).
{0x1f, "divide", divide, pdivide} /* give life to new cell, put in queues */
This instruction causes the mother cell to loose her write privelages
on the space of the daughter cell, and causes the daughter cell to be
entered into both the reaper and slicer queues. The daughter enters the
bottom of the reaper queue, and enters the slicer queue just behind the mother
(so the daughter will be the last cell to be reaped and to get another slice).
This instruction will increment the instruction pointer by one. If this
instruction is executed successfully, this creature will move down the reaper
queue one position.
This instruction can fail under the following circumstances: a) if the
size of the daughter cell is less than soup_in variable MinCellSize (default
value of 8), b) if the number of instructions written into the space of the
daughter cell by the mother cell (ce->d.mov_daught) is less than the soup_in
variable MovPropThrDiv times the size of the daughter cell, c) if the soup_in
variable DivSameSize is non-zero, and the daughter cell is not the same
size as the mother cell, d) if the soup_in variable DivSameGen is non-zero
and the daughter cell is not the same size, or does not have exactly the same
sequence as the mother cell.
8.3) New Features in Sets 2 through 4
Instruction sets two through four are essentially identical except for
the methods used to move data between the registers of the cpu. Set two
allows the order of the registers to be freely rearranged, and provides
an instruction, movdd() to move data directly between two registers. Set
three uses the reverse Polish notation method, in which the registers can
be rolled up or down, and the values in the lower two registers can be
exchanged. Set four requires all inter-register moves to pass through the
stack.
Each of the new sets includes a movdi() instruction to move data from
a register into the soup, and a movid instruction to move data from the soup
into a register.
Each new set provides I/O functions in the form of put() and get()
instructions which write to or read from buffers. The get() instruction
reads a value from the input buffer and moves it into a cpu register. The
input buffer maintains a pointer to the next input value to be read from the
buffer, a pointer to the next input value to be written to the buffer,
and a count of unread input values:
ce->c.gb[GETBUFSIZ] == pointer to next input value to be read
ce->c.gb[GETBUFSIZ + 1] == pointer to next input value to be written
ce->c.gb[GETBUFSIZ + 2] == number of unread input values
The output buffer maintains a pointer to the next output value to be
written to the buffer, a pointer to the next output value to be read from the
buffer, and a count of unread output values:
ce->c.pb[PUTBUFSIZ] == pointer to next output value to be written
ce->c.pb[PUTBUFSIZ + 1] == pointer to next output value to be read
ce->c.pb[PUTBUFSIZ + 2] == number of unread output values
The put() instruction operates in two modes. If ICC is defined
(inter-cellular communication), put() writes to the input buffer(s) of other
cell(s). In this case, if the put() instruction is followed by a template,
put() will write to the input buffer(s) of any cells within the put radius,
PutLimit, which have a get instruction followed by a complementary template
(this is an analog to hormonal communication). If put() is not followed by
a template, then it writes to the input buffer of the cell located at the
address specified in the relevant register of the cell executing the put().
If ICC is not defined, put() writes to its own output buffer. This
mode is provided so that the user can communicate with the cells. This could
be used for example to get them to do useful work by giving them data by
writing to their input buffer, letting them get the data and compute on it,
then letting them write their results back to you in the output buffer (the
creatures can also use this facility to pray). In order to assist with
communication between the user and the creatures, three additional functions
are provided in the instruct.c module:
I8s ReadFPut(ce, value) /* for god to read the data in the output buffer */
void Write2Get(ce, value) /* place value in input buffer of cell ce */
void Broad2Get(value) /* broadcast value to input buffer of all cells */
Each new instruction set includes the basic calculations: inc(), dec(),
add() and sub(). The zero() instruction sets a register to zero. The not0()
instruction flips the low order bit of a register, and the not() instruction
flips all bits of a register. The shl() instruction shifts all bits of a
register one bit to the left, placing a zero in the low order bit (multiply
by two). Set three also provides rand() which places a random number in
the AX register.
In addition to ifz() which tests if a register is zero, the new sets
provide iffl() which tests it a flag is set. This latter can be used to
test in an instruction has failed.
In sets two through four, the jmp() and jmpb() instructions will
jump to a complementary template, if the jump instruction is followed by
a template, but if there is no template, they will jump to the address
specified in a register. Similarly, the call instruction will push the
instruction pointer + 1 onto the stack, and if a template is present, will
jump to the nearest complementary template.
In sets two through four, the address instructions adr(), adrb() and
adrf() have two new features: 1) they return the offset distance from the
source template to where they found the target template. 2) they start the
search at an offset from the source template, specified in a register. This
provides the facility to repeatedly call the address instructions in order
to find target templates beyond the first matching one.
In sets two through four, the mal() instruction has the option of
specifying the preferred address in a register, or of using a better fit
mode.
In sets two through four, the divide instruction causes the contents
of the mother's cpu registers to be transferred to the daughter's cpu
registers. Also, the instruction pointer of the daughter will start with
an offset into the daughter cell that is specified in a register of the
mother's cpu. This allows the mother to force the daughter to express
a different sub-set of the genome from that which was expressed by the
mother (the mother can force the daughter cell to differentiate). If a
certain register of the mother is not zero, the genome of the daughter cell
will be ejected from the soup. This makes it possible for the daughter cell
to emmigrate to another soup, if there is one. When ejected from the soup,
the space occupied by the daughter cell is freed, and the code is set to
zero in that space.
9) THE ANCESTOR & WRITING A CREATURE
9.1) The Ancestor
The ASCII assembler code file with comments, for the ancestor, is listed
below. Below the listing I have some explanatory material.
**** begin genome file (note blank line at head of file)
format: 2 bits: 2156009669 EXsh TCsh TPs MFs MTd MBh
genotype: 0080aaa parent genotype: 0666god
1st_daughter: flags: 0 inst: 827 mov_daught: 80 breed_true: 1
2nd_daughter: flags: 0 inst: 809 mov_daught: 80 breed_true: 1
Origin: InstExe: 0,0 clock: 0 Thu Jan 01 -5:00:00 1970
MaxPropPop: 0.8306 MaxPropInst: 0.4239 mpp_time: 0,0
ploidy: 1 track: 0
track 0: prot
xwr
nop1 ; 010 110 01 0 beginning marker
nop1 ; 010 110 01 1 beginning marker
nop1 ; 010 110 01 2 beginning marker
nop1 ; 010 110 01 3 beginning marker
zero ; 010 110 04 4 put zero in cx
not0 ; 010 110 02 5 put 1 in first bit of cx
shl ; 010 110 03 6 shift left cx (cx = 2)
shl ; 010 110 03 7 shift left cx (cx = 4)
movcd ; 010 110 18 8 move cx to dx (dx = 4)
adrb ; 010 110 1c 9 get (backward) address of beginning marker -> ax
nop0 ; 010 100 00 10 complement to beginning marker
nop0 ; 010 100 00 11 complement to beginning marker
nop0 ; 010 100 00 12 complement to beginning marker
nop0 ; 010 100 00 13 complement to beginning marker
sub_ac ; 010 110 07 14 subtract cx from ax, result in ax
movab ; 010 110 19 15 move ax to bx, bx now contains start address of mother
adrf ; 010 110 1d 16 get (forward) address of end marker -> ax
nop0 ; 010 100 00 17 complement to end marker
nop0 ; 010 100 00 18 complement to end marker
nop0 ; 010 100 00 19 complement to end marker
nop1 ; 010 100 01 20 complement to end marker
inc_a ; 010 110 08 21 increment ax, to include dummy instruction at end
sub_ab ; 010 110 06 22 subtract bx from ax to get size, result in cx
nop1 ; 010 110 01 23 reproduction loop marker
nop1 ; 010 110 01 24 reproduction loop marker
nop0 ; 010 110 00 25 reproduction loop marker
nop1 ; 010 110 01 26 reproduction loop marker
mal ; 010 110 1e 27 allocate space (cx) for daughter, address to ax
call ; 010 110 16 28 call template below (copy procedure)
nop0 ; 010 100 00 29 copy procedure complement
nop0 ; 010 100 00 30 copy procedure complement
nop1 ; 010 100 01 31 copy procedure complement
nop1 ; 010 100 01 32 copy procedure complement
divide ; 010 110 1f 33 create independent daughter cell
jmp ; 010 110 14 34 jump to template below (reproduction loop)
nop0 ; 010 100 00 35 reproduction loop complement
nop0 ; 010 100 00 36 reproduction loop complement
nop1 ; 010 100 01 37 reproduction loop complement
nop0 ; 010 100 00 38 reproduction loop complement
ifz ; 010 000 05 39 dummy instruction to separate templates
nop1 ; 010 110 01 40 copy procedure template
nop1 ; 010 110 01 41 copy procedure template
nop0 ; 010 110 00 42 copy procedure template
nop0 ; 010 110 00 43 copy procedure template
pushax ; 010 110 0c 44 push ax onto stack
pushbx ; 010 110 0d 45 push bx onto stack
pushcx ; 010 110 0e 46 push cx onto stack
nop1 ; 010 110 01 47 copy loop template
nop0 ; 010 110 00 48 copy loop template
nop1 ; 010 110 01 49 copy loop template
nop0 ; 010 110 00 50 copy loop template
moviab ; 010 110 1a 51 move contents of [bx] to [ax] (copy one instruction)
dec_c ; 010 110 0a 52 decrement cx (size)
ifz ; 010 110 05 53 if cx == 0 perform next instruction, otherwise skip it
jmp ; 010 110 14 54 jump to template below (copy procedure exit)
nop0 ; 010 110 00 55 copy procedure exit complement
nop1 ; 010 110 01 56 copy procedure exit complement
nop0 ; 010 110 00 57 copy procedure exit complement
nop0 ; 010 110 00 58 copy procedure exit complement
inc_a ; 010 110 08 59 increment ax (address in daughter to copy to)
inc_b ; 010 110 09 60 increment bx (address in mother to copy from)
jmp ; 010 110 14 61 bidirectional jump to template below (copy loop)
nop0 ; 010 100 00 62 copy loop complement
nop1 ; 010 100 01 63 copy loop complement
nop0 ; 010 100 00 64 copy loop complement
nop1 ; 010 100 01 65 copy loop complement
ifz ; 010 000 05 66 this is a dummy instruction to separate templates
nop1 ; 010 110 01 67 copy procedure exit template
nop0 ; 010 110 00 68 copy procedure exit template
nop1 ; 010 110 01 69 copy procedure exit template
nop1 ; 010 110 01 70 copy procedure exit template
popcx ; 010 110 12 71 pop cx off stack (size)
popbx ; 010 110 11 72 pop bx off stack (start address of mother)
popax ; 010 110 10 73 pop ax off stack (start address of daughter)
ret ; 010 110 17 74 return from copy procedure
nop1 ; 010 100 01 75 end template
nop1 ; 010 100 01 76 end template
nop1 ; 010 100 01 77 end template
nop0 ; 010 100 00 78 end template
ifz ; 010 000 05 79 dummy instruction to separate creature
**** end genome file
Each genome file begins with some header information. Let me explain
each item:
format: 2 because we occasionally change the format of the genome files,
this parameter is included for backwards compatibility. It is used by the
assembler/disassembler to know how to read and write the files. This variable
is now defunct.
bits: 2156009669 this is the bit field associated with each genome in
the genebank. If the genebanker is on and if any of the parameters: WatchExe,
WatchMov, or WatchTem are set to a non-zero value, then bits in this field
will be set to characterize the ecological characteristics of the genotype.
The definitions of the bits in the field are given in the tierra.h module,
and above in the description of the soup_in parameters. For more specific
details, follow the Watch variables in the source modules to see exactly what
they are doing.
EXsh TCsh TPs MFs MTd MBh this is an ASCII summary of the meaning of
the bits that are set in the bit field. The meanings of these abbreviations
are given in the tierra.h file and above in the description of the soup_in
parameters.
genotype: 0080aaa This is the name of this genotype. The name has two
parts. The first part is numeric and must be equal to the size of the cell
of this creature (the size of its allocated block of memory). The cell size
usually, but not always, corresponds to the size of the genome. The second
part is a unique (and arbitrary) three letter code to distinguish this
particular genotype from others of the same size.
parent genotype: 0666god This is the name of the genotype of the
immediate ancestor of this genotype. The immediate ancestor is the creature,
whose cpu gave rise to the first individual of this genotype. The original
creature, 0080aaa was created by god and the devil. This information is
deficient in three respects. First, the creature whose cpu created the
offspring is not necessarily the genetic parent of the creature.
Hyper-parasites for example, force other creatures to replicate their genomes.
Second, the immediate ancestor of a creature may have gone extinct before it
crossed the threshold, so its genotype may not appear in the genebank.
Third, in the case where the immediate ancestor went extinct without being
saved in the genebank, its name becomes available for reuse. This means
that even if you find another creature with the right name in the genebank,
there is no certainty that it is actually the ancestor you are looking for.
In short, this information is essentially useless. This is the problem
that is most actively being worked on at the moment. A seamless phylogeny
tracker is under development.
1st_daughter: flags: 0 inst: 827 mov_daught: 80 breed_true: 1
This is a set of metabolic data about what transpired during the production
of the first daughter by this genotype. flags: 0 This tells us how many
errors (flags) were generated during the first reproduction. The generation
of errors indicates invalid execution of instructions and causes the creature
to move up the reaper queue, closer to death. inst: 827 This tells us how
many instructions were executed during the first reproduction, this is an
indication of metabolic costs and efficiency. mov_daught: 80 This tells us
how many instructions were copied from the mother to the daughter during the
first reproduction. breed_true: 1 This tells us if the first daughter ever
has the same genotype as the mother.
2nd_daughter: flags: 0 inst: 809 mov_daught: 80 breed_true: 1
This is a set of metabolic data about what transpired during the production
of the second daughter by this genotype. The data are the same as those
from the first daughter. The second daughter and those that follow generally
have the same metabolic data, but they also generally differ from the first
daughter, because the second time through, the parent often does not examine
itself again, and it does not start the algorithm from the same place.
Origin InstExe: 0,0 clock: 0 Thu Jan 01 -5:00:00 1970
InstExe: 0,0 At the time this genotype first appeared, the system had
executed this many of instructions. The number to the left of the comma
is the number of millions of instructions, and the number to the right of the
comma is the remainder. clock: 0 This is the system clock time at the first
origin of this genotype. Wed Dec 26 22:56:08 1990 This is the system clock
time at the first origin of this genotype.
MaxPropPop: 0.8306 MaxPropInst: 0.4239 mpp_time: 0,0
MaxPropPop: 0.8306 The maximum proportion of the population of adult
cells in the soup, attained by this genotype. MaxPropInst: 0.4239 The
maximum proportion of space in the soup attained by adults of this genotype.
mpp_time: 0,0 The time at which this genotype achived its maximum proportion
of the population of cells (MaxPropPop). Time here is measured in millions
of instructions (to the left of the comma) and the remainder (to the right
of the comma).
ploidy: 1 track: 0
ploidy: 1 The ploidy level of this genotype (i.e., this genotype
is haploid). track: 0 Which copy of the genome will start executing at
birth. This is only used when the ploidy level is greater than one
(i.e., diploid).
track 0: prot
xwr
nop1 ; 010 110 01 0 beginning marker
track 0: prot This tells us that the assembler code that follows is
track one. If the genotype has a ploidy of 2, a second assembler listing
will follow, and it will be labeled track 1. The word prot refers to the
protection bits: xwr, or x = execute, w = write, r = read.
nop1 ; 010 110 01 0 beginning marker
This is the first line of the actual genome. The first word, nop1 is
the assembler mnemonic for one of the two no-operation instructions. The
semicolon indicates the beginning of comments.
The digits 010 tell us what protection this instruction will have at
birth. Only the write bit is set, so this instruction will be write
protected, but open to reading or execution at birth.
The digits 110 are a record of which instructions were executed by this
creature's own CPU (first digit), and the CPUs of other creatures' (second
digit), the third digit is not used at present. These bits are set when the
WatchExe parameter is set. That the first two digits are set to one indicates
that this instruction was executed both by its own CPU and by the CPU of
another creature (perhaps a parasite, or a lost instruction pointer).
The digits 01 are the actual hexadecimal op code of the instruction. It
is this value that will actually be stored in the soup.
The digit 0 just before the words ``beginning marker'' is a count of
the Nth instruction in the genome. This is the first instruction, so it is
numbered zero.
The words ``beginning marker'' are a comment describing the intended
purpose of this instruction.
If you study the code of the ancestor, you may be perplexed by the
reason for including the following instructions:
zero ; 010 110 04 4 put zero in cx
not0 ; 010 110 02 5 put 1 in first bit of cx
shl ; 010 110 03 6 shift left cx (cx = 2)
shl ; 010 110 03 7 shift left cx (cx = 4)
mov_cd ; 010 110 18 8 move cx to dx (dx = 4)
In the original version of the simulator, the size of the templates
was determine by the value in the dx register. These five instructions
loaded the dx register with the value 4, which is the size of the templates
in this creature. Later, it was decided that this was a stupid way to
determine template sizes. Now the parser just looks to see how many nops
follow any instruction using them, and the number of consecutive nops
determine the template size. Therefore, these five instructions don't do any
useful work in the present model, but they have been left in place because
the code still works.
9.2) Writing a Creature
If you write your own creature, you must obey the following conventions:
**** begin genome file (note blank line at top of file)
format: 2 bits: 3
genotype: 0080aaa parent genotype: 0666god
track 0: prot
xwr
nop1 ; 010
nop1 ; 010
**** end genome file
Yank the above lines into the file you are going to write, to use as
a template. You must have the following:
1) a blank line at the top of the file.
2) a line declaring the format and bits, just use the line given.
3) a line stating the genome size and three letter name, and that of
the parent genotype. The genome size must match the actual number
of instructions in the genome. The three letter name is arbitrary,
you can make up any name, but I advise using a low letter name like
aaa because these names are used in a base 26 numbering system by
the genebanker, and the genebanker must allocate an array as big
as the largest of these numbers. You may make up the parent genotype
size and name, it won't be used for anything, so its details don't
matter, but it should have the format of four numeric digits followed
by three letters.
4) a blank line
5) the line: track 0: prot, just use the line provided
6) the line: xwr, just use the line provided
7) the listing of assembler mnemonics, followed by a semicolon and a
three digit code indicating the protection at birth. I recomment that
you use the protection indicated. The listing of the 32 assembler
mnemonics can be found at the end of the soup_in.h file or in the
opcode.map file. For a description of what they actually do, study
the comments on the code of the ancestor listed above, and study the
corresponding parser and execute functions in the two modules in
parse.c and instruct.c.
10) IF YOU WANT TO MODIFY THE SOURCE CODE
If you make some significant improvements to Tierra, we would welcome
receiving the source code, so that we may integrate it into our version, and
then make it available to others. We will credit you for your contributions.
All lines of source code should be 78 characters or less, or it will
mess up the formatting of the code for distribution by mail.
The simulator has been designed so that it can be brought down, and then
brought back up where it left off. This means that there can be no static
local variables. Any variables that hang around must be global. They
are declared and defined in soup_in.h if they are also soup_in parameters.
Otherwise they are declared in declare.h, and all global variables are
declared as externals in extern.h.
The code for bringing the simulator up and down is in the tsetup.c
module. The system is brought up by GetSoup(), which calls GetAVar()
to read soup_in. All soup_in variables are read by the GetAVar() function.
If a new simulation is being started, GetSoup() calls GetNewSoup(). If an
old simulation is being restarted, GetSoup() calls GetOldSoup(). GetOldSoup()
will read all global variables not contained in soup_in, and will also read
in all arrays, such as the soup, the cells array, and the free_mem array.
When the simulator goes down, and periodically during a run, all global
variables are written to a file soup_out, and all global arrays such as
soup, the cells array, the free_mem array, and the random number generator
array, and some structures, are written to a binary file called core_out.
Thus if you create any new global variables or arrays, be sure they are read
by GetOldSoup(), and written by WriteSoup().
There are several obvious projects that I would like to comment on:
10.1) Creating a Frontend
All I/O to the console is routed through the frontend.c module, so that
it can be handled by a variety of frontends now under development. The
simplest of these just uses printf to write to standard out. If you
compile with #define FRONTEND BASIC, you will get the new frontend created
by Dan Pirone. The Basic frontend consists of five data area:
The STATS area at the top two lines of the screen.
The PLAN area next for displaying simlation variable every million virtual
instruction.
The MESSAGE area, for state changes, and Genebank data.
The ERROR area at the second to the last line of the screen.
The HELP area at the last line of the screen.
If your are going to work on the frontend, please observe the Basic code
as a template for changes, or get back to us for an updated version of
the frontend.c module.
10.2) Creating New Instruction Sets
If you want to create a new instruction set, more power to you. The
relevant modules to study are: instruct.c, parse.c, soup_in.h, arginst.h,
configur.h, and opcode.map. You will also need to study the definitions of
struct cpu, struct InstDef, and struct inst, all in the tierra.h module.
Note that the cpu structure includes an array of registers. The idea is that
you may change the size of this array to make just about any changes you might
want to the CPU architecture. You should avoid actually having to alter the
structure definitions in the tierra.h file. You may alter the correspondence
between opcodes and instructions by editing the opcode.map file.
10.3) Creating New Slicer Mechanisms
If you want to experiment with artificial rather than natural selection,
consider that selection is both a carrot and a stick. The carrot in this
model is CPU time which is allocated by the slicers. The stick is the reaper.
If you want to try to evolve algorithms that do useful work, your evaluation
functions should be embedded into the slicer, and should allocate more CPU
time to creatures who rank high. You can also do the same with the reaper.
10.4) Creating a Sexual Model
Sex emerges spontaneously in runs whenever parasites appear. However,
this sex is primitive and disorganized. I believe that the easiest way to
engineer organized sex is to work with diploid creatures. The infrastructure
to allow multiple ploidy levels is already in place. Notice that the
definition of Instruction, the type of which the soup is composed is:
typedef struct Inst Instruction[PLOIDY];
This means that if PLOIDY is defined as two, there are two parallel
tracks for genomes. The instruction pointer will run down the track
specified by the ce->c.tr variable in the cpu structure. We have not
implemented any other controls over the tracking of the instruction pointer
in diploid or higher models. This is future work.
10.5) Creating a Multi-cellular Model
Multi-cellularity was the hallmark of the Cambrian explosion of
diversity, and thus is likely a biological feature worth including in Tierra.
Also, it is likely that a multi-cellular model is the appropriate one for
evolving large application programs on massively parallel machines. How
can we implement multi-cellularity? What does it mean in the context of
Tierran creatures?
Consider that at the conceptual core, multi-cellularity means that the
mother cell determines what portion of the genome its daughter cell will
express. For many daughter cells, the mother cells narrows their options
by preventing them from expressing (executing) large portions of their
genome (code). In the organic world this is done by loading the daughter
cell with regulatory proteins which determine which genes will be expressed.
In the Tierran world, the same result can be achieved by allowing the
mother cell to set the position of the instruction pointer in the daughter
cell, and also the initial values of the CPU registers. These acts can
place the daughter cell into a portion of its code from which it may never
be able to reach certain other parts of its code. In this way the mother
cell determines what parts of the code are executed by the daughter.
To facilitate this process, the divide instruction has been broken into
three steps: 1) Create and initialize a CPU for the daughter. 2) Start the
daughter CPU running. 3) Become independent from the daughter by loosing
write privelages on the daughter space. Now, between steps 1 and 2, the
mother can place values into the CPU registers and instruction pointer of
the daughter. This will require and inter-CPU move instruction. The divide
instruction takes an argument that determines which of the three steps is
being performed. The divide instruction in the new instruction sets does
all three steps at once, but it also transfers the value from the mother's
cpu registers to the daughter's cpu registers, and it starts the daughter's
instruction pointer at some offset into the daughter's genome, as specified
by a value in one of the mothers cpu registers (this can cause
differentiation).
11) CONFIGURATION AT COMPILE TIME (configur.h)
The file configur.h is included by the file tierra.h, and thus is
included in all tierra source code modules. Many definitions are made in
this file that control options that require recompilation to change. The
number of these options has been growing, therefore the following is a
brief documentation of some of the parameters defined in this file.
#define INSTBITNUM 5
#define INSTNUM 32 /* INSTNUM = 2 ^ INSTBITNUM */
INSTBITNUM defines the number of bits used by instructions in the soup.
All of the instruction sets that have been implemented use five bits.
INSTNUM is the number of distinct opcodes in the instruction set which
is always 2 ^ INSTBITNUM.
#define PLOIDY 1
PLOIDY defines the number of parallel instruction tracks contained in
the soup. Tierra is normally run as a haploid model, meaning that each
creature has one copy of the genome. It is possible for creatures to
contain two or more copies of the genome (humans have two copies of each
chromosome), by defining ploidy to a number larger than one. However,
it has been some time since this option has been used, and it may not work
at present. The option will likely be exercised in the near future.
#ifndef INST
#define INST 1 /* 1, 2, 3 or 4 */
#endif
INST defines which of the four instruction sets will be implemented.
This variable may be defined in the makefile.
#define STACK_SIZE 10
Each of the four instruction sets includes a stack, whose size is defined
by this variable.
#define ALOC_REG 4
#define NUMREG 4
Each of the four instruction sets uses an array of registers, whose size
is controlled with these variables. The number of actual registers is defined
by NUMREG, but there may be additional virtual registers (as in INST == 2),
which can be allocated by settin ALOC_REG larger than NUMREG.
#define GETBUFSIZ 3
#define PUTBUFSIZ 3
Instruction sets two through four include communication functions,
get() and put(), which read from the get buffer and write to the put buffer.
GETBUFSIZ and PUTBUFSIZ control the size of these buffers.
#ifndef FRONTEND
#define FRONTEND BASIC /* BASIC or STDIO */
#endif
There are two frontends. The BASIC frontend provides the best
interface, but requires the use of the console. In order to run Tierra in
the background (on unix systems) Tierra should be compiled with the STDIO
frontend whose output can be redirected to /dev/null.
/* #define CM5 */ /* CM5 the version for the CM5 */
CM5 controls variations specific to the MIMD implementation of Tierra
on the CM5, the Connection Machine manufactured by Thinking Machines
Corporation.
#define MICRO /* define for micro step debugging */
MICRO implements the code required to use the virtual debugger.
/* #define HSEX */ /* define for haploid, crossover sex */
HSEX implements the code used to support haploid sex among creatures.
#define ICC /* define for inter-cellular communication, != I/O */
Instruction sets two through four have two communications options.
If ICC is defined, the put() and get() instructions cause communications
between cells. In this case the put() function of one cell writes to the get
buffer(s) of other cell(s). If ICC is not defined, put() and get() write and
read to the buffers of the active cell, and these communications are intended
for the user.
/* #define READPROT */ /* define to implement read protection of soup */
#define WRITEPROT /* define to implement write protection of soup */
/* #define EXECPROT */ /* define to implement execute protection of soup */
Instructions in the soup can be protected using the mal() instruction.
The status of protected and unprotected memory is determined by the two
soup_in variables:
MemModeFree = 0 read, write, execute protection for free memory
MemModeProt = 2 rwx protect mem: 1 bit = execute, 2 bit = write, 4 bit = read
However, each of the three memory protection modes costs cpu time, therefore
they should not be compiled unless they are going to be used. The
READPROT, WRITEPROT, and EXECPROT definitions control which protection modes
are implemented at compile time.
/* #define ERROR */ /* use to include error checking code */
ERROR turns on a large body of error checking and verification code.
It should only be used while debugging Tierra.
/* #define ALCOMM */ /* define for socket communications */
ALCOMM turns on the code used to communicate with the ALmond monitor
program (unix only).
/* #define MEM_PROF */ /* profile dynamic memory usage */
MEM_PROF turns on code that provides a summary of how dynamically
allocated memory is being used. This is basically for debugging purposes.
12) KNOWN BUGS
Memory management for the DOS version is very tricky, and is not yet
air-tight. During any run using the genebank, the demand for memory grows
as genomes accumulate in RAM. Eventually the memory fills up, and the
memory management routines swap extinct genomes out to disk. When a lot
of genomes are swapped to disk, the system thrashes in the swapping of
genomes between RAM and disk. Eventually some memory request will fail due
to memory fragmentation or the accumulation of too many living genomes in
RAM, and the system will come down. At this point just bring the system
back up by typing: tierra soup_out. Bringing the system down and back up
should eliminate the memory fragmentation, and effect some garbage collection,
freeing up more memory. It is actually a good idea to bring the system
up and down on DOS machines before they run out of memory, just to clear
the memory. You can tell if you are running out of memory by selecting the
i-info option from the menu. At the upper left of the screen you will then
see a message like: Coreleft = 4400. This tells you that the system is out
of memory (Tierra is not allowed to use the last 4K of memory).
13) IF YOU HAVE PROBLEMS
13.1) Problems with Installation
Read the installation instructions carefully. Where we present
alternative methods, try them all. If it still doesn't work, please
let us know. We would like to help people with installation, and make
sure that our instructions are clear. If you are using a compiler or
machine we have not tested, we may not be able to help you, but we want
to accomodate these additional conditions. We would like to help you
find a solution and incorporate the solution in future releases.
13.2) Problems Running Tierra
Read all of the .doc files. You may find the answer there. If a
problem still persists, and you have ftp access, get a new copy of the
source code out of the ftp site. It is likely that the source code will be
updated on a roughly monthly or bimonthly basis as we continue to improve
the program. By the time you are sure there is a problem, we may already
have fixed it and placed a fix in the ftp site.
If the problem still persists after you have tested the latest version
of the software, let us know about the problem. We would like to fix it.
If you do not have ftp access, and you identify a bug, we will fix it and
send you a free upgrade if you return you original distribution disk in
a disk mailer.
14) REGISTRATION & MAILING LISTS
The reason you might want to register your copy of the software is so
that we can contact you if we discover a bug, or we can let you know when
new versions of the program are ready for distribution. If you obtained
Tierra on disk, send your name, address and the serial number of your
distribution disk to Virtual Life at the address listed below. If you
obtained Tierra from the ftp site, send you email address to:
tierra-request@life.slhs.udel.edu
There is a mailing list for Tierra users. (actually 2, but you only want
to be on one of them). The first list is for people who only want to get the
official announcements, updates and bug-fixes. The other will carry the
official postings, and are intended for discussion of Tierra by users.
This one is distributed in digest form, when there is enough material.
The lists are:
tierra-announce official updates, patches and announcements only
tierra-digest discussion, updates, etc. (digest form)
The addresses are:
tierra-request@life.slhs.udel.edu the list administrator. to be added,
removed, or complain about problems with
any of these lists.
tierra-digest@life.slhs.udel.edu to post to the list.
tierra-bug@life.slhs.udel.edu for bug-reports or questions about the
code or installation.
You may also be interested in the Artificial Life mailing list.
Subscribe to the list by sending a message to:
alife-request@cognet.ucla.edu
Post to the list by sending a message to:
alife@cognet.ucla.edu
Tom Ray
Virtual Life
P.O. Box 625
Newark, DE 19715
From January through August, we will be at:
Tom Ray
Santa Fe Institute
1660 Old Pecos Trail, Suite A
Santa Fe, NM 87501
ray@santafe.edu
505-984-8800
505-982-0565 (FAX)
From September through December, we will be at:
Tom Ray
University of Delaware
School of Life & Health Sciences
Newark, Delaware 19716
ray@udel.edu
302-831-2753
302-831-2281 (FAX)