ftp.pasteur.org/FAQ/

home *** CD-ROM | disk | FTP | other *** search

/ ftp.pasteur.org/FAQ/ / ftp-pasteur-org-FAQ.zip / FAQ / ai-faq / genetic / part3 < prev next >

Wrap

Internet Message Format | 2001-04-12 | 56.7 KB

Path: senator-bedfellow.mit.edu!bloom-beacon.mit.edu!news.bu.edu!micro-heart-of-gold.mit.edu!nntp.club.cc.cmu.edu!usenet01.sei.cmu.edu!logbridge.uoregon.edu!server3.netnews.ja.net!server4.netnews.ja.net!fafnir.cf.ac.uk!scmdb From: David.Beasley@cs.cf.ac.uk (David Beasley) Newsgroups: comp.ai.genetic,comp.answers,news.answers Subject: FAQ: comp.ai.genetic part 3/6 (A Guide to Frequently Asked Questions) Supersedes: <part3_969480833@cs.cf.ac.uk> Followup-To: comp.ai.genetic Date: 11 Apr 2001 20:23:23 GMT Organization: Posted through the Joint Cardiff Computing Service, Wales, UK Lines: 1075 Approved: news-answers-request@MIT.Edu Expires: 25 May 2001 20:22:54 GMT Message-ID: <part3_987020574@cs.cf.ac.uk> References: <part2_987020574@cs.cf.ac.uk> NNTP-Posting-Host: thrall.cs.cf.ac.uk X-Trace: fafnir.cf.ac.uk 987020603 29441 131.251.42.22 (11 Apr 2001 20:23:23 GMT) X-Complaints-To: abuse@cf.ac.uk NNTP-Posting-Date: 11 Apr 2001 20:23:23 GMT Summary: This is part 3 of a <trilogy> entitled "The Hitch-Hiker's Guide to Evolutionary Computation". A periodically published list of Frequently Asked Questions (and their answers) about Evolutionary Algorithms, Life and Everything. It should be read by anyone who whishes to post to the comp.ai.genetic newsgroup, preferably *before* posting. Originator: scmdb@thrall.cs.cf.ac.uk Xref: senator-bedfellow.mit.edu comp.ai.genetic:21405 comp.answers:44994 news.answers:205230 Archive-name: ai-faq/genetic/part3 Last-Modified: 4/12/01 Issue: 9.1 Important note: Do NOT send email to the cs.cf.ac.uk address above: it will be ignored. Corrections and other correspondence should be sent to david.beasley@iee.org TABLE OF CONTENTS OF PART 3 Q2: What applications of EAs are there? Q3: Who is concerned with EAs? Q4: How many EAs exist? Which? Q4.1: What about Alife systems, like Tierra and VENUS? Q5: What about all this Optimization stuff? ---------------------------------------------------------------------- Subject: Q2: What applications of EAs are there? In principle, EAs can compute any computable function, i.e. everything a normal digital computer can do. But EAs are especially badly suited for problems where efficient ways of solving them are already known, (unless these problems are intended to serve as benchmarks). Special purpose algorithms, i.e. algorithms that have a certain amount of problem domain knowledge hard coded into them, will usually outperform EAs, so there is no black magic in EC. EAs should be used when there is no other known problem solving strategy, and the problem domain is NP-complete. That's where EAs come into play: heuristically finding solutions where all else fails. Following is an incomplete (sic!) list of successful EA applications: ARTIFICIAL LIFE ARTIFICIAL LIFE (ALIFE) systems attempt to simulate the kind of behaviour exhibited by real, living creatures. Not all Artificial Life systems employ EVOLUTIONARY ALGORITHMs (see Q4.1). Framsticks Framsticks is a three-dimensional life SIMULATION project. Both the physical structure of creatures and their control systems are evolved. Evolutionary algorithms are used with SELECTION, CROSSOVER and MUTATION. Finite element methods are used for simulation. Both spontaneous and directed EVOLUTIONs are possible. This system uses the standard EA framework to evolve 3D agents equipped with neural networks. It has proved to be an attractive tool for people who want to learn about the way evolutionary OPTIMIZATION techniques work. This is shareware, but all the evolutionary features are available free. The project is open, and developers can take part in it, and also conduct their own experiments (i.e. using their own GENETIC OPERATORs). There are links to the scientific papers on the web page, as well as the detailed program documentation. The software is quite general and can be used to study a range of problems, including coevolution of body and brain. For more details, see: http://www.frams.poznan.pl/ BIOCOMPUTING Biocomputing, or Bioinformatics, is the field of biology dedicated to the automatic analysis of experimental data (mostly sequencing data). Several approaches to specific biocomputing problems have been described that involve the use of GA, GP and simulated annealing. General information about biocomputing (software, databases, misc.) can be found on the server of the European Bioinformatics Institute: http://www.ebi.ac.uk/ebi_home.html ENCORE has a good selection of pointers related to this subject. VSCN provides a detailed online course on bioinformatics: http://www.techfak.uni- bielefeld.de/bcd/Curric/welcome.html There are three main domains to which GA have been applied in Bioinformatics: protein folding, RNA folding, sequence alignment. Protein Folding Proteins are one of the essential components of any form of life. They are made of twenty different types of amino acid. These amino acids are chained together in order to form the protein that can contain from a few to several thousands residues. In most of the cases, the properties and the function of a protein are a result of its three dimensional structure. It seems that in many cases this structure is a direct consequence of the sequence. Unfortunately, it is still very difficult/impossible to deduce the three dimensional structure, knowing only the sequence. A part of the VSCN on-line bioinformatics course is dedicated to the use of GAs in Protein Folding Prediction. It contains an extensive bibliography and a detailed presentation of the subject with LOTS of explanations and on-line papers. The URL is: http://www.techfak.uni- bielefeld.de/bcd/Curric/ProtEn/contents.html Koza [KOZA92] gives one example of GP applied to Protein Folding. Davis [DAVIS91] gives an example of DNA conformation prediction (a closely related problem) in his Handbook of GAs. RNA Folding Describing the tertiary structure of an RNA molecule, is about as hard as for a protein, but describing the intermediate structure (secondary structure) is somehow easier because RNA molecules are using the same pairing rules as DNA, (Watson and Crick base pairing). There exist deterministic algorithms that given a set of constraints (rules), compute the more stable structure, but: (a) their time and memory requirement increase quadratically or more with the length of the sequences, and (b) they require simplified rules. Lots of effort has recently been put into applying GAs to this problem, and several papers can be found (on-line if your institute subscribes to these journals): A genetic Algorithm Based Molecular Modelling Technique For RNA Stem- loop Structures H. Ogata, Y. Akiyama and M Kanehisa, Nucleic Acid Research, 1995, vol 23,3 419-426 An Annealing Mutation Operator in the GA for RNA folding B.A Shapiro and J. C. Wu, CABIOS, 1996, vol 12, 3, 171-180 The computer Simulation of RNA Folding Pathway Using a Genetic Algorithm A.P. Gultyaev, F.D.H van Batenburg and C. W. A. Pleij in Journal of Molecular Biology, 1995, vol 250 37-51 Simulated Annealing has also been applied successfully to this problem: Description of RNA folding by SA M. Schmitz and G. Steger in Journal of Molecular Biology, 1995, 255, 245-266 Sequence Alignments Sequence Alignment is another important problem of Bioinformatics. The aim is to align together several related sequences (from two to hundreds) given a cost function. For the most widely used cost functions, the problem has been shown to be NP-complete. Several attempts have been made using SA: Multiple Sequence Alignment Using SA J. Kim, Sakti Pramanik and M.J. Chung, CABIOS, 1994, vol 10, 4, 419-426 Multiple Sequence Alignment by Parallel SA M. Isshikawa, T. Koya and al, CABIOS, 1993,vol 9, 3, 267-273 SAM, software which uses Hidden Markov Models for Multiple Sequence Alignment, can use SA to train the model. Several papers have been published on SAM. The software, documentation and an extensive bibliography can be found in: http://www.cse.ucsc.edu/research/compbio/sam.html More recently, various software using different methods like Gibbs sampling or GAs has been developed: A Gibbs Sampling Strategy for Multiple Alignment C.E. Lawrence, S. F. Altschull and al, Science, October 1993, vol 262, 208-214 SAGA: Sequence Alignment by Genetic Algorithm C. Notredame and D.G. Higgins, Nucleic Acid Research, 1995, vol 24, 8, 1515-1524 A beta release of SAGA (along with the paper) is available on the European Bioinformatics Institute anonymous FTP server: ftp.ebi.ac.uk/pub/software/unix/saga.tar.Z CELLULAR PROGRAMMING: Evolution of Parallel Cellular Machines Nature abounds in systems involving the actions of simple, locally- interacting components, that give rise to coordinated global behavior. These collective systems have evolved by means of natural SELECTION to exhibit striking problem-solving capacities, while functioning within a complex, dynamic ENVIRONMENT. Employing simple yet versatile parallel cellular models, coupled with EVOLUTIONARY COMPUTATION techniques, cellular programming is an approach for constructing man-made systems that exhibit characteristics such as those manifest by their natural counterparts. Parallel cellular machines hold potential both scientifically, as vehicles for studying phenomena of interest in areas such as complex adaptive systems and ARTIFICIAL LIFE, as well as practically, enabling the construction of novel systems, endowed with evolutionary, reproductive, regenerative, and learning capabilities. Web site: http://lslwww.epfl.ch/~moshes/cp.html References: Sipper, M. (1997) "Evolution of Parallel Cellular Machines: The Cellular Programming Approach", Springer-Verlag, Heidelberg. Sipper, M. (1996) "Co-evolving Non-Uniform Cellular Automata to Perform Computations", Physica D, 92, 193-208. Sipper, M. and Ruppin, E. (1997) "Co-evolving architectures for cellular machines", Physica D, 99, 428-441. Sipper, M. and Tomassini, M. (1996) "Generating Parallel Random Number Generators By Cellular Programming", International Journal of Modern Physics C, 7(2), 181-190. Sipper, M. (1997) "Evolving Uniform and Non-uniform Cellular Automata Networks", in Annual Reviews of Computational Physics, D. Stauffer (ed) Evolvable Hardware The idea of evolving machines, whose origins can be traced to the cybernetics movement of the 1940s and the 1950s, has recently resurged in the form of the nascent field of bio-inspired systems and evolvable hardware. The field draws on ideas from the EVOLUTIONARY COMPUTATION domain as well as on novel hardware innovations. Recently, the term evolware has been used to describe such evolving ware, with current implementations centering on hardware, while raising the possibility of using other forms in the future, such as bioware. The inaugural workshop, Towards Evolvable Hardware, took place in Lausanne, in October 1995, followed by the First International Conference on Evolvable Systems: From Biology to Hardware (ICES96) held in Japan, in October 1996. Another major event in the field, ICES98, was held in Lausanne, Switzerland, in September 1998. References: Sipper, M. et al (1997) "A Phylogenetic, Ontogenetic, and Epigenetic View of Bio-Inspired Hardware Systems", IEEE Transactions on Evolutionary Computation, 1(1). Sanchez, E. and Tomassini, M. (eds) (1996) "Towards Evolvable Hardware", Springer-Verlag, Lecture Notes in Computer Science, 1062. Higuchi, T. et al (1997) "Proceedings of First International Conference on Evolvable Systems: From Biology to Hardware (ICES96)", Springer-Verlag, Lecture Notes in Computer Science. GAME PLAYING GAs can be used to evolve behaviors for playing games. Work in evolutionary GAME THEORY typically surrounds the EVOLUTION of a POPULATION of players who meet randomly to play a game in which they each must adopt one of a limited number of moves. (Maynard-Smith 1982). Let's suppose it is just two moves, X and Y. The players receive a reward, analogous to Darwinian FITNESS, depending on which combination of moves occurs and which move they adopted. In more complicated models there may be several players and several moves. The players iterate such a game a series of times, and then move on to a new partner. At the end of all such moves, the players will have a cumulative payoff, their fitness. This fitness can then be used to generate a new population. The real key in using a GA is to come up with an encoding to represent player's strategies, one that is amenable to CROSSOVER and to MUTATION. Possibilities are to suppose at each iteration a player adopts X with some probability (and Y with one minus such). A player can thus be represented as a real number, or a bit-string suitably interpreted as a probability An alternative characterisation is to model the players as Finite State Machines, or Finite Automata (FA). These can be though of as a simple flow chart governing behaviour in the "next" play of the game depending upon previous plays. For example: 100 Play X 110 If opponent plays X go to 100 120 Play Y 130 If opponent plays X go to 100 else go to 120 represents a strategy that does whatever its opponent did last, and begins by playing X, known as "Tit-For-Tat." (Axelrod 1982). Such machines can readily be encoded as bit-strings. Consider the encoding "1 0 1 0 0 1" to represent TFT. The first three bits, "1 0 1" are state 0. The first bit, "1" is interpreted as "Play X." The second bit, "0" is interpreted as "if opponent plays X go to state 1," the third bit, "1", is interpreted as "if the opponent plays Y, go to state 1." State 1 has a similar interpretation. Crossing over such bit-strings always yields valid strategies. SIMULATIONs in the Prisoner's dilemma have been undertaken (Axelrod 1987, Fogel 1993, Miller 1989) of these machines. Alternative representations of game players include CLASSIFIER SYSTEMs (Marimon, McGrattan and Sargent 1990, [GOLD89]), and Neural- networks (Fogel and Harrald 1994), though not necessarily with a GA. (Fogel 1993), and Fogel and Harrald 1994 use an Evolutionary Program). Chellapilla and Fogel (1999) have evolved a neural network which can play checkers (draughts) at near expert level. Other methods of evolving a population can be found in Lindgren 1991, Glance and Huberman 1993 and elsewhere. A GA for playing the game "Mastermind" has been produced. See http://kal-el.ugr.es/mastermind References. Axelrod, R. (1987) ``The Evolution of Strategies in the Repeated Prisoner's Dilemma,'' in [DAVIS91] Axelrod, R (?) ``The Complexity of Cooperation'' (See the web site, which includes code to implement tournaments: http://pscs.physics.lsa.umich.edu/Software/ComplexCoop.html ) Chellapilla, K. and Fogel, D.B. (1999) ``Evolution, neural networks, games, and intelligence'' , Proc. IEEE, Sept., pp. 1471-1496. Miller, J.H. (1989) ``The Coevolution of Automata in the Repeated Prisoner's Dilemma'' Santa Fe Institute Working Paper 89-003. Marimon, Ramon, Ellen McGrattan and Thomas J. Sargent (1990) ``Money as a Medium of Exchange in an Economy with Artificially Intelligent Agents'' Journal of Economic Dynamics and Control 14, pp. 329--373. Maynard-Smith, (1982) Evolution and the Theory of Games, CUP. Lindgren, K. (1991) ``Evolutionary Phenomena in Simple Dynamics,'' in [ALIFEI]. Holland, J.H and John Miller (1990) ``Artificially Adaptive Agents in Economic Theory,'' American Economic Review: Papers and Proceedings of the 103rd Annual Meeting of the American Economics Association: 365--370. Huberman, Bernado, and Natalie S. Glance (1993) "Diversity and Collective Action" in H. Haken and A. Mikhailov (eds.) Interdisciplinary Approaches to Nonlinear Systems, Springer. Fogel (1993) "Evolving Behavior in the Iterated Prisoner's Dilemma" Evolutionary Computation 1:1, 77-97 Fogel, D.B. and Harrald, P. (1994) ``Evolving Complex Behaviour in the Iterated Prisoner's Dilemma,'' Proceedings of the Fourth Annual Meetings of the Evolutionary Programming Society, L.J. Fogel and A.W. Sebald eds., World Science Press. Lindgren, K. and Nordahl, M.G. "Cooperation and Community Structure in Artificial Ecosystems", Artificial Life, vol 1:1&2, 15-38 Stanley, E.A., Ashlock, D. and Tesfatsion, L. (1994) "Iterated Prisoners Dilemma with Choice and Refusal of Partners in [ALIFEIII] 131-178 JOB-SHOP SCHEDULING The Job-Shop Scheduling Problem (JSSP) is a very difficult NP- complete problem which, so far, seems best addressed by sophisticated branch and bound search techniques. GA researchers, however, are continuing to make progress on it. (Davis 85) started off GA research on the JSSP, (Whitley 89) reports on using the edge RECOMBINATION operator (designed initially for the TSP) on JSSPs too. More recent work includes (Nakano 91),(Yamada & Nakano 92), (Fang et al. 93). The latter three report increasingly better results on using GAs on fairly large benchmark JSSPs (from Muth & Thompson 63); neither consistently outperform branch & bound search yet, but seem well on the way. A crucial aspect of such work (as with any GA application) is the method used to encode schedules. An important aspect of some of the recent work on this is that better results have been obtained by rejecting the conventional wisdom of using binary representations (as in (Nakano 91)) in favor of more direct encodings. In (Yamada & Nakano 92), for example, a GENOME directly encodes operation completion times, while in (Fang et al. 93) genomes represent implicit instructions for building a schedule. The success of these latter techniques, especially since their applications are very important in industry, should eventually spawn advances in GA theory. Concerning the point of using GAs at all on hard job-shop scheduling problems, the same goes here as suggested above for `Timetabling': The GA approach enables relatively arbitrary constraints and objectives to be incorporated painlessly into a single OPTIMIZATION method. It is unlikely that GAs will outperform specialized knowledge-based and/or conventional OR-based approaches to such problems in terms of raw solution quality, however GAs offer much greater simplicity and flexibility, and so, for example, may be the best method for quick high-quality solutions, rather than finding the best possible solution at any cost. Also, of course, hybrid methods will have a lot to offer, and GAs are far easier to parallelize than typical knowledge-based/OR methods. Similar to the JSSP is the Open Shop Scheduling Problem (OSSP). (Fang et al. 93) reports an initial attempt at using GAs for this. Ongoing results from the same source shows reliable achievement of results within less than 0.23% of optimal on moderately large OSSPs (so far, up to 20x20), including an improvement on the previously best known solution for a benchmark 10x10 OSSP. A simpler form of job shop problem is the Flow-Shop Sequencing problem; recent successful work on applying GAs to this includes (Reeves 93)." Other scheduling problems In contrast to job shop scheduling some maintenance scheduling problems consider which activities to schedule within a planned maintenance period, rather than seeking to minimise the total time taken by the activities. The constraints on which parts may be taken out of service for maintenance at particular times may be very complex, particularly as they will in general interact. Some initial work is given in (Langdon, 1995). References Davis, L. (1985) "Job-Shop Scheduling with Genetic Algorithms", [ICGA85], 136-140. Muth, J.F. & Thompson, G.L. (1963) "Industrial Scheduling". Prentice Hall, Englewood Cliffs, NJ, 1963. Nakano, R. (1991) "Conventional Genetic Algorithms for Job-Shop Problems", [ICGA91], 474-479. Reeves, C.R. (1993) "A Genetic Algorithm for Flowshop Sequencing", Coventry Polytechnic Working Paper, Coventry, UK. Whitley, D., Starkweather, T. & D'Ann Fuquay (1989) "Scheduling Problems and Traveling Salesmen: The Genetic Edge Recombination Operator", [ICGA89], 133-140. Fang, H.-L., Ross, P., & Corne D. (1993) "A Promising Genetic Algorithm Approach to Job-Shop Scheduling, Rescheduling & Open-Shop Scheduling Problems", [ICGA93], 375-382. Yamada, T. & Nakano, R. (1992) "A Genetic Algorithm Applicable to Large-Scale Job-Shop Problems", [PPSN92], 281-290. Langdon, W.B. (1995) "Scheduling Planned Maintenance of the (UK) National Grid", cs.ucl.ac.uk/genetic/papers/grid_aisb-95.ps MANAGEMENT SCIENCES "Applications of EA in management science and closely related fields like organizational ecology is a domain that has been covered by some EA researchers - with considerable bias towards scheduling problems. Since I believe that EA have considerable potential for applications outside the rather narrow domain of scheduling and related combinatorial problems, I started collecting references about the status quo of EA-applications in management science. This report intends to make available my findings to other researchers in the field. It is a short overview and lists some 230 references to current as well as finished research projects. [..] "At the end of the paper, a questionnaire has been incorporated that may be used for this purpose. Other comments are also appreciated." --- from the Introduction of (Nissen 93) References Nissen, V. (1993) "Evolutionary Algorithms in Management Science: An Overview and List of References", Papers on Economics and Evolution, edited by the European Study Group for Evolutionary Economics. This report is also avail. via anon. FTP from ftp.gwdg.de/pub/msdos/reports/wi/earef.eps Boulding, K.E. (1991) "What is evolutionary economics?", Journal of Evolutionary Economics, 1, 9-17. NON-LINEAR FILTERING New connections between GENETIC ALGORITHMs and Non Linear Filtering Theory have been established. GAs have already been successfully applied to a large class of non-linear filtering problems such as RADAR / SONAR/ GPS signal processing. This relatively new branch of GA application has also lead to new results on the convergence of GAs: large deviations, fluctuations... Some preprints and references on this topic are available in the web page: http://www-sv.cict.fr/lsp/Delmoral/index.html The new results also points out some natural connections between: genetic type algorithms, information theory, non-linear filtering theory, interacting and branching particle systems. TIMETABLING This has been addressed quite successfully with GAs. A very common manifestation of this kind of problem is the timetabling of exams or classes in Universities, etc. The first application of GAs to the timetabling problem was to build the schedule of the teachers in an Italian high school. The research, conducted at the Department of Electronics and Information of Politecnico di Milano, Italy, showed that a GA was as good as Tabu Search, and better than simulated annealing, at finding teacher schedules satisfying a number of hard and soft constraints. The software package developed is now in current use in some high schools in Milano. (Colorni et al 1990) At the Department of Artificial Intelligence, University of Edinburgh, timetabling the MSc exams is now done using a GA (Corne et al. 93, Fang 92). An example of the use of GAs for timetabling classes is (Abramson & Abela 1991). In the exam timetabling case, the FITNESS function for a GENOME representing a timetable involves computing degrees of punishment for various problems with the timetable, such as clashes, instances of students having to take consecutive exams, instances of students having (eg) three or more exams in one day, the degree to which heavily-subscribed exams occur late in the timetable (which makes marking harder), overall length of timetable, etc. The modular nature of the fitness function has the key to the main potential strength of using GAs for this sort of thing as opposed to using conventional search and/or constraint programming methods. The power of the GA approach is the ease with which it can handle arbitrary kinds of constraints and objectives; all such things can be handled as weighted components of the fitness function, making it easy to adapt the GA to the particular requirements of a very wide range of possible overall objectives . Very few other timetabling methods, for example, deal with such objectives at all, which shows how difficult it is (without GAs) to graft the capacity to handle arbitrary objectives onto the basic "find shortest- length timetable with no clashes" requirement. The proper way to weight/handle different objectives in the fitness function in relation to the general GA dynamics remains, however, an important research problem! GAs thus offer a combination of simplicity, flexibility & speed which competes very favorably with other approaches, but are unlikely to outperform knowledge-based (etc) methods if the best possible solution is required at any cost. Even then, however, hybridisation may yield the best of both worlds; also, the ease (if the hardware is available!) of implementing GAs in parallel enhances the possibility of using them for good, fast solutions to very hard timetabling and similar problems. References Abramson & Abela (1991) "A Parallel Genetic Algorithm for Solving the School Timetabling Problem", Technical Report, Division of I.T., C.S.I.R.O, April 1991. (Division of Information Technology, C.S.I.R.O., c/o Dept. of Communication & Electronic Engineering, Royal Melbourne Institute of Technology, PO BOX 2476V, Melbourne 3001, Australia) Colorni A., M. Dorigo & V. Maniezzo (1990). Genetic Algorithms And Highly Constrained Problems: The Time-Table Case. Proceedings of the First International Workshop on Parallel Problem Solving from Nature, Dortmund, Germany, Lecture Notes in Computer Science 496, Springer- Verlag, 55-59. http://iridia.ulb.ac.be/dorigo/dorigo/conferences/IC.01-PPSN1.ps.gz Colorni A., M. Dorigo & V. Maniezzo (1990). Genetic Algorithms: A New Approach to the Time-Table Problem. NATO ASI Series, Vol.F 82, COMBINATORIAL OPTIMIZATION, (Ed. M.Akguel and others), Springer- Verlag, 235-239. http://iridia.ulb.ac.be/dorigo/dorigo/conferences/IC.02-NATOASI90.ps.gz Colorni A., M. Dorigo & V. Maniezzo (1990). A Genetic Algorithm to Solve the Timetable Problem. Technical Report No. 90-060, Politecnico di Milano, Italy. http://iridia.ulb.ac.be/dorigo/dorigo/tec.reps/TR.01-TTP.ps.gz Corne, D. Fang, H.-L. & Mellish, C. (1993) "Solving the Modular Exam Scheduling Problem with Genetic Algorithms". Proc. of 6th Int'l Conf. on Industrial and Engineering Applications of Artificial Intelligence & Expert Systems, ISAI. Fang, H.-L. (1992) "Investigating GAs for scheduling", MSc Dissertation, University of Edinburgh Dept. of Artificial Intelligence, Edinburgh, UK. ------------------------------ Subject: Q3: Who is concerned with EAs? EVOLUTIONARY COMPUTATION attracts researchers and people of quite dissimilar disciplines, i.e. EC is a interdisciplinary research field: Computer scientists Want to find out about the properties of sub-symbolic information processing with EAs and about learning, i.e. adaptive systems in general. They also build the hardware necessary to enable future EAs (precursors are already beginning to emerge) to huge real world problems, i.e. the term "massively parallel computation" [HILLIS92], springs to mind. Engineers Of many kinds want to exploit the capabilities of EAs on many areas to solve their application, esp. OPTIMIZATION problems. Roboticists Want to build MOBOTs (MOBile ROBOTs, i.e. R2D2's and #5's cousins) that navigate through uncertain ENVIRONMENTs, without using built-in "maps". The MOBOTS thus have to adapt to their surroundings, and learn what they can do "move-through-door" and what they can't "move- through-wall" on their own by "trial-and-error". Cognitive scientists Might view CFS as a possible apparatus to describe models of thinking and cognitive systems. Physicists Use EC hardware, e.g. Hillis' (Thinking Machine Corp.'s) Connection Machine to model real world problems which include thousands of variables, that run "naturally" in parallel, and thus can be modelled more easily and esp. "faster" on a parallel machine, than on a serial "PC" one. Biologists Are finding EAs useful when it comes to protein folding and other such bio-computational problems (see Q2). EAs can also be used to model the behaviour of real POPULATIONs of organisms. Some biologists are hostile to modeling, but an entire community of Population Biologists arose with the 'evolutionary synthesis' of the 1930's created by the polymaths R.A. Fisher, J.B.S. Haldane, and S. Wright. Wright's SELECTION in small populations, thereby avoiding local optima) is of current interest to both biologists and ECers -- populations are naturally parallel. A good exposition of current population Biology modeling is J. Maynard Smith's text Evolutionary Genetics. Richard Dawkin's Selfish Gene and Extended Phenotype are unparalleled (sic!) prose expositions of evolutionary processes. Rob Collins' papers are excellent parallel GA models of evolutionary processes (available in [ICGA91] and by FTP from ftp.cognet.ucla.edu/pub/alife/papers/ ). As fundamental motivation, consider Fisher's comment: "No practical biologist interested in (e.g.) sexual REPRODUCTION would be led to work out the detailed consequences experienced by organisms having three or more sexes; yet what else should [s/]he do if [s/]he wishes to understand why the sexes are, in fact, always two?" (Three sexes would make for even weirder grammar, [s/]he said...) Chemists And in particular biochemists and molecular chemists, are interested in problems such as the conformational analysis of molecular clusters and related problems in molecular sciences. The application of GAs to molecular systems has opened an interesting area of research and the number of chemists involved in it increases day-by-day. Some typical research topics include: o protein folding; o conformational analysis and energy minimization; o docking algorithms for drug-design; o solvent site prediction in macromolecules; Several papers have been published in journals such as Journal of Computational Chemistry and Journal of Computer-Aided Design. Some interesting WWW sites related to the applications of GAs to chemistry (or molecular science in general) include: o http://garage.cps.msu.edu/projects/biochem/biochem.html about GAs in biochemistry (water site prediction, drug-design and protein folding); o http://www.tc.cornell.edu/Edu/SPUR/SPUR94/Main/John.html about the application of GAs to the search of conformational energy minima; o http://cmp.ameslab.gov/cmp/CMP_Theory/gsa/gen2.html By using a GA in combiation with a Tight-binding model, David Deaven and Kai- Ming Ho founded fullerene cages (including C60) starting from random coordinates. See also Q2 for applications in biocomputing. Philosophers and some other really curious people may also be interested in EC for various reasons. ------------------------------ Subject: Q4: How many EAs exist? Which? The All Stars There are currently 3 main paradigms in EA research: GENETIC ALGORITHMs, EVOLUTIONARY PROGRAMMING, and EVOLUTION STRATEGIEs. CLASSIFIER SYSTEMs and GENETIC PROGRAMMING are OFFSPRING of the GA community. Besides this leading crop, there are numerous other different approaches, alongside hybrid experiments, i.e. there exist pieces of software residing in some researchers computers, that have been described in papers in conference proceedings, and may someday prove useful on certain tasks. To stay in EA slang, we should think of these evolving strands as BUILDING BLOCKs, that when recombined someday, will produce new offspring and give birth to new EA paradigm(s). One such interesting offspring is the Memetic Algorithm. This is a hybrid evolutionary algorithm, which makes use of local search operators. For details, see http://www.densis.fee.unicamp.br/~moscato/memetic_home.html Promising Rookies As far as "solving complex function and COMBINATORIAL OPTIMIZATION tasks" is concerned, Davis' work on real-valued representations and adaptive operators should be mentioned (Davis 89). Moreover Whitley's Genitor system incorporating ranking and "steady state" mechanism (Whitley 89), Goldberg's "messy GAs", involves adaptive representations (Goldberg 91), and Eshelman's CHC algorithm (Eshelman 91). For real FUNCTION OPTIMIZATION, Differential EVOLUTION seems hard to beat in terms of convergence speed as well as simplicity: With just three control variables, tuning is particularly easy to do. For "the design of robust learning systems", i.e. the field characterized by CFS, Holland's (1986) CLASSIFIER SYSTEM, with its state-of-the-art implementation CFS-C (Riolo 88), we should note developments in SAMUEL (Grefenstette 89), GABIL (De Jong & Spears 91), and GIL (Janikow 91). References Davis, L. (1989) "Adapting operator probabilities in genetic algorithms", [ICGA89], 60-69. De Jong K.A. & Spears W. (1991) "Learning concept classification rules using genetic algorithms". Proc. 12th IJCAI, 651-656, Sydney, Australia: Morgan Kaufmann. Dorigo M. & E. Sirtori (1991)."ALECSYS: A Parallel Laboratory for Learning Classifier Systems". Proceedings of the Fourth International Conference on Genetic Algorithms, San Diego, California, R.K.Belew and L.B.Booker (Eds.), Morgan Kaufmann, 296-302. Dorigo M. (1995). "ALECSYS and the AutonoMouse: Learning to Control a Real Robot by Distributed Classifier Systems". Machine Learning, 19, 3, 209-240. Eshelman, L.J. et al. (1991) "Preventing premature convergence in genetic algorithms by preventing incest", [ICGA91], 115-122. Goldberg, D. et al. (1991) "Don't worry, be messy", [ICGA91], 24-30. Grefenstette, J.J. (1989) "A system for learning control strategies with genetic algorithms", [ICGA89], 183-190. Holland, J.H. (1986) "Escaping brittleness: The possibilities of general-purpose learning algorithms applied to parallel rule-based systems". In R. Michalski, J. Carbonell, T. Mitchell (eds), Machine Learning: An Artificial Intelligence Approach. Los Altos: Morgan Kaufmann. Janikow C. (1991) "Inductive learning of decision rules from attribute-based examples: A knowledge-intensive Genetic Algorithm approach". TR91-030, The University of North Carolina at Chapel Hill, Dept. of Computer Science, Chapel Hill, NC. Riolo, R.L. (1988) "CFS-C: A package of domain independent subroutines for implementing classifier systems in arbitrary, user- defined environments". Logic of computers group, Division of computer science and engineering, University of Michigan. Whitley, D. et al. (1989) "The GENITOR algorithm and selection pressure: why rank-based allocation of reproductive trials is best", [ICGA89], 116-121. ------------------------------ Subject: Q4.1: What about Alife systems, like Tierra and VENUS? None of these are EVOLUTIONARY ALGORITHMs, but all of them use the evolutionary metaphor as their "playing field". Tierra Synthetic organisms have been created based on a computer metaphor of organic life in which CPU time is the ``energy'' resource and memory is the ``material'' resource. Memory is organized into informational patterns that exploit CPU time for self-replication. MUTATION generates new forms, and EVOLUTION proceeds by natural SELECTION as different GENOTYPEs compete for CPU time and memory space. Observation of nature shows that evolution by natural selection is capable of both OPTIMIZATION and creativity. Artificial models of evolution have demonstrated the optimizing ability of evolution, as exemplified by the field of GENETIC ALGORITHMs. The creative aspects of evolution have been more elusive to model. The difficulty derives in part from a tendency of models to specify the meaning of the ``genome'' of the evolving entities, precluding new meanings from emerging. I will present a natural model of evolution demonstrating both optimization and creativity, in which the GENOME consists of sequences of executable machine code. From a single rudimentary ancestral ``creature'', very quickly there evolve parasites, which are not able to replicate in isolation because they lack a large portion of the genome. However, these parasites search for the missing information, and if they locate it in a nearby creature, parasitize the information from the neighboring genome, thereby effecting their own replication. In some runs, hosts evolve immunity to attack by parasites. When immune hosts appear, they often increase in frequency, devastating the parasite POPULATIONs. In some runs where the community comes to be dominated by immune hosts, parasites evolve that are resistant to immunity. Hosts sometimes evolve a response to parasites that goes beyond immunity, to actual (facultative) hyper-parasitism. The hyper- parasite deceives the parasite causing the parasite to devote its energetic resources to replication of the hyper-parastie genome. This drives the parasites to extinction. Evolving in the absence of parasites, hyper-parasites completely dominate the community, resulting in a relatively uniform community characterized by a high degree of relationship between INDIVIDUALs. Under these circumstances, sociality evolves, in the form of creatures which can only replicate in aggregations. The cooperative behavior of the social hyper-parasites makes them vulnerable to a new class of parasites. These cheaters, hyper-hyper- parasites, insert themselves between cooperating social individuals, deceiving the social creatures, causing them to replicate the genomes of the cheaters. The only genetic change imposed on the simulator is random bit flips in the machine code of the creatures. However, it turns out that parasites are very sloppy replicators. They cause significant RECOMBINATION and rearrangement of the genomes. This spontaneous sexuality is a powerful force for evolutionary change in the system. One of the most interesting aspects of this instance of life is that the bulk of the evolution is based on adaptation to the biotic ENVIRONMENT rather than the physical environment. It is co-evolution that drives the system. --- "Tierra announcement" by Tom Ray (1991) How to get Tierra? Tierra is available (source and executables, for Unix and NT) from alife.santafe.edu/pub/SOFTWARE/Tierra . Related work David Bennett <dmb@pfxcorp.com> reported in March 2000: Much new work has been done in Tierra since 1993. Thomas Ray <tray@mail.nhn.ou.edu> is now working in Japan. I have been using another similar system called Avida. It has some advantages, and a significant body of research results. The contact for Avida is <avida@krl.caltech.edu>. References Ray, T. S. (1991) "Is it alive, or is it GA?" in [ICGA91], 527--534. Ray, T. S. (1991) "An approach to the synthesis of life." in [ALIFEII], 371--408. Ray, T. S. (1991) "Population dynamics of digital organisms." in [ALIFEII]. Ray, T. S. (1991) "Evolution and optimization of digital organisms." Scientific Excellence in Supercomputing: The IBM 1990 Contest Prize Papers, Eds. Keith R. Billingsley, Ed Derohanes, Hilton Brown, III. Athens, GA, 30602, The Baldwin Press, The University of Georgia. Ray, T. S. (1992) "Evolution, ecology and optimization of digital organisms." Santa Fe Institute working paper 92-08-042. Ray, T. S. "Evolution, complexity, entropy, and artificial reality." submitted Physica D. Ray, T. S. (1993) "An evolutionary approach to synthetic biology, Zen and the art of creating life. Artificial Life 1(1). VENUS Steen Rasmussen's (et al.) VENUS I+II "coreworlds" as described in [ALIFEII] and [LEVY92], are inspired by A.K. Dewdney's well-known article (Dewdney 1984). Dewdney proposed a game called "Core Wars", in which hackers create computer programs that battle for control of a computer's "core" memory (Strack 93). Since computer programs are just patterns of information, a successful program in core wars is one that replicates its pattern within the memory, so that eventually most of the memory contains its pattern rather than that of the competing program. VENUS is a modification of Core Wars in which the Computer programs can mutate, thus the pseudo assembler code creatures of VENUS evolve steadily. Furthermore each memory location is endowed with "resources" which, like sunshine are added at a steady state. A program must have sufficient resources in the regions of memory it occupies in order to execute. The input of resources determines whether the VENUS ecosystem is a "jungle" or a "desert." In jungle ENVIRONMENTs, Rasmussen et al. observe the spontaneous emergence of primitive "copy/split" organisms starting from (structured) random initial conditions. --- [ALIFEII], p.821 Dewdney, A.K. (1984) "Computer Recreations: In the Game called Core War Hostile Programs Engage in a Battle of Bits", Sci. Amer. 250(5), 14-22. Farmer & Belin (1992) "Artificial Life: The Coming Evolution", [ALIFEII], 815-840. Rasmussen, et al. (1990) "The Coreworld: Emergence and Evolution of Cooperative Structures in a Computational Chemistry", [FORREST90], 111-134. Rasmussen, et al. (1992) "Dynamics of Programmable Matter", [ALIFEII], 211-254. Strack (1993) "Core War Frequently Asked Questions ( rec.games.corewar FAQ)" Avail. by anon. FTP from rtfm.mit.edu/pub/usenet/news.answers/games/corewar-faq.Z PolyWorld Larry Yaeger's PolyWorld as described in [ALIFEIII] and [LEVY92] is available via anonymous FTP from alife.santafe.edu/pub/SOFTWARE/Polyworld/ "The subdirectories in this "polyworld" area contain the source code for the PolyWorld ecological simulator, designed and written by Larry Yaeger, and Copyright 1990, 1991, 1992 by Apple Computer. PostScript versions of my ARTIFICIAL LIFE III technical paper have now been added to the directory. These should be directly printable from most machines. Because some unix systems' "lpr" commands cannot handle very large files (ours at least), I have split the paper into Yaeger.ALife3.1.ps and Yaeger.ALife3.2.ps. These files can be ftp-ed in "ascii" mode. For unix users I have also included compressed versions of both these files (indicated by the .Z suffix), but have left the uncompressed versions around for people connecting from non- unix systems. I have not generated PostScript versions of the images, because they are color and the resulting files are much too large to store, retrieve, or print. Accordingly, though I have removed a Word-formatted version of the textual body of the paper that used to be here, I have left a Word-formatted version of the color images. If you wish to acquire it, you will need to use the binary transfer mode to move it to first your unix host and then to a Macintosh (unless Word on a PC can read it - I don't know), and you may need to do something nasty like use ResEdit to set the file type and creator to match those of a standard Word document (Type = WDBN, Creator = MSWD). [..]" --- from the README by Larry Yaeger <larryy@apple.com> General Alife repositories? Also, all of the following FTP sites carry ALIFE related info: ftp.cognet.ucla.edu/pub/alife/ , life.anu.edu.au/pub/complex_systems/alife/ , ftp.cogs.susx.ac.uk/pub/reports/csrp/ , xyz.lanl.gov/nlin-sys/ , alife.santafe.edu/pub/ . ------------------------------ Subject: Q5: What about all this Optimization stuff? Just think of an OPTIMIZATION problem as a black box. A large black box. As large as, for example, a Coca-Cola vending machine. Now, we don't know anything about the inner workings of this box, but see, that there are some regulators to play with, and of course we know, that we want to have a bottle of the real thing... Putting this everyday problem into a mathematical model, we proceed as follows: (1) we label all the regulators with x and a number starting from 1; the result is a vector x, i.e. (x_1,...,x_n), where n is the number of visible regulators. (2) we must find an objective function, in this case it's obvious, we want to get k bottles of the real thing, where k is equal to 1. [some might want a "greater or equal" here, but we restricted ourselves to the visible regulators (we all know that sometimes a "kick in the right place" gets use more than 1, but I have no idea how to put this mathematically...)] (3) thus, in the language some mathematicians prefer to speak in: f(x) = k = 1. So, what we have here is a maximization problem presented in a form we know from some boring calculus lessons, and we also know that there at least a dozen utterly uninteresting techniques to solve problems presented this way... What can we do in order to solve this problem? We can either try to gain more knowledge or exploit what we already know about the interior of the black box. If the objective function turns out to be smooth and differentiable, analytical methods will produce the exact solution. If this turns out to be impossible, we might resort to the brute force method of enumerating the entire SEARCH SPACE. But with the number of possibilities growing exponentially in n, the number of dimensions (inputs), this method becomes infeasible even for low- dimensional spaces. Consequently, mathematicians have developed theories for certain kinds of problems leading to specialized OPTIMIZATION procedures. These algorithms perform well if the black box fulfils their respective prerequisites. For example, Dantzig's simplex algorithm (Dantzig 66) probably represents the best known multidimensional method capable of efficiently finding the global optimum of a linear, hence convex, objective function in a search space limited by linear constraints. (A USENET FAQ on linear programming is maintained by Professor Robert Fourer <4er@iems.nwu.edu> (and "nonlinear- programming-faq") that is posted monthly to sci.op-research and is mostly interesting to read. It is also available from http://www- unix.mcs.anl.gov/otc/Guide/faq/linear-programming-faq.html ) Gradient strategies are no longer tied to these linear worlds, but they smooth their world by exploiting the objective function's first partial derivatives one has to supply in advance. Therefore, these algorithms rely on a locally linear internal model of the black box. Newton strategies additionally require the second partial derivatives, thus building a quadratic internal model. Quasi-Newton, conjugate gradient and variable metric strategies approximate this information during the search. The deterministic strategies mentioned so far cannot cope with deteriorations, so the search will stop if anticipated improvements no longer occur. In a multimodal ENVIRONMENT these algorithms move "uphill" from their respective starting points. Hence, they can only converge to the next local optimum. Newton-Raphson-methods might even diverge if a discrepancy between their internal assumptions and reality occurs. But of course, these methods turn out to be superior if a given task matches their requirements. Not relying on derivatives, polyeder strategy, pattern search and rotating coordinate search should also be mentioned here because they represent robust non-linear optimization algorithms (Schwefel 81). Dealing with technical optimization problems, one will rarely be able to write down the objective function in a closed form. We often need a SIMULATION model in order to grasp reality. In general, one cannot even expect these models to behave smoothly. Consequently, derivatives do not exist. That is why optimization algorithms that can successfully deal with black box-type situations have been developed. The increasing applicability is of course paid for by a loss of "convergence velocity," compared to algorithms specially designed for the given problem. Furthermore, the guarantee to find the global optimum no longer exists! But why turn to nature when looking for more powerful algorithms? In the attempt to create tools for various purposes, mankind has copied, more often instinctively than geniously, solutions invented by nature. Nowadays, one can prove in some cases that certain forms or structures are not only well adapted to their ENVIRONMENT but have even reached the optimum (Rosen 67). This is due to the fact that the laws of nature have remained stable during the last 3.5 billion years. For instance, at branching points the measured ratio of the diameters in a system of blood-vessels comes close to the theoretical optimum provided by the laws of fluid dynamics (2^-1/3). This, of course, only represents a limited, engineering point of view on nature. In general, nature performs adaptation, not optimization. The idea to imitate basic principles of natural processes for optimum seeking procedures emerged more than three decades ago (cf Q10.3). Although these algorithms have proven to be robust and direct OPTIMIZATION tools, it is only in the last five years that they have caught the researchers' attention. This is due to the fact that many people still look at organic EVOLUTION as a giantsized game of dice, thus ignoring the fact that this model of evolution cannot have worked: a human germ-cell comprises approximately 50,000 GENEs, each of which consists of about 300 triplets of nucleic bases. Although the four existing bases only encode 20 different amino acids, 20^15,000,000, ie circa 10^19,500,000 different GENOTYPEs had to be tested in only circa 10^17 seconds, the age of our planet. So, simply rolling the dice could not have produced the diversity of today's complex living systems. Accordingly, taking random samples from the high-dimensional parameter space of an objective function in order to hit the global optimum must fail (Monte-Carlo search). But by looking at organic evolution as a cumulative, highly parallel sieving process, the results of which pass on slightly modified into the next sieve, the amazing diversity and efficiency on earth no longer appears miraculous. When building a model, the point is to isolate the main mechanisms which have led to today's world and which have been subjected to evolution themselves. Inevitably, nature has come up with a mechanism allowing INDIVIDUALs of one SPECIES to exchange parts of their genetic information (RECOMBINATION or CROSSOVER), thus being able to meet changing environmental conditions in a better way. Dantzig, G.B. (1966) "Lineare Programmierung und Erweiterungen", Berlin: Springer. (Linear programming and extensions) Kursawe, F. (1994) " Evolution strategies: Simple models of natural processes?", Revue Internationale de Systemique, France (to appear). Rosen, R. (1967) "Optimality Principles in Biologie", London: Butterworth. Schwefel, H.-P. (1981) "Numerical Optimization of Computer Models", Chichester: Wiley. ------------------------------ Copyright (c) 1993-2000 by J. Heitkoetter and D. Beasley, all rights reserved. This FAQ may be posted to any USENET newsgroup, on-line service, or BBS as long as it is posted in its entirety and includes this copyright statement. This FAQ may not be distributed for financial gain. This FAQ may not be included in commercial collections or compilations without express permission from the author. End of ai-faq/genetic/part3 *************************** --