(from a presentation given to \\ ITP's [[http://www.shiffman.net/teaching/the-nature-of-code/|Nature of Code]] class in Nov 2006) === INTRO === In my part of the presentation I would like to start out with a compilation of [[http://www.spore.com/|Spore]] footage. In succession I will speculate where and how genetic algorithms are applied in this game...at which point it does not really matter if I am right or wrong as long as this illustrates an interesting application. Then I will pull back from this special case...and the way GAs are used in Spore is a special case. Even the darwinistic model is a special case...I will pull back and talk about the general concept of GAs. In fact I will pull back two steps and also talk about the branch of computer science of which GAs are a subfield. This is the field of heuristics. Heuristics is interesting as it also demarcates a philosophical division within CS. [[http://file.stefanix.net/spore-compilation.mov|video]] === Genetic Algorithms in Spore === I assume that you are all familiar with the darwinistic model. In its original form it is utterly simply. In fact so simple that it is hard to see all its implications. This is a key quality of complex systems so that they actually need to be simulated to unveil their nature. Firstly, we need something that proscribes the features of the creature. This can be random mutation, a kind of recombination of features from different creatures, some kind of intelligence design agent, or any combination of the above. In Spore the player is the intelligent agent who designs a creature in the creature editor. Secondly, creatures need to be tested for fitness and in some form this fitness needs to be quantifiable. This fitness test is often done in form of a simulation but can also be a simple rating algorithm. In Spore a creature is released into the environment and the fitness test is how long the creature survives and how often it mates which is probably also dependent on how well it gathers food. Thirdly, fitness of the creatures needs to have a direct proportional influence on how well they procreate. In spore the creatures mate and instantly lay eggs. If they die they lay no more eggs. In a more abstracted way the creatures with higher fitness are duplicated various times according to their fitness score. In an idealistic system where the test for fitness does not change this increases the average fitness of the system's creatures over time. === Genetic Algorithms in General === In its general form GAs are an iterative way of coming up with solution candidates for a complex problem. (Complexity theory has its own classification scheme for complex problems--NP-hard, NP-complete, etc.) In each iteration/generation the best previous candidates are used to derive new, potentially better candidates until a satisficing solution is reached. The way new candidates are derived and how they are evaluated is up to the imagination of the author. Typically these problems have conflicting objectives which have to be balanced or they require a series of decisions where each decision changes the problem for the next decision (e.g. [[http://en.wikipedia.org/wiki/Traveling_Salesman_Problem|salesman problem]]). **Example Problem:** Design a spore carnivore that can eat as much food as possible. **Parameters:** //head-size// and //claw-size//. **Evaluation:** The bigger the head the more it can swallow. The bigger the claws the more it can catch. The lighter it is the faster it can run and the more it can catch. == claw-size problem domain == {{:nature-of-code:claw-size.png?200}} == head-size prooblem domain == {{:nature-of-code:head-size.png?200}} == combined solution domain == {{:nature-of-code:problem-domain.png?200}} Any point on this graph is a solution candidate. While it is easy to see the optimum solution if all solution candidates are generated this is often not feasible in a practical amount of time. GA is a way to generate solution candidates that converge with the peaks of the graph over many generations. As such this is a path finding problem. There are many other approaches for finding a path to the peak. Some of the more common ones are: * Ant colony optimization (ACO) * Memetic algorithm (MA) === IN CONTEXT === When I think of genetic algorithms I do not think of them as being just another computer science gimmick. They are rather part of an entirely separate branch of CS. I like to think of the practitioners in CS as the //accountants// and the //hackers//. The former think of CS as a discipline of formal language and hardware components the latter as a way of framing thought and executing magic spells. "Magic spell" sounds much better than "formal language" but depending on the problem domain throwing around with magic can be rather unpractical. There is of course a whole list of pros, cons and differences which distinguish these two philosophies: == hacking == * vast interest in simulations * and non-deterministic behavior * considers Goedel backlesh * sees complexity sciences as an answer to it * the solution avenue changes as the code is written == accounting == * in denial of Goedel's Incompleteness Theorem * writes code based on clear specs * finds unanticipated behavior undesirable * design and implementation are separate phases * predictable behavior There is a fundamental distinction between these two camps which brings out most of the listed attributes. Being on one side or the other depends on whether an **analytical** or a **heuristic** approach is taken. Analytical strategies have a substantial mathematical heritage. As such they are strictly deductive. The solution is a proposition that can be directly derived from a simple set of axioms and deduction rules. If a solution can be found and how long it takes can be estimated very well (o-notaion). Heuristic or search strategies are different in many regards. Instructing a computer how to search is just as vague as explaining this to a person. Explaining how to find the most authentic Cuban restaurant in New York is difficult in that there is no standard way of doing this. Finding something is a complex problem and in this sense something what many people would describe as an art. These philosophical difference can also be identified in the way people write code. Some implement from specifications others massage the code until they like what it does. Implementing from specs strips out most of the design from the development process. Massaging the code on the other hand is clearly a design process and in fact a search problem. At any state in its development cycle, the code body represents a solution candidate. Hacking on it is a kind of informed mutation and recombination of features. Liking the code less or more corresponds to the the evaluation of fitness test. I wish all the hackers out there a happy evolution of their sources. Thank you. === Bibliography === * [[http://en.wikipedia.org/wiki/Genetic_algorithm|Genetic Algorithm]] on Wikipedia * [[http://en.wikipedia.org/wiki/Combinatorial_optimization|Combinatorial Optimization]] on Wikipedia * [[http://en.wikipedia.org/wiki/John_Henry_Holland|John Holland]] on Wikipedia * [[http://cscs.umich.edu/~crshalizi/notebooks/john-holland.html|John Holland]] on Cosma's Notebooks * [[http://www.amazon.com/Adaptation-Natural-Artificial-Systems-Intelligence/dp/0262581116/|Adaptation in Natural and Artificial Systems]] on Amazon * [[http://cscs.umich.edu/~crshalizi/notebooks/evol-comp.html|Evolutionary Computation]] on Cosma's Notebooks * [[http://hobbiton.thisside.net/genetic/|genealgy]] a python tutorial * [[http://en.wikipedia.org/wiki/List_of_NP-complete_problems| NP-Complete Problems]] on Wikipedia Thank you Wikipedia. You are the best resource for refreshing my memory.