Genetic algorithms are a mysterious sounding technique in
mysterious sounding field--artificial intelligence.
This is the problem with naming things appropriately.
When the field was labeled artificial intelligence,
it meant using mathematics to artificially create the
semblance of intelligence, but self-engrandizing researchers
and Isaac Asimov redefined it as robots.
The name genetic algorithmsdoes sound complex and
has a faintly magical ring to it, but it turns out that
they are one of the simplest and most-intuitive concepts
you'll encounter in A.I.
What Can Genetic Algorithms Do?
In a word, genetic algorithms optimize. They can find better answers to
a question, but not solve new questions. Given the definition of a car,
they might create a better car,
but they'll never give you an airplane.
I Like The Term Genetic Programming More!
Me too--it has a much nicer ring to it--but unfortunately using
one or the other isn't a personal preference; they refer to different
techniques. Genetic algorithms are about optimization,
while genetic programming is about using the techniques from
genetic algorithms to build computer programs from primordial
programming language soup.
Imagine, if you will, that you have spent the last decade trying
to write a "Hello World" program in Scheme, but simply cannot
overcome that tremendous hurdle. But imagine that you were smart
enough to put together a tremendous list of commands and strings,
such that some combination could print "Hello World" to standard
out. Then you could mix and match the commands and strings and
eventually you might generate "Hello World", much like monkeys
typing randomly will--given an infinitely long period of time--recreate Shakespeare. Sort of.
So, where generic algorithms are a collection of optimization
techniques, genetic programming is a novel approach for solving
problems for which you already know the answer. Ok, ok, this isn't
a fair description. We'll take a more indepth look later in another
article, for the time being lets just remember they aren't the same.
Defining a Problem to Optimize
Now we're going to put together a simple example of using a genetic
algorithm in Python. We're going to optimize a very simple problem:
trying to create a list of N numbers that equal X when summed together.
If we set N = 5 and X = 200, then these would all be appropriate
(I've been told that randomly stating Pretty easy, right? at random junctions of articles is not helpful. Otherwise I would have uselessly added one here.)
Ingredients of The Solution
Each suggested solution for a genetic algorithm is referred to
as an individual. In our current problem, each list of N
numbers is an individual.
>>> fromrandomimportrandint>>> defindividual(length,min,max):... 'Create a member of the population.'... return[randint(min,max)forxinxrange(length)]... >>> individual(5,0,100)[79, 0, 20, 47, 40]>>> individual(5,0,100)[64, 1, 25, 84, 87]
The collection of all individuals is referred to as our population.
>>> defpopulation(count,length,min,max):... """... Create a number of individuals (i.e. a population).... ... count: the number of individuals in the population... length: the number of values per individual... min: the min possible value in an individual's list of values... max: the max possible value in an individual's list of values... ... """... return[individual(length,min,max)forxinxrange(count)]... >>> population(3,5,0,100)[[51, 55, 73, 0, 80], [3, 47, 18, 65, 55], [17, 64, 77, 43, 48]]
Next we need a way to judge the how effective each solution is;
to judge the fitness of each individual. Predictably enough,
we call this the fitness function. For our problem, we want
the fitness to be a function of the distance between the sum of
an individuals numbers and the target number X.
We can implement the fitness function as follows:
>>> fromoperatorimportadd>>> deffitness(individual,target):... """... Determine the fitness of an individual. Lower is better.... ... individual: the individual to evaluate... target: the sum of numbers that individuals are aiming for... """... sum=reduce(add,individual,0)... returnabs(target-sum)... >>> x=individual(5,0,100)>>> fitness(x,200)165
Personally, I'd prefer to have a high fitness score correlate to
a fit individual rather than the current implementation where a
perfectly fit individual has a fitness of 0, and the higher
the worse. Ah well, regardless, keep that detail in mind while
following this code.
It's also helpful to create a function that will determine a population's
>>> defgrade(pop,target):... 'Find average fitness for a population.'... summed=reduce(add,(fitness(x,target)forxinpop),0)... returnsummed/(len(pop)*1.0)...>>> x=population(3,5,0,100)>>> target=200>>> grade(x,target)116
Now we just need a way evolve our population; to advance the
population from one generation to the next.
This is the secret sauce of genetic algorithms, where secret
means fairly obvious, and sauce means sauce. Consider a population
of elk which are ruthlessly hunted by a pack of wolves.
With each generation the weakest are eaten by the wolves, and
then the strongest elk reproduce and have children.
Abstract those ideas a bit, and we can implement the evolution
For each generation we'll take a portion of the best
performing individuals as judged by our fitness function.
These high-performers will be the parents of the next generation.
We'll also randomly select some lesser performing individuals
to be parents, because we want to promote genetic diversity.
Abandoning the metaphor, one of the dangers of optimization
algorithms is getting stuck at a local maximum and consequently
being unable to find the real maximum. By including some
individuals who are not performing as well, we decrease our
likelihood of getting stuck.
Breed together parents to repopulate the population to its
desired size (if you take the top 20 individuals in a population
of 100, then you'd need to create 80 new children via breeding).
In our case, breeding is pretty basic: take the first N/2 digits from
the father and the last N/2 digits from the mother.
With 20% survival (plus an additional 5% of other individuals) and 1% mutation, it only took nine generations to reach a perfect solution. Then the algorithm joyfully runs in circles for as long as you'll let the mutations continue But this is a good feeling right? If it only took us half an hour to solve a problem of this magnitude, imagine what we could do with a day. A genetic algorithm for optimizing your Apache2 configuration file for number of children processes? Easy as pie.
There are a nearly endless variety of techniques for and variations of
genetic algorithms, but all of them rest on this straight forward
foundation. We'll look more at those sometime in the future, but for now
you know enough to go out and throw together something interesting.
"""# Example usagefrom genetic import *target = 371p_count = 100i_length = 6i_min = 0i_max = 100p = population(p_count, i_length, i_min, i_max)fitness_history = [grade(p, target),]for i in xrange(100): p = evolve(p, target) fitness_history.append(grade(p, target))
for datum in fitness_history: print datum"""fromrandomimportrandint,randomfromoperatorimportadd
defindividual(length,min,max):'Create a member of the population.'return[randint(min,max)forxinxrange(length)]
defpopulation(count,length,min,max):""" Create a number of individuals (i.e. a population).
count: the number of individuals in the population length: the number of values per individual min: the minimum possible value in an individual's list of values max: the maximum possible value in an individual's list of values
deffitness(individual,target):""" Determine the fitness of an individual. Higher is better.
individual: the individual to evaluate target: the target number individuals are aiming for """sum=reduce(add,individual,0)returnabs(target-sum)
defgrade(pop,target):'Find average fitness for a population.'summed=reduce(add,(fitness(x,target)forxinpop))returnsummed/(len(pop)*1.0)
defevolve(pop,target,retain=0.2,random_select=0.05,mutate=0.01):graded=[(fitness(x,target),x)forxinpop]graded=[xforxinsorted(graded)]retain_length=int(len(graded)*retain)parents=graded[:retain_length]# randomly add other individuals to# promote genetic diversityforindividualingraded[retain_length:]:ifrandom_select>random():parents.append(individual)# mutate some individualsforindividualinparents:ifmutate>random():pos_to_mutate=randint(0,len(individual)-1)# this mutation is not ideal, because it# restricts the range of possible values,# but the function is unaware of the min/max# values used to create the individuals,individual[pos_to_mutate]=randint(min(individual),max(individual))# crossover parents to create childrenparents_length=len(parents)desired_length=len(pop)-parents_lengthchildren=whilelen(children)<desired_length:male=randint(0,parents_length-1)female=randint(0,parents_length-1)ifmale!=female:male=parents[male]female=parents[female]half=len(male)/2child=male[:half]+female[half:]children.append(child) parents.extend(children)returnparents