Genetic algorithms (GA) work by simulating the logic of Darwinian selection, where only the best are selected for replication. Over many generations, natural populations evolve according to the principles of natural selection and stated by Charles Darwin in The Origin of Species. Only the most suited elements in a population are likely to survive and generate offspring, thus transmitting their biological heredity to new generations.

Genetic algorithms are able to address complicated problems with many variables and a large number of possible outcomes by simulating the evolutionary process of “survival of the fittest” to reach a defined goal. They operate by generating many random answers to a problem, eliminating the worst and cross-pollinating better answers. Repeating this elimination and regeneration process gradually improves the quality of the answers to an optimal or near-optimal condition.

In computing terms, a genetic algorithm implements the model of computation by having arrays of bits or characters (binary string) to represent the chromosomes.  Each string represents a potential solution.   The genetic algorithm then manipulates the most promising chromosomes searching for improved solutions. A genetic algorithm operates through a cycle of three stages:

  1. Build and maintain a population of solutions to a problem
  2. Choose the better solutions for recombination with each other
  3. Use their offspring to replace poorer solutions.

GA Coding

Each individual of a population represents a possible solution to a given problem. Each individual is assigned a “fitness score” according to how good a solution to the problem it is.

A potential solution to a problem may be represented as a set of parameters. For example, if our problem is to maximize a function of three variables, F(x; y; z), we might represent each variable by a 10-bit binary number (suitably scaled). Our chromosome would therefore contain three genes, and consist of 30 binary digits.

Fitness function

A Fitness function must be specific for each problem to be solved. Given a particular chromosome, the fitness function returns a single numerical merit proportional to the utility of the individual that chromosome represents.

Reproduction

During the reproductive phase of the GA, individuals are selected from the population and recombined. Parents are selected randomly from the population using a scheme which favors individuals with higher fitness scores.

Having selected two parents, their chromosomes are recombined, typically using the mechanisms of crossover and mutation:

Crossover takes two individuals, and cuts their chromosome strings at some randomly chosen position, to produce two “head” segments, and two “tail” segments. The tail segments are then swapped over to produce two new full length chromosomes. The two individual each inherit some genes from each parent.

Mutation is applied to each child individually after crossover. It randomly alters each gene with a small probability (typically 0.001).

If the GA has been correctly implemented, the population will evolve over successive generations so that the fitness of the best and the average individual in each generation increases towards the global optimum.

The basic Genetic algorithm is :

Initialize a random population of individuals
Compute fitness of each individual
WHILE NOT finished DO
BEGIN /* produce new generation */
FOR population_size DO
BEGIN /* reproductive cycle */
Select two individuals from old generation, recombine the
 two individuals to give two offspring
Make a mutation for selected individuals by altering a
 random bit in a string
Create a new generation (new populations)
END
IF population has converged THEN
finished := TRUE
END

A typical application of this technology would be to try to determine the best route for delivering product from point A to point B, when the conditions may periodically change. These conditions could be related to weather, road conditions, traffic flow, rush hour, etc. Many times the best route could be the fastest, shortest, most scenic, most cost effective, or a combination thereof. No one answer to the question will always be perfectly correct. It may get you from point A to B, but not necessarily to everyone’s liking, or every time of the day or week. In fact, typically the more options you provide, the more accurate the answer will be.

Genetic algorithms provide various benefits to existing machine learning technologies such as being able to be used by data mining for the field/attribute selection, and can be combined with neural networks to determine optimal weights and architecture.