Evolutionary algorithms can find optimal solutions to problems. Aurora Ramírez and Chris Simons give us an overview.
Within artificial intelligence, there is a category of programming problems where although the outcomes are known in terms of maximising or minimising some desired objective, the corresponding input values required to achieve this outcome are unknown. Such problems are known as optimisation problems, because solving the problem requires discovering an optimal solution in terms of either best quality (i.e. maximisation), or least cost (minimisation). Optimisation problems are found across a wide range of domains, e.g. from maximising energy outputs by optimising wind farm or generator placement, to minimising fuel costs by optimising delivery routes.
To address such optimisation problems in computing, a widely used approach involves firstly defining the space of all possible solutions to the problem. Then, with a programmatic encoding of all possible solutions representing a search space, it’s possible to travel through the space, searching for optimal solutions.
Perhaps the simplest way to travel through the search space is to enumerate each possible solution. Such an approach is straightforward and uninformed insofar as it doesn’t use any information about the problem domain to steer the search. For small scale search spaces, this approach can be highly effective. However, some search spaces can get very large, very quickly.
For example, in the ‘travelling salesman problem’, a number of cities are to be visited by a salesman. The salesman sets out to visit the cities in turn, returning to the starting point at the end of the journey, visiting each city once only. The goal of the salesman is to minimise travelling costs and CO 2 emissions by locating the path of least distance, i.e. the optimal path, around the cities. The number of possible solution paths, of course, depends on the number of cities to be visited. The greater the number the cities, the greater the number of possible solution paths. It just so happens that the number of possible solution paths is n! where n in the number of cities. For a few cities, e.g. 4, the number of possible solution paths is 24. However, for 100 cities, the number rises to approximately 10 157 , while for 1000 cities, the number is roughly 10 2567 . In fact, there are many other optimisation problems that show such scaling characteristics. Examples include many well-known allocation problems such as course timetabling, nurse rostering, process scheduling, network routing, vehicle delivery scheduling, and load balancing etc. For problems of such increasing scale, exhaustively examining each possible solution path is beyond reasonable computation time, and so alternative approaches are required. One alternative approach is inspired by the biology of the natural world around us, i.e. natural evolution.
Evolutionary algorithms
In nature, there are two biological aspects at the heart of natural evolution. Firstly, selection of individuals for reproduction according to their fitness ensures that superior characteristics present in a population are more likely to survive in future generations. Secondly, sexual reproduction recombines the genetic information encoding parents’ characteristics, resulting in offspring that are in some ways different to parents. In addition, gene mutation can also occasionally occur in nature. A mixture of recombination and mutation thus ensures a degree of variety in the population. Of course, all living things exist in environments that are prone to change. A combination of mechanisms for both selection and variety promotion enables a population of individuals, over time, to adapt to any changes in its environment.
Taking inspiration from natural biology, the notion of an evolutionary computing algorithm was suggested some time ago. Although the origin of the idea is not known for certain, Turing [ Turing52 ] mentions the possibility of ‘genetical programming’ in his 1950 article when considering the question “can machines think?” Perhaps the first implementation of an evolutionary algorithm was developed by Fraser in 1957 [ Fogel02 ], although this was an attempt to simulate the characteristics of species in natural evolution, rather than investigating computational optimisation. An early attempt at optimisation of the performance of finite state machines was described by Fogel et al. in 1966 [ Fogel66 ] as ‘evolutionary programming’. Subsequently, many proposals for evolutionary search and optimisation algorithms have emerged, and a recent article in Overload describes an evolutionary algorithm approach on how to ‘program your way out of a paper bag’ [ Buontempo13 ].
The first task in applying an evolutionary algorithm for optimisation is to programmatically encode a representation of solution individuals. In this regard, evolutionary algorithms are quite flexible – virtually any representation may be used (although overly complex encodings can impact algorithm performance and impede understandability). Example representations can include, for example, vectors of ‘genes’ coding for solution characteristics, or where a better match to the problem domain can be found, perhaps encoding genes in graphs or tree structures. Because the solution representation codes for the ‘genes’ of the individual, the full expression of the genes is referred to by its biological meaning as a ‘genotype’. It’s important to draw on the characteristics of the problem domain to help formulate a solution representation.
The second task is to implement the evolution of a population of solution individuals over many generations. Listing 1 is a typical evolutionary outline approach, taken from [ Eiben15 ].
initialise population at random while( not done ) evaluate each individual select parents recombine pairs of parents mutate new candidate individuals select candidates for next generation end while |
Listing 1 |
As can be seen, there are a number of algorithm aspects to customise to the needs of the problem domain. For example, the termination condition
while(not done)
for the evolutionary loop could be the point at which individuals of sufficient fitness have been arrived at, or when a certain computational budget has been exhausted. The evaluation of each solution individual is highly problem specific, and relates to the optimisation being conducted. For maximisation problems, measures of fitness are typically related to the required quality measures of the problem. For minimisation problems, measures of fitness typically relate to the cost or expense of solution individuals. Often, solution fitness depends on more than one quality/cost measure, in which case aggregation of individual fitness values to arrive at an overall evaluation of fitness may be necessary. The algorithm also includes a degree of randomness in that only a proportion of the population may be selected as parents for the next generation, and likewise for recombination and mutation.
Achieving population variety is achieved by recombining and mutating genetic information. Recombination and mutation ensure variety in the offspring, which may be particularly necessary in complex optimisation problems where exploration of the search space is crucial.
Evolutionary algorithms have been applied in a wide variety of optimisation problem domains. Further information on the application of evolutionary computing is widely available, although [ Eiben15 ] is a compact and readable introduction. Given the relative maturity of evolutionary computing, it’s perhaps not surprising that a number of frameworks have emerged to provide programmers with reusable components and interfaces for customisation to a variety of problem domain applications.
Evolutionary frameworks for optimisation
Evolutionary frameworks for optimisation allow rapid, exploratory prototyping with evolutionary algorithms by means of generic ‘building blocks’ (i.e. components and interfaces) [ Parejo12 ] that, when properly configured or extended, address a domain-specific problem (e.g. such as the Travelling Salesman Problem). These building blocks usually correspond to the steps of an evolutionary algorithm. For example, there may be a component to create a population of solution individuals, another to select the best ones, others to produce new solution individuals via recombination and mutation, etc. [ Gagné06 ]. In addition, frameworks often facilitate the execution and monitoring of the algorithm with general capabilities such as loading problem-domain specific information from configuration files, monitoring progress and generating reports.
In short, characteristics of evolutionary computing frameworks for optimisation include:
- adaptable search components to create customised implementations;
- mechanisms for the integration of problem-specific knowledge, such as problem constraints and fitness function(s);
- components to configure and monitor the execution, thus allowing the user to set any required execution parameter and visualise intermediate results;
- general utilities to conduct experiments, including batch processing and parallel execution; and
- designed with best practices such design patterns in mind.
The use of such frameworks can bring many advantages for the programmer seeking to implement optimisation programs. Firstly, coding effort greatly decreases and, to a certain extent, quality and correctness of code are both ensured. Additional utilities, such as benchmarks and graphic environments, might also be important to some users. In general, the communities behind the development of these frameworks provide us with complete, open-source programming environments that can also be easily integrated in external tools. Subject to the learning curve of the framework, it’s possible to get up and running with evolutionary optimisation programs very quickly, especially when compared with programming an evolutionary algorithm from scratch.
However, there are also some challenges to using such a framework. One is that a considerable number are currently available, making it difficult to know which one best fits the needs of the problem domain. As is typical in optimisation problems, a unique global optimal solution may or may not exist, and so it’s recommended that programmers trial a shortlist of frameworks and evaluate their performance for a specific problem. This leads to a second challenge, the learning curve. Some frameworks have been developed by various open source initiatives, including academics as part of their on-going research activities. Because of this, a few evolutionary frameworks may not be regularly maintained, and their documentation may not always be readily available and up-to-date. Moreover, some knowledge about the specific algorithm variants and the influence of setting their execution parameters might be also required to make the most of the evolutionary techniques. Finally, regarding the integration of these frameworks with existing tools, there may be some restrictions in terms of programming languages and platforms available.
Having said that, evolutionary algorithms have been applied to a wide range of problem domains, so implementations for many widely used programming languages are available. Table 1 lists some mature and popular frameworks written in C++, Java, C#, Python and Matlab, together with their latest available version, and the date of release for the latest version. Links to further details on each framework are available in the references section at the end of the paper.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Table 1 |
C++ libraries such as Evolving Objects and OpenBeagle were early attempts to popularise the use of evolutionary algorithms, and precursors of more up-to-date developments. OptFrame and Evolutionary Computation Framework (ECF) are good active examples of this, both supporting parallelism via MPI (Message Passing Interface). OptFrame also provides implementations of the MapReduce paradigm, whereas the particular strength of ECF is straightforward configuration with few parameters. jMetal is a popular framework, and is the only framework available in three different programming languages. Originally coded in Java, it is focused on implementations of evolutionary algorithms to solve multi-objective problems. MOEA Framework is another recent library to solve this type of problem, with well-documented and tested code. The most mature Java library is the Java-based Evolutionary Computation Research System (ECJ), which offers a great variety of search algorithms. Although perhaps less known, Java Class Library for Evolutionary Computation (JCLEC) provides extensible modules for more advanced techniques including machine learning, whereas Evolutionary Algorithms Workbench (EvA) and Opt4J include basic graphical user interfaces. However, the framework with the most complete graphical environment is probably HeuristicLab, developed under the .NET framework but also compatible with Linux systems. Finally, new developments are appearing for Matlab and Python, and among those worthy of mention is the PyGMO/PaGMO project, initially developed by the European Space Agency. PaGMO has been recently rebuilt to comply with language features of C++14 and 17.
Application example
The high-level programming interface of the Java Class Library for Evolutionary Computation (JCLEC) [ Ventura08 ] is a representative example of an evolutionary optimisation framework. This open source library also offers extension modules to develop multi-objective algorithms and machine learning approaches, as well as a visual environment to run experiments.
Every evolutionary algorithm in JCLEC extends the class
PopulationAlgorithm
, which defines the main steps of the evolutionary process as shown in Listing 2.
public abstract class PopulationAlgorithm extends AbstractAlgorithm{ doInit(): void doSelection(): void doGeneration(): void doReplacement(): void doControl(): void } |
Listing 2 |
To initialise the population, a component named
Provider
is invoked with the number of solutions to be created as a parameter. Classes extending the interfaces
IIndividual
and
ISpecies
are those defining the specific characteristics of the optimisation problem. On the one hand, each solution individual should contain a fitness object representing its quality. For the travelling salesman problem, an individual represents a possible route, while the fitness quality measure is the total distance to be minimised. General methods to copy and compare solutions are also required along the search process. The species provides additional information of the problem and how it is encoded. In our example, this element will contain the number of cities. Therefore, the species can create random routes, which are stored as permutations of the cities. (See Listing 3.)
public interface IIndividual{ getFitness(): IFitness setFitness(IFitness): void copy(): IIndividual equals(Object): boolean } public interface IProvider{ provide(int): List<IIndividual> } public interface ISpecies{ createIndividual(T[]): IIndividual } |
Listing 3 |
A further component evaluates the quality of random solutions after their creation. The interface
IEvaluator
defines methods to evaluate a list of individuals, counts the number of evaluations (which might be a stopping criterion) and adapts the comparator to the specific problem. This way, individuals can be compared in terms of their fitness values for both maximisation and minimisation problems. The evaluation of the travelling salesman problem will assess the distance between the consecutive cities on the route, as defined by the permutation representing the solution. The total distance is to be minimised.
For clearly defined optimisation problems such as the travelling salesman problem, the quality of the solution is quantified using a
double
floating-point value. In this case, the interface
IValueFitness
provides the necessary methods to keep and retrieve the computed value, as well as additional methods to check if the value ‘is good enough’ against some criteria, and make copies. These two latter methods are inherited from the more general interface
IFitness
. (See Listing 4.)
public interface IEvaluator{ evaluate(List<IIndividual>): void getNumberOfEvaluations(): int getComparator(): Comparator<IFitness> } public interface IValueFitness extends IFitness{ getValue(): double setValue(double): void isAcceptable(): Boolean copy(): IFitness } |
Listing 4 |
Once the population has been created, each generation of the evolution is performed by sequentially invoking the following methods (see Listing 2):
doSelection( )
,
doGeneration( )
,
doReplacement( )
and
doControl( )
. For selection, the most common approach is to delegate the functionality to a ‘selector’, a class implementing the interface
ISelector
. In JCLEC, diverse selection methods can be implemented via polymorphism as shown in the following listing. More specifically, the first method just picks as many solutions as there is in the received list (repetition is allowed by default), the second specifies the number of solutions to be returned, and the third one also allows the programmer to indicate whether repetition is permitted. One classic example of a selection mechanism is performing a tournament between solutions selected at random, wherein the best one wins the competition based on fitness.
The next step is to generate new solutions by recombining and mutating the chosen ones. Again, independent components will be coded to perform each task, using the interfaces
IRecombinator
and
IMutator
as reference. In their most simple form, both genetic operators receive a list of individuals and return another one with the modified solutions. Specific classes in JCLEC extend this idea to support the association of probabilities and transform the solutions depending on how the solutions are encoded. For the travelling salesman problem, a recombinator can interchange subroutes of two solutions to produce new ones. Mutation might just swap two cities for a given route. (See Listing 5.)
public interface ISelector{ select(List<IIndividual>) : List<IIndividual> select(List<IIndividual>, int) : List<IIndividual> select(List<IIndividual>, int, boolean) : List<IIndividual> } public interface IRecombinator{ recombine(List<IIndividual>): List<IIndividual> } public interface IMutator{ mutate(List<IIndividual>): List<IIndividual> } |
Listing 5 |
Relationships between the various interfaces offered by JCLEC are as shown in Figure 1.
Figure 1 |
Both user-defined classes or existing ones can be combined in JCLEC to run an evolutionary algorithm. To put all the components together, a configuration file in XML format specifies which class implements each component. JCLEC uses Java Reflection to instantiate these classes ‘on the fly’ during the configuration process, and then launches the execution. Listeners may be added to generate intermediate reports at a desired frequency, for example, at a specified number of evolutionary generations.
Taking the travelling salesman problem as an example, one possible resulting configuration is shown below. In this experiment, the evolving population comprises 100 solution individuals, and the termination criterion is 10000 fitness evaluations. The problem to be optimised in this experiment relates to 52 locations in the city of Berlin, and the specification of the 52 locations is held in a file berlin52.tsp . The genes of the solution genotype are encoded as an ordered array, containing 52 ordered elements – each element identifying a single location. The evaluation of each solution individual involves computing the sum of the distance between each location in the order specified in the genotype. In evolutionary computing terms, the total distance is the ‘fitness’ of the individual. In this experiment, optimisation involves minimising the distance to arrive at the least cost solution.
Parent individuals are selected by ‘tournament’ selection, i.e. two individuals are placed in a tournament where the single fittest wins. The probability that two selected parents will be recombined is 0.9, while the probability that recombined individuals will be further mutated is 0.2. Finally, intermediate reports are generated at every 10 generations of evolution (see Listing 6).
<experiment> <process algorithm-type="net.sf.jclec.algorithm.classic.SGE"> <rand-gen-factory type="net.sf.jclec.util.random.RanecuFactory" seed="123"/> <population-size>100</population-size> <max-of-evaluations>10000</max-of-evaluations> <species type="net.sf.jclec.orderarray.OrderArrayIndividualSpecies" genotype-length="52"/> <evaluator type="tutorial.TSP" file-name="berlin52.tsp" number-cities="52"/> <provider type="net.sf.jclec.orderarray.OrderArrayCreator"/> <parents-selector type="net.sf.jclec.selector.TournamentSelector"> <tournament-size>2</tournament-size> </parents-selector> <recombinator type="net.sf.jclec.orderarray.rec.OrderPMXCrossover" rec-prob="0.9" /> <mutator type="net.sf.jclec.orderarray.mut.Order2OptMutator" mut-prob="0.2" /> <listener type="net.sf.jclec.listener.PopulationReporter"> <report-frequency>10</report-frequency> <report-on-file>true</report-on-file> </listener> </process> </experiment> |
Listing 6 |
After running the evolutionary optimisation experiment for the travelling salesman problem with the above configuration, the JCLEC framework provides information of the optimisation, as shown in Figure 2.
Figure 2 |
After 1000 generations of evolution, the best and worst solution individuals in the population are shown. The best solution individual has a fitness value of 9483.023, i.e. the distance of the path in metres. The genotype reveals the travelling order of the locations (expressed as integer identifiers) resulting in this distance. The worst solution individual has a fitness value of 12737.531. Because the population of individuals retains some diversity (even after 1000 generations of evolution), additional information about the population is also provided. Information related to the median individual in the population is also made available, as is the average fitness of the population, and population fitness variance.
Practicalities and final thoughts
There are a number of practicalities relating to the application of evolutionary optimisation, including:
- Domain specific information relating to the optimisation problem is essential. For example, some notion of maximisation or minimisation is necessary to implement fitness-based selection. Also, an understanding of appropriate solution characteristics is necessary to encode a satisfactory presentation of the genetic information.
- Evolutionary optimisation is only appropriate for larger scale problems. If the scale of the solution search space is such that exhaustive enumeration of all individuals is possible, then this is preferable because we can be sure that the optimal solution has been discovered.
- Since evolutionary optimisation incorporates a degree of randomness (in selection and diversity preservation), it’s not possible to prove that the ‘best’ solution(s) discovered are, in fact, optimal. However, sometimes just getting something that’s ‘good enough’ can be great, improving the quality of the problem solution in ways otherwise not possible.
- Benchmarking evolutionary optimisation performance can be tricky. Firstly, although some comparative studies of optimisation frameworks exist (e.g. [ Parejo12 ]), standard benchmark problem instances are less readily available. Secondly, customisation of optimisation components and parameters must be consistent across different problem domains and frameworks to ensure a fair comparison. Thirdly, because of the degree of randomness is present in the evolutionary frameworks, results can differ over different evolutionary ‘runs’ for the same optimisation problem. In this case, it’s useful to execute many evolutionary optimisation ‘runs’ for the same problem, and perform statistical analysis if appropriate.
- What happens after a population has evolved? It’s possible for complex optimisation problems that at algorithm termination, a population of equally optimal solutions has been discovered, but many individuals are dissimilar. At this point, programmer preference and judgement might be required to choose among the available solutions.
However, even with these practicalities in mind, evolutionary optimisation frameworks offer significant ‘off-the-shelf’ optimisation capabilities for programmers. Given their on-going development and increasing maturity, they can be attractive option for programmers working with large scale optimisation problems.
References and resources
[Buontempo13] ‘How to Program Your Way out of a Paper Bag using Genetic Algorithms’, F. Buontempo, Overload , December 2013, https://accu.org/index.php/journals/1825
[DEAP] DEAP: Distributed Evolutionary Algorithms in Python. https://github.com/DEAP (Last accessed: 23/10/17).
[ECF] ECF – Evolutionary Computation Framework. http://ecf.zemris.fer.hr/ (Last accessed: 23/10/17).
[ECJ] ECJ – A Java-based Evolutionary Computation Research System. https://cs.gmu.edu/~eclab/projects/ecj/ (Last accessed: 23/10/17).
[Eiben15] Introduction to Evolutionary Computing , 2nd Ed., A.E. Eiben, and J.E. Smith, Springer, 2015.
[EO] Evolving Objects (EO): an Evolutionary Computation Framework. http://eodev.sourceforge.net/ (Last accessed: 23/10/17).
[EvA] EvA2 – A Java based framework for Evolutionary Algorithms. http://www.ra.cs.uni-tuebingen.de/software/eva2/ (Last accessed: 23/10/17).
[Fogel02] ‘In Memoriam: Alex S. Fraser’, D. Fogel, IEEE Transactions on Evolutionary Computation , vol. 6, no. 2, pp. 429–430, 2002.
[Fogel66] Artificial Intelligence through Simulated Evolution , L.J. Fogel, A.J. Owens, and M.J. Walsh, Wiley, 1966.
[Gagné06] ‘Genericity in evolutionary computation software tools: principles and case-study’, C. Gagné, M. Parizeau, International Journal of Artificial Intelligent Tools , vol. 15, no. 2, pp. 173–194. 2006.
[GEATbx] GEATbx – The Genetic and Evolutionary Algorithm Toolbox for Matlab. http://www.geatbx.com/ (Last accessed: 23/10/17).
[GenSharp] GeneticSharp. https://github.com/giacomelli/GeneticSharp (Last accessed: 23/10/17).
[GOT] Global Optimisation Toolbox. https://www.mathworks.com/products/global-optimization.html (Last accessed: 23/10/17).
[HL] HeuristicLab – A Paradigm-Independent and Extensible Environment for Heuristic Optimization. https://dev.heuristiclab.com/trac.fcgi/ (Last accessed: 23/10/17).
[JCLEC] JCLEC – A Java Class Library for Evolutionary Computation. http://jclec.sourceforge.net/ (Last accessed: 23/10/17).
[jMetal] jMetal: a framework for multi-objective optimization with metaheuristics. https://github.com/jMetal (Last accessed: 23/10/17).
[Mallba] MALLBA Library. http://neo.lcc.uma.es/mallba/easy-mallba/index.html (Last accessed: 23/10/17).
[MOEAFram] MOEA Framework – A Free and Open Source Java Framework for Multiobjective Optimization. http://moeaframework.org/ (Last accessed: 23/10/17).
[OB] OpenBeagle – A generic C++ framework for evolutionary computation. https://github.com/chgagne/beagle (Last accessed: 23/10/17).
[Opt4J] Opt4J – A Modular Framework for Meta-heuristic Optimization. http://opt4j.sourceforge.net/ (Last accessed: 23/10/17).
[OptFrame] OptFrame. https://sourceforge.net/projects/optframe/ (Last accessed: 23/10/17).
[Pagmo] Pagmo – The C++ Scientific Library for Massively Parallel Optimization. https://esa.github.io/pagmo2/ (Last accessed: 23/10/17)
[Paradiseo] Paradiseo – A Software Framework for Metaheuristics. http://paradiseo.gforge.inria.fr/ (Last accessed: 23/10/17).
[Parejo12] ‘Metaheuristic Optimization Frameworks: A Survey and Benchmarking’, J.A. Parejo, A. Ruiz-Cortés, S. Lozano, and P. Fernandez, Soft Computing , vol. 16, no. 3, pp. 527–561, 2012.
[PlatEMO] PlatEMO – Evolutionary multi-objective optimization platform. http://bimk.ahu.edu.cn/index.php?s=/Index/Software/index.html (Last accessed: 23/10/17)
[Pyevolve] Pyevolve. http://pyevolve.sourceforge.net/ (Last accessed: 23/10/17)
[Pygmo] PyGMO – The Python Scientific Library for Massively Parallel Optimization. https://esa.github.io/pagmo2/ (Last accessed: 23/10/17)
[Pyvolution] Pyvolution – A Pure Python Evolutionary Algorithms Framework. https://pypi.python.org/pypi/Pyvolution (Last accessed: 23/10/17)
[Turing52] ‘Computing Machinery and Intelligence’, A.M. Turing, Mind , vol. 59, no. 236, pp. 433–460, 1950.
[Ventura08] ‘JCLEC: a Java framework for evolutionary computation’, S. Ventura, C. Romero, A. Zafra, J.A. Delgado, C. Hervás, Soft Computing , vol. 12, no.4, pp. 381–392, 2008.