Carlos M.Fonseca†and Peter J.Fleming‡Dept.Automatic Control and Systems Eng.University of SheffieldSheffield S14DU,U.K.AbstractThe paper describes a rank-basedfitness as-signment method for Multiple Objective Ge-netic Algorithms(MOGAs).Conventionalniche formation methods are extended to thisclass of multimodal problems and theory forsetting the niche size is presented.Thefit-ness assignment method is then modified toallow direct intervention of an external deci-sion maker(DM).Finally,the MOGA is gen-eralised further:the genetic algorithm is seenas the optimizing element of a multiobjectiveoptimization loop,which also comprises theDM.It is the interaction between the twothat leads to the determination of a satis-factory solution to the problem.Illustrativeresults of how the DM can interact with thegenetic algorithm are presented.They alsoshow the ability of the MOGA to uniformlysample regions of the trade-offsurface.1INTRODUCTIONWhilst most real world problems require the simulta-neous optimization of multiple,often competing,cri-teria(or objectives),the solution to such problems isusually computed by combining them into a single cri-terion to be optimized,according to some utility func-tion.In many cases,however,the utility function isnot well known prior to the optimization process.Thewhole problem should then be treated as a multiobjec-tive problem with non-commensurable objectives.Inthis way,a number of solutions can be found whichprovide the decision maker(DM)with insight into thecharacteristics of the problem before afinal solution ischosen.2VECTOR EV ALUATED GENETICALGORITHMSBeing aware of the potential GAs have in multiob-jective optimization,Schaffer(1985)proposed an ex-tension of the simple GA(SGA)to accommodate vector-valuedfitness measures,which he called the Vector Evaluated Genetic Algorithm(VEGA).The se-lection step was modified so that,at each generation, a number of sub-populations was generated by per-forming proportional selection according to each ob-jective function in turn.Thus,for a problem with q objectives,q sub-populations of size N/q each would be generated,assuming a population size of N.These would then be shuffled together to obtain a new popu-lation of size N,in order for the algorithm to proceed with the application of crossover and mutation in the usual way.However,as noted by Richardson et al.(1989),shuf-fling all the individuals in the sub-populations together to obtain the new population is equivalent to linearly combining thefitness vector components to obtain a single-valuedfitness function.The weighting coeffi-cients,however,depend on the current population. This means that,in the general case,not only will two non-dominated individuals be sampled at differ-ent rates,but also,in the case of a concave trade-offsurface,the population will tend to split into differ-ent species,each of them particularly strong in one of the objectives.Schaffer anticipated this property of VEGA and called it speciation.Speciation is unde-sirable in that it is opposed to the aim offinding a compromise solution.To avoid combining objectives in any way requires a different approach to selection.The next section de-scribes how the concept of inferiority alone can be used to perform selection.3A RANK-BASED FITNESSASSIGNMENT METHOD FORMOGAsConsider an individual x i at generation t which is dom-inated by p(t)i individuals in the current population.Its current position in the individuals’rank can be given byrank(x i,t)=1+p(t)i.All non-dominated individuals are assigned rank1,see Figure1.This is not unlike a class of selection meth-ods proposed by Fourman(1985)for constrained opti-mization,and correctly establishes that the individual labelled3in thefigure is worse than individual labelled 2,as the latter lies in a region of the trade-offwhich is less well described by the remaining individuals.The13511211f1f2Figure1:Multiobjective Rankingmethod proposed by Goldberg(1989,p.201)would treat these two individuals indifferently. Concerningfitness assignment,one should note that not all ranks will necessarily be represented in the pop-ulation at a particular generation.This is also shown in the example in Figure1,where rank4is absent. The traditional assignment offitness according to rank may be extended as follows:1.Sort population according to rank.2.Assignfitnesses to individuals by interpolatingfrom the best(rank1)to the worst(rank n∗≤N) in the usual way,according to some function,usu-ally linear but not necessarily.3.Average thefitnesses of individuals with the samerank,so that all of them will be sampled at the same rate.Note that this procedure keeps the global populationfitness constant while maintain-ing appropriate selective pressure,as defined by the function used.Thefitness assignment method just described appears as an extension of the standard assignment offitness according to rank,to which it maps back in the case of a single objective,or that of non-competing objectives.4NICHE-FORMATION METHODS FOR MOGAsConventionalfitness sharing techniques(Goldberg and Richardson,1987;Deb and Goldberg,1989)have been shown to be to effective in preventing genetic drift,in multimodal function optimization.However,they in-troduce another GA parameter,the niche sizeσshare, which needs to be set carefully.The existing theory for setting the value ofσshare assumes that the solu-tion set is composed by an a priori knownfinite num-ber of peaks and uniform niche placement.Upon con-vergence,local optima are occupied by a number of individuals proportional to theirfitness values.On the contrary,the global solution of an MO prob-lem isflat in terms of individualfitness,and there is no way of knowing the size of the solution set before-hand,in terms of a phenotypic metric.Also,local optima are generally not interesting to the designer, who will be more concerned with obtaining a set of globally non-dominated solutions,possibly uniformly spaced and illustrative of the global trade-offsurface. The use of ranking already forces the search to concen-trate only on global optima.By implementingfitness sharing in the objective value domain rather than the decision variable domain,and only between pairwise non-dominated individuals,one can expect to be able to evolve a uniformly distributed representation of the global trade-offsurface.Niche counts can be consistently incorporated into the extendedfitness assignment method described in the previous section by using them to scale individualfit-nesses within each rank.The proportion offitness allo-cated to the set of currently non-dominated individuals as a whole will then be independent of their sharing coefficients.4.1CHOOSING THE PARAMETERσshare The sharing parameterσshare establishes how far apart two individuals must be in order for them to decrease each other’sfitness.The exact value which would allow a number of points to sample a trade-offsurface only tangentially interfering with one another obviously de-pends on the area of such a surface.As noted above in this section,the size of the set of so-lutions to a MO problem expressed in the decision vari-able domain is not known,since it depends on the ob-jective function mappings.However,when expressed in the objective value domain,and due to the defini-tion of non-dominance,an upper limit for the size of the solution set can be calculated from the minimum and maximum values each objective assumes within that set.Let S be the solution set in the decision variable domain,f(S)the solution set in the objective domain and y=(y1,...,y q)any objective vector in f(S).Also,letm=(miny y1,...,minyy q)=(m1,...,m q)M=(maxy y1,...,maxyy q)=(M1,...,M q)as illustrated in Figure2.The definition of trade-offsurface implies that any line parallel to any of the axes will have not more than one of its points in f(S),which eliminates the possibility of it being rugged,i.e.,each objective is a single-valued function of the remaining objectives.Therefore,the true area of f(S)will be less than the sum of the areas of its projections according to each of the axes.Since the maximum area of each projection will be at most the area of the correspond-ing face of the hyperparallelogram defined by mand Figure2:An Example of a Trade-offSurface in3-Dimensional SpaceM,the hyperarea of f(S)will be less thanA=qi=1qj=1j=i(M j−m j)which is the sum of the areas of each different face of a hyperparallelogram of edges(M j−m j)(Figure3). In accordance with the objectives being non-commensurable,the use of the∞-norm for measuring the distance between individuals seems to be the most natural one,while also being the simplest to compute. In this case,the user is still required to specify an indi-vidualσshare for each of the objectives.However,the metric itself does not combine objective values in any way.Assuming that objectives are normalized so that all sharing parameters are the same,the maximum num-ber of points that can sample area A without in-terfering with each other can be computed as the number of hypercubes of volumeσqsharethat can be placed over the hyperparallelogram defined by A(Fig-ure4).This can be computed as the difference in vol-ume between two hyperparallelograms,one with edges (M i−m i+σshare)and the other with edges(M i−m i), divided by the volume of a hypercube of edgeσshare, i.e.N=qi=1(M i−m i+σshare)−qi=1(M i−m i)Figure3:Upper Bound for the Area of a Trade-offSurface limited by the Parallelogram defined by (m1,m2,m3)and(M1,M2,M3)(q−1)-order polynomial equationNσq−1share −qi=1(M i−m i+σshare)−qi=1(M i−m i)Pareto set of interest to the DM by providing external information to the selection algorithm.Thefitness assignment method described earlier was modified in order to accept such information in the form of goals to be attained,in a similar way to that used by the conventional goal attainment method(Gembicki,1974),which will now be briefly introduced.5.1THE GOAL ATTAINMENT METHOD The goal attainment method solves the multiobjective optimization problem defined asminx∈Ωf(x)where x is the design parameter vector,Ωthe feasible parameter space and f the vector objective function, by converting it into the following nonlinear program-ming problem:minλ,x∈Ωλsuch thatf i−w iλ≤g iHere,g i are goals for the design objectives f i,and w i≥0are weights,all of them specified by the de-signer beforehand.The minimization of the scalarλleads to thefinding of a non-dominated solution which under-or over-attains the specified goals to a degree represented by the quantities w iλ.5.2A MODIFIED MO RANKINGSCHEME TO INCLUDE GOALINFORMATIONThe MO ranking procedure previously described was extended to accommodate goal information by altering the way in which individuals are compared with one another.In fact,degradation in vector components which meet their goals is now acceptable provided it results in the improvement of other components which do not satisfy their goals and it does not go beyond the goal boundaries.This makes it possible for one to prefer one individual to another even though they are both non-dominated.The algorithm will then identify and evolve the relevant region of the trade-offsurface. Still assuming a minimization problem,consider two q-dimensional objective vectors,y a=(y a,1,...,y a,q) and y b=(y b,1,...,y b,q),and the goal vector g= (g1,...,g q).Also consider that y a is such that it meets a number,q−k,of the specified goals.Without loss of generality,one can write∃k=1,...,q−1:∀i=1,...,k,∀j=k+1,...,q,(y a,i>g i)∧(y a,j≤g j)(A) which assumes a convenient permutation of the objec-tives.Eventually,y a will meet none of the goals,i.e.,∀i=1,...,q,(y a,i>g i)(B)or even all of them,and one can write∀j=1,...,q,(y a,j≤g j)(C) In thefirst case(A),y a meets goals k+1,...,q and, therefore,will be preferable to y b simply if it domi-nates y b with respect to itsfirst k components.For the case where all of thefirst k components of y a are equal to those of y b,y a will still be preferable to y b if it dominates y b with respect to the remaining com-ponents,or if the remaining components of y b do not meet all their goals.Formally,y a will be preferable to y b,if and only ify a,(1,...,k)p<y b,(1,...,k) ∨y a,(1,...,k)=y b,(1,...,k) ∧y a,(k+1,...,q)p<y b,(k+1,...,q) ∨∼ y b,(k+1,...,q)≤g(k+1,...,q)In the second case(B),y a satisfies none of the goals. Then,y a is preferable to y b if and only if it dominates y b,i.e.,y a p<y bFinally,in the third case(C)y a meets all of the goals, which means that it is a satisfactory,though not nec-essarily optimal,solution.In this case,y a is preferable to y b,if and only if it dominates y b or y b is not satis-factory,i.e.,(y a p<y b)∨∼(y b≤g)The use of the relation preferable to as just described, instead of the simpler relation partially less than,im-plies that the solution set be delimited by those non-dominated points which tangentially achieve one or more goals.Setting all the goals to±∞will make the algorithm try to evolve a discretized description of the whole Pareto set.Such a description,inaccurate though it may be,can guide the DM in refining its requirements.When goals can be supplied interactively at each GA generation, the decision maker can reduce the size of the solution set gradually while learning about the trade-offbe-tween objectives.The variability of the goals acts as a changing environment to the GA,and does not im-pose any constraints on the search space.Note that appropriate sharing coefficients can still be calculated as before,since the size of the solution set changes in a way which is known to the DM.This strategy of progressively articulating the DM preferences,while the algorithm runs,to guide the search,is not new in operations research.The main disadvantage of the method is that it demands a higher effort from the DM.On the other hand,it potentially reduces the number of function evaluations required when compared to a method for a posteriori articula-tion of preferences,as well as providing less alternativeddddDM a priori knowledgeGAobjective function values fitnesses(acquired knowledge)resultsFigure 5:A General Multiobjective Genetic Optimizerpoints at each iteration,which are certainly easier for the DM to discriminate between than the whole Pareto set at once.6THE MOGA AS A METHOD FOR PROGRESSIVE ARTICULATION OF PREFERENCESThe MOGA can be generalized one step further.The DM action can be described as the consecutive evalu-ation of some not necessarily well defined utility func-tion .The utility function expresses the way in which the DM combines objectives in order to prefer one point to another and,ultimately,is the function which establishes the basis for the GA population to evolve.Linearly combining objectives to obtain a scalar fit-ness,on the one hand,and simply ranking individuals according to non-dominance,on the other,both corre-spond to two different attitudes of the DM.In the first case,it is assumed that the DM knows exactly what to optimize,for example,financial cost.In the second case,the DM is making no decision at all apart from letting the optimizer use the broadest definition of MO optimality.Providing goal information,or using shar-ing techniques,simply means a more elaborated atti-tude of the DM,that is,a less straightforward utility function,which may even vary during the GA process,but still just another utility function.A multiobjective genetic optimizer would,in general,consist of a standard genetic algorithm presenting the DM at each generation with a set of points to be as-sessed.The DM makes use of the concept of Pareto optimality and of any a priori information available to express its preferences,and communicates them to the GA,which in turn replies with the next generation.At the same time,the DM learns from the data it is presented with and eventually refines its requirements until a suitable solution has been found (Figure 5).In the case of a human DM,such a set up may require reasonable interaction times for it to become attrac-tive.The natural solution would consist of speedingup the process by running the GA on a parallel ar-chitecture.The most appealing of all,however,would be the use of an automated DM,such as an expert system.7INITIAL RESULTSThe MOGA is currently being applied to the step response optimization of a Pegasus gas turbine en-gine.A full non-linear model of the engine (Han-cock,1992),implemented in Simulink (MathWorks,1992b),is used to simulate the system,given a num-ber of initial conditions and the controller parameter settings.The GA is implemented in Matlab (Math-Works,1992a;Fleming et al.,1993),which means that all the code actually runs in the same computation en-vironment.The logarithm of each controller parameter was Gray encoded as a 14-bit string,leading to 70-bit long chro-mosomes.A random initial population of size 80and standard two-point reduced surrogate crossover and binary mutation were used.The initial goal values were set according to a number of performance require-ments for the engine.Four objectives were used:t r The time taken to reach 70%of the final output change.Goal:t r ≤0.59s.t s The time taken to settle within ±10%of the final output change.Goal:t s ≤1.08s.os Overshoot,measured relatively to the final output change.Goal:os ≤10%.err A measure of the output error 4seconds after thestep,relative to the final output change.Goal:err ≤10%.During the GA run,the DM stores all non-dominated points evaluated up to the current generation.This constitutes acquired knowledge about the trade-offs available in the problem.From these,the relevant points are identified,the size of the trade-offsurface estimated and σshare set.At any time in the optimiza-trts ov err o r m a l i z e d o b j e c t i v e v a l u e s Objective functions0.59s 1.08s 10% 10%Figure 6:Trade-offGraph for the Pegasus Gas Turbine Engine after 40Generations (Initial Goals)tion process,the goal values can be changed,in order to zoom in on the region of interest.A typical trade-offgraph,obtained after 40genera-tions with the initial goals,is presented in Figure 6and represents the accumulated set of satisfactory non-dominated points.At this stage,the setting of a much tighter goal for the output error (err ≤0.1%)reveals the graph in Figure 7,which contains a subset of the points in Figure 6.Continuing to run the GA,more definition can be obtained in this area (Figure 8).Fig-ure 9presents an alternative view of these solutions,illustrating the arising step responses.8CONCLUDING REMARKSGenetic algorithms,searching from a population of points,seem particularly suited to multiobjective opti-mization.Their ability to find global optima while be-ing able to cope with discontinuous and noisy functions has motivatedan increasing number of applications in engineering and related fields.The development of the MOGA is one expression of our wish to bring decision making into engineering design,in general,and control system design,in particular.An important problem arising from the simple Pareto-based fitness assignment method is that of the global size of the solution plex problems can be expected to exhibit a large and complex trade-offsur-face which,to be sampled accurately,would ultimately overload the DM with virtually useless information.Small regions of the trade-offsurface,however,can still be sampled in a Pareto-based fashion,while the deci-sion maker learns and refines its requirements.Niche formation methods are transferred to the objective value domain in order to take advantage of the prop-erties of the Paretoset.Figure 7:Trade-offGraph for the Pegasus Gas Turbine Engine after 40Generations (New Goals)Figure 8:Trade-offGraph for the Pegasus Gas Turbine Engine after 60Generations (New Goals)Figure 9:Satisfactory Step Responses after 60Gener-ations (New Goals)Initial results,obtained from a real world engineering problem,show the ability of the MOGA to evolve uni-formly sampled versions of trade-offsurface regions. They also illustrate how the goals can be changed dur-ing the GA run.Chromosome coding,and the genetic operators them-selves,constitute areas for further study.Redundant codings would eventually allow the selection of the ap-propriate representation while evolving the trade-offsurface,as suggested in(Chipperfield et al.,1992). The direct use of real variables to represent an indi-vidual together with correlated mutations(B¨a ck et al., 1991)and some clever recombination operator(s)may also be interesting.In fact,correlated mutations should be able to identify how decision variables re-late to each other within the Pareto set.AcknowledgementsThefirst author gratefully acknowledges support by Programa CIENCIA,Junta Nacional de Investiga¸c˜a o Cient´ıfica e Tecnol´o gica,Portugal.ReferencesB¨a ck,T.,Hoffmeister,F.,and Schwefel,H.-P.(1991).A survey of evolution strategies.In Belew,R.,editor,Proc.Fourth Int.Conf.on Genetic Algo-rithms,pp.2–9.Morgan Kaufmann.Chipperfield, A.J.,Fonseca, C.M.,and Fleming, P.J.(1992).Development of genetic optimiza-tion tools for multi-objective optimization prob-lems in CACSD.In IEE Colloq.on Genetic Algo-rithms for Control Systems Engineering,pp.3/1–3/6.The Institution of Electrical Engineers.Di-gest No.1992/106.Deb,K.and Goldberg,D.E.(1989).An investigation of niche and species formation in genetic func-tion optimization.In Schaffer,J.D.,editor,Proc.Third Int.Conf.on Genetic Algorithms,pp.42–50.Morgan Kaufmann.Farshadnia,R.(1991).CACSD using Multi-Objective Optimization.PhD thesis,University of Wales, Bangor,UK.Fleming,P.J.(1985).Computer aided design of regulators using multiobjective optimization.In Proc.5th IFAC Workshop on Control Applica-tions of Nonlinear Programming and Optimiza-tion,pp.47–52,Capri.Pergamon Press. Fleming,P.J.,Crummey,T.P.,and Chipperfield,A.J.(1992).Computer assisted control systemdesign and multiobjective optimization.In Proc.ISA Conf.on Industrial Automation,pp.7.23–7.26,Montreal,Canada.Fleming,P.J.,Fonseca,C.M.,and Crummey,T.P.(1993).Matlab:Its toolboxes and open struc-ture.In Linkens,D.A.,editor,CAD for Control Systems,chapter11,pp.271–286.Marcel-Dekker. Fourman,M.P.(1985).Compaction of symbolic lay-out using genetic algorithms.In Grefenstette, J.J.,editor,Proc.First Int.Conf.on Genetic Algorithms,pp.141–wrence Erlbaum. Gembicki,F.W.(1974).Vector Optimization for Con-trol with Performance and Parameter Sensitivity Indices.PhD thesis,Case Western Reserve Uni-versity,Cleveland,Ohio,USA.Goldberg,D.E.(1989).Genetic Algorithms in Search, Optimization and Machine Learning.Addison-Wesley,Reading,Massachusetts.Goldberg,D.E.and Richardson,J.(1987).Genetic algorithms with sharing for multimodal function optimization.In Grefenstette,J.J.,editor,Proc.Second Int.Conf.on Genetic Algorithms,pp.41–wrence Erlbaum.Hancock,S.D.(1992).Gas Turbine Engine Controller Design Using Multi-Objective Optimization Tech-niques.PhD thesis,University of Wales,Bangor, UK.MathWorks(1992a).Matlab Reference Guide.The MathWorks,Inc.MathWorks(1992b).Simulink User’s Guide.The MathWorks,Inc.Richardson,J.T.,Palmer,M.R.,Liepins,G.,and Hilliard,M.(1989).Some guidelines for genetic algorithms with penalty functions.In Schaffer, J.D.,editor,Proc.Third Int.Conf.on Genetic Algorithms,pp.191–197.Morgan Kaufmann. Schaffer,J.D.(1985).Multiple objective optimiza-tion with vector evaluated genetic algorithms.In Grefenstette,J.J.,editor,Proc.First Int.Conf.on Genetic Algorithms,pp.93–wrence Erl-baum.Wienke,D.,Lucasius,C.,and Kateman,G.(1992).Multicriteria target vector optimization of analyt-ical procedures using a genetic algorithm.Part I.Theory,numerical simulations and application to atomic emission spectroscopy.Analytica Chimica Acta,265(2):211–225.。
The genetic algorithm attempts to find a very good (or best) solution to the problem by genetically breeding the population of individuals over a series of generations.Before applying the genetic algorithm to the problem, the user designs an artificial chromosome of a certain fixed size and then defines a mapping (encoding) between the points in the search space of the problem and instances of the artificial chromosome. For example, in applying the genetic algorithm to a multidimensional optimization problem (where the goal is to find the global optimum of an unknown multidimensional function), the artificial chromosome may be a linear character string (modeled directly after the linear string of information found in DNA). A specific location (a gene) along this artificial chromosome is associated with each of the variables of the problem. Character(s) appearing at a particular location along the chromosome denote the value of a particular variable (i.e., the gene value or allele). Each individual in the population has a fitness value (which, for a multidimensional optimization problem, is the value of the unknown function). The genetic algorithm then manipulates a population of such artificial chromosomes (usually starting from a randomly-created initial population of strings) using the operations of reproduction, crossover, and mutation. Individuals are probabilistically selected to participate in these genetic operations based on their fitness. The goal of the genetic algorithm in a multidimensional optimization problem is to find an artificial chromosome which, when decoded and mapped back into the search space of the problem, corresponds to a globally optimum (or near-optimum) point in the original search space of the problem.In preparing to use the conventional genetic algorithm operating on fixed-length character strings to solve a problem, the user must(1) determine the representation scheme,(2) determine the fitness measure,(3) determine the parameters and variables for controllingthe algorithm, and(4) determine a way of designating the result and a criterionfor terminating a run.In the conventional genetic algorithm, the individuals in the population are usually fixed-length character strings patterned after chromosome strings. Thus, specification of the representation scheme in the conventional genetic algorithm starts with a selection of the string length L and the alphabet size K. Often the alphabet is binary, so K equals 2. The most important part of the representation scheme is the mapping that expresses each possible point in the search space of the problem as a fixed-length character string (i.e., as a chromosome) and each chromosome as a point in the search space of the problem. Selecting a representation scheme that facilitates solution of the problem by the genetic algorithm often requires considerable insight into the problem and good judgment.The evolutionary process is driven by the fitness measure. The fitness measure assigns a fitness value to each possible fixed-length character string in the population.The primary parameters for controlling the genetic algorithm are the population size, M, and the maximum number of generations to be run, G. Populations can consist of hundreds, thousands, tens of thousands or more individuals. There can be dozens, hundreds, thousands, or more generations in a run of the genetic algorithm.Each run of the genetic algorithm requires specification of a termination criterion for deciding when to terminate a run and a method of result designation. One frequently used method of result designation for a run of the genetic algorithm is to designate the best individual obtained in any generation of the population during the run (i.e., the best-so-far individual) as the result of the run.Once the four preparatory steps for setting up the genetic algorithm have been completed, the genetic algorithm can be run.The evolutionary process described above indicates how a globally optimum combination of alleles (gene values) within a fixed-size chromosome can be evolved.The three steps in executing the genetic algorithm operating on fixed-length character strings are as follows:(1) Randomly create an initial population of individual fixed-length character strings.(2) Iteratively perform the following substeps on thepopulation of strings until the termination criterion has been satisfied:(A) Assign a fitness value to each individual in thepopulation using the fitness measure.(C) Create a new population of strings by applying thefollowing three genetic operations. The geneticoperations are applied to individual string(s) in thepopulation chosen with a probability based on fitness.(i) Reproduce an existing individual string by copyingit into the new population.(ii) Create two new strings from two existing strings by genetically recombining substrings using thecrossover operation (described below) at arandomly chosen crossover point.(iii) Create a new string from an existing string byrandomly mutating the character at one randomlychosen position in the string.(3) The string that is identified by the method of resultdesignation (e.g., the best-so-far individual) is designated as the result of the genetic algorithm for the run. This result may represent a solution (or an approximate solution) to the problem.The genetic operation of reproduction is based on the Darwinian principle of reproduction and survival of the fittest. In the reproduction operation, an individual is probabilistically selected from the population based on its fitness (with reselection allowed) and then the individual is copied, without change, into the next generation of the population. The selection is done in such a way that the better an individual's fitness, the more likely it is to be selected. An important aspect of this probabilistic selection is that every individual, however poor its fitness, has some probability of selection.The genetic operation of crossover (sexual recombination) allows new individuals (i.e., new points in the search space) to be created and tested. The operation of crossover starts with two parents independently selected probabilistically from the population based on their fitness (with reselection allowed). As before, the selection is done in such a way that the better an individual's fitness, the more likely it is to be selected. The crossover operation produces two offspring. Each offspring contains some genetic material from each of its parents.Suppose that the crossover operation is to be applied to the two parental strings 10110 and 01101 of length L = 5 over an alphabet of size K = 2. The crossover operation begins by randomly selecting a number between 1 and L–1 using a uniform probability distribution. Suppose that the third interstitial location is selected. This location becomes the crossover point. Each parent is then split at this crossover point into a crossover fragment and a remainder. The crossover operation then recombines remainder 1 (i.e., – – – 1 0) with crossover fragment 2 (i.e., 011 – –) to create offspring 2 (i.e., 01110). The crossover operation similarly recombines remainder 2 (i.e., – – – 01) with crossover fragment 1 (i.e., 101 – –) to create offspring 1 (i.e., 10101).The operation of mutation allows new individuals to be created. It begins by selecting an individual from the population based on its fitness (with reselection allowed). A point along the string is selected at random and the character at that point is randomly changed. The altered individual is then copied intothe next generation of the population. Mutation is used very sparingly in genetic algorithm work.The genetic algorithm works in a domain-independent wayon the fixed-length character strings in the population. The genetic algorithm searches the space of possible character strings in an attempt to find high-fitness strings. The fitness landscape may be very rugged and nonlinear. To guide this search, the genetic algorithm uses only the numerical fitnessvalues associated with the explicitly tested strings in the population. Regardless of the particular problem domain, the genetic algorithm carries out its search by performing the same disarmingly simple operations of copying, recombining, and occasionally randomly mutating the strings.In practice, the genetic algorithm is surprisingly rapid in effectively searching complex, highly nonlinear, multidimensional search spaces. This is all the more surprising because the genetic algorithm does not know anything about the problem domain or the internal workings of the fitness measurebeing used.1.1 Sources of Additional InformationDavid Goldberg's Genetic Algorithms in Search, Optimization,and Machine Learning (1989) is the leading textbook and bestsingle source of additional information about the field of genetic algorithms.Additional information on genetic algorithms can be foundin Davis (1987, 1991), Michalewicz (1992), and Buckles andPetry (1992). The proceedings of the International Conferenceon Genetic Algorithms provide an overview of research activityin the genetic algorithms field. See Eshelman (1995), Forrest (1993), Belew and Booker (1991), Schaffer (1989), and Grefenstette (1985, 1987).Also see the proceedings of the IEEE International Conference on Evolutionary Computation {IEEE 1994, 1995).The proceedings of the Foundations of Genetic Algorithms workshops cover theoretical aspects of the field. See Whitleyand Vose (1995), Whitley (1992), and Rawlins (1991).Fogel and Atmar (1992, 1993), Sebald and Fogel (1994), andSebald and Fogel (1995) emphasizes recent work on evolutionary programming (EP).The proceedings of the Parallel Problem Solving from Nature conferences emphasize work on evolution strategies (ES). See Schwefel and Maenner (1991), Maenner and Manderick (1992), and Davidor, Schwefel, and Maenner (1994).Stender (1993) describes parallelization of genetic algorithms. Also see Koza and Andre 1995. Davidor (1992) describes application of genetic algorithms to robotics. Schafferand Whitley (1992) and Albrecht, Reeves, and Steele (1993) describe work on combinations of genetic algorithms and neural networks. Forrest (1991) describes application of genetic classifier systems to semantic nets.Additional information about genetic algorithms may be obtained from the GA-LIST electronic mailing list to which youmay subscribe, at no charge, by sending a subscription request toGA-List-Request@. Issues of theGA-LIST provide instructions for accessing the genetic algorithms archive, which contains software that may be obtained over the Internet. The archive may be accessed overthe World Wide Web at /galist/ or through anonymous ftp at ( /pub/galist.2. GENETIC PROGRAMMINGGenetic programming is an attempt to deal with one of the central questions in computer science (posed by Arthur Samuel in 1959), namelyHow can computers learn to solve problems withoutbeing explicitly programmed? In other words, how cancomputers be made to do what needs to be done, withoutbeing told exactly how to do it?All computer programs – whether they are written in FORTRAN, PASCAL, C, assembly code, or any other programming language – can be viewed as a sequence of applications of functions (operations) to arguments (values). Compilers use this fact by first internally translating a given program into a parse tree and then converting the parse tree into the more elementary assembly code instructions that actually run on the computer. However this important commonality underlying all computer programs is usually obscured by the large variety of different types of statements, operations, instructions, syntactic constructions, and grammatical restrictions found in most popular programming languages.Any computer program can be graphically depicted as a rooted point-labeled tree with ordered branches.Genetic programming is an extension of the conventional genetic algorithm in which each individual in the population is a computer program.The search space in genetic programming is the space of all possible computer programs composed of functions and terminals appropriate to the problem domain. The functions may be standard arithmetic operations, standard programming operations, standard mathematical functions, logical functions, or domain-specific functions.The book Genetic Programming: On the Programming of Computers by Means of Natural Selection (Koza 1992) demonstrated a result that many found surprising and counterintuitive, namely that an automatic, domain-independent method can genetically breed computer programs capable of solving, or approximately solving, a wide variety of problems from a wide variety of fields.In applying genetic programming to a problem, there are five major preparatory steps. These five steps involve determining(1) the set of terminals,(2) the set of primitive functions,(3) the fitness measure,(4) the parameters for controlling the run, and(5) the method for designating a result and the criterion forterminating a run.The first major step in preparing to use genetic programmingis to identify the set of terminals. The terminals can be viewed as the inputs to the as-yet-undiscovered computer program. The set of terminals (along with the set of functions) are the ingredients from which genetic programming attempts to construct a computer program to solve, or approximately solve, the problem.The second major step in preparing to use genetic programming is to identify the set of functions that are to be used to generate the mathematical expression that attempts to fit the given finite sample of data.Each computer program (i.e., mathematical expression, LISPS-expression, parse tree) is a composition of functions from the function set F and terminals from the terminal set T.Each of the functions in the function set should be able to accept, as its arguments, any value and data type that may possibly be returned by any function in the function set and any value and data type that may possibly be assumed by any terminal in the terminal set. That is, the function set and terminal set selected should have the closure property.These first two major steps correspond to the step of specifying the representation scheme for the conventional genetic algorithm. The remaining three major steps for genetic programming correspond to the last three major preparatory steps for the conventional genetic algorithm.In genetic programming, populations of hundreds, thousands, or millions of computer programs are genetically bred. This breeding is done using the Darwinian principle of survival and reproduction of the fittest along with a genetic crossover operation appropriate for mating computer programs.A computer program that solves (or approximately solves) a given problem often emerges from this combination of Darwinian natural selection and genetic operations.Genetic programming starts with an initial population (generation 0) of randomly generated computer programs composed of functions and terminals appropriate to the problem domain. The creation of this initial random population is, in effect, a blind random search of the search space of the problem represented as computer programs.Each individual computer program in the population is measured in terms of how well it performs in the particular problem environment. This measure is called the fitness measure. The nature of the fitness measure varies with the problem.For many problems, fitness is naturally measured by the error produced by the computer program. The closer this error is to zero, the better the computer program. In a problem of optimal control, the fitness of a computer program may be the amount of time (or fuel, or money, etc.) it takes to bring the system to a desired target state. The smaller the amount of time (or fuel, or money, etc.), the better. If one is trying to recognize patterns or classify examples, the fitness of a particular program may be measured by some combination of the number of instances handled correctly (i.e., true positive and true negatives) and the number of instances handled incorrectly (i.e., false positives and false negatives). Correlation is often used as a fitness measure. On the other hand, if one is trying to find a good randomizer, the fitness of a given computer program might be measured by means of entropy, satisfaction of the gap test, satisfaction of the run test, or some combination of these factors. For electronic circuit design problems, the fitness measure may involve a convolution. For some problems, it may be appropriate to use a multiobjective fitness measure incorporating a combination of factors such as correctness, parsimony (smallness of the evolved program), or efficiency (of execution).Typically, each computer program in the population is run over a number of different fitness cases so that its fitness is measured as a sum or an average over a variety of representative different situations. These fitness cases sometimes represent a sampling of different values of an independent variable or a sampling of different initial conditions of a system. For example, the fitness of an individual computer program in the population may be measured in terms of the sum of the absolute value of the differences between the output produced by the program and the correct answer to the problem (i.e., the Minkowski distance) or the square root of the sum of thesquares (i.e., Euclidean distance). These sums are taken over a sampling of different inputs (fitness cases) to the program. The fitness cases may be chosen at random or may be chosen in some structured way (e.g., at regular intervals or over a regular grid). It is also common for fitness cases to represent initial conditions of a system (as in a control problem). In economic forecasting problems, the fitness cases may be the daily closing price of some financial instrument.The computer programs in generation 0 of a run of genetic programming will almost always have exceedingly poor fitness. Nonetheless, some individuals in the population will turn out to be somewhat more fit than others. These differences in performance are then exploited.The Darwinian principle of reproduction and survival of the fittest and the genetic operation of crossover are used to create a new offspring population of individual computer programs from the current population of programs.The reproduction operation involves selecting a computer program from the current population of programs based on fit-ness (i.e., the better the fitness, the more likely the individual is to be selected) and allowing it to survive by copying it into the new population.The crossover operation is used to create new offspring computer programs from two parental programs selected based on fitness. The parental programs in genetic programming are typically of different sizes and shapes. The offspring programs are composed of subexpressions (subtrees, subprograms, subroutines, building blocks) from their parents. These offspring programs are typically of different sizes and shapes than their parents.The mutation operation may also be used in genetic programming.After the genetic operations are performed on the current population, the population of offspring (i.e., the new generation) replaces the old population (i.e., the old generation). Each individual in the new population of programs is then measured for fitness, and the process is repeated over many generations.At each stage of this highly parallel, locally controlled, decentralized process, the state of the process will consist only of the current population of individuals.The force driving this process consists only of the observed fitness of the individuals in the current population in grappling with the problem environment.As will be seen, this algorithm will produce populations of programs which, over many generations, tend to exhibit increasing average fitness in dealing with their environment. In addition, these populations of computer programs can rapidly and effectively adapt to changes in the environment.The best individual appearing in any generation of a run (i.e., the best-so-far individual) is typically designated as the result produced by the run of genetic programming.The hierarchical character of the computer programs that are produced is an important feature of genetic programming. The results of genetic programming are inherently hierarchical. In many cases the results produced by genetic programming are default hierarchies, prioritized hierarchies of tasks, or hierarchies in which one behavior subsumes or suppresses another.The dynamic variability of the computer programs that are developed along the way to a solution is also an important feature of genetic programming. It is often difficult and unnatural to try to specify or restrict the size and shape of the eventual solution in advance. Moreover, advance specification or restriction of the size and shape of the solution to a problem narrows the window by which the system views the world and might well preclude finding the solution to the problem at all.Another important feature of genetic programming is the absence or relatively minor role of preprocessing of inputs and postprocessing of outputs. The inputs, intermediate results, and outputs are typically expressed directly in terms of the natural terminology of the problem domain. The programs produced by genetic programming consist of functions that are natural for the problem domain. The postprocessing of the output of a program, if any, is done by a wrapper (output interface).Finally, another important feature of genetic programming is that the structures undergoing adaptation in genetic programming are active. They are not passive encodings (i.e., chromosomes) of the solution to the problem. Instead, given a computer on which to run, the structures in genetic programming are active structures that are capable of being executed in their current form.The genetic crossover (sexual recombination) operation operates on two parental computer programs selected with a probability based on fitness and produces two new offspring programs consisting of parts of each parent.For example, consider the following computer program (presented here as a LISP S-expression):(+ (* 0.234 Z) (- X 0.789)),which we would ordinarily write as0.234 Z + X – 0.789.This program takes two inputs (X and Z) and produces a floating point output.Also, consider a second program:(* (* Z Y) (+ Y (* 0.314 Z))).Suppose that the crossover points are the * in the first parent and the + in the second parent. These two crossover fragments correspond to the underlined sub-programs (sub-lists) in the two parental computer programs.The two offspring resulting from crossover are as follows:(+ (+ Y (* 0.314 Z)) (- X 0.789))(* (* Z Y) (* 0.234 Z)).Thus, crossover creates new computer programs using parts of existing parental programs. Because entire sub-trees are swapped, the crossover operation always produces syntactically and semantically valid programs as offspring regardless of the choice of the two crossover points. Because programs are selected to participate in the crossover operation with a probability based on fitness, crossover allocates future trials to regions of the search space whose programs contains parts from promising programs.The videotape Genetic Programming: The Movie (Koza and Rice 1992) provides a visualization of the genetic programming process and of solutions to various problems.2.2 Automatically Defined FunctionsI believe that no approach to automated programming is likely to be successful on non-trivial problems unless it provides some hierarchical mechanism to exploit, by reuse and parameterization, the regularities, symmetries, homogeneities, similarities, patterns, and modularities inherent in problem environments. Subroutines do this in ordinary computer programs.Accordingly, Genetic Programming II: Automatic Discovery of Reusable Programs (Koza 1994) describes how to evolve multi-part programs consisting of a main program and one or more reusable, parameterized, hierarchically-called subprograms (called automatically defined functions or ADF s). A visualization of the solution to numerous example problems using automatically defined functions can be found in the videotape Genetic Programming II Videotape: The Next Generation (Koza 1994).Automatically defined functions can be implemented within the context of genetic programming by establishing a constrained syntactic structure for the individual programs in the population. Each multi-part program in the population contains one (or more) function-defining branches and one (or more) main result-producing branches. The result-producing branch usually has the ability to call one or more of the automatically defined functions. A function-defining branch may have the ability to refer hierarchically to other already-defined automatically defined functions.Genetic programming evolves a population of programs, each consisting of an automatically defined function in the function-defining branch and a result-producing branch. The structures of both the function-defining branches and the result-producing branch are determined by the combined effect, over many generations, of the selective pressure exerted by the fitness measure and by the effects of the operations of Darwinian fitness-based reproduction and crossover. The function defined by the function-defining branch is available for use by the result-producing branch. Whether or not the defined function will be actually called is not predetermined, but instead, determined by the evolutionary process.Since each individual program in the population of this example consists of function-defining branch(es) and result-producing branch(es), the initial random generation must be created so that every individual program in the population has this particular constrained syntactic structure. Since a constrained syntactic structure is involved, crossover must be performed so as to preserve this syntactic structure in all offspring.Genetic programming with automatically defined functions has been shown to be capable of solving numerous problems (Koza 1994a). More importantly, the evidence so far indicates that, for many problems, genetic programming requires less computational effort (i.e., fewer fitness evaluations to yield a solution with, say, a 99% probability) with automatically defined functions than without them (provided the difficulty of the problem is above a certain relatively low break-even point).Also, genetic programming usually yields solutions with smaller average overall size with automatically defined functions than without them (provided, again, that the problem is not too simple). That is, both learning efficiency and parsimony appear to be properties of genetic programming with automatically defined functions.Moreover, there is evidence that genetic programming with automatically defined functions is scalable. For several problems for which a progression of scaled-up versions was studied, the computational effort increases as a function of problem size at a slower rate with automatically defined functions than without them. Also, the average size of solutions similarly increases as a function of problem size at a slower rate with automatically defined functions than without them. This observed scalability results from the profitable reuse of hierarchically-callable, parameterized subprograms within the overall program.When single-part programs are involved, genetic programming automatically determines the size and shape of the solution (i.e., the size and shape of the program tree) as well as the sequence of work-performing primitive functions that can solve the problem. However, when multi-part programs and automatically defined functions are being used, the question arises as to how to determine the architecture of the programs that are being evolved. The architecture of a multi-part program consists of the number of function-defining branches (automatically defined functions) and the number of arguments (if any) possessed by each function-defining branch.2.3 Evolutionary Selection of ArchitectureOne technique for creating the architecture of the overall program for solving a problem during the course of a run of genetic programming is to evolutionarily select the architecture dynamically during a run of genetic programming. This technique is described in chapters 21 – 25 of Genetic Programming II: Automatic Discovery of Reusable Programs (Koza 1994a). The technique of evolutionary selection starts with an architecturally diverse initial random population. As the evolutionary process proceeds, individuals with certain architectures may prove to be more fit than others at solving the problem. The more fit architectures will tend to prosper, while the less fit architectures will tend to wither away.The architecturally diverse populations used with the technique of evolutionary selection require a modification of both the method of creating the initial random population and the two-offspring subtree-swapping crossover operation previously used in genetic programming. Specifically, the architecturally diverse population is created at generation 0 so as to contain randomly-created representatives of a broad range of different architectures. Structure-preserving crossover with point typing is a one-offspring crossover operation that permits robust recombination while guaranteeing that any pair of architecturally different parents will produce syntactically and semantically valid offspring.2.4 Architecture-AlteringOperationsA second technique for creating the architecture of the overall program for solving a problem during the course of a run of genetic programming is to evolve the architecture using architecture-altering (Koza 1995).2.8 Sources of Additional InformationIn addition to the author's books (Koza 1992, 1994) and accompanying videotapes (Koza and Rice 1992, Koza 1994), the first Advances in Genetic Programming book (Kinnear 1994) and the upcoming second book in this series (Angeline and Kinnear 1996) contain about two dozen articles each on various applications and aspects of genetic programming.In addition to the conferences mentioned in the earlier section on genetic algorithms, the conferences of artificial life {Brooks and Maes 1994) and simulation of adaptive behavior (Cliff et al. 1994) and have articles on genetic programming.Additional information about genetic programming may be obtained from the GP-LIST electronic mailing list to which you may subscribe, at no charge, by sending a subscription request to genetic-programming-request@.Information about obtaining software in C, C++, LISP, and other programming languages for genetic programming, information about upcoming conferences, and links to various。