Genetic Algorithms for Multiobjective Optimization Formulation ...

合集下载

多目标遗传算法

A more appropriate approach to deal with multiple objectives is to use techniques that were originally designed for that purpose in the eld of Operations Research. Work in that area started a century ago, and many approaches have been re ned and commonly applied in economics and control theory.
Abstract: In this paper we propose the use of the genetic algorithm (GA) as a tool to solve multiobjective
optimization problems in structures. Using the concept of min-max optimum, a new GA-based multiobjective optimization technique is proposed and two truss design problems are solved using it. The results produced by this new approach are compared to those produced by other mathematical programminቤተ መጻሕፍቲ ባይዱ techniques and GA-based approaches, proving that this technique generates better trade-o s and that the genetic algorithm can be used as a reliable numerical optimization tool.

多目标优化问题求解的直接法和间接法的优缺点

多目标优化问题求解的直接法和间接法的优缺点多目标优化问题是指在同一优化问题中存在多个冲突的目标函数，需要找到一组解，使得每个目标函数都能达到最优。

在解决这类问题时，可采用直接法和间接法两种不同的方法。

本文将会对直接法和间接法进行详细的介绍，并分析它们各自的优点和缺点。

直接法直接法也被称为权衡法或综合法，它将多目标优化问题转化为单目标优化问题，通过综合考虑各个目标函数的权重，求解一个综合目标函数。

直接法的基本思想是将多个目标函数进行线性组合，构建一个综合目标函数，然后通过求解单个目标函数的优化问题来求解多目标问题。

优点：1.简单直观：直接法将多目标问题转化为单目标问题，相对于间接法来说，更加直观和易于理解。

2.数学模型简化：直接法通过线性组合，将多个目标函数融合为一个综合目标函数，从而简化了数学模型，降低了计算难度。

3.基于人的主观意愿：直接法需要设定各个目标函数的权重，这样通过调整权重的大小来达到不同目标之间的权衡，符合人的主观意愿。

缺点：1.主观性强：直接法中的权重需要依赖专家经验或决策者主观意愿来确定，因此结果可能受到主观因素的影响。

2.依赖权重设定：直接法对于权重设定非常敏感，权重的选择对最终的结果具有较大的影响，不同的权重选择可能得到不同的解决方案。

3.可能出现非最优解：由于直接法是通过综合目标函数来求解单目标问题，因此可能会导致非最优解的出现，无法找到所有的最优解。

间接法间接法也称为非支配排序遗传算法（Non-dominated Sorting Genetic Algorithm, NSGA），它是一种利用遗传算法的非支配排序方法来解决多目标优化问题的方法。

通过建立种群的非支配排序，通过选择、交叉和变异等遗传算子来生成新的种群，并不断迭代，直到找到一组非支配解集。

优点：1.高效性：间接法利用遗传算法，并采用非支配排序的思想，能够快速收敛到一组非支配解集，有效地解决多目标优化问题。

2.多样性：间接法通过种群的选择、交叉和变异等操作，能够保持种群的多样性，不仅可以得到最优解，还可以提供多种优秀的解决方案供决策者选择。

time-cost trade-off in

A genetic algorithm approach for thetime-cost trade-oﬀin PERT networksAmir Azaron *,Cahit Perkgoz,Masatoshi SakawaDepartment of Artiﬁcial Complex Systems Engineering,Graduate School of Engineering,Hiroshima University,Kagamiyama 1-4-1,Higashi-Hiroshima,Hiroshima 739-8527,Japan AbstractWe develop a multi-objective model for the time-cost trade-oﬀproblem in PERT net-works with generalized Erlang distributions of activity durations,using a genetic algo-rithm.The mean duration of each activity is assumed to be a non-increasing function and the direct cost of each activity is assumed to be a non-decreasing function of the amount of resource allocated to it.The decision variables of the model are the allocated resource quantities.The problem is formulated as a multi-objective optimal control problem that involves four conﬂicting objective functions.The objective functions are the project direct cost (to be minimized),the mean of the project completion time (min),the variance of the project completion time (min),and the probability that the project completion time does not exceed a certain threshold (max).It is impossible to solve this problem optimally.Therefore,we apply a ‘‘Genetic Algorithm for Numerical Optimizations of Constrained Problems’’(GENOCOP)to solve this multi-objective problem using a goal attainment technique.Several factorial experiments are performed to identify appropriate genetic algorithm parameters that produce the best results within a given execution time in the three typical cases with diﬀerent conﬁgurations.Finally,we compare the genetic algorithm results against the results of a discrete-time approxima-tion method for solving the original optimal control problem.Ó2004Elsevier Inc.All rights reserved.0096-3003/$-see front matter Ó2004Elsevier Inc.All rights reserved.doi:10.1016/j.amc.2004.10.021*Corresponding author.E-mail address:azaron@msl.sys.hiroshima-u.ac.jp (A.Azaron).Applied Mathematics and Computation 168(2005)1317–13391318 A.Azaron et al./put.168(2005)1317–1339Keywords:Project management and scheduling;Genetic algorithm;Multiple objective program-ming;Optimal control;Design of experiments1.IntroductionSince the late1950s,critical path method(CPM)techniques have become widely recognized as valuable tools for the planning and scheduling of large pro-jects.In a traditional CPM analysis,the major objective is to schedule a project assuming deterministic durations.However,project activities must be scheduled under available resources,such as crew sizes,equipment and materials.The activity duration can be looked upon as a function of resource availability. Moreover,diﬀerent resource combinations have their own costs.Ultimately, the schedule needs to take account of the trade-oﬀbetween project direct cost and project completion time.For example,using more productive equipment or hiring more workers may save time,but the project direct cost could increase.In CPM networks,activity duration is viewed either as a function of cost or as a function of resources committed to it.The well-known time-cost trade-oﬀproblem(TCTP)in CPM networks takes the former view.In the TCTP,the objective is to determine the duration of each activity in order to achieve the minimum total direct and indirect costs of the project.Studies on TCTP have been done using various kinds of cost functions such as linear[1,2],discrete[3], convex[4,5],and concave[6].When the cost functions are arbitrary(still non-increasing),the dynamic pro-gramming(DP)approach was suggested by Robinson[7]and Elmaghraby[8]. Tavares[9]has presented a general model based on the decomposition of the project into a sequence of stages and the optimal solution can be easily computed for each practical problem as it is shown for a real case study.Weglarz[10]stud-ied this problem using optimal control theory and assumed that the processing speed of each activity at time t is a continuous,non-decreasing function of the amount of resource allocated to the activity at that instant of time.This means that time is considered as a continuous variable.Unfortunately,it seems that this approach is not applicable to networks of a reasonable size(>10).Recently,some researchers have adopted computational optimization tech-niques,such as genetic algorithms and simulated annealing to solve TCTP.Feng et al.[11]and Chua et al.[12]proposed models using genetic algorithms and the Pareto front approach to solve construction time-cost trade-oﬀproblems.These models mainly focus on deterministic situations.However,during project implementation,many uncertain variables dynamically aﬀect activity durations,and the costs could also change accordingly.Examples of these variables are weather,space congestion,productivity level,etc.To solve prob-lems of this kind,PERT has been developed to deal with uncertainty in the project completion time.A.Azaron et al./put.168(2005)1317–13391319PERT does not take into account the time-cost trade-oﬀ.Therefore,com-bining the aforementioned concepts to develop a time-cost trade-oﬀmodel under uncertainty would be beneﬁcial to scheduling engineers in forecasting a more realistic project completion time and cost.In this paper,we develop a multi-objective model for the time-cost trade-oﬀproblem in PERT networks,using a genetic algorithm.It is assumed that the activity durations are independent random variables with generalized Erlang dis-tributions.It is also assumed that the amount of resource allocated to each activ-ity is controllable,where the mean duration of each activity is a non-increasing function of this control variable.The direct cost of each activity is also assumed to be a non-decreasing function of the amount of resource allocated to it.The problem is formulated as a multi-objective optimal control problem, where the objective functions are the project direct cost(to be minimized), the mean of the project completion time(min),its variance(min)and the prob-ability that the project completion time does not exceed a given level(max). Then,we apply the goal attainment technique,which is a variation of the goal programming technique,to solve this multi-objective problem.For the problem concerned in this paper,as a general-purpose solution method for non-linear programming problems,in order to consider the non-linearity of problems and to cope with large-scale problems,we apply the revised GENOCOP V,developed by Suzuki[13],which is a direct extension of the genetic algorithm for numerical optimizations of constrained problems (GENOCOP),proposed by Koziel and Michalewicz[14].Three factorial experiments are performed to identify appropriate genetic algorithm parameters that produce the best results within a given execution time in the three typical cases with diﬀerent conﬁgurations.Moreover,an experiment in randomized block design is conducted to study the eﬀects of three diﬀerent methods of solving this problem,including the GA,on the objective function value and on the computational time.The remainder of this paper is organized in the following way.In Section2, we extend the method of Kulkarni and Adlakha[15]to analytically compute the project completion time distribution in PERT networks with generalized Erlang distributions of activity durations.Section3presents the multi-objec-tive resource allocation formulation.In Section4,we explain the revised GENOCOP V.Section5presents the computational experiments,andﬁnally we draw conclusions from these experiments in Section6.2.Project completion time distribution in PERT networksIn this section,we present an analytical method to compute the distribution function of the project completion time in PERT networks,or in fact the distribution function of the longest path from the source to the sink node ofa directed acyclic stochastic network,where the arc lengths or activity dura-tions are mutually independent random variables with generalized Erlang dis-tributions.To do this,we extend the technique of Kulkarni and Adlakha[15], because this method is an analytical one,simple,easy to implement on a com-puter and computationally stable.Let G=(V,A)be a PERT network with set of nodes V={v1,v2,...,v m}and set of activities A={a1,a2,...,a n}.Duration of activity a2A(T a)exhibits a generalized Erlang distribution of order n a and the inﬁnitesimal generator matrix G a as:G a¼Àk a1k a10 (00)0Àk a2k a2 (00):::...::000...Àk anak ana 000 (00)2666666437777775:In this case,T a would be the time until absorption in the absorbing state.An Erlang distribution of order n a is a generalized Erlang distribution withk a1¼k a2¼ÁÁÁ¼k ana .When n a=1,the underlying distribution becomes expo-nential with the parameter k a1.First,we transform the original PERT network into a new one,in which all activity durations have exponential distributions.For constructing this net-work,we use the idea that if the duration of activity a is distributed according to a generalized Erlang distribution of order n a and the inﬁnitesimal generator matrix G a,it can be decomposed to n a exponential series of arcs with theparameters k a1,k a2,...,k ana .Then,we substitute each generalized Erlang activ-ity with n a series of exponential activities with the parameters k a1,k a2,...,k ana .Now,Let G0=(V0,A0)be the transformed network,in which V0and A0rep-resent the sets of nodes and arcs of this transformed network,respectively, where the duration of each activity a2A0is exponential with parameter k a. The source and sink nodes are denoted by s and t,respectively.For a2A0, let a(a)be the starting node of arc a,and b(a)be the ending node of arc a. Deﬁnition1.Let I(v)and O(v)be the sets of arcs ending and starting at node v, respectively,which are deﬁned as follows:IðvÞ¼f a2A0:bðaÞ¼v gðv2V0Þ,ð1ÞOðvÞ¼f a2A0:aðaÞ¼v gðv2V0Þ:ð2ÞDeﬁnition2.If X&V0,such that s2X and t2X¼V0ÀX,then an(s,t)cut is deﬁned as1320 A.Azaron et al./put.168(2005)1317–1339A.Azaron et al./put.168(2005)1317–13391321ðX,XÞ¼f a2A0:aðaÞ2X,bðaÞ2X g:ð3ÞAn(s,t)cutðX,XÞis called an uniformly directed cut(UDC),ifðX,XÞis empty.Example1.Before proceeding,we illustrate the material by an example. Consider the network shown in Fig.1.Clearly,(1,2)is a uniformly directed cut(UDC),because V0is divided into two disjoint subsets X and X,where s2X and t2X.The other UDCs of this network are(2,3),(1,4,6),(3,4,6)and(5,6).Deﬁnition3.Let D=E[F be a uniformly directed cut(UDC)of a network. Then,it is called an admissible2-partition,if I(b(a))X F,for a2F.To illustrate this deﬁnition,consider Example1again.As mentioned, (3,4,6)is a UDC.This cut can be divided into two subsets E and F.For exam-ple,E={4}and F={3,6}.In this case,the cut is an admissible2-partition,be-cause I(b(3))={3,4}X F and also I(b(6))={5,6}X F.However,if E={6} and F={3,4},then the cut is not an admissible2-partition,because I(b(3))={3,4}&F={3,4}.Deﬁnition4.During the project execution and at time t,each activity can be in one of the active,dormant or idle states,which are deﬁned as follows:(i)Active.An activity is active at time t,if it is being executed at time t. (ii)Dormant.An activity is dormant at time t,if it hasﬁnished but there is at least one unﬁnished activity in I(b(a)).If an activity is dormant at time t, then its successor activities in O(b(a))cannot begin.(iii)Idle.An activity is idle at time t,if it is neither active nor dormant at time t.The sets of active and dormant activities are denoted by Y(t)and Z(t),respec-)).tively,and X(t)=(Y(t),Z(tConsider Example 1,again.If activity 3is dormant,it means that this activ-ity has ﬁnished but the next activity,i.e.5,cannot begin because activity 4is still active.Table 1,presents all admissible 2-partition cuts of this network.We use a superscript star to denote a dormant activity.All others are active.E contains all active while F includes all dormant activities.Let S denote the set of all admissible 2-partition cuts of the network,and S ¼S [fð/,/Þg .Note that X (t )=(/,/)implies that Y (t )=/and Z (t )=/,i.e.all activities are idle at time t and hence the project is completed by time t .It is proven that {X (t ),t P 0}is a continuous-time Markov process with state space S ,refer to [15]for details.As mentioned,E and F contain active and dormant activities of a UDC,respectively.When activity a ﬁnishes (with the rate of k a ),and there is at least one unﬁnished activity in I (b (a )),it moves from E to a new dormant activities set,i.e.to F 0.Furthermore,if by ﬁnishing this activity,its succeeding ones,O (b (a )),become active,then this set will also be included in the new E 0,while the elements of I (b (a )),which one of them belongs to E and the other ones be-long to F ,will be deleted from the particular sets.Thus,the elements of the inﬁnitesimal generator matrix Q =[q{(E ,F ),(E 0,F 0)}],(E ,F )and ðE 0,F 0Þ2S ,are calculated as follows:q fðE ;F Þ;ðE 0;F 0Þg ¼k a if a 2E ;I ðb ða ÞÞ&F [f a g ;E 0¼E Àf a g ;F 0¼F [f a g ;ð4Þk a if a 2E ;I ðb ða ÞÞ&F [f a g ;E 0¼ðE Àf a gÞ[O ðb ða ÞÞ;F 0¼F ÀI ðb ða ÞÞ;ð5ÞÀP a 2E k a if E 0¼E ;F 0¼F ;ð6Þ0otherwise :ð7Þ8>>>>>>>>>><>>>>>>>>>>:In Example 1,if we consider E ={1,2},F (/),E 0={2,3}and F 0=(/),then E 0=(E À{1})[O (b (1)),and thus from (5),q{(E ,F ),(E 0,F 0)}=k 1.{X (t ),t P 0}is a ﬁnite-state absorbing continuous-time Markov process.Since q{(/,/),(/,/)}=0,this state would be an absorbing one and obviouslyTable 1All admissible 2-partition cuts of the example network1.(1,2) 5.(1,4*,6)9.(3*,4,6)13.(3,4*,6*)17.(/,/)2.(2,3) 6.(1,4,6*)10.(3,4*,6)14.(5,6)3.(2,3*)7.(1,4*,6*)11.(3,4,6*)15.(5*,6)4.(1,4,6)8.(3,4,6)12.(3*,4,6*)16.(5,6*)1322 A.Azaron et al./put.168(2005)1317–1339A.Azaron et al./put.168(2005)1317–13391323 the other states are transient.Furthermore,we number the states in S such that the Q matrix is an upper triangular matrix.We assume that the states are num-bered1,2,...,N¼j S j.State1is the initial state,namely(O(s),/);and state N is the absorbing state,namely(/,/).Let T represent the length of the longest path in the network,or the project completion time.Clearly,T=min{t>0:X(t)=N/X(0)=1}.Thus,T is the time until{X(t),t P0}gets absorbed in theﬁnal state starting from state1.Chapman–Kolmogorov backward equations can be applied to compute F(t)=P{T6t}.If we deﬁne:P iðtÞ¼P f XðtÞ¼N=Xð0Þ¼i g,i¼1,2,...,N:ð8ÞThen F(t)=P1(t).The system of diﬀerential equations for the vector P(t)=[P1(t), P2(t),...,P N(t)]T is given byP0ðtÞ¼QPðtÞ,ð9ÞPð0Þ¼½0,0,...,1 T:3.Multi-objective resource allocation problemIn this section,we develop a multi-objective model to optimally control the resources allocated to the activities in a PERT network whose activity dura-tions exhibit generalized Erlang distributions,where the mean duration of each activity is a non-increasing function and the direct cost of each activity is a non-decreasing function of the amount of resource allocated to it.We may de-crease the project direct cost,by decreasing the amount of resource allocated to the activities.However,clearly it causes the mean project completion time to be increased,because these objectives are in conﬂict with each other.Conse-quently,an appropriate trade-oﬀbetween the total direct costs and the mean project completion time is required.The variance of the project completion time should also be considered in the model,because when we only focus on the mean time,the resource quantities may be non-optimal if the project com-pletion time substantially varies because of randomness.The probability that the project completion time does not exceed a certain threshold is also impor-tant in many cases and to be considered.Therefore,we have a multi-objective stochastic programming problem.The objective functions are the project direct cost(to be minimized),the mean of project completion time(min),the variance of project completion time(min), and the probability that the project completion time does not exceed a certain threshold(max).The direct cost of activity a 2A is assumed to be a non-decreasing function d a (x a )of the amount of resource x a allocated to it.Therefore,the project direct cost would be equal to P a 2A d a ðx a Þ.The mean duration of activity a 2A ,which is equal to P n a j ¼11k a j ,is assumed to be the non-increasing function g a (x a )of the amount of resource x a allocated to it.Let U a represent the amount of resource available to be allocated to the activity a ,and L a represent the minimum amount of resource required to achieve the activity a .In reality d a (x a )and g a (x a )can be estimated using linear regression.We can collect the sample paired data of d a (x a )and g a (x a )as the dependent variables,for diﬀerent values of x a as the independent variables,from the previous similar activities or using the judgments of the experts in this area.Then,we can esti-mate the parameters of the relevant linear regression model.The mean and the variance of project completion time are given byE ðT Þ¼Z 1ð1ÀP 1ðt ÞÞd t ,ð10ÞVar ðT Þ¼Z10t 2P 01ðt Þd t ÀZ10tP 01ðt Þd t 2,ð11Þwhere P 01ðt Þis the density function of project completion time.The probability that the project completion time does not exceed the given threshold u isP ðT 6u Þ¼P 1ðu Þ:ð12ÞThe inﬁnitesimal generator matrix Q would be a function of the vector k ¼½k a j ;a 2A ,j ¼1,2,...,n a T ,in the optimal control problem.Therefore,the non-linear dynamic model isP 0ðt Þ¼Q ðk ÞP ðt Þ,P i ð0Þ¼08i ¼1,2,...,N À1,P N ðt Þ¼1:ð13ÞAccordingly,the appropriate multi-objective optimal control problem isMin f 1ðx ,k Þ¼X a 2A d a ðx a Þ,Minf 2ðx ,k Þ¼Z 10ð1ÀP 1ðt ÞÞd t ,Minf 3ðx ,k Þ¼Z 10t 2P 01ðt Þd t ÀZ10tP 01ðt Þd t 2,Max f 4ðx ,k Þ¼P 1ðu Þ1324 A.Azaron et al./put.168(2005)1317–1339s :t :P 0ðt Þ¼Q ðk ÞP ðt Þ,P i ð0Þ¼08i ¼1,2,...,N À1,P N ðt Þ¼1,g a ðx a Þ¼X n a j ¼11k a j a 2A ,x a 6U aa 2A ,x a P L aa 2A ,k a j P 0a 2A ,j ¼1,2,...,n a :ð14ÞA possible approach to solving (14)to optimality is to use the Maximum Principle (see [16]for details).For simplicity,consider solving the problem with only one of the objective functions,f 2ðx ,k Þ¼R 10ð1ÀP 1ðt ÞÞd t .Clearly,x a ¼g À1a P n a j ¼11k aj for a 2A .Therefore,we can consider k as the unique control vector of the problem,and ignore the role of x =[x 1,x 2,...,x n ]T as the other independent decision vector.Consider K as the set of allowable controls consisting of all constraints except the constraints representing the dy-namic model (k 2K ),and N -vector l (t )as the adjoint vector function.Then,Hamiltonian function would beH ðl ðt Þ,P ðt Þ,k Þ¼l ðt ÞT Q ðk ÞP ðt Þþ1ÀP 1ðt Þ:ð15ÞNow,we write the adjoint equations and terminal conditions,which areÀl 0ðt ÞT ¼l ðt ÞT Q ðk Þþ½À1,0,...,0 ,l ðT ÞT ¼0,T !1:ð16ÞIf we could compute l (t )from (16),then we would be able to minimize the Hamiltonian function subject to k 2K in order to get the optimal control k *,and solve the problem optimally.Unfortunately,the adjoint equations (16)are dependent on the unknown control vector,k ,and therefore they cannot be solved directly.If we could also minimize the Hamiltonian function (15),subject to k 2K ,for an optimal control function in closed form as k *=f (P *(t ),l *(t )),then we would be able to substitute this into the state equations,P 0(t )=Q (k )ÆP (t ),P (0)=[0,0,...,1]T ,and adjoint equations (16)to get a set of diﬀerential equa-tions,which is a two-point boundary value problem.Unfortunately,we cannot obtain k *by diﬀerentiating H with respect to k ,because the minimum of H A.Azaron et al./put.168(2005)1317–13391325occurs on the boundary of K ,and consequently k *cannot be obtained in a closed form.According to these points,it is impossible to solve the optimal control prob-lem (14),optimally,even in the restricted case of a single objective problem.Relatively few optimal control problems can be solved optimally.Therefore,we apply a genetic algorithm for numerical optimizations of constrained prob-lems (revised GENOCOP V),which is fully described in Section 4,to solve this problem,using a goal attainment method.3.1.Goal attainment methodThis method requires setting up a goal and weight,b j and c j (c j P 0)for j =1,2,3,4,for the four indicated objective functions.The c j relates the relative under-attainment of the b j .For under-attainment of the goals,a smaller c j is associated with the more important objectives.c j ,j =1,2,3,4,are generally normalized so that P 4j ¼1c j ¼1.The appropriate goal attainment formulationto obtain x *isMinz s :t :Xa 2A d a ðx a ÞÀc 1z 6b 1,Z10ð1ÀP 1ðt ÞÞd t Àc 2z 6b 2,Z 10t 2P 01ðt Þd t ÀZ10tP 01ðt Þd t 2Àc 3z 6b 3,P 1ðu Þþc 4z P b 4,P 0ðt Þ¼Q ðk ÞP ðt Þ,P i ð0Þ¼08i ¼1,2,...,N À1,P N ðt Þ¼1,g a ðx a Þ¼X n a j ¼11k a j a 2A ,x a 6U aa 2A ,x a P L aa 2A ,k a j P 0a 2A ,j ¼1,2,...,n a ,z P 0:ð17Þ1326 A.Azaron et al./put.168(2005)1317–1339Lemma 1.If x *is Pareto-optimal,then there exists a c,b pair such that x *is an optimal solution to the optimization problem (17).4.A genetic algorithm for numerical optimizations of constrained problems (revised GENOCOP V)In this section,we use the revised GENOCOP V proposed as a general purpose method for solving non-linear programming problems as deﬁned in (18).Min f ðk Þs :t :g r ðk Þ¼0,r ¼1,2,...,k 1,h r ðk Þ60,r ¼k 1þ1,k 1þ2,...,k ,L j 6k j 6U j ,j ¼1,2,...,l ,ð18Þwhere k is an l dimensional decision vector,g r (k )=0,r =1,2,...,k 1,are k 1equality constraints and h r (k )60,r =k 1+1,k 1+2,...,k ,are k Àk 1inequality constraints.These are assumed to be either linear or non-linear real-values functions.Moreover,L j and U j ,j =1,...,l ,are the lower and upper bounds of the decision variables,respectively.In order to have the same form given in (18),we reformulate the problem (17),by combining the objective functions and the state equations.We also consider a new decision vector k =[k j ;j =1,2,...,m ]T,where m ¼n þP n i ¼1n i ,instead of the original decision vectors x and k ,in the reformulated problem (19).The appropriate min–max problem is obtained as:Min f ðk Þ¼Max f z 1ðk Þ,z 2ðk Þ,z 3ðk Þ,z 4ðk Þgs :t :g r ðk Þ¼0,r ¼1,2,...,n ,L j 6k j 6U j ,j ¼1,2,...,n ,ð19Þwherez 1ðk Þ¼f 1ðx ,k ÞÀb 1c 1,z 2ðk Þ¼f 2ðx ,k ÞÀb 2c 2,z 3ðk Þ¼f 3ðx ,k ÞÀb 3c 3,z4ðkÞ¼b4Àf4ðx,kÞc4,g r ðkÞ¼g rðk rÞÀX n rj¼11k rj¼0,r¼1,2,...,n,j¼1,2,...,n randP0ðtÞ¼QðkÞPðtÞ,Pð0Þ¼½0,0,...,1 T:ð20ÞIt should be noted that in our computer program,P1(t)is obtained by solv-ing the system of diﬀerential equations(20)analytically and then the mean and the variance of project completion time are computed,numerically.The prob-lem(19)does not have the inequality constraints(h r(k)60)of problem(18). The only restriction that we have in this problem is that the elements of k vec-tor(decision variables)are selected between the given lower and upper bounds.We apply the revised GENOCOP V,developed by Suzuki[13],which is a direct extension of the genetic algorithm for numerical optimizations of con-strained problems(GENOCOP),proposed by Koziel and Michalewicz[14]. In GENOCOP V,an initial reference point is generated randomly from indi-viduals satisfying the lower and upper bounds,which is quite diﬃcult in prac-tice.Furthermore,because a new search point is randomly generated on the line segment between a search point and a reference point,the eﬀectiveness and speed of the search may be quite low.The proposed revised GENOCOP V overcomes these drawbacks by generating an initial reference point by min-imizing the sum of squares of the violated non-linear constraints and using a bisection method for generating a new feasible point on the line segment between a search point and a reference point.To be more explicit aboutﬁnding the initial reference point,for some k,Lj6 k j6U j,j=1,...,l,we use the set of violated non-linear equality constraintsI g¼f w j g wð kÞ¼0,w¼1,...,k1gð21aÞand the set of violated non-linear inequality constraintsI h¼f w j h wð kÞ>0,w¼k1þ1,...,k g:ð21bÞAn unconstrained optimization problem is formulated to minimize the sum of squares of violated non-linear constraintsMinXw2I g ðg wðkÞÞ2þXw2I hðh wðkÞÞ2ð21cÞand the optimization problem(21)is solved for obtaining one initial reference point.In the bisection method for generating a new search point,two cases are considered,in which the search points are either feasible or infeasible individuals.If search points are feasible,a new search point is generated on the line seg-ment between a search point and a reference point.If search points are infea-sible,a boundary point is found and a new point is generated on the line segment between the boundary point and a reference point.If the feasible space is not convex,the new point could be infeasible.In this case the generation of a new point is repeated if becomes feasible.putational procedures of revised GENOCOP VIn this section,the genetic algorithm for numerical optimizations of con-strained problems(revised GENOCOP V)is summarized step by step.Step0.Determine the values of the population size P,the total number of generations G,the probability of mutation P m,and the probabilityof crossover P c.Step1.Generate one or more initial reference points by minimizing the sum of squares of violated non-linear constraints.Step2.Generate the initial population consisting of P individuals.Step3.Solve the system of diﬀerential equations in(20)and compute P1(t)for each individual.The solution of the system is found as follows;ﬁrst,the eigenvalues and then the related eigenvectors of the constant coef-ﬁcient matrix Q are found.According to the eigenvectors and theeigenvalues of the system,the solution is found for each individual. Step4.Decode each individual(genotype)in the current population and cal-culate itsﬁtness(phenotype).Step5.Apply the mutation and crossover operations with the probabilities provided in step0.Step6.Generate the new population by applying the reproduction operator, based on the ranking selection.Step7.When the maximum number of iterations is reached,then go to step8.Otherwise,increase the generation number by1and then go to step3. Step8.Stop.putational experimentsTo investigate the performance of the proposed genetic algorithm method (revised GENOCOP V)for the time-cost trade-oﬀproblem in PERT networks, we consider3typical small,medium and large cases with diﬀerent conﬁgura-。

Discovering Accurate and Interesting Classification Rules Using Genetic Algorithm

Discovering Accurate and Interesting Classiﬁcation Rules Using GeneticAlgorithmJanaki Gopalan Reda Alhajj Ken BarkerAbstractDiscovering accurate and interesting classiﬁcation rulesis a signiﬁcant task in the post-processing stage of a datamining(DM)process.Therefore,an optimization problemexists between the accuracy and the interesting metrics forpost-processing rule sets.To achieve a balance,in thispaper,we propose two major post-processing tasks.Intheﬁrst task,we use a genetic algorithm(GA)toﬁnd thebest combination of rules that maximizes the predictiveaccuracy on the sample training set.Thus we obtain themaximized accuracy.In the second task,we rank the rulesby assigning objective rule interestingness(RI)measures(or weights)for the rules in the rule set.Henceforth,we propose a pruning strategy using a GA toﬁnd thebest combination of interesting rules with the maximized(or greater)accuracy.We tested our implementation onthree data sets.The results are very encouraging;theydemonstrate the applicability and effectiveness of ourapproach.Keywords:post-processing,data mining,classiﬁcationrules,rule interestingness,genetic algorithms.1IntroductionData mining is generally deﬁned as the process ofextracting previously unknown knowledge from a givendatabase.A DM process is divided into three stagesnamely,the pre-processing,mining,and the post-processingstages[1,20].The post-processing stage of the DM pro-cess involves interpretation of the discovered knowledge orsome post-processing of this knowledge.An example ofjective boosted hypothesis rule[4].Therefore,the accu-racy of the obtained results are biased by the accuracy with which these weights are obtained.Moreover,these weightsare based on one metric which is the classiﬁcation accuracy of the classiﬁer.In this paper,we propose a pruning strat-egy by extending the idea proposed by Thompson.In our strategy,we use a GA with objective rule interestingness measures(based on Freitas[7])toﬁnd the most interest-ing subset with the performance accuracy of atleast on the sample set(problem space).These measures are based on several objective metrics(including the accuracy metric) to derive interesting as well as accurate rules.Therefore, the resulting rule set from the solution of our GA method is the best combination of accurate interesting classiﬁcation rules.These rules are then tested for their accuracy on the unknown validation set(the solution space).The rest of the paper is organized as follows.Section2 describes the related work in rule-set reﬁnement for classi-ﬁcation rules.Section3discusses the implementation using a GA for this problem.In Section4,we give the experimen-tal results using the GA method.Section5is conclusions and future work.2Related WorkIn this section,we discuss the related work in rule-set reﬁnement for classiﬁcation rules.They are:1)The rule interestingness(RI)principles proposed for classiﬁcation rules;and2)ﬁnding the best set(or subset)of rules from the discovered rule-set.The task of assigning a RI measure is discussedﬁrst,followed by the discussion of deriving the best combination of accurate rules.Methods for the selection of interesting classiﬁcation rules can be divided into subjective and objective methods. Subjective methods are user-driven and domain-dependent. By contrast,objective methods are data-driven and domain-independent.A comprehensive review about subjective as-pects of RI is available[10].Piatetsky-Shapiro[11],Major and Mangano[12],and Kamber and Shinghal[13]propose objective principles for RI to include the rule quality factors of coverage,completeness,and a conﬁdence factor.Freitas (1999)[7]extended the objective RI principles[11,12,13] to include additional factors such as the disjunct size,im-balance of class distributions,attribute interestingness,mis-classiﬁcation costs,and the asymmetric nature of classiﬁca-tion rules.We consider the next problem of pruning rule sets.Pro-dromidis et al.[14]present methods for pruning classiﬁers in a distributed meta-learning system.A pre-training prun-ing is used to select a subset of classiﬁers from an ensem-ble which are then combined by a meta-learned combiner. Margineantu and Deitterich[2]use a backﬁtting algorithm for pruning classiﬁer sets.This involves choosing an ad-ditional classiﬁer to add to a set of classiﬁers by a greedy search and then checking that each of the other classiﬁers in the set cannot be replaced by another to produce a better en-semble.Thompson[4]proposes a GA to prune a classiﬁer ensemble toﬁnd the right combination of classiﬁers with-out over-ﬁtting the training set.The proposed GA uses a real-valued encoding.Each chromosome has real-valued genes,where is the number of classiﬁers in the ensemble. Each gene represents the voting weight of its corresponding classiﬁer calculated using a boosted hypothesis(WVBH) rule.Theﬁtness function consists of measuring the pre-dictive accuracy of the classiﬁer ensemble with the weights proposed by the chromosomes on a hold-out set(different from the training set).Two major conclusions are drawn:1) In a majority of the experiments that were performed with the classiﬁer sets,it was found that a subset of classiﬁers from the original classiﬁer ensemble had a better classiﬁca-tion accuracy.2)The GA method was very efﬁcient inﬁnd-ing the right set of pruned classiﬁers.Moreover,the pruned classiﬁer sets from the GA method have a better classiﬁca-tion accuracy over the pruned classiﬁers sets from earlier work[4].In this paper,we propose a pruning strategy using a GA toﬁnd the best set of interesting and accurate rules by ex-tending the idea proposed by Thompson.In the rest of the paper,weﬁrst discuss our proposed approach that employs GA to address this problem.Finally,we present the experi-mental validations along with future work.3Post-processing Rule Sets Using Genetic AlgorithmsGA’s were introduced by Holland[8],as a general model of an adaptive processes,but subsequently widely exploited as optimizers[8].Basically,a GA can be used for solving problems for which it is possible to construct an objective function(also known asﬁtness function)to estimate how a given representative(solution)ﬁts the considered environ-ment(problem).In general,the main motivation for using GAs in any data mining process is that they perform a global search and cope better with interaction than the greedy rule induction algorithms often used in data mining.Genetic algorithms can be used in the post-processing stage of the DM process.Very little work has been reported in the literature in this area.As reviewed in earlier sections, Thompson[4]proposes a GA to prune a classiﬁer ensemble efﬁciently.We implemented GAKPER,a GA based Knowledge Discovery algorithm for deriving Efﬁcient Rules to achieve our goal.In our implementation,the original dataset is di-vided into a sample set(to train)and a validation set(to test).This task is divided into two parts.In theﬁrst part, a binary encoded GA is used toﬁnd the most accurate sub-set of rules with the best classiﬁcation accuracy on the sample set(problem space).In the second part,a binary en-coded GA is used toﬁnd the most interesting subset with accuracy of at least on the sample set.Finally,the derived accurate interesting rules are tested on the unknown valida-tion set(solution space)for their accuracy.Each of these parts are described in the subsections below.3.1GAKPER ALGORITHM-PART1A binary encoded GA is used to search for the best com-bination of accurate rules.Each chromosome in the popu-lation is a subset of the classiﬁcation rules.The length of the chromosome is the number of rules in the rule set;in that,each gene represents the corresponding classiﬁcation rule.For example,if there are rules in the original rule set,then“1001100111”is a possible chromosome,where theﬁrst,fourth,ﬁfth,eight,ninth,and tenth rules from the rule set are chosen to represent the set of accurate rules.A solution in the phenotype space is only represented by a single chromosome and all possible chromosomes are valid. Hence a1-1mapping exists between the genotype and phe-notype spaces.Theﬁtness functionﬁrst measures the predictive accu-racy of the rules(represented by the chromosome)on the entire sample set.This is achieved as follows.The true class values for all instances in the sample set is stored prior to running the GA.The classes predicted by the rule set repre-senting the chromosome are known.Therefore,to classify the test instances from the sample set,theﬁtness function takes a vote of the rules from the rule set.Thus,to calculate the class of a single test instance,the class predicted by the rules representing the chromosome(based on majority vote) is matched with the original stored class value of the test in-stance.This process is repeated for all the test instances in the sample set.Finally,the best combination of rules that maximizes the predictive accuracy on the sample set(i.e., the number of correctly predicted test instances)is obtained. The maximum accuracy is denoted as.If represents the number of correctly predicted instances by the chromo-some,theﬁtness value of the chromosome is deﬁned as:(1) All these aspects are precisely encoded and implemented into the GA and all the chromosomes(potential solutions) should be awarded or punished according to the criteria stated above during the process of evolution.The outcome of several evolutions modeled by this GA generates the right set of accurate rules.The GAKPER algorithm(Part I)is presented below: ALGORITHM:GAKPER-IInput:Classification rule set Output:Set of accurate rulesMethod:1)Search for the right set of accuraterules using the GA Method as follows: 1a)Randomly create an initial populationof potential accurate rules.1b)Iteratively perform the following substeps on the population until thetermination criterion in the GA issatisfied:a)FITNESS:Evaluate fitness f(x)of eachchromosome in the populationb)NEW POPULATIONb1)SELECTION:Based on f(x)b2)RECOMBINATION:2-point Cross-over ofchromosomesb3)MUTATION:Mutate chromosomesb4)ACCEPTATION:Reject or accept new onec)REPLACE:Replace old with newpopulation in new generationd)TEST:Test problem criterium(Number of generations is>100)2)After the termination criterion issatisfied,the best chromosome in thepopulation produced during the run isdesignated as the required combinationof the accurate rules.3.2GAKPER ALGORITHM-PART IIA binary encoded GA is used to search for the best com-bination of interesting accurate rules.The representation for the chromosome is the same as described in Part1.Each chromosome in the population is a subset of the classiﬁca-tion rules.The length of the chromosome is the number of rules in the rule set,in that,each gene represents the corre-sponding classiﬁcation rule.The maximum predictive accuracy is known from the result of the GAKPER algorithm-Part I.The RI measure (or weight)proposed by Freitas is assigned to all the rules. Theﬁtness function optimizes the weights toﬁnd the best combination of interesting rules with a classiﬁcation accu-racy of at least.It is important to note that accuracy of the rules when tested on the test instances(from the sample set) are based on the weighted majority vote.Thus,a rule with a higher RI measure ranks higher in classiﬁcation over a rule with a lower RI measure.The weights of the rules rep-resenting a chromosome are denoted as. If the accuracy of the rules is greater than(or equal)to, then theﬁtness value of the chromosome is the sum of the weights,otherwise the chromosome is given a defaultﬁt-ness value.The intuitive idea of choosing this default value is:1)the value has to be greater than zero to continue the GA runs in subsequent generations,and2)since theﬁtness criteria is not satisﬁed by the respective chromosome,an ar-bitrary value is chosen to punish the leastﬁt chromosome. Theﬁtness value of the chromosome is deﬁned as:If(2)All these aspects are precisely encoded and implemented into the GA and all the chromosomes(potential solutions) should be awarded or punished according to the criteria stated above during the process of evolution.The outcome of several evolutions modeled by this GA generates the right set of accurate interesting rules.The GAKPER algorithm(Part II)is presented below:ALGORITHM:GAKPER-IIInput:Classification rule setOutput:Set of accurate interesting rules Method: (1)Assign weights for the rules in the input ruleset.(2)Search for the right set of accurateinteresting rules using the GA Method asfollows:2a)Randomly create an initial population of potential accurate interesting rules.2b)Iteratively perform the following sub steps on the population until the terminationcriterion in the GA is satisfied:a)FITNESS:Evaluate fitness f(x)of eachchromosome in the populationb)NEW POPULATIONb1)SELECTION:Based on f(x)b2)RECOMBINATION:2-point Cross-over ofchromosomesb3)MUTATION:Mutate chromosomesb4)ACCEPTATION:Reject or accept new onec)REPLACE:Replace old with new population:the new generationd)TEST:Test problem criterium(Number of generations is>100) (3)After the termination criterion is satisfied,the best chromosome in the population produced during the run is designated as the rightcombination of the accurate interesting rules. The approach is implemented as a standard GA writ-ten in C,similar to Grefenstette’s GENESIS program; Baker’s SUS selection algorithm[21]is employed;2-point crossover is maintained at60%and mutation is very low; and selection is based on proportionalﬁtness.It is impor-tant to note that this approach optimizes the predictive ac-curacy and the interesting measures of the rule set on the entire sample set,which is the problem space.4Results and DiscussionsAll the tests have been conducted using a single Proces-sor,Intel(R)Xeon(TM)UNIX machine with CPU power of 2.80GHz and cache size of512KB.The GAKPER algorithm implementation is tested onﬁve datasets.The data-splitting,that is,dividing the dataset into the sample set and validation set is performed using ran-dom sampling technique.The classiﬁcation rules are ob-tained using the sample set.Recall that,to prune the rule set to derive the set of interesting accurate rules,the fol-lowing tests are performed.The GAKPER(part I)is used to derive the subset of rules that maximizes the classiﬁca-tion accuracy on the sample set.To derive the most inter-esting subset,a RI measure or weight(based on Freitas)is assigned to the rules.GAKPER(part II)is used again,this time,to maximize the interestingness measure of the rules whose classiﬁcation accuracy on the sample set is at least .The accuracy of the derived interesting accurate rules (on the validation set)from this approach is compared with: 1)pruning the rule set using GA without assigning initial weights for the rules,and2)without any pruning methods, using the entire rule set.The results for theﬁve datasets are presented below.Data Set1:Breast Cancer Data SetThis is a real data set obtained from Tom Baker Cancer Cen-ter,Calgary,Alberta,Canada.The original dataset consists of records and attributes.Each record represents follow-up data for one breast cancer case.Breast cancer “recurred”in some of these patients after the initial occur-rence.Hence,each patient is classiﬁed as“recurrent”or “non-recurrent”,depending on his or her status.With re-spect to classiﬁcation of the dataset,there are2classes: 1)Recurrent Patients,and2)Non Recurrent Patients.The original dataset is divided into a sample set and a validation set using the random sampling technique. The sample set has instances whereas the validation set has instances.The GA based approaches,i.e., GAKPER-I and GAKPER-II,is used(on the sample set) toﬁnd the best combination of accurate interesting rules. These rules are tested on the unknown validation set for ac-curacy.The GA parameters are:,-,,,and the is.30 experiments were performed with the GA approaches and every GA experiment was run for generations in both approaches.The post-processing results on the sample set and the validation sets are presented in Table1and Table2 respectively.Table1.Post-processing Results using Dif-ferent Approaches on the Sample Set for Dataset1WithoutWeightsCorrectly Predicted(in%)80Incorrectly Predicted(in%)10Unknown(in%)10Table2.Post-processing Results using Dif-ferent Approaches on the Validation Set for Dataset1WithoutWeightsCorrectly Predicted(in%)77Incorrectly Predicted(in%)8Unknown(in%)15With Weights Entire Rule Set94940066Table4.Post-processing Results using Dif-ferent Approaches on the Validation Set for Dataset2WithoutWeightsCorrectly Predicted(in%)91Incorrectly Predicted(in%)3Unknown(in%)6using the different approaches on the sample set(the prob-lem space)for Dataset2is presented.Table4presents the accuracy results of the discovered rule sets(using the differ-ent approaches)when tested on the validation set(the solu-tion space)for Dataset2.For this dataset,we found that the accuracy of the result on the unknown validation set using different approaches is the same(as presented in theﬁrst, second and third columns in Table4).This is also depicted graphically in Fig2.Data Set3:US-CENSUS-DATASETThis data is the USCensus1990raw data set obtained from the UCI repository[19].The data was collected as part of the1990census.The original dataset consisted of cate-gorical attributes and instances.The dataset used here consists of instances and attributes derived from the original USCensus1990rawdataset.It contains classes namely,iClass=0,iClass=5,and iClass=1.Each class refers to the native country of the candidate under con-sideration.The original dataset is divided into a sample set and a validation set using the random sampling technique.The sample set has instances while the validation set has instances.The GA based approaches,i.e.,GAKPER-Iand GAKPER-II,is used(on the sample set)toﬁnd the best combination of accurate interesting rules.These rules are tested on the unknown validation set for accuracy.Finally, the same GA parameters enumerated earlier for Dataset1 is used here.The post-processing results are presented in Table5and Table6.In Table5,the accuracy of the rule sets,while pruning, using the different approaches on the sample set(the prob-lem space)for Dataset3is presented.Table6presents the accuracy results of the discovered rule sets(using the dif-ferent approaches)when tested on the validation set(the solution space)for Dataset3.It is very important to ob-serve that,for this dataset,the accuracy of the result on the unknown validation set using our approach(as presented in theﬁrst column in Table6)is much greater than the accu-racy of the results using the traditional approaches(as pre-sented in second and third columns in Table6).This is also depicted graphically in Fig3.Table5.Post-processing Results using Differ-ent Approaches on the Sample the US census dataWithoutWeightsCorrectly Predicted(in%)66Incorrectly Predicted(in%)34Unknown(in%)0WithWeightsEntireRule Set7560254000From the results,it can be observed that:1)in majority of the tests performed,a subset of rules pruned from the original set has a better performance accuracy;and2)it is possible to derive the most interesting subset with a higher classiﬁcation accuracy as compared to the original rule set. Therefore,the GA with its inherent robust search strategies is most suitable for the post-processing problem considered in this paper.5Conclusions and Future WorkIn this paper,we propose and implement a GA based methodology to derive interesting and accurate classiﬁca-tion rules from a dataset.The fundamental goal of any data mining model is to derive interesting rules.At the same time,accuracy is a key issue.Therefore,in the post-processing component,the problem of deriving interesting accurate rules is addressed. The earlier works in this area addressed the following two problems independently:1)the problem ofﬁnding the ac-curate set(or subset)of rules using pruning strategies to theoriginal rule set[4];and2)the problem of assigning sub-jective or objective RI measures to the rules to determine their interestingness[7,11,12,13].In our work,a new methodology is proposed byﬁrst assigning an objective RI measure based on Freitas[7]to the rules and using pruning strategies with a GA to search for the right set of interesting accurate rules.An alternative approach worth investigating is toﬁnd the interesting accurate rules using multi-objective genetic al-gorithms.The goal is to optimize two parameters,namely, the interesting and the accuracy metrics for the classiﬁca-tion rules simultaneously.References[1]Dorain Pyle,“Data Preparation For Data Mining”,Morgan Kaufmann,1999.[2]D.D.Margineantu and T.G.Dietterich,“PruningAdaptive Boosting”.Proceedings of the14Inter-national Conference on Machine Learning,San Fran-cisco,CA,pp.211-218,1997.[3]J.R.Quinlan,“Boosting First Order Learning”.Pro-ceedings of the14International Conference on Ma-chine Learning,1997.[4]S.Thompson,“Genetic Algorithms as Postprocessorsfor Data Mining”.Data Mining with Evolutionary Algorithms:Research Directions-Papers from the AAAI Workshop”,pp18-22,1999.[5]R.E.Schapire and Y.Freund and P.Bartlett and W.S.Lee,“Boosting the Margin:a New Explanation for the Effectiveness of V oting Methods”.Machine Learning:Proceedings of the14International Con-ference,pp.322-330,1997.[6]P.Domingos,“Knowledge Acquisition from Exam-ples via Multiple Models”.Machine Learning:Pro-ceedings of the14International Conference,pp98-106,1997.[7]A.A.Freitas,“On Rule Interestingness Measures”.Advances in Evolutionary Computation,Knowledge-Based Systems,12,1999.[8]D.E.Goldberg,“Genetic Algorithms in Search,Op-timization and Machine Learning”.Addison Wesley, Longman Publishing Co.,Inc.,Boston,MA,1989. [9]A.A.Freitas,“A Survey of Evolutionary Algo-rithms for Data Mining and Knowledge Discovery”.Advances in Evolutionary Computation,Springer-Verlag,2001.[10]B.Liu,W.Hsu,and Yiming Ma,“Integrating Classiﬁ-cation and Association Rule Mining”.Proc.of KDD, pp.80-86,1998.[11]G.Piatetsky-Shapiro,“Discovery,Analysis,and Pre-sentation of Strong Rules”.Knowledge Discovery in Databases,pp.229-248,1991.[12]J.A.Major and J.J.Mangano,“Selecting AmongRules Induced from a Hurricane Database”.Proceed-ings of AAI-93,Workshop on Knowledge Discovery in Databases,pp.28-44,1993.[13]M.Kamber and R.Shinghal,“Evaluating the Inter-estingness of Characteristic Rules”.Proceedings of the2International Conference on KDD”,pp.28-44, 1993.[14]A.L.Prodromidis and S.Stolfo,“Pruning Classiﬁersin a Distributed Meta-Learning System”.Proceedings of the KDD”,pp.151-160,1998.[15]C.M.Fonseca and P.J.Fleming,“Genetic Algo-rithms for Multi-Objective Optimization:Formula-tion,Discussion and Generalization”.Proceedings of the Fifth International Conference on Genetic Algo-rithms,pp.93-100,1985.[16]R.J.Bayardo,“Brute-Force Mining of High-Conﬁdence Classiﬁcation Rules”.Proc.of KDD, pp.123-126,1997.C.M.Fonseca and P.J.Fleming [17]Han,“CMAR:Accurate and Efﬁcient ClassiﬁcationBased on Multiple Class-Association Rules”.Proc.of IEEE-ICDM,pp.369-376,2001.[18]R.Agrawal,and R.Srikant,“Fast algorithms for min-ing association rules,”Proc.of VLDB,Santiago,Chile, September1994.[19]C.L.Blake,and C.J.Merz,“UCI Repository of ma-chine learning databases”.University of California, Department of Information and Computer Science, Irvine,CA,1998.[20]I.H.Witten and E.Frank,“Data Mining:Practical Ma-chine Learning Tools and Techniques with Java Imple-mentations”.Morgan Kaufmann,October1999. [21]J.E.Baker,Reducing Bias&Inefﬁciency in the Selec-tion Algorithm.Proc.of the International Conference on Genetic Algorithms,pp.14-21,1987.。

遗传算法(GeneticAlgorithm)..

2018/10/7
选择(Selection)
选择(复制)操作把当前种群的染色体按与适应值成正比例的概率复制到新的种群中主要思想: 适应值较高的染色体体有较大的选择(复制)机会实现1：”轮盘赌”选择(Roulette wheel selection) 将种群中所有染色体的适应值相加求总和，染色体适应值按其比例转化为选择概率Ps 产生一个在0与总和之间的的随机数m 从种群中编号为1的染色体开始，将其适应值与后续染色体的适应值相加，直到累加和等于或大于m
2018/10/7
选择(Selection)
染色体的适应值和所占的比例
轮盘赌选择
2018/10/7
选择(Selection)
染色体被选的概率
染色体编号
1
2
3
4
5
6
染色体
适应度被选概率适应度累计
01110
8
0.16 8
11000
15
0.3 23
00100
2
0.04 25
10010
5
0.1 30
适者生存(Survival of the Fittest)
GA主要采用的进化规则是“适者生存” 较好的解保留，较差的解淘汰
2018/10/7
生物进化与遗传算法对应关系
生物进化
环境
适者生存个体染色体基因群体种群交叉变异
2018/10/7
遗传算法
适应函数
适应函数值最大的解被保留的概率最大问题的一个解解的编码编码的元素被选定的一组解根据适应函数选择的一组解以一定的方式由双亲产生后代的过程编码的某些分量发生变化的过程
遗传算法的基本操作

基因法+自定义复合分片算法

基因法+自定义复合分片算法基因法（Genetic Algorithm）是一种启发式搜索算法，受到了自然界进化理论的启发。

它模拟了生物进化的过程，通过逐代演化来搜索问题的最优解。

而自定义复合分片算法则是一种特定的分片算法，用于将问题分解为更小的子问题进行处理。

基因法的基本原理是通过对候选解（个体）进行基因编码，然后通过选择、交叉和变异等操作来产生新的候选解。

这些操作模拟了自然界的选择、交叉和变异过程，以期望找到更优的解。

基因法的优点在于可以处理复杂的问题，并且可以在搜索空间中进行全局搜索。

自定义复合分片算法是一种将问题分解为多个子问题的方法。

它可以根据问题的特点和需求，将问题分解为更小、更简单的子问题，然后通过求解这些子问题来得到原问题的解。

这种分解可以提高问题的求解效率，并且可以利用并行计算的优势。

结合基因法和自定义复合分片算法可以提供一种更强大的问题求解方法。

首先，可以使用基因法来优化自定义复合分片算法中的参数和策略，以获得更好的分片方案。

其次，可以将基因法应用于每个子问题的求解过程中，以进一步优化子问题的解。

这样，整个问题的求解过程可以得到全局最优解。

使用基因法+自定义复合分片算法的优势在于能够处理复杂的问题，并且能够通过自适应的方式进行搜索和优化。

同时，它还可以根据问题的特点进行灵活的分解和求解，以提高求解效率。

然而，这种方法也存在一些挑战，如参数调节、算法设计和计算资源的需求等方面。

总之，基因法+自定义复合分片算法是一种强大的问题求解方法，它结合了基因法的全局搜索和自定义复合分片算法的分解优化能力。

通过合理地设计和应用，可以在多个角度上全面完整地解决各种问题。

Comparison of Multiobjective Evolutionary Algorithms Empirical Results

AbstractIn this paper,we provide a systematic comparison of various evolutionary approaches tomultiobjective optimization using six carefully chosen test functions.Each test functioninvolves a particular feature that is known to cause difﬁculty in the evolutionary optimiza-tion process,mainly in converging to the Pareto-optimal front(e.g.,multimodality anddeception).By investigating these different problem features separately,it is possible topredict the kind of problems to which a certain technique is or is not well suited.However,in contrast to what was suspected beforehand,the experimental results indicate a hierarchyof the algorithms under consideration.Furthermore,the emerging effects are evidencethat the suggested test functions provide sufﬁcient complexity to compare multiobjectiveoptimizers.Finally,elitism is shown to be an important factor for improving evolutionarymultiobjective search.KeywordsEvolutionary algorithms,multiobjective optimization,Pareto optimality,test functions,elitism.1MotivationEvolutionary algorithms(EAs)have become established as the method at hand for exploring the Pareto-optimal front in multiobjective optimization problems that are too complex to be solved by exact methods,such as linear programming and gradient search.This is not only because there are few alternatives for searching intractably large spaces for multiple Pareto-optimal solutions.Due to their inherent parallelism and their capability to exploit similarities of solutions by recombination,they are able to approximate the Pareto-optimal front in a single optimization run.The numerous applications and the rapidly growing interest in the area of multiobjective EAs take this fact into account.After theﬁrst pioneering studies on evolutionary multiobjective optimization appeared in the mid-eighties(Schaffer,1984,1985;Fourman,1985)several different EA implementa-tions were proposed in the years1991–1994(Kursawe,1991;Hajela and Lin,1992;Fonseca c2000by the Massachusetts Institute of T echnology Evolutionary Computation8(2):173-195E.Zitzler,K.Deb,and L.Thieleand Fleming,1993;Horn et al.,1994;Srinivas and Deb,1994).Later,these approaches (and variations of them)were successfully applied to various multiobjective optimization problems(Ishibuchi and Murata,1996;Cunha et al.,1997;Valenzuela-Rend´on and Uresti-Charre,1997;Fonseca and Fleming,1998;Parks and Miller,1998).In recent years,some researchers have investigated particular topics of evolutionary multiobjective search,such as convergence to the Pareto-optimal front(Van Veldhuizen and Lamont,1998a;Rudolph, 1998),niching(Obayashi et al.,1998),and elitism(Parks and Miller,1998;Obayashi et al., 1998),while others have concentrated on developing new evolutionary techniques(Lau-manns et al.,1998;Zitzler and Thiele,1999).For a thorough discussion of evolutionary algorithms for multiobjective optimization,the interested reader is referred to Fonseca and Fleming(1995),Horn(1997),Van Veldhuizen and Lamont(1998b),and Coello(1999).In spite of this variety,there is a lack of studies that compare the performance and different aspects of these approaches.Consequently,the question arises:which imple-mentations are suited to which sort of problem,and what are the speciﬁc advantages and drawbacks of different techniques?First steps in this direction have been made in both theory and practice.On the theoretical side,Fonseca and Fleming(1995)discussed the inﬂuence of differentﬁtness assignment strategies on the selection process.On the practical side,Zitzler and Thiele (1998,1999)used a NP-hard0/1knapsack problem to compare several multiobjective EAs. In this paper,we provide a systematic comparison of six multiobjective EAs,including a random search strategy as well as a single-objective EA using objective aggregation.The basis of this empirical study is formed by a set of well-deﬁned,domain-independent test functions that allow the investigation of independent problem features.We thereby draw upon results presented in Deb(1999),where problem features that may make convergence of EAs to the Pareto-optimal front difﬁcult are identiﬁed and,furthermore,methods of constructing appropriate test functions are suggested.The functions considered here cover the range of convexity,nonconvexity,discrete Pareto fronts,multimodality,deception,and biased search spaces.Hence,we are able to systematically compare the approaches based on different kinds of difﬁculty and to determine more exactly where certain techniques are advantageous or have trouble.In this context,we also examine further factors such as population size and elitism.The paper is structured as follows:Section2introduces key concepts of multiobjective optimization and deﬁnes the terminology used in this paper mathematically.We then give a brief overview of the multiobjective EAs under consideration with special emphasis on the differences between them.The test functions,their construction,and their choice are the subject of Section4,which is followed by a discussion about performance metrics to assess the quality of trade-off fronts.Afterwards,we present the experimental results in Section6and investigate further aspects like elitism(Section7)and population size (Section8)separately.A discussion of the results as well as future perspectives are given in Section9.2DeﬁnitionsOptimization problems involving multiple,conﬂicting objectives are often approached by aggregating the objectives into a scalar function and solving the resulting single-objective optimization problem.In contrast,in this study,we are concerned withﬁnding a set of optimal trade-offs,the so-called Pareto-optimal set.In the following,we formalize this 174Evolutionary Computation Volume8,Number2Comparison of Multiobjective EAs well-known concept and also deﬁne the difference between local and global Pareto-optimalsets.A multiobjective search space is partially ordered in the sense that two arbitrary so-lutions are related to each other in two possible ways:either one dominates the other or neither dominates.D EFINITION1:Let us consider,without loss of generality,a multiobjective minimization problem with decision variables(parameters)and objectives:Minimizewhere(1) and where is called decision vector,parameter space,objective vector,and objective space.A decision vector is said to dominate a decision vector(also written as) if and only if(2)Additionally,in this study,we say covers()if and only if or.Based on the above relation,we can deﬁne nondominated and Pareto-optimal solutions: D EFINITION2:Let be an arbitrary decision vector.1.The decision vector is said to be nondominated regarding a set if and only if thereis no vector in which dominates;formally(3)If it is clear within the context which set is meant,we simply leave it out.2.The decision vector is Pareto-optimal if and only if is nondominated regarding.Pareto-optimal decision vectors cannot be improved in any objective without causing a degradation in at least one other objective;they represent,in our terminology,globally optimal solutions.However,analogous to single-objective optimization problems,there may also be local optima which constitute a nondominated set within a certain neighbor-hood.This corresponds to the concepts of global and local Pareto-optimal sets introduced by Deb(1999):D EFINITION3:Consider a set of decision vectors.1.The set is denoted as a local Pareto-optimal set if and only if(4)where is a corresponding distance metric and,.E.Zitzler,K.Deb,and L.Thiele2.The set is called a global Pareto-optimal set if and only if(5) Note that a global Pareto-optimal set does not necessarily contain all Pareto-optimal solu-tions.If we refer to the entirety of the Pareto-optimal solutions,we simply write“Pareto-optimal set”;the corresponding set of objective vectors is denoted as“Pareto-optimal front”.3Evolutionary Multiobjective OptimizationT wo major problems must be addressed when an evolutionary algorithm is applied to multiobjective optimization:1.How to accomplishﬁtness assignment and selection,respectively,in order to guide thesearch towards the Pareto-optimal set.2.How to maintain a diverse population in order to prevent premature convergence andachieve a well distributed trade-off front.Often,different approaches are classiﬁed with regard to theﬁrst issue,where one can distinguish between criterion selection,aggregation selection,and Pareto selection(Horn, 1997).Methods performing criterion selection switch between the objectives during the selection phase.Each time an individual is chosen for reproduction,potentially a different objective will decide which member of the population will be copied into the mating pool. Aggregation selection is based on the traditional approaches to multiobjective optimization where the multiple objectives are combined into a parameterized single objective function. The parameters of the resulting function are systematically varied during the same run in order toﬁnd a set of Pareto-optimal solutions.Finally,Pareto selection makes direct use of the dominance relation from Deﬁnition1;Goldberg(1989)was theﬁrst to suggest a Pareto-basedﬁtness assignment strategy.In this study,six of the most salient multiobjective EAs are considered,where for each of the above categories,at least one representative was chosen.Nevertheless,there are many other methods that may be considered for the comparison(cf.Van Veldhuizen and Lamont(1998b)and Coello(1999)for an overview of different evolutionary techniques): Among the class of criterion selection approaches,the Vector Evaluated Genetic Al-gorithm(VEGA)(Schaffer,1984,1985)has been chosen.Although some serious drawbacks are known(Schaffer,1985;Fonseca and Fleming,1995;Horn,1997),this algorithm has been a strong point of reference up to now.Therefore,it has been included in this investigation.The EA proposed by Hajela and Lin(1992)is based on aggregation selection in combination withﬁtness sharing(Goldberg and Richardson,1987),where an individual is assessed by summing up the weighted objective values.As weighted-sum aggregation appears still to be widespread due to its simplicity,Hajela and Lin’s technique has been selected to represent this class of multiobjective EAs.Pareto-based techniques seem to be most popular in theﬁeld of evolutionary mul-tiobjective optimization(Van Veldhuizen and Lamont,1998b).In particular,the 176Evolutionary Computation Volume8,Number2Comparison of Multiobjective EAs algorithm presented by Fonseca and Fleming(1993),the Niched Pareto Genetic Algo-rithm(NPGA)(Horn and Nafpliotis,1993;Horn et al.,1994),and the Nondominated Sorting Genetic Algorithm(NSGA)(Srinivas and Deb,1994)appear to have achieved the most attention in the EA literature and have been used in various studies.Thus, they are also considered here.Furthermore,a recent elitist Pareto-based strategy,the Strength Pareto Evolutionary Algorithm(SPGA)(Zitzler and Thiele,1999),which outperformed four other multiobjective EAs on an extended0/1knapsack problem,is included in the comparison.4Test Functions for Multiobjective OptimizersDeb(1999)has identiﬁed several features that may cause difﬁculties for multiobjective EAs in1)converging to the Pareto-optimal front and2)maintaining diversity within the population.Concerning theﬁrst issue,multimodality,deception,and isolated optima are well-known problem areas in single-objective evolutionary optimization.The second issue is important in order to achieve a well distributed nondominated front.However,certain characteristics of the Pareto-optimal front may prevent an EA fromﬁnding diverse Pareto-optimal solutions:convexity or nonconvexity,discreteness,and nonuniformity.For each of the six problem features mentioned,a corresponding test function is constructed following the guidelines in Deb(1999).We thereby restrict ourselves to only two objectives in order to investigate the simplest caseﬁrst.In our opinion,two objectives are sufﬁcient to reﬂect essential aspects of multiobjective optimization.Moreover,we do not consider maximization or mixed minimization/maximization problems.Each of the test functions deﬁned below is structured in the same manner and consists itself of three functions(Deb,1999,216):Minimizesubject to(6)whereThe function is a function of theﬁrst decision variable only,is a function of the remaining variables,and the parameters of are the function values of and.The test functions differ in these three functions as well as in the number of variables and in the values the variables may take.D EFINITION4:We introduce six test functions that follow the scheme given in Equa-tion6:The test function has a convex Pareto-optimal front:E.Zitzler,K.Deb,and L.Thielewhere,and.The Pareto-optimal front is formed with.The test function represents the discreteness feature;its Pareto-optimal front consists of several noncontiguous convex parts:(10)where,,and.The global Pareto-optimal front is formed with,the best local Pareto-optimal front with.Note that not all local Pareto-optimal sets are distinguishable in the objective space.The test function describes a deceptive problem and distinguishes itself from the other test functions in that represents a binary string:(11)where gives the number of ones in the bit vector(unitation),ififand,,and.The true Pareto-optimal front is formed with,while the best deceptive Pareto-optimal front is represented by the solutions for which.The global Pareto-optimal front as well as the local ones are convex.The test function includes two difﬁculties caused by the nonuniformity of the search space:ﬁrst,the Pareto-optimal solutions are nonuniformly distributed along the global Pareto front (the front is biased for solutions for which is near one);second,the density of the solutions is lowest near the Pareto-optimal front and highest away from the front:(12)where,.The Pareto-optimal front is formed with and is nonconvex.We will discuss each function in more detail in Section6,where the corresponding Pareto-optimal fronts are visualized as well(Figures1–6).178Evolutionary Computation Volume8,Number2Comparison of Multiobjective EAs5Metrics of PerformanceComparing different optimization techniques experimentally always involves the notion of performance.In the case of multiobjective optimization,the deﬁnition of quality is substantially more complex than for single-objective optimization problems,because the optimization goal itself consists of multiple objectives:The distance of the resulting nondominated set to the Pareto-optimal front should be minimized.A good(in most cases uniform)distribution of the solutions found is desirable.Theassessment of this criterion might be based on a certain distance metric.The extent of the obtained nondominated front should be maximized,i.e.,for each objective,a wide range of values should be covered by the nondominated solutions.In the literature,some attempts can be found to formalize the above deﬁnition(or parts of it)by means of quantitative metrics.Performance assessment by means of weighted-sum aggregation was introduced by Esbensen and Kuh(1996).Thereby,a set of decision vectors is evaluated regarding a given linear combination by determining the minimum weighted-sum of all corresponding objective vectors of.Based on this concept,a sample of linear combinations is chosen at random(with respect to a certain probability distribution),and the minimum weighted-sums for all linear combinations are summed up and averaged.The resulting value is taken as a measure of quality.A drawback of this metric is that only the“worst”solution determines the quality value per linear combination. Although several weight combinations are used,nonconvex regions of the trade-off surface contribute to the quality more than convex parts and may,as a consequence,dominate the performance assessment.Finally,the distribution,as well as the extent of the nondominated front,is not considered.Another interesting means of performance assessment was proposed by Fonseca and Fleming(1996).Given a set of nondominated solutions,a boundary function divides the objective space into two regions:the objective vectors for which the corre-sponding solutions are not covered by and the objective vectors for which the associated solutions are covered by.They call this particular function,which can also be seen as the locus of the family of tightest goal vectors known to be attainable,the attainment surface. T aking multiple optimization runs into account,a method is described to compute a median attainment surface by using auxiliary straight lines and sampling their intersections with the attainment surfaces obtained.As a result,the samples represented by the median attain-ment surface can be relatively assessed by means of statistical tests and,therefore,allow comparison of the performance of two or more multiobjective optimizers.A drawback of this approach is that it remains unclear how the quality difference can be expressed,i.e.,how much better one algorithm is than another.However,Fonseca and Fleming describe ways of meaningful statistical interpretation in contrast to the other studies considered here,and furthermore,their methodology seems to be well suited to visualization of the outcomes of several runs.In the context of investigations on convergence to the Pareto-optimal front,some authors(Rudolph,1998;Van Veldhuizen and Lamont,1998a)have considered the distance of a given set to the Pareto-optimal set in the same way as the function deﬁned below.The distribution was not taken into account,because the focus was not on this Evolutionary Computation Volume8,Number2179E.Zitzler,K.Deb,and L.Thielematter.However,in comparative studies,distance alone is not sufﬁcient for performance evaluation,since extremely differently distributed fronts may have the same distance to the Pareto-optimal front.T wo complementary metrics of performance were presented in Zitzler and Thiele (1998,1999).On one hand,the size of the dominated area in the objective space is taken under consideration;on the other hand,a pair of nondominated sets is compared by calculating the fraction of each set that is covered by the other set.The area combines all three criteria(distance,distribution,and extent)into one,and therefore,sets differing in more than one criterion may not be distinguished.The second metric is in some way similar to the comparison methodology proposed in Fonseca and Fleming(1996).It can be used to show that the outcomes of an algorithm dominate the outcomes of another algorithm, although,it does not tell how much better it is.We give its deﬁnition here,because it is used in the remainder of this paper.D EFINITION5:Let be two sets of decision vectors.The function maps the ordered pair to the interval::(14)Comparison of Multiobjective EAs 2.The function takes the distribution in combination with the number of nondominatedsolutions found into account:max(16)Analogously,we deﬁne three metrics,,and on the objective space.Let,respectively,and and as before:(17)max(19)While and are intuitive,and(respectively and)need further explanation.The distribution metrics give a value within the interval()that reﬂects the number of-niches(-niches)in().Obviously,the higher the value,the better the distribution for an appropriate neighborhood parameter(e.g.,means that for each objective vector there is no other objective vector within-distance to it).The functions and use the maximum extent in each dimension to estimate the range to which the front spreads out.In the case of two objectives,this equals the distance of the two outer solutions.6Comparison of Different Evolutionary Approaches6.1MethodologyWe compare eight algorithms on the six proposed test functions:1.A random search algorithm.2.Fonseca and Fleming’s multiobjective EA.3.The Niched Pareto Genetic Algorithm.4.Hajela and Lin’s weighted-sum based approach.5.The Vector Evaluated Genetic Algorithm.6.The Nondominated Sorting Genetic Algorithm.Evolutionary Computation Volume8,Number2181E.Zitzler,K.Deb,and L.Thiele7.A single-objective evolutionary algorithm using weighted-sum aggregation.8.The Strength Pareto Evolutionary Algorithm.The multiobjective EAs,as well as,were executed times on each test problem, where the population was monitored for nondominated solutions,and the resulting non-dominated set was taken as the outcome of one optimization run.Here,serves as an additional point of reference and randomly generates a certain number of individuals per generation according to the rate of crossover and mutation(but neither crossover and mutation nor selection are performed).Hence,the number ofﬁtness evaluations was the same as for the EAs.In contrast,simulation runs were considered in the case of, each run optimizing towards another randomly chosen linear combination of the objec-tives.The nondominated solutions among all solutions generated in the runs form the trade-off front achieved by on a particular test function.Independent of the algorithm and the test function,each simulation run was carried out using the following parameters:Number of generations:250Population size:100Crossover rate:0.8Mutation rate:0.01Niching parameter share:0.48862Domination pressure dom:10The niching parameter was calculated using the guidelines given in Deb and Goldberg (1989)assuming the formation of ten independent niches.Since uses genotypic ﬁtness sharing on,a different value,share,was chosen for this particular case. Concerning,the recommended value for dom of the population size wastaken(Horn and Nafpliotis,1993).Furthermore,for reasons of fairness,ran with a population size of where the external nondominated set was restricted to.Regarding the implementations of the algorithms,one chromosome was used to en-code the parameters of the corresponding test problem.Each parameter is represented by bits;the parameters only comprise bits for the deceptive function. Moreover,all approaches except were realized using binary tournament selection with replacement in order to avoid effects caused by different selection schemes.Further-more,sinceﬁtness sharing may produce chaotic behavior in combination with tournament selection,a slightly modiﬁed method is incorporated here,named continuously updated shar-ing(Oei et al.,1991).As requires a generational selection mechanism,stochastic universal sampling was used in the implementation.6.2Simulation ResultsIn Figures1–6,the nondominated fronts achieved by the different algorithms are visualized. Per algorithm and test function,the outcomes of theﬁrstﬁve runs were uniﬁed,and then the dominated solutions were removed from the union set;the remaining points are plotted in theﬁgures.Also shown are the Pareto-optimal fronts(lower curves),as well as additional reference curves(upper curves).The latter curves allow a more precise evaluation of the obtained trade-off fronts and were calculated by adding max minto the values of the Pareto-optimal points.The space between Pareto-optimal and 182Evolutionary Computation Volume8,Number2f101234f2RANDFFGA NPGA HLGA VEGA NSGA SOEA SPEAFigure 1:T est function(convex).f101234f2RANDFFGA NPGA HLGA VEGA NSGA SOEA SPEAFigure 2:T est function(nonconvex).Evolutionary ComputationVolume 8,Number 2183f11234f2RANDFFGA NPGA HLGA VEGA NSGA SOEA SPEA Figure 3:T est function(discrete).f1010203040f2RANDFFGA NPGA HLGA VEGA NSGA SOEA SPEAFigure 4:T est function(multimodal).184Evolutionary Computation Volume 8,Number 2f10246f2RANDFFGA NPGAHLGA VEGA NSGASOEA SPEA Figure 5:T est function (deceptive).f12468f2RANDFFGA NPGA HLGA VEGA NSGASOEA SPEA Figure 6:T est function(nonuniform).Evolutionary ComputationVolume 8,Number 2185reference fronts represents about of the corresponding objective space.However,the curve resulting from the deceptive function is not appropriate for our purposes,since it lies above the fronts produced by the random search algorithm.Instead,we consider all solutions with,i.e.,for which the parameters are set to the deceptive attractors (for).In addition to the graphical presentation,the different algorithms were assessed in pairs using the metric from Deﬁnition5.For an ordered algorithm pair,there is a sample of values according to the runs performed.Each value is computed on the basis of the nondominated sets achieved by and with the same initial population. Here,box plots are used to visualize the distribution of these samples(Figure7).A box plot consists of a box summarizing of the data.The upper and lower ends of the box are the upper and lower quartiles,while a thick line within the box encodes the median. Dashed appendages summarize the spread and shape of the distribution.Furthermore,the shortcut in Figure7stands for“reference set”and represents,for each test function,a set of equidistant points that are uniformly distributed on the corresponding reference curve.Generally,the simulation results prove that all multiobjective EAs do better than the random search algorithm.However,the box plots reveal that,,anddo not always cover the randomly created trade-off front completely.Furthermore,it can be observed that clearly outperforms the other nonelitist multiobjective EAs regarding both distance to the Pareto-optimal front and distribution of the nondominated solutions.This conﬁrms the results presented in Zitzler and Thiele(1998).Furthermore, it is remarkable that performs well compared to and,although some serious drawbacks of this approach are known(Fonseca and Fleming,1995).The reason for this might be that we consider the off-line performance here in contrast to other studies that examine the on-line performance(Horn and Nafpliotis,1993;Srinivas and Deb,1994). On-line performance means that only the nondominated solutions in theﬁnal population are considered as the outcome,while off-line performance takes the solutions nondominated among all solutions generated during the entire optimization run into account.Finally,the best performance is provided by,which makes explicit use of the concept of elitism. Apart from,it even outperforms in spite of substantially lower computational effort and although uses an elitist strategy as well.This observation leads to the question of whether elitism would increase the performance of the other multiobjective EAs.We will investigate this matter in the next section.Considering the different problem features separately,convexity seems to cause the least amount of difﬁculty for the multiobjective EAs.All algorithms evolved reasonably distributed fronts,although there was a difference in the distance to the Pareto-optimal set.On the nonconvex test function,however,,,and have difﬁculties ﬁnding intermediate solutions,as linear combinations of the objectives tend to prefer solutions strong in at least one objective(Fonseca and Fleming,1995,4).Pareto-based algorithms have advantages here,but only and evolved a sufﬁcient number of nondominated solutions.In the case of(discreteness),and are superior to both and.While the fronts achieved by the former cover about of the reference set on average,the latter come up with coverage.Among the considered test functions,and seem to be the hardest problems,since none of the algorithms was able to evolve a global Pareto-optimal set.The results on the multimodal problem indicate。

【目标管理】张砦-遗传算法在多目标优化中的应用

(3) 发展期(90年代以后) 90年代，遗传算法不断地向广度和深度发展。
• 1991年，wrence出版《Handbook of Genetic Algorithms》一书，详尽地介绍遗传算法的工作细节。 • 1996年 Z.Michalewicz的专著《遗传算法 + 数据结构 = 进化程序》深入讨论了遗传算法的各种专门问题。同年，T.Back的专著《进化算法的理论与实践：进化策略、进化规划、遗传算法》深入阐明进化算法的许多理论问题。 • 1992年，Koza出版专著《Genetic Programming:on the Programming of Computer by Means of Natural Selection》，该书全面介绍了遗传规划的原理及应用实例，表明遗传规划己成为进化算法的一个重要分支。 • 1994年，Koza又出版第二部专著《Genetic Programming Ⅱ:Automatic Discovery of Reusable Programs》，提出自动定义函数的新概念，在遗传规划中引入子程序的新技术。同年， K.E.Kinnear主编《Advances in Genetic Programming》，汇集许多研究工作者有关应用遗传规划的经验和技术。
(1) 生物的所有遗传信息都包含在其染色体中，染色体决定了生物的性状；
(2) 染色体是由基因及其有规律的排列所构成的，遗传和进化过程发生在染色体上；
(3) 生物的繁殖过程是由其基因的复制过程来完成的； (4) 通过同源染色体之间的交叉或染色体的变异会产生新的物种，使生物呈现新的性状。 (5) 对环境适应性好的基因或染色体经常比适应性差的基因或染色体有更多的机会遗传到下一代。
1.1 遗传算法的生物学基础

a fast and elitist multiobjective genetic algorithm NSGA-II

A Fast and Elitist Multiobjective Genetic Algorithm:NSGA-IIKalyanmoy Deb ,Associate Member,IEEE ,Amrit Pratap,Sameer Agarwal,and T.MeyarivanAbstract—Multiobjective evolutionary algorithms (EAs)that use nondominated sorting and sharing have been criti-cizedmainly for their:1)is thenumber of objectives and is the populationsize);2)nonelitism approach;and 3)the need for specifying a sharing parameter.In this paper,we suggest a nondominated sorting-based multiobjective EA (MOEA),called nondominated sorting genetic algorithm II (NSGA-II),which alleviates all the above three difficulties.Specifically,a fast nondominated sorting approach withsolutions.Simulation results on difficult test problems show that the proposed NSGA-II,in most problems,is able to find much better spread of solutions and better convergence near the true Pareto-optimal front compared to Pareto-archived evolution strategy and strength-Pareto EA—two other elitist MOEAs that pay special attention to creating a diverse Pareto-optimal front.Moreover,we modify the definition of dominance in order to solve constrained multiobjective problems efficiently.Simulation results of the constrained NSGA-II on a number of test problems,including a five-objective seven-constraint nonlinear problem,are compared with another constrained multiobjective optimizer and much better performance of NSGA-II is observed.Index Terms—Constraint handling,elitism,genetic algorithms,multicriterion decision making,multiobjective optimization,Pareto-optimal solutions.I.I NTRODUCTIONTHE PRESENCE of multiple objectives in a problem,in principle,gives rise to a set of optimal solutions (largely known as Pareto-optimal solutions),instead of a single optimal solution.In the absence of any further information,one of these multiple solutions,it has to be applied many times,hopefully finding a different solution at each simulation run.of multiobjective evolu-been suggested [1],[7],[13],revised February 5,2001andSeptember 7,2001.The work of K.Deb was supported by the Ministry of Human Resources and Development,India,under the Research and Development Scheme.The authors are with the Kanpur Genetic Algorithms Laboratory,Indian In-stitute of Technology,Kanpur PIN 208016,India (e-mail:deb@iitk.ac.in).Publisher Item Identifier S 1089-778X(02)04101-2.[20],[26].The primary reason for this is their ability to find multiple Pareto-optimal solutions in one single simulation run.Since evolutionary algorithms (EAs)work with a population of solutions,a simple EA can be extended to maintain a diverse set of solutions.With an emphasis for moving toward the true Pareto-optimal region,an EA can be used to find multiple Pareto-optimal solutions in one single simulation run.The nondominated sorting genetic algorithm (NSGA)pro-posed in [20]was one of the first such EAs.Over the years,the main criticisms of the NSGA approach have been as follows.1)High computational complexity of nondominated sorting:The currently-used nondominated sorting algorithm has acomputational complexity of(where is the population size).Thismakes NSGA computationally expensive for large popu-lation sizes.This large complexity arises because of the complexity involved in the nondominated sorting proce-dure in every generation.2)Lack of elitism:Recent results [25],[18]show that elitism can speed up the performance of the GA significantly,which also can help preventing the loss of good solutions once they are found.3)Need for specifying the sharing parameterwe describe the proposed NSGA-II algorithm in details.Sec-tion IV presents simulation results of NSGA-II and compares them with two other elitist MOEAs(PAES and SPEA).In Sec-tion V,we highlight the issue of parameter interactions,a matter that is important in evolutionary computation research.The next section extends NSGA-II for handling constraints and compares the results with another recently proposed constraint-handling method.Finally,we outline the conclusions of this paper.II.E LITIST M ULTIOBJECTIVE E VOLUTIONARY A LGORITHMS During1993–1995,a number of different EAs were sug-gested to solve multiobjective optimization problems.Of them,Fonseca and Fleming’s MOGA[7],Srinivas and Deb’s NSGA[20],and Horn et al.’s NPGA[13]enjoyed more attention.These algorithms demonstrated the necessary additional oper-ators for converting a simple EA to a MOEA.Two commonfeatures on all three operators were the following:i)assigningfitness to population members based on nondominated sortingand ii)preserving diversity among solutions of the samenondominated front.Although they have been shown to findmultiple nondominated solutions on many test problems and anumber of engineering design problems,researchers realizedthe need of introducing more useful operators(which havebeen found useful in single-objective EA’s)so as to solvemultiobjective optimization problems better.Particularly,the interest has been to introduce elitism to enhance theconvergence properties of a MOEA.Reference[25]showedthat elitism helps in achieving better convergence in MOEAs.Among the existing elitist MOEAs,Zitzler and Thiele’s SPEA[26],Knowles and Corne’s Pareto-archived PAES[14],andRudolph’s elitist GA[18]are well studied.We describe theseapproaches in brief.For details,readers are encouraged to referto the original studies.Zitzler and Thiele[26]suggested an elitist multicriterion EAwith the concept of nondomination in their SPEA.They sug-gested maintaining an external population at every generationstoring all nondominated solutions discovered so far beginningfrom the initial population.This external population partici-pates in all genetic operations.At each generation,a combinedpopulation with the external and the current population is firstconstructed.All nondominated solutions in the combined pop-ulation are assigned a fitness based on the number of solutionsthey dominate and dominated solutions are assigned fitnessworse than the worst fitness of any nondominated solution.This assignment of fitness makes sure that the search is directedtoward the nondominated solutions.A deterministic clusteringtechnique is used to ensure diversity among nondominatedsolutions.Although the implementation suggested in[26]is,with proper bookkeeping the complexity of SPEAcan be reduced to1)-evolutionstrategy.Instead of using real parameters,binary strings wereused and bitwise mutations were employed to create offsprings.In their PAES,with one parent and one offspring,the offspringis compared with respect to the parent.If the offspring domi-nates the parent,the offspring is accepted as the next parent andthe iteration continues.On the other hand,if the parent dom-inates the offspring,the offspring is discarded and a new mu-tated solution(a new offspring)is found.However,if the off-spring and the parent do not dominate each other,the choice be-tween the offspring and the parent is made by comparing themwith an archive of best solutions found so far.The offspring iscompared with the archive to check if it dominates any memberof the archive.If it does,the offspring is accepted as the newparent and all the dominated solutions are eliminated from thearchive.If the offspring does not dominate any member of thearchive,both parent and offspring are checked for their near-ness with the solutions of the archive.If the offspring resides ina least crowded region in the objective space among the mem-bers of the archive,it is accepted as a parent and a copy of addedto the archive.Crowding is maintained by dividing the entiresearch space deterministically in is thedepth parameter andevaluationsas,where,theoverall complexity of the algorithm is,each solutioncan be compared with every other solution in the population tofind if it is dominated.This requiresis the number of objectives.When thisprocess is continued to find all members of the first nondomi-nated level in the population,the total complexity isfront,the solutions of the first front are discounted temporarily and the above procedure is repeated.In the worst case,the task of finding the second front alsorequiresfronts and there exists onlyone solution in each front.This requires anoverall computations.Notethat,and 2)dom-inates.This requireswith,we visit each member (the domi-nation count becomes zero,we put it in a separate list and thethird front is identified.This process continues until all fronts are identified.For each solution.Thus,each solutiontimes before its domination count becomes zero.At this point,the solution is assigned a nondomination level and will never be visited again.Since there are at most.Thus,the overall complexity of the procedureistimes as each individual can be the memberof at most one front and the second inner loop (for eachtimes for each individual[each individual dominatescomparisons]resultsin the overall.B.Diversity PreservationWe mentioned earlier that,along with convergence to the Pareto-optimal set,it is also desired that an EA maintains a good spread of solutions in the obtained set of solutions.The original NSGA used the well-known sharing function approach,which has been found to maintain sustainable diversity in a popula-tion with appropriate setting of its associated parameters.The sharing function method involves a sharing parameter--for each if thenAddelse if thenifthenUsed to store the members of the next frontfor eachif then比pFig.1.Crowding-distance calculation.Points marked in filled circles are solutions of the same nondominated front.2)Since each solution must be compared with all other so-lutions in the population,the overall complexity of thesharing function approachis.In the proposed NSGA-II,we replace the sharing function approach with a crowded-comparison approach that eliminates both the above difficulties to some extent.The new approach does not require any user-defined parameter for maintaining diversity among population members.Also,the suggested ap-proach has a better computational complexity.To describe this approach,we first define a density-estimation metric and then present the crowded-comparison operator.1)Density Estimation:To get an estimate of the density of solutions surrounding a particular solution in the population,we calculate the average distance of two points on either side of this point along each of the objectives.Thisquantityth solution inits front (marked with solid circles)is the average side length of the cuboid (shown with a dashed box).The crowding-distance computation requires sorting the pop-ulation according to each objective function value in ascending order of magnitude.Thereafter,for each objective function,the boundary solutions (solutions with smallest and largest function values)are assigned an infinite distance value.All other inter-mediate solutions are assigned a distance value equal to the ab-solute normalized difference in the function values of two adja-cent solutions.This calculation is continued with other objective functions.The overall crowding-distance value is calculated as the sum of individual distance values corresponding to each ob-jective.Each objective function is normalized before calculating the crowding distance.The algorithm as shown at the bottom of the page outlines the crowding-distance computation procedure of all solutions in an nondominatedsetth objective function value of the and theparametersand arethe maximum and minimum values oftheindependent sortings of atmost)are in-volved,the above algorithmhasare assigned adistance metric,we can compare two solutions for their extent of proximity with other solutions.A solution with a smaller value of this distance measure is,in some sense,more crowded by other solutions.This is exactly what we compare in the proposed crowded-comparison operator,described below.Although Fig.1illustrates the crowding-distance computation for two objectives,the procedure is applicable to more than two objectives as well.2)Crowded-Comparison Operator:The crowded-compar-ison operator(in the populationhas two attributes:1)nondomination rank();2)crowding distance(ifandThat is,between two solutions with differing nondomination ranks,we prefer the solution with the lower (better)rank.Other-wise,if both solutions belong to the same front,then we prefer the solution that is located in a lesser crowded region.With these three new innovations—a fast nondominated sorting procedure,a fast crowded distance estimation proce-dure,and a simple crowded comparison operator,we are now ready to describe the NSGA-II algorithm.C.Main LoopInitially,a random parentpopulation.Sinceelitism--foreachtois introduced by comparing current population with previously found best nondominated solutions,the procedure is different after the initial generation.We first describethe,we definitely choose all members of thesetpopulation members,we sort the solutionsof the lastfrontis now used for se-lection,crossover,and mutation to create a newpopulationofsize;2)crowding-distance assignmentis;3)sortingon.The overall complexity of the algorithmismem-bersin----th nondominated front in the parentpopsort in descending orderusing-use selection,crossover and mutation to createa newpopulationTABLE IT EST P ROBLEMS U SED IN T HIS STUDYAll objective functions are to be minimized.A.Test ProblemsWe first describe the test problems used to compare different MOEAs.Test problems are chosen from a number of signifi-cant past studies in this area.Veldhuizen [22]cited a number of test problems that have been used in the past.Of them,we choose four problems:Schaffer’s study (SCH)[19],Fonseca and Fleming’s study (FON)[10],Poloni’s study (POL)[16],and Kursawe’s study (KUR)[15].In 1999,the first author suggested a systematic way of developing test problems for multiobjec-tive optimization [3].Zitzler et al.[25]followed those guide-lines and suggested six test problems.We choose five of those six problems here and call them ZDT1,ZDT2,ZDT3,ZDT4,and ZDT6.All problems have two objective functions.None of these problems have any constraint.We describe these prob-lems in Table I.The table also shows the number of variables,their bounds,the Pareto-optimal solutions,and the nature of the Pareto-optimal front for each problem.All approaches are run for a maximum of 25000function evaluations.We use the single-point crossover and bitwisemutation for binary-coded GAs and the simulated binary crossover (SBX)operator and polynomial mutation [6]forreal-coded GAs.The crossover probabilityofand a mutation probabilityofis the number of decision variables for real-coded GAsandequalto four and an archivesizeFig.3.Distance metric7.nondominated solutions of the combined GA and external populations at the final generation to calculate the performance metrics used in this study.For PAES,SPEA,and binary-coded NSGA-II,we have used30bits to code each decision variable.B.Performance MeasuresUnlike in single-objective optimization,there are two goals in a multiobjective optimization:1)convergence to the Pareto-op-timal set and2)maintenance of diversity in solutions of the Pareto-optimal set.These two tasks cannot be measured ade-quately with one performance metric.Many performance met-rics have been suggested[1],[8],[24].Here,we define two per-formance metrics that are more direct in evaluating each of the above two goals in a solution set obtained by a multiobjective optimization algorithm.The firstmetricchosen solutions on the Pareto-optimal front.The averageof these distances is used as the firstmetricchosen solutions on the Pareto-op-timal front for the calculation of the convergence metric and so-lutions marked with dark circles are solutions obtained by analgorithm.The smaller the value of this metric,the better theconvergence toward the Pareto-optimal front.When all obtainedsolutions lie exactlyonandvariance of this metric calculated for solutionsets obtained in multiple runs.Even when all solutions converge to the Pareto-optimal front,the above convergence metric does not have a value of zero.Themetric will yield zero only when each obtained solution lies ex-actly on each of the chosen solutions.Although this metricaloneFig.4.Diversity metric1.can provide some information about the spread in obtained so-lutions,we define an different metric to measure the spread insolutions obtained by an algorithm directly.The secondmetricof these distances.Thereafter,from the obtained set of non-dominated solutions,we first calculate the extreme solutions(inthe objective space)by fitting a curve parallel to that of the truePareto-optimal front.Then,we use the following metric to cal-culate the nonuniformity in thedistribution:is the average of alldistances,assuming that therearesolutions,there aresolutions lie on one solution.It is interesting tonote that this is not the worst case spread of solutions possible.We can have a scenario in which there is a large varianceinand wouldmakewould be zero,making the metric to take a valuezero.For any other distribution,the value of the metric would begreater than zero.For two distributions having identical valuesof takes a higher value with worse distri-butions of solutions within the extreme solutions.Note that theabove diversity metric can be used on any nondominated set ofsolutions,including one that is not the Pareto-optimal ingTABLE IIM EAN(F IRST R OWS)AND V ARIANCE(S ECOND R OWS)OF THE C ONVERGENCE M ETRIC7TABLE IIIM EAN(F IRST R OWS)AND V ARIANCE(S ECOND R OWS)OF THE D IVERSITY M ETRIC1a triangularization technique or a V oronoi diagram approach[1]to calculateobtained using four algorithms NSGA-II(real-coded),NSGA-II(binary-coded),SPEA,and PAES.NSGA-II(real coded or binary coded)is able to convergebetter in all problems except in ZDT3and ZDT6,where PAESfound better convergence.In all cases with NSGA-II,the vari-ance in ten runs is also small,except in ZDT4with NSGA-II(binary coded).The fixed archive strategy of PAES allows betterconvergence to be achieved in two out of nine problems.Table III shows the mean and variance of the diversity metricFig.7.Nondominated solutions with SPEA onKUR.Fig.8.Nondominated solutions with NSGA-II (binary-coded)on ZDT2.In both aspects of convergence and distribution of solutions,NSGA-II performed better than SPEA in this problem.Since SPEA could not maintain enough nondominated solutions in the final GA population,the overall number of nondominated solutions is much less compared to that obtained in the final population of NSGA-II.Next,we show the nondominated solutions on the problem ZDT2in Figs.8and 9.This problem has a nonconvex Pareto-op-timal front.We show the performance of binary-coded NSGA-II and SPEA on this function.Although the convergence is not a difficulty here with both of these algorithms,both real-and binary-coded NSGA-II have found a better spread and more solutions in the entire Pareto-optimal region than SPEA (the next-best algorithm observed for this problem).The problem ZDT4has21here],that study clearly showed that a population of size of about at least 500is needed for single-objective binary-coded GAs (with tournament selection,single-point crossover and bitwise mutation)to find the global optimum solution in more than 50%of the simulation runs.Since we have used a population of size 100,it is not expected that a multiobjective GA would find the global Pareto-optimal solution,but NSGA-II is able to find a good spread of solutions even at a local Pareto-optimal front.Since SPEA converges poorly on this problem (see Table II),we do not show SPEA results on this figure.Finally,Fig.11shows that SPEA finds a better converged set of nondominated solutions in ZDT6compared to any other algorithm.However,the distribution in solutions is better with real-coded NSGA-II.D.Different Parameter SettingsIn this study,we do not make any serious attempt to find the best parameter setting for NSGA-II.But in this section,we per-Fig.11.Real-coded NSGA-II finds better spread of solutions than SPEA on ZDT6,but SPEA has a better convergence.TABLE IVM EAN AND V ARIANCE OF THE C ONVERGENCE AND D IVERSITY M ETRICSUP TO 500GENERATIONSform additional experiments to show the effect of a couple of different parameter settings on the performance of NSGA-II.First,we keep all other parameters as before,but increase the number of maximum generations to 500(instead of 250used before).Table IV shows the convergence and diversity metrics for problems POL,KUR,ZDT3,ZDT4,and ZDT6.Now,we achieve a convergence very close to the true Pareto-optimal front and with a much better distribution.The table shows that in all these difficult problems,the real-coded NSGA-II has converged very close to the true optimal front,except in ZDT6,which prob-ably requires a different parameter setting with NSGA-II.Par-ticularly,the results on ZDT3and ZDT4improve with genera-tion number.The problem ZDT4has a number of local Pareto-optimalfronts,each corresponding to particular valueof.A large change in the decision vector is needed to get out of a local optimum.Unless mutation or crossover operators are capable of creating solutions in the basin of another better attractor,the improvement in the convergence toward the true Pareto-op-timal front is not possible.We use NSGA-II (real-coded)with a smaller distributionindexand diversitymeasureFig.12.Obtained nondominated solutions with NSGA-II on problem ZDT4.These results are much better than PAES and SPEA,as shown in Table II.To demonstrate the convergence and spread of so-lutions,we plot the nondominated solutions of one of the runs after 250generations in Fig.12.The figure shows that NSGA-II is able to find solutions on the true Pareto-optimal frontwith.V .R OTATED P ROBLEMSIt has been discussed in an earlier study [3]that interactions among decision variables can introduce another level of dif-ficulty to any multiobjective optimization algorithm including EAs.In this section,we create one such problem and investi-gate the working of previously three MOEAs on the following epistatic problem:minimizewhere,but the aboveobjective functions are defined in terms of the variablevector by a fixed rotationmatrixFig.13.Obtained nondominated solutions with NSGA-II,PAES,and SPEA on the rotated problem.within the prescribed variable bounds,we discourage solutionswithresulting].This example problem demonstrates that one of the known dif-ficulties(the linkage problem[11],[12])of single-objective op-timization algorithm can also cause difficulties in a multiobjec-tive problem.However,more systematic studies are needed toamply address the linkage issue in multiobjective optimization.VI.C ONSTRAINT H ANDLINGIn the past,the first author and his students implemented apenalty-parameterless constraint-handling approach for single-objective optimization.Those studies[2],[6]have shown howa tournament selection based algorithm can be used to handleconstraints in a population approach much better than a numberof other existing constraint-handling approaches.A similar ap-proach can be introduced with the above NSGA-II for solvingconstrained multiobjective optimization problems.A.Proposed Constraint-Handling Approach(ConstrainedNSGA-II)This constraint-handling method uses the binary tournamentselection,where two solutions are picked from the populationand the better solution is chosen.In the presence of constraints,each solution can be either feasible or infeasible.Thus,theremay be at most three situations:1)both solutions are feasible;2)one is feasible and other is not;and3)both are infeasible.For single objective optimization,we used a simple rule for eachcase.Case1)Choose the solution with better objective functionvalue.Case2)Choose the feasible solution.Case3)Choose the solution with smaller overall constraintviolation.Since in no case constraints and objective function values arecompared with each other,there is no need of having any penaltyparameter,a matter that makes the proposed constraint-handlingapproach useful and attractive.In the context of multiobjective optimization,the latter twocases can be used as they are and the first case can be resolved byusing the crowded-comparison operator as before.To maintainthe modularity in the procedures of NSGA-II,we simply modifythe definition of domination between two solutions.Definition1:A solution,if any of the following conditions is true.1)Solution is not.2)Solutions are both infeasible,but solutionand dominatessolutionTABLE VC ONSTRAINED T EST P ROBLEMS U SED IN T HIS S TUDYAll objective functions are to be minimized.Three different nondominated rankings of the population arefirst performed.The first ranking is performed using-di-mensional vector.The second ranking is performedusing only the constraint violation values of all(Fig.14.Obtained nondominated solutions with NSGA-II on the constrained problemCONSTR.Fig.15.Obtained nondominated solutions with Ray-Tai-Seow’s algorithm on the constrained problem CONSTR.can be maintained for a large number of generations.However,in each case,we obtain a reasonably good spread of solutions as early as 200generations.Crossover and mutation probabilities are the same as before.Fig.14shows the obtained set of 100nondominated solu-tions after 500generations using NSGA-II.The figure shows that NSGA-II is able to uniformly maintain solutions in both Pareto-optimal region.It is important to note that in order to maintain a spread of solutions on the constraint boundary,the solutions must have to be modified in a particular manner dic-tated by the constraint function.This becomes a difficult task of any search operator.Fig.15shows the obtained solutions using Ray-Tai-Seow’s algorithm after 500generations.It is clear that NSGA-II performs better than Ray–Tai–Seow’s algorithm in terms of converging to the true Pareto-optimal front and also in terms of maintaining a diverse population of nondominated solutions.Next,we consider the test problem SRN.Fig.16shows the nondominated solutions after 500generations usingNSGA-II.Fig.16.Obtained nondominated solutions with NSGA-II on the constrained problemSRN.Fig.17.Obtained nondominated solutions with Ray–Tai–Seow’s algorithm on the constrained problem SRN.The figure shows how NSGA-II can bring a random population on the Pareto-optimal front.Ray–Tai–Seow’s algorithm is also able to come close to the front on this test problem (Fig.17).Figs.18and 19show the feasible objective space and the obtained nondominated solutions with NSGA-II and Ray–Tai–Seow’s algorithm.Here,the Pareto-optimal region is discontinuous and NSGA-II does not have any difficulty in finding a wide spread of solutions over the true Pareto-optimal region.Although Ray–Tai–Seow’s algorithm found a number of solutions on the Pareto-optimal front,there exist many infeasible solutions even after 500generations.In order to demonstrate the working of Fonseca–Fleming’s constraint-han-dling strategy,we implement it with NSGA-II and apply on TNK.Fig.20shows 100population members at the end of 500generations and with identical parameter setting as used in Fig.18.Both these figures demonstrate that the proposed and Fonseca–Fleming’s constraint-handling strategies work well with NSGA-II.。

基于分布估计的分解多目标进化算法

基于分布估计的分解多目标进化算法摘要：分解多目标进化算法具有较好的分布性，但群体数量会随着目标数的增加而急剧增加，严重影响算法效率。

提出一种基于分布估计的分解多目标进化算法，基本思想：首先将多目标分解为若干单目标，然后根据分布估计的思想对各个单目标建立概率模型，通过采样产生解。

数值分析和实验表明，新算法的解不仅具有较好的多样性和均匀性，而且算法的计算复杂度明显低于分解多目标进化算法，尤其是对于三目标优化问题。

关键词：多目标优化；进化算法；分解策略；分布估计0引言目前，新型占优机制、新型进化机制、高维多目标优化问题及多目标优化测试问题是进化多目标优化算法的研究热点。

本文根据分布估计原理和分解多目标进化算法的特点，对分解多目标进化算法做了改进研究，提出了一种基于分布估计的分解多目标进化算法，并对该改进算法进行了性能分析和数值模拟。

1分解多目标进化算法与分布估计算法传统优化算法求解多目标优化问题的基本思路是：将各个子目标加权组合后转化为单目标优化问题。

多目标进化算法是将所有目标看成一个整体，通过适当的进化方法，寻找尽可能多的有代表性的、分布均匀的Pareto最优解。

Zhang和Li将传统多目标优化算法思想引入多目标进化算法，提出了分解多目标进化算法MOEA/D。

分解多目标进化算法将多目标优化问题分解为若干单目标优化问题，并将它们作为一个群体同时进化，进化的每一代群体由当前各个子目标的最优解组成。

在MOEA/D中，各个子目标的优化只需用到它周围的邻居个体信息，子目标间的邻居关系由各个目标函数的权向量之间的距离决定。

权向量距离相近的两个子目标，它们的解也必然近似。

由此可见，各个目标函数的权向量能否充满整个空间，分布是否均匀是MOEA/D中的关键问题。

分布估计算法是进化计算领域新兴的分支，它是进化算法和统计学习的有机结合。

该算法使用统计学习的手段构建解空间内个体分布概率模型，然后运用进化的思想进化该模型。

分布估计算法没有交叉和变异操作，取而代之的是估计解空间的概率模型和由概率模型采样生成新的群体。

Multi-objective optimization using genetic algorithms A tutorial

y ), if and only if, zi(x) ≤ zi(y) for i=1, …, K and zj(x) < zj(y) for
least one objective function j. A solution is said to be Pareto optimal if it is not dominated by any other solution in the solution space. A Pareto optimal solution cannot be improved with respect to any objective without worsening at least one other objective. The set of all feasible
Multi-Objective Optimization Using Genetic Algorithms: A Tutorial
Abdullah Konak1, David W. Coit2, Alice E. Smith3
1
Information Sciences and Technology, Penn State Berks-Lehigh Valley Department of Industrial and Systems Engineering, Rutgers University Department of Industrial and Systems Engineering, Auburn University
objective is possible with methods such as utility theory, weighted sum method, etc., but the problem lies in the correct selection of the weights or utility functions to characterize the decision-makers preferences. In practice, it can be very difficult to precisely and accurately select these weights, even for someone very familiar with the problem domain. Unfortunately, small perturbations in the weights can lead to very different solutions. For this reason and others, decision-makers often prefer a set of promising solutions given the multiple objectives. The second general approach is to determine an entire Pareto optimal solution set or a representative subset. A Pareto optimal set is a set of solutions that are nondominated with respect to each other. While moving from one Pareto solution to another, there is always a certain amount of sacrifice in one objective to achieve a certain amount of gain in the other. Pareto optimal solution sets are often preferred to single solutions because they can be practical when considering real-life problems, since the final solution of the decision maker is always a trade-off between crucial parameters. Pareto optimal sets can be of varied sizes, but the size of the Pareto set increases with the increase in the number of objectives. 2. Multi-Objective Optimization Formulation A multi-objective decision problem is defined as follows: Given an n-dimensional decision variable vector x={x1,…,xn} in the solution space X, find a vector x* that minimizes a given set of K objective functions z(x*)={z1(x*),…,zK(x*)}. The solution space X is generally restricted by a series of constraints, such as gj(x*)=bj for j = 1, …, m, and bounds on the decision variables. In many real-life problems, objectives under consideration conflict with each other. Hence, optimizing x with respect to a single objective often results in unacceptable results with respect to the other objectives. Therefore, a perfect multi-objective solution that simultaneously optimizes each objective function is almost impossible. A reasonable solution to a multiobjective problem is to investigate a set of solutions, each of which satisfies the objectives at an acceptable level without being dominated by any other solution. If all objective functions are for minimization, a feasible solution x is said to dominate another feasible solution y ( x

几种多目标进化算法简介

NPGA II-Ranking
NPGA II-Niche Count
NPGA II – 总结评制选择个体进入下一代，出现tie则使用共享机制 3. 计算个体的Niche Count，选择NC值较小的进入下一代 1. 相对而言，效率不错（SGA和ERS），但也不算很好 2. 不使用外部种群，精英保护机制类似于NSGA II
NSGA II-Sorting
NSGA II-Sorting
Crowded Comparison
Crowded Comparison
NSGA II-Main Loop
NSGA II-Main Loop
NSGA II-Main Loop
NSGA II-性能评价
a. 最优秀的多目标进化算法之一。
小生境技术的基本思想是将生物学中的小生境概念应用于进化计算中，将进化计算中的每一代个体划分为若干类，每个类中选出若干适应度较大的个体作为一个类的优秀代表组成一个群，再在种群中，以及不同种群之间，杂交、变异产生新一代的个体种群。
小生境（niche）
小生境计数（Niche Count）用来估计个体 i 所有邻居（小生境内）的拥挤程度
个体支配关系
假设 x 和 y 是群体 P 中不同的两个个体，我们定义
x 支配（dominate） y ，如果满足下列条件：
（1）对所有子目标，都有 x 不差于 y ,
fk ( x) f k ( y), (k 1,2, , r )
（2）至少存在一项子目标，x 优于 y ，即， l {1, 2,
多目标进化算法
多目标进化算法
1. 绪论
2. 主要的多目标进化算法 3. 多目标进化算法性能评价和问题测试集

Genetic Algorithms for multiple objective vehicle routing

a r X i v :0809.0416v 1 [c s .A I ] 2 S e p 2008Genetic Algorithms for multiple objective vehicle routingM.J.Geiger∗∗Production and Operations ManagementInstitute 510-Business AdministrationUniversity of HohenheimEmail:mail@martingeiger.deAbstract The talk describes a general approach of a genetic algorithm for multiple objective optimization problems.A particular dominance relation between the individuals of the population is used to deﬁne a ﬁtness operator,enabling the genetic algorithm to adress even problems with eﬃcient,but convex-dominated alternatives.The algorithm is implemented in a multilingual computer program,solving vehicle routing problems with time windows under multiple objectives.The graphical user interface of the program shows the progress of the genetic algorithm and the main parameters of the approach can be easily modiﬁed.In addition to that,the program provides powerful decision support to the decision maker.The software has proved it´s excellence at the ﬁnals of the European Academic Software Award EASA,held at the Keble college/University of Oxford/Great Britain.1The Genetic Algorithm for multiple objective optimization problems Based on a single objective genetic algorithm,diﬀerent extensions for multiple objective optimization problems are proposed in literature [1,4,8,10]All of them tackle the multiple objective elements by modifying the evaluation and selection operator of the genetic pared to a single objective problem,more than one evaluation functions are considered and the ﬁtness of the individuals cannot be directly calculated from the (one)objective value.Eﬃcient but convex-dominated alternatives are diﬃcult to obtain by integrating the consideredobjectives to a weighted sum (Figure 1).To overcome this problem,an approach of a selection-operator is presented,using only few information and providing a underlying self-adaption technique.In this approach,we use dominance-information of the individuals of the population by calculating for each individual i the number of alternatives ξi from which this individual is dominated.For a population consisting of n pop alternatives we get values of:0≤ξi ≤n pop −1(1)Individuals that are not being dominated by others should receive a higher ﬁtness value than individuals that are being dominated,i.e.:if ξi <ξj →f (i )>f (j )∀i,j =1,...,n pop(2)if ξi =ξj →f (i )=f (j )∀i,j =1,...,n pop (3)ξmax ∗ξi(5) 2The implementation[7]The approach of the genetic algorithm is implemented in a computer program which solves vehicle routing problems with time windows under multiple objectives[6].The examined objectives are:•Minimizing the total distances traveled by the vehicles.•Minimizing the number of vehicles used.•Minimizing the time window violation.•Minimizing the number of violated time windows.The program illustrates the progress of the genetic algorithm and the parameters of the approach of the can simply be controlled by the graphical user interface(Figure2).In addition to the necessary calculations,the obtained alternatives of the vehicle routing problem can easily be compared,as shown in Figure3.For example the alternative with the shortest routes is compared to the alternative having the lowest time window violations.The windows show the routes,travelled by the vehicles from the depot to the customers.The time window violations are visualized with vertical bars at each customer.Red:The vehicle is too late,green:the truck arrives too early.For a more detailed comparison,inverse radar charts and3D-views are available,showing the trade-oﬀbetween the objective values of the selected alternatives(Figure4).Porto,Portugal,July16-20,2001。

Multiobjective optimization using non-dominated sorting in genetic algorithms

One way to solve multiobjective problems is to scalarize the vector of objectives into one objective by averaging the objectives with a weight vector. This process allows a simpler optimization algorithm to be used, but the obtained solution largely depends on the weight vector used in the scalarization process. Moreover, if available, a decision maker may be interested in knowing alternate solutions. Since genetic algorithms (GAs) work with a population of points, a number of Pareto-optimal solutions may be captured using GAs. An early GA application on multiobjective optimization by Scha er (1984) opened a new avenue of research in this eld. Though his algorithm, VEGA, gave encouraging results, it su ered from biasness towards some Pareto-optimal solutions. A new algorithm, Nondominated Sorting Genetic Algorithm (NSGA), is presented in this paper based on Goldberg's suggestion (Goldberg 1989). This algorithm eliminates the bias in VEGA and thereby distributes the population over the entire Pareto-optimal regions. Although there exist two other implementations (Fonesca and Fleming 1993; Horn, Nafpliotis, and Goldberg 1994) based on this idea, NSGA is di erent from their working principles, as explained below.

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

Carlos M.Fonseca†and Peter J.Fleming‡Dept.Automatic Control and Systems Eng.University of SheﬃeldSheﬃeld S14DU,U.K.AbstractThe paper describes a rank-basedﬁtness as-signment method for Multiple Objective Ge-netic Algorithms(MOGAs).Conventionalniche formation methods are extended to thisclass of multimodal problems and theory forsetting the niche size is presented.Theﬁt-ness assignment method is then modiﬁed toallow direct intervention of an external deci-sion maker(DM).Finally,the MOGA is gen-eralised further:the genetic algorithm is seenas the optimizing element of a multiobjectiveoptimization loop,which also comprises theDM.It is the interaction between the twothat leads to the determination of a satis-factory solution to the problem.Illustrativeresults of how the DM can interact with thegenetic algorithm are presented.They alsoshow the ability of the MOGA to uniformlysample regions of the trade-oﬀsurface.1INTRODUCTIONWhilst most real world problems require the simulta-neous optimization of multiple,often competing,cri-teria(or objectives),the solution to such problems isusually computed by combining them into a single cri-terion to be optimized,according to some utility func-tion.In many cases,however,the utility function isnot well known prior to the optimization process.Thewhole problem should then be treated as a multiobjec-tive problem with non-commensurable objectives.Inthis way,a number of solutions can be found whichprovide the decision maker(DM)with insight into thecharacteristics of the problem before aﬁnal solution ischosen.2VECTOR EV ALUATED GENETICALGORITHMSBeing aware of the potential GAs have in multiob-jective optimization,Schaﬀer(1985)proposed an ex-tension of the simple GA(SGA)to accommodate vector-valuedﬁtness measures,which he called the Vector Evaluated Genetic Algorithm(VEGA).The se-lection step was modiﬁed so that,at each generation, a number of sub-populations was generated by per-forming proportional selection according to each ob-jective function in turn.Thus,for a problem with q objectives,q sub-populations of size N/q each would be generated,assuming a population size of N.These would then be shuﬄed together to obtain a new popu-lation of size N,in order for the algorithm to proceed with the application of crossover and mutation in the usual way.However,as noted by Richardson et al.(1989),shuf-ﬂing all the individuals in the sub-populations together to obtain the new population is equivalent to linearly combining theﬁtness vector components to obtain a single-valuedﬁtness function.The weighting coeﬃ-cients,however,depend on the current population. This means that,in the general case,not only will two non-dominated individuals be sampled at diﬀer-ent rates,but also,in the case of a concave trade-oﬀsurface,the population will tend to split into diﬀer-ent species,each of them particularly strong in one of the objectives.Schaﬀer anticipated this property of VEGA and called it speciation.Speciation is unde-sirable in that it is opposed to the aim ofﬁnding a compromise solution.To avoid combining objectives in any way requires a diﬀerent approach to selection.The next section de-scribes how the concept of inferiority alone can be used to perform selection.3A RANK-BASED FITNESSASSIGNMENT METHOD FORMOGAsConsider an individual x i at generation t which is dom-inated by p(t)i individuals in the current population.Its current position in the individuals’rank can be given byrank(x i,t)=1+p(t)i.All non-dominated individuals are assigned rank1,see Figure1.This is not unlike a class of selection meth-ods proposed by Fourman(1985)for constrained opti-mization,and correctly establishes that the individual labelled3in theﬁgure is worse than individual labelled 2,as the latter lies in a region of the trade-oﬀwhich is less well described by the remaining individuals.The13511211f1f2Figure1:Multiobjective Rankingmethod proposed by Goldberg(1989,p.201)would treat these two individuals indiﬀerently. Concerningﬁtness assignment,one should note that not all ranks will necessarily be represented in the pop-ulation at a particular generation.This is also shown in the example in Figure1,where rank4is absent. The traditional assignment ofﬁtness according to rank may be extended as follows:1.Sort population according to rank.2.Assignﬁtnesses to individuals by interpolatingfrom the best(rank1)to the worst(rank n∗≤N) in the usual way,according to some function,usu-ally linear but not necessarily.3.Average theﬁtnesses of individuals with the samerank,so that all of them will be sampled at the same rate.Note that this procedure keeps the global populationﬁtness constant while maintain-ing appropriate selective pressure,as deﬁned by the function used.Theﬁtness assignment method just described appears as an extension of the standard assignment ofﬁtness according to rank,to which it maps back in the case of a single objective,or that of non-competing objectives.4NICHE-FORMATION METHODS FOR MOGAsConventionalﬁtness sharing techniques(Goldberg and Richardson,1987;Deb and Goldberg,1989)have been shown to be to eﬀective in preventing genetic drift,in multimodal function optimization.However,they in-troduce another GA parameter,the niche sizeσshare, which needs to be set carefully.The existing theory for setting the value ofσshare assumes that the solu-tion set is composed by an a priori knownﬁnite num-ber of peaks and uniform niche placement.Upon con-vergence,local optima are occupied by a number of individuals proportional to theirﬁtness values.On the contrary,the global solution of an MO prob-lem isﬂat in terms of individualﬁtness,and there is no way of knowing the size of the solution set before-hand,in terms of a phenotypic metric.Also,local optima are generally not interesting to the designer, who will be more concerned with obtaining a set of globally non-dominated solutions,possibly uniformly spaced and illustrative of the global trade-oﬀsurface. The use of ranking already forces the search to concen-trate only on global optima.By implementingﬁtness sharing in the objective value domain rather than the decision variable domain,and only between pairwise non-dominated individuals,one can expect to be able to evolve a uniformly distributed representation of the global trade-oﬀsurface.Niche counts can be consistently incorporated into the extendedﬁtness assignment method described in the previous section by using them to scale individualﬁt-nesses within each rank.The proportion ofﬁtness allo-cated to the set of currently non-dominated individuals as a whole will then be independent of their sharing coeﬃcients.4.1CHOOSING THE PARAMETERσshare The sharing parameterσshare establishes how far apart two individuals must be in order for them to decrease each other’sﬁtness.The exact value which would allow a number of points to sample a trade-oﬀsurface only tangentially interfering with one another obviously de-pends on the area of such a surface.As noted above in this section,the size of the set of so-lutions to a MO problem expressed in the decision vari-able domain is not known,since it depends on the ob-jective function mappings.However,when expressed in the objective value domain,and due to the deﬁni-tion of non-dominance,an upper limit for the size of the solution set can be calculated from the minimum and maximum values each objective assumes within that set.Let S be the solution set in the decision variable domain,f(S)the solution set in the objective domain and y=(y1,...,y q)any objective vector in f(S).Also,letm=(miny y1,...,minyy q)=(m1,...,m q)M=(maxy y1,...,maxyy q)=(M1,...,M q)as illustrated in Figure2.The deﬁnition of trade-oﬀsurface implies that any line parallel to any of the axes will have not more than one of its points in f(S),which eliminates the possibility of it being rugged,i.e.,each objective is a single-valued function of the remaining objectives.Therefore,the true area of f(S)will be less than the sum of the areas of its projections according to each of the axes.Since the maximum area of each projection will be at most the area of the correspond-ing face of the hyperparallelogram deﬁned by mand Figure2:An Example of a Trade-oﬀSurface in3-Dimensional SpaceM,the hyperarea of f(S)will be less thanA=qi=1qj=1j=i(M j−m j)which is the sum of the areas of each diﬀerent face of a hyperparallelogram of edges(M j−m j)(Figure3). In accordance with the objectives being non-commensurable,the use of the∞-norm for measuring the distance between individuals seems to be the most natural one,while also being the simplest to compute. In this case,the user is still required to specify an indi-vidualσshare for each of the objectives.However,the metric itself does not combine objective values in any way.Assuming that objectives are normalized so that all sharing parameters are the same,the maximum num-ber of points that can sample area A without in-terfering with each other can be computed as the number of hypercubes of volumeσqsharethat can be placed over the hyperparallelogram deﬁned by A(Fig-ure4).This can be computed as the diﬀerence in vol-ume between two hyperparallelograms,one with edges (M i−m i+σshare)and the other with edges(M i−m i), divided by the volume of a hypercube of edgeσshare, i.e.N=qi=1(M i−m i+σshare)−qi=1(M i−m i)Figure3:Upper Bound for the Area of a Trade-oﬀSurface limited by the Parallelogram deﬁned by (m1,m2,m3)and(M1,M2,M3)(q−1)-order polynomial equationNσq−1share −qi=1(M i−m i+σshare)−qi=1(M i−m i)Pareto set of interest to the DM by providing external information to the selection algorithm.Theﬁtness assignment method described earlier was modiﬁed in order to accept such information in the form of goals to be attained,in a similar way to that used by the conventional goal attainment method(Gembicki,1974),which will now be brieﬂy introduced.5.1THE GOAL ATTAINMENT METHOD The goal attainment method solves the multiobjective optimization problem deﬁned asminx∈Ωf(x)where x is the design parameter vector,Ωthe feasible parameter space and f the vector objective function, by converting it into the following nonlinear program-ming problem:minλ,x∈Ωλsuch thatf i−w iλ≤g iHere,g i are goals for the design objectives f i,and w i≥0are weights,all of them speciﬁed by the de-signer beforehand.The minimization of the scalarλleads to theﬁnding of a non-dominated solution which under-or over-attains the speciﬁed goals to a degree represented by the quantities w iλ.5.2A MODIFIED MO RANKINGSCHEME TO INCLUDE GOALINFORMATIONThe MO ranking procedure previously described was extended to accommodate goal information by altering the way in which individuals are compared with one another.In fact,degradation in vector components which meet their goals is now acceptable provided it results in the improvement of other components which do not satisfy their goals and it does not go beyond the goal boundaries.This makes it possible for one to prefer one individual to another even though they are both non-dominated.The algorithm will then identify and evolve the relevant region of the trade-oﬀsurface. Still assuming a minimization problem,consider two q-dimensional objective vectors,y a=(y a,1,...,y a,q) and y b=(y b,1,...,y b,q),and the goal vector g= (g1,...,g q).Also consider that y a is such that it meets a number,q−k,of the speciﬁed goals.Without loss of generality,one can write∃k=1,...,q−1:∀i=1,...,k,∀j=k+1,...,q,(y a,i>g i)∧(y a,j≤g j)(A) which assumes a convenient permutation of the objec-tives.Eventually,y a will meet none of the goals,i.e.,∀i=1,...,q,(y a,i>g i)(B)or even all of them,and one can write∀j=1,...,q,(y a,j≤g j)(C) In theﬁrst case(A),y a meets goals k+1,...,q and, therefore,will be preferable to y b simply if it domi-nates y b with respect to itsﬁrst k components.For the case where all of theﬁrst k components of y a are equal to those of y b,y a will still be preferable to y b if it dominates y b with respect to the remaining com-ponents,or if the remaining components of y b do not meet all their goals.Formally,y a will be preferable to y b,if and only ify a,(1,...,k)p<y b,(1,...,k) ∨y a,(1,...,k)=y b,(1,...,k) ∧y a,(k+1,...,q)p<y b,(k+1,...,q) ∨∼ y b,(k+1,...,q)≤g(k+1,...,q)In the second case(B),y a satisﬁes none of the goals. Then,y a is preferable to y b if and only if it dominates y b,i.e.,y a p<y bFinally,in the third case(C)y a meets all of the goals, which means that it is a satisfactory,though not nec-essarily optimal,solution.In this case,y a is preferable to y b,if and only if it dominates y b or y b is not satis-factory,i.e.,(y a p<y b)∨∼(y b≤g)The use of the relation preferable to as just described, instead of the simpler relation partially less than,im-plies that the solution set be delimited by those non-dominated points which tangentially achieve one or more goals.Setting all the goals to±∞will make the algorithm try to evolve a discretized description of the whole Pareto set.Such a description,inaccurate though it may be,can guide the DM in reﬁning its requirements.When goals can be supplied interactively at each GA generation, the decision maker can reduce the size of the solution set gradually while learning about the trade-oﬀbe-tween objectives.The variability of the goals acts as a changing environment to the GA,and does not im-pose any constraints on the search space.Note that appropriate sharing coeﬃcients can still be calculated as before,since the size of the solution set changes in a way which is known to the DM.This strategy of progressively articulating the DM preferences,while the algorithm runs,to guide the search,is not new in operations research.The main disadvantage of the method is that it demands a higher eﬀort from the DM.On the other hand,it potentially reduces the number of function evaluations required when compared to a method for a posteriori articula-tion of preferences,as well as providing less alternativeddddDM a priori knowledgeGAobjective function values ﬁtnesses(acquired knowledge)resultsFigure 5:A General Multiobjective Genetic Optimizerpoints at each iteration,which are certainly easier for the DM to discriminate between than the whole Pareto set at once.6THE MOGA AS A METHOD FOR PROGRESSIVE ARTICULATION OF PREFERENCESThe MOGA can be generalized one step further.The DM action can be described as the consecutive evalu-ation of some not necessarily well deﬁned utility func-tion .The utility function expresses the way in which the DM combines objectives in order to prefer one point to another and,ultimately,is the function which establishes the basis for the GA population to evolve.Linearly combining objectives to obtain a scalar ﬁt-ness,on the one hand,and simply ranking individuals according to non-dominance,on the other,both corre-spond to two diﬀerent attitudes of the DM.In the ﬁrst case,it is assumed that the DM knows exactly what to optimize,for example,ﬁnancial cost.In the second case,the DM is making no decision at all apart from letting the optimizer use the broadest deﬁnition of MO optimality.Providing goal information,or using shar-ing techniques,simply means a more elaborated atti-tude of the DM,that is,a less straightforward utility function,which may even vary during the GA process,but still just another utility function.A multiobjective genetic optimizer would,in general,consist of a standard genetic algorithm presenting the DM at each generation with a set of points to be as-sessed.The DM makes use of the concept of Pareto optimality and of any a priori information available to express its preferences,and communicates them to the GA,which in turn replies with the next generation.At the same time,the DM learns from the data it is presented with and eventually reﬁnes its requirements until a suitable solution has been found (Figure 5).In the case of a human DM,such a set up may require reasonable interaction times for it to become attrac-tive.The natural solution would consist of speedingup the process by running the GA on a parallel ar-chitecture.The most appealing of all,however,would be the use of an automated DM,such as an expert system.7INITIAL RESULTSThe MOGA is currently being applied to the step response optimization of a Pegasus gas turbine en-gine.A full non-linear model of the engine (Han-cock,1992),implemented in Simulink (MathWorks,1992b),is used to simulate the system,given a num-ber of initial conditions and the controller parameter settings.The GA is implemented in Matlab (Math-Works,1992a;Fleming et al.,1993),which means that all the code actually runs in the same computation en-vironment.The logarithm of each controller parameter was Gray encoded as a 14-bit string,leading to 70-bit long chro-mosomes.A random initial population of size 80and standard two-point reduced surrogate crossover and binary mutation were used.The initial goal values were set according to a number of performance require-ments for the engine.Four objectives were used:t r The time taken to reach 70%of the ﬁnal output change.Goal:t r ≤0.59s.t s The time taken to settle within ±10%of the ﬁnal output change.Goal:t s ≤1.08s.os Overshoot,measured relatively to the ﬁnal output change.Goal:os ≤10%.err A measure of the output error 4seconds after thestep,relative to the ﬁnal output change.Goal:err ≤10%.During the GA run,the DM stores all non-dominated points evaluated up to the current generation.This constitutes acquired knowledge about the trade-oﬀs available in the problem.From these,the relevant points are identiﬁed,the size of the trade-oﬀsurface estimated and σshare set.At any time in the optimiza-trts ov err 00.20.40.60.81N o r m a l i z e d o b j e c t i v e v a l u e s Objective functions0.59s 1.08s 10% 10%Figure 6:Trade-oﬀGraph for the Pegasus Gas Turbine Engine after 40Generations (Initial Goals)tion process,the goal values can be changed,in order to zoom in on the region of interest.A typical trade-oﬀgraph,obtained after 40genera-tions with the initial goals,is presented in Figure 6and represents the accumulated set of satisfactory non-dominated points.At this stage,the setting of a much tighter goal for the output error (err ≤0.1%)reveals the graph in Figure 7,which contains a subset of the points in Figure 6.Continuing to run the GA,more deﬁnition can be obtained in this area (Figure 8).Fig-ure 9presents an alternative view of these solutions,illustrating the arising step responses.8CONCLUDING REMARKSGenetic algorithms,searching from a population of points,seem particularly suited to multiobjective opti-mization.Their ability to ﬁnd global optima while be-ing able to cope with discontinuous and noisy functions has motivatedan increasing number of applications in engineering and related ﬁelds.The development of the MOGA is one expression of our wish to bring decision making into engineering design,in general,and control system design,in particular.An important problem arising from the simple Pareto-based ﬁtness assignment method is that of the global size of the solution plex problems can be expected to exhibit a large and complex trade-oﬀsur-face which,to be sampled accurately,would ultimately overload the DM with virtually useless information.Small regions of the trade-oﬀsurface,however,can still be sampled in a Pareto-based fashion,while the deci-sion maker learns and reﬁnes its requirements.Niche formation methods are transferred to the objective value domain in order to take advantage of the prop-erties of the Paretoset.Figure 7:Trade-oﬀGraph for the Pegasus Gas Turbine Engine after 40Generations (New Goals)Figure 8:Trade-oﬀGraph for the Pegasus Gas Turbine Engine after 60Generations (New Goals)Figure 9:Satisfactory Step Responses after 60Gener-ations (New Goals)Initial results,obtained from a real world engineering problem,show the ability of the MOGA to evolve uni-formly sampled versions of trade-oﬀsurface regions. They also illustrate how the goals can be changed dur-ing the GA run.Chromosome coding,and the genetic operators them-selves,constitute areas for further study.Redundant codings would eventually allow the selection of the ap-propriate representation while evolving the trade-oﬀsurface,as suggested in(Chipperﬁeld et al.,1992). The direct use of real variables to represent an indi-vidual together with correlated mutations(B¨a ck et al., 1991)and some clever recombination operator(s)may also be interesting.In fact,correlated mutations should be able to identify how decision variables re-late to each other within the Pareto set.AcknowledgementsTheﬁrst author gratefully acknowledges support by Programa CIENCIA,Junta Nacional de Investiga¸c˜a o Cient´ıﬁca e Tecnol´o gica,Portugal.ReferencesB¨a ck,T.,Hoﬀmeister,F.,and Schwefel,H.-P.(1991).A survey of evolution strategies.In Belew,R.,editor,Proc.Fourth Int.Conf.on Genetic Algo-rithms,pp.2–9.Morgan Kaufmann.Chipperﬁeld, A.J.,Fonseca, C.M.,and Fleming, P.J.(1992).Development of genetic optimiza-tion tools for multi-objective optimization prob-lems in CACSD.In IEE Colloq.on Genetic Algo-rithms for Control Systems Engineering,pp.3/1–3/6.The Institution of Electrical Engineers.Di-gest No.1992/106.Deb,K.and Goldberg,D.E.(1989).An investigation of niche and species formation in genetic func-tion optimization.In Schaﬀer,J.D.,editor,Proc.Third Int.Conf.on Genetic Algorithms,pp.42–50.Morgan Kaufmann.Farshadnia,R.(1991).CACSD using Multi-Objective Optimization.PhD thesis,University of Wales, Bangor,UK.Fleming,P.J.(1985).Computer aided design of regulators using multiobjective optimization.In Proc.5th IFAC Workshop on Control Applica-tions of Nonlinear Programming and Optimiza-tion,pp.47–52,Capri.Pergamon Press. Fleming,P.J.,Crummey,T.P.,and Chipperﬁeld,A.J.(1992).Computer assisted control systemdesign and multiobjective optimization.In Proc.ISA Conf.on Industrial Automation,pp.7.23–7.26,Montreal,Canada.Fleming,P.J.,Fonseca,C.M.,and Crummey,T.P.(1993).Matlab:Its toolboxes and open struc-ture.In Linkens,D.A.,editor,CAD for Control Systems,chapter11,pp.271–286.Marcel-Dekker. Fourman,M.P.(1985).Compaction of symbolic lay-out using genetic algorithms.In Grefenstette, J.J.,editor,Proc.First Int.Conf.on Genetic Algorithms,pp.141–wrence Erlbaum. Gembicki,F.W.(1974).Vector Optimization for Con-trol with Performance and Parameter Sensitivity Indices.PhD thesis,Case Western Reserve Uni-versity,Cleveland,Ohio,USA.Goldberg,D.E.(1989).Genetic Algorithms in Search, Optimization and Machine Learning.Addison-Wesley,Reading,Massachusetts.Goldberg,D.E.and Richardson,J.(1987).Genetic algorithms with sharing for multimodal function optimization.In Grefenstette,J.J.,editor,Proc.Second Int.Conf.on Genetic Algorithms,pp.41–wrence Erlbaum.Hancock,S.D.(1992).Gas Turbine Engine Controller Design Using Multi-Objective Optimization Tech-niques.PhD thesis,University of Wales,Bangor, UK.MathWorks(1992a).Matlab Reference Guide.The MathWorks,Inc.MathWorks(1992b).Simulink User’s Guide.The MathWorks,Inc.Richardson,J.T.,Palmer,M.R.,Liepins,G.,and Hilliard,M.(1989).Some guidelines for genetic algorithms with penalty functions.In Schaﬀer, J.D.,editor,Proc.Third Int.Conf.on Genetic Algorithms,pp.191–197.Morgan Kaufmann. Schaﬀer,J.D.(1985).Multiple objective optimiza-tion with vector evaluated genetic algorithms.In Grefenstette,J.J.,editor,Proc.First Int.Conf.on Genetic Algorithms,pp.93–wrence Erl-baum.Wienke,D.,Lucasius,C.,and Kateman,G.(1992).Multicriteria target vector optimization of analyt-ical procedures using a genetic algorithm.Part I.Theory,numerical simulations and application to atomic emission spectroscopy.Analytica Chimica Acta,265(2):211–225.。