The Fully Informed Particle Swarm Simpler, Maybe Better
A modified particle swarm optimizer
1. INTRODUCTION
Evolutionary computation techniques (evolutionary programming [4], genetic algorithms [5], evolutionary strategies [9] and genetic programming [SI) are motivated by the evolution of nature. A population of individuals. which encode the problem solutions. are manipulated accordingto the rule of survival of the fittest through “genetic” operations, such as mutation, crossover and reproduction. A best solution is evolved through the generations. These kjnds of techniques have been successfully applied in many areas and a lot of new applicationsare expected to appear. In contrast to evolutionary computation techniques, Eberhart and Kennedy developed a different algorithm through simulatingsocial behavior [2,3,6,7J As in other algorithms,a population of individuals exists. This algorithm is called particle swarm optimization (PSO) since it resembles a school of flying birds. In a particle swarm optimizer, instead of using genetic operators, these individuals are “evolved” by cooperation and competition among the individuals themselves through generations. Each particle adjusts its flying accordingto its own flying experience and its companions’ flying experience. Each individual is named as a “particle” which, in fact, represents a potential solution to a problem. Each particle is treated as a point in a Ddimensional space. The ith particle is represented as
Smooth Path Planning of a Mobile Robot Using Stochastic Particle Swarm Optimization
Smooth Path Planning of a Mobile Robot Using Stochastic Particle Swarm OptimizationXin Chen and Yangmin LiDepartment of Electromechanical EngineeringFaculty of Science and Technology,University of MacauAv.Padre Tom´a s Pereira S.J.,Taipa,Macao SAR,P.R.ChinaEmail:{ya27407,ymli}@umac.moAbstract—This paper proposes a new approach using im-proved particle swarm optimization(PSO)to optimize the path of a mobile robot through an environment containing static obstacles.Relative to many optimization methods that produce nonsmooth paths,the PSO method developed in this paper can generate smooth paths,which are more preferable for designing continuous control technologies to realize path following using mobile robots.To reduce computational cost of optimization, the stochastic PSO(S-PSO)with high exploration ability is developed,so that a swarm with small size can accomplish path planning.Simulation results validate the proposed algorithm in a mobile robot path planning.I.I NTRODUCTIONSmooth navigation for a mobile robot is a key factor to achieve a task well,therefore this research direction is very hot in recent years.To form a good navigation,path planning algorithms are necessary.Path planning is a task of how to generate a safety path connecting the start and the destination in a known or unknown environment according to some requirements in terms of the shortest path and obstacles avoidance.Generally path planning can be divided into two categories.Thefirst category is a real-time reactive way.A path planning algorithm among polyhedral obstacles based on the geometry graph was proposed in1979[1].The artificial potentialfield(APF)method is widely used,which provides simple and effective motion planning for practical purpose so that robotic movement is controlled by artificial force resulting from virtual potential profile[2][3].Because it is an entirely real-time computation,it is difficult to predict the path ahead of a motion,even if the path can lead the robot to reach the destination,whether the path is optimal or not can not be estimated.Furthermore local potential minimum is an important shortage which may induce failure in navigation. The second category is an off-line path planning way.Off-line path planning for a mobile robot depends on pre-mission knowledge on the space features of the landscape between the start of the robot and thefinal position.It is also the topic discussed in this paring to real-time path planning in which APF is viewed as the primary technique,there are more techniques for off-line path planning.For example, a fuzzy terrain-based path planning is proposed where the traversal difficulty of the terrain is described by a traversable map using fuzzy logic technique[4].And neural network is used for complete coverage path planning with obstacle avoidance in non-stationary environment[5].There is no doubt that path planning can be viewed as an optimization problem,and the requirements of a path can be described by some evaluation functions,such as the shortest distance under collision-free condition.So evolutionary com-putational methods,for instance genetic algorithms(GAs),are used in solving the optimization of path planning successfully [6][7].In[8],a knowledge based genetic algorithm for path planning of a mobile robot is proposed,which uses problem-specific genetic algorithms for robot path planning instead of the standard GAs.Inspired by social behavior in the nature, some intelligent techniques simulating swarm behaviors of ants or birds are developed to solve optimization problems and path planning.There are two important techniques used in the path planning.Thefirst one is Ant Colony Optimization (ACO)in which agents simulate behaviors of ants to detect the existence of the intra-class pheromone left by other“ants”to get a shortest path[9].The second method is particle swarm optimization(PSO)which is abstracted from swarm behavior. PSO was developed in1995[10][11],now it becomes a very important method for solving optimization problems,including path planning[12][13][14][15].In all references mentioned above,the paths generated by off-line path planning are nonsmooth paths.Since in practice the mobile robots used are normally a kind of nonholonomic car-like robots,a smooth path will be more suitable for such nonholonomic robots.Hence in this paper,we propose a new path planning,which is able to generate safe smooth paths described by high order polynomial[16]using PSO algorithm. Consequently the smooth path generated is predictable,and one can estimate the feasibility of path ahead of robot moving. In addition to the generation of the safety path,another im-portant issue on optimization is the computational cost.Since in PSO paradigm a lot of particles are employed to search the solution,the significant way for reducing computational cost is to decrease the swarm size,or the number of particles,while the performance of PSO should be maintained.Hence a new modified PSO named stochastic PSO(S-PSO)is proposed in the paper which possesses higher exploration ability than the traditional algorithm,in order that a swarm with small size is employed to accomplish path planning.This paper is organized as follows.Section II presents theProceedings of the2006IEEE International Conference on Mechatronics and Automation June25-28,2006,Luoyang,Chinabasic description of S-PSO.Section III introduces the algo-rithm for path planning with obstacle avoidance.In Section III,the simulations on the algorithm are proposed to test the feasibility of this path planning.Some conclusions are drawn in the last section.II.S TOCHASTIC PSO D ESCRIPTIONAhead of discussing path planning for a mobile robot,we propose a modified particle swarm optimization algorithm,named stochastic PSO (S-PSO)with high exploration ability,and introduce its properties to explain why we adopt such PSO.Theorem:A Stochastic PSO (S-PSO)is described as follows.Let F (n )be a sequence of sub-σ-algebras of F such that F (n )⊂F (n +1),for all n .For a swarm including M particles,the position of a particle i is defined as X i =[x i 1x i 2···x iD ]T ,where D represents the dimension of swarm space.The updating principle for individual particle is defined asv i (n +1)=ε(n )v i (n )+c 1i r 1i (n )(Y d i (n )−X i (n ))+c 2i r 2i (n )(Y g i (n )−X i (n ))+ξi (n )]X i (n +1)=αX i (n )+v i (n +1)+1−αφi (n )(c 1i r 1i (n )Y d i (n )+c 2i r 2i (n )Y g i (n )),(1)where c 1i and c 2i are positive constants;r 1i (n )and r 2i (n )are F (n )-measurable random variables;Y d i (n )represents the best position that particle i has found so far,which is of the form Y d i (n )=arg min k ≤n F (X i (k )),where F (·)represents a fitness function to be decreased;Y g i (n )represents the best position found by particle i ’s neighborhood,which is of the form Y g i (n )=arg min j ∈Πi F (X j (n ));φi (n )=φ1i (n )+φ2i (n ),where φ1i (n )=c 1i r 1i (n ),φ2i (n )=c 2i r 2i (n ).Suppose the following assumptions hold:(1)ξi (n )is a random variable with continuous uniform distribution,which provides additional exploration velocity.It has constant expectation denoted by Ξi =E ξi (n );(2)ε(n )→0with n increasing,and Σ∞n =0εn =∞;(3)0<α<1;(4)r 1i (n )and r 2i (n )are independent variables satisfying continuous uniform distribution in [0,1],whose expectations are 0.5.And let Φ1i =E φ1i (n )and Φ2i =E φ2i (n )respectively.Then swarm must converge with probability one.Let Y ∗=inf λ∈(R D )F (λ)represent the unique optimal position in solution space.Then swarm must converge to Y ∗if lim n Y d i (n )→Y ∗and lim n Y g i (n )→Y ∗.Due to limitations of pages,the proof on this theorem is ignored.There are two main characters of S-PSO:1ε(n )looks like a kind of inertia weight used in tra-ditional updating principle.But here ε(n )decreases to zero,while normally the inertia weight is bounded no less than a positive scalar,such as 0.4[17].2A stochastic component ξi (n )is added into the updating principle,which does not depend on any best solution recorded.In many improvements on PSO,the additional methods for avoiding local minimum such as mutation [13]and restarting reiteration,are dependent on the best solution recorded,which increase complexity of the algorithm and make the algorithms be more difficult to analyze mathematically.From these two characters,the following properties are proposed which are very useful in the application of path planning.Property 1:At the beginning or n is not large enough,the individual updating principle is nonconvergent so that a particle will move away from the best position recorded by itself and its neighborhood.This phenomenon can be viewed as a strong exploration that all particles are wandering in the swarm space to make swarm distribute as broadly as possible.During this process,the particles still record their best solution found so far.Therefore all particles are indeed collecting information of distribution of the best record.And when n >N k ,where N k is a large enough integer,a swarm starts to aggregate by interactions among particles.Hence a proper designed duration of divergence will enable the swarm cover the solution space broadly,so that when a swarm converges,it will be more efficient to avoid local minimum.Property 2:If ξ(n )is ignored,the updating equation of the velocity is almost the same as the original PSO’s.Then the exploration ability of particle is entirely dependent on relative differences between particle’s current position and best positions recorded by itself and its neighborhood.If a swarm aggregates too fast,the intension of exploration behavior is reduced too fast consequently.Therefore an additional exploration velocity ξ(n )is very useful to maintain intension of exploration,while the convergence of a swarm is still guaranteed.For the task of path planning,in addition to the main purpose to find the proper path,the other two important issues should be considered,the local minimum and the computational cost.For a practical environment with obstacles,any path planning using optimization technique can generate more than one path connecting the position of a mobile robot and the destination.But there always is the best path which is with respect to the best evaluation,or the best fitness.Since S-PSO has high exploration ability,it is more efficient to escape from the local minimum,so that it is able to find the best path with high probability.Moreover since S-PSO has higher exploration ability than traditional PSO,a swarm with smaller size will perform as good as,even better than other PSO with larger swarm size.Therefore a relative small swarm can be used in the path planning to reduce computational cost greatly.III.P ATH P LANNING U SING PSOTo realize the proposed path planning algorithm,some assumptions are made as follows:•The mobile robot is a kind of car-like robot with two wheels driven,which obey nonholonomic constraints andwhose size can not be ignored.The mobile robot moves in a two-dimension space;•The path should be smooth continuous;•There are several static obstacles whose heights are ignored.Hence the purpose of developing path planning algorithm is tofind out the shortest smooth path connecting the robot and the destination,when the robot is moving along the path, it can avoid collisions with any obstacles.The benefit of a smooth path is that it is easy to design a control method to enable a nonholonomic mobile robot follow a smooth path. In following subsections,we will introduce all details of the algorithm and propose the architecture of the whole algorithm.A.Description on the Leader Desired TrajectorySince the path is a smooth one in the two-dimension space, it can be described by an algebraic cubic spline.Let p d=[p d x,p d y]T represent the points on a path.Assume that there exists a virtual point moving along the path,and its coordinate in X-direction is predetermined with respect to time,ie.p d x(t)=ϕ(t),the smooth path can be denoted by an algebraic cubic spline.p d y(t)=a n(p d x(t))n+a n−1(p d x(t))n−1+···+a0.(2) Since the mobile robot satisfies nonholonomic constraints, except for the requirement of reaching the destination,there is another requirement that the robot should reach the destination with a certain heading angle.That means if the robot follows the path,it reaches the destination with a special heading angle.Hence the derivative of the end of the path should be specified.Similarly let the heading angle of the robot ahead of path planning be the derivative of the start of the path.Then the boundary requirements of the path can be expressed asp d(t0)=P t0,p d(t f)=P t f,dp dy dp dx |t=t=θt0,dp d ydp dx|t=tf=θt f,where[t0,t f]represents the duration through which the virtual points move from the start of the desired trajectory to the end;P t0represents the position of the robot ahead of path planning;P t f represents the destination;θt0represents the heading angle of the robot ahead of path planning,andθt f represents the derivative of the end of the path.If only consider the boundary condition,the path can be chosen to be a three-order polynomial.Instead of three-order one,a higher order polynomial for path planning is chosen to avoid obstacles.The order of the path can be determined according to the dense of obstacles and processing ability of the platform used for path planning.If let N represent the order of the path,there are N+1parameters,a0to a N,to describe the path.Since there are four boundary conditions, N−4parameters are chosen from a0to a N to be free parameters,and the other four parameters are expressed as the functions of these free parameters.Then these free parameters are the values for optimization.B.S-PSO Path Planning Algorithm(1)A General S-PSO algorithmIn the previous section,Theorem1describes the basic idea of the stochastic PSO.Here we just propose the best records Y diand Y g i in terms of practical forms which are used in realization of S-PSO.The best position record found by particle i is updated by Y d i(n+1)=Y d i(n),F(X i(n+1))≥F(Y d i(n))X i(n+1),F(X i(n+1))<F(Y d i(n)).(3) The global best position found by particle i’s neighborhoods is modified byY gi(n+1)=arg minj∈ΠiF(Y d j(n+1)),(4) whereΠi represents the neighborhoods of a particle i. (2)Interaction topology in the swarmThe global best potential solution for particle i is obtained by comparing all best positions recorded by its neighborhoods, just like(4)shows.The relationship of neighborhoods can be described by a graph[18].Fig.1shows the interaction topology used in this paper,in which each particle takes other six particles as its neighborhoods.Fig.1.A netlike interaction topology.(3)Fitness evaluationThe goal of PSO is to minimize afitness function F(·). Therefore the question of path planning should be described as an evaluation function,orfitness function,so that path planning can be transformed to an optimization problem.Here a combinedfitness function is proposed with respect to two requirements:(i)arriving at destination along the trajectory as soon as possible;(ii)avoiding obstacles.1)Fitness function with respect to trajectory’s length. The length of trajectory can be calculated via such a direct way that F path=Γ1+(dp d ydp d x)dp d x,whereΓrepresents the path.But it is too difficult to solve this integral analytically due to extraction operator.Hence anotherfitness function is chosen.If we let the X-axis of the universal reference frame be along the beeline connecting the leader and the destination, thefitness function can be expressed byF path=p d(t f)x(p d y)2dp d x.(5)It reflects the intention that the desired path should be as close as possible to the beeline connecting two ends of the path.2)Fitness function with respect to obstacle avoidance.In order to avoid obstacles,the shortest distance between obstacles and all points on the path should be larger than a critical or safe threshold.If we define such threshold as ρeff ,and let Ωand Ψrepresent the set of points on the path and the set of obstacles,this intention can be expressed as ∀t,∀i ∈Ω,∀j ∈Ψ,min {ρij }≥ρeff ,where ρij representsthe distance between point i and obstacle j .If let ρmin j=min {ρij },given ∀t,∀i ∈Ω,∀j ∈Ψ,an evaluation function for obstacle avoidance is designed asF obstacle j = μ(1 ρmin j−1ρeff ), ρminj ≤ρeff j 0, ρmin j >ρeffj .(6)Therefore the key point is to find out ρmin j.Fig.2shows an example on calculating such distance,in which there is a path denoted for optimization,on which a virtual robot is drawn to denote the virtual moving point.Then the minimal distance between obstacle m and the moving point is the minimaldistance between obstacle and the path.A critical point P c mis defined such that a beeline through the center of the obstacle intersects with the trajectory perpendicularly on it.Then if we can find this critical point,the minimal distance can also be obtained.the moving pointFig.2.A snapshot of formation with two robots at time t s .Hence the key point to construct a fitness function of obstacle avoidance is to find this critical point.Given obstacle m in Fig.2,if at instant t s ,the moving point reaches the position denoted by the robot in the Figure,passing through the position of the moving point,a perpendicular denoted by dashed line is drawn.And we draw a beeline connecting the robot with obstacle m .Obviously if the moving point is at the critical point,the connecting line must coincide with the perpendicular line.Therefore finding critical point can also be viewed as an optimization problem such that the fitness function about evaluating critical point is expressed asF criticalpoint j= 1+p o jy −p c jy p o jx −p c jx·dp d y dp d x P d =P c j2,(7)where the subscript j indicates that F crosspoint j only represents the fitness function of critical point with respect to obstaclej ,P c j =[p c j 1p c j 2]T and P o j =[p o j 1p o j 2]T represent the coordinates of the critical point and the obstacle respectively.Based upon the analysis mentioned above,if the path planning algorithm can handle M obstacles at one time,the fitness function in S-PSO is in the following combined form F =ω1·F path +ω2·M i =1F crosspoint i +ω3·M i =1F obstacle i ,(8)where ω1to ω3represent positive weights.(4)Description of particles in swarmBased on the description of the path and fitness functions,the dimension of solution space can be determined.Firstly all free parameters are chosen in order to describe a desired path.For example,if an order N cubic spline is chosen to describe the desired path,we select a 0to a N −4as the free parameters.And for an obstacle,we need to find the critical point with respect to every obstacle.Since a virtual moving point is proposed to make the desired path be a function of time,the critical point for obstacle j can also be expressed as a function of time T c j ,that means the moving point will reach the critical point at time T c j .Therefore if we assume that M obstacles are handled for path planning,the position of an arbitrary particle is in the form of X =[a 0a 1···a N −4T c 1T c 2···T c M ]T.(5)The realization process of S-PSO path planning To make the algorithm of path planning be more clear,we describe the algorithm in the following steps:Step 1:Initialize S-PSO by distributing all particles within the solution space randomly.Take the output of S-PSO as pa-rameters to construct the initial path according to the boundary conditions.Initiate a moving point on the start of the path.In whole process,the moving point will move to the destination on constant velocity.Construct an obstacle set to record the nearest M obstacles of the moving point.Initiate the obstacle set as zero.Step 2:Start the moving point.Step 3:If the moving point reaches the destination,the path planning is successful.Then end the path planning or go to the next step.Step 4:Compare the nearest M obstacles with the obstacle set.If they are the same,the moving point keeps going,and the program goes to Step 3.Or go to the next step.Step 5:Update the obstacle set by the new M obstacles,and start a path planning process.Take the current pose of the moving point as the boundary conditions.Start the path planning.According to the solution of S-PSO,a new path is generated,and the moving point follows the new path.Then go back to Step 3.The pseudocode about the S-PSO path planning is listed as follows:For n =1to the max iteration For i =1to the swarm sizeCalculate the fitness F path using (5)For j=1to MCalculate thefitness F criticalpointjusing(7)Calculate thefitness F obstaclej using(6)EndForCalculate thefitness F using(8) Update Y d i found by particle i using(3)Update Y gi found by particle i’s neighborhood using(4)Calculate v i(n)and X i(n)using(1)EndForEndForIV.S IMULATIONThe proposed algorithm is implemented with MATLAB on an Intel Pentium43.00GHz computer.The simulation environment is presented as follows.•The simulation space is a rectangle plane with seven obstacles;•Just as mentioned in thefitness about path length,let the X-axis be along with the connecting line between the start point and the destination,so the start position is (0m,0m)and the destination is(7m,0m),the derivatives of the start point and the end point of the path should be zero;•For each obstacle,there is a safety range denoted by a red circle around the obstacle,which means that if a robot is apart from the center of an obstacle more than the radius of the safety range,the robot will avoid collision with this obstacle.To simplify simulation,we assume that all radiuses of safe ranges are identically0.25m;•The splines used to represent paths are described as the six-order polynomial.Consequently there are seven parameters,a0to a6to be determined.According to the boundary conditions,a6to a4are chosen as free parameters in the optimization process.•It is assumed that the path planning algorithm can handle four obstacles at one time.If a virtual moving point is moving along the path,it is prescribed that the path planning only generate a safety path according to the four nearest obstacles.•The swarm used in S-PSO has20particles.According to the order of the polynomial and the number of obstacles handled at one time,the dimension of the solution space is seven.The parameters used in S-PSO algorithm arelisted below:ε(n)=3.5(n+1)0.4,c1=c2=3.5,α=0.95,ξ(n)is the additional random velocity which is limited within[−0.5×10−50.5×10−5];•The iterations for path planning duration are identically 500iterations.The simulation results are shown in Fig.3and Fig.4.In Fig.3,the blue line represents the path generated by S-PSO, and seven obstacles are marked by1to7.Obviously the path is safe because it does not pass through any safety range of the obstacle.Since there are seven obstacles,when a virtual point is moving to the destination,there need four times of path planning due to the upper limit of obstacles handled at one time.So the whole path consists of four parts,which are separated by the dashed lines and denoted by segments(1)to (4)in Fig.3.For example,after thefirst time of path planning with generating a path connecting the start and the destination according to obstacles1to4,the virtual point is moving along the path,whose trace is the segment(1).When it reaches the intersection point between segment(1)and segment(2), due to the nearest obstacles changing to obstacles2to5,the second time of path planning is triggered,and at this time the start of the path is the current point of the virtual point. Since the derivative of the end of the segment(1)is viewed as the derivative of the start for the second path planning,it is guaranteed that the connection of segment(1)and segment(2) is smooth.Similarly all connections of segments are smooth, so that the whole path must be smooth one.According to the meaning of the coordinate of particle i,x i1 to x i3represent free parameters a6to a4,which determine the polynomial cubic spline with respect to the boundary conditions.Therefore we pay more attention to the evolution of these three coordinates.Fig.4shows the evolutionary processes about a6to a4in all four times of path planning. Obviously every optimization process is convergent.That means in every time of path planning,all particles of the swarm converge to an optimal value,which guarantee a safety path avoiding obstacle and connecting the destination.But it should be mentioned because thefitness of path length can not be optimized to zero,we can not assert that this path must be the global best path,but we can guarantee that the path must avoid obstacles,while approaching the beeline connecting the start and the destination.Fig.3.The path generated by the S-PSO path planning.The expressions of the four paths generated by S-PSO path planning are listed below(The span of each segment along X-axis is presented in the square brackets):1.p d x∈[0m2.78m]:p d y(t)=0.2808×10−3(p d x)6−0.3018×10−2(p d x)5−0.66142×10−2(p d x)4+0.1512(p d x)3−0.3758(p d x)2+0.01485p d x−0.14791×10−3 2.p d x∈[2.78m3.54m]:p d y(t)=0.14528×10−2(p d x)6−0.02176(p d x)5+0.8595×10−2(p d x)4+1.4546(p d x)3−9.6845(p d x)2+24.67p d x−22.94250 500−0.04−0.0200.020.04Convergence of X 1(a 6)Time (iteration)V a l u eConvergence of X (a )Time (iteration)V a l u eConvergence of X (a )Time (iteration)V a l u e (a)Evolution of a 6to a 4in the 1st time of path planning.6464(d)Evolution of a 6to a 4in the 4th time of path planning.Fig.4.Evolution process of the free parameters in four times of path planning.3.p d x ∈[3.54m4.28]:p d y (t )=0.6894×10−3(p d x )6−0.2884×10−2(p d x )5−0.05451(p d x )4−0.1910(p d x )3+7.3793(p d x )2−35.355p dx +49.6294.p d x ∈[4.28m 7.0m ]:p d y (t )=0.1824×10−2(p d x )6−0.037592(p d x )5+0.028207(p d x )4+4.8445(p d x )3−47.185(p d x )2+177.11p dx −239.85From above expressions we know that the results of freeparameters are all less than 0.1,even less than 0.001.That’s why the additional exploration velocity is limited within [−0.5×10−50.5×10−5].V.C ONCLUSIONSIn this paper,an improved path planning technique named stochastic path planning is proposed for a mobile robot path planing with obstacle avoidance.To make a path be a smoothone,a kind of cubic spline expressed by high order polynomial is employed,in which some coefficients are chosen as free parameters to construct the particles of swarm.A combined fitness function is set up for evaluating the length of a path and the performance of obstacle ing this S-PSO path planning algorithm,a safety path can be found through the field with static obstacles.Simulation results demonstrate the proposed algorithm for a mobile robot path planning is effective.R EFERENCES[1]T.Lozano-Perez and M.A.Wesley,“An Algorithm for Planning Collision-Free Paths among Polyhedral Obstacles,”Comm.ACM,vol.22,no.10,pp.560-570,1979.[2]M.G.Park and M.C.Lee,“Experimental Evaluation of Robot Path Plan-ning by Artificial Potential Field Approach with Simulated Annealing,”Proc.of the 41st SICE Annual Conference,vol.4,Osaka,Japan,August 2002,pp.2190-2195.[3]H.Z.Zhuang,S.X.Du,and T.J.Wu,“Real-Time Path Planning for MobileRobots,”Proc.of the 4th Int.Conf.on Machine Learning and Cybernetics,Guangzhou,China,August 2005,pp.526-531.[4] A.Howard,H.Seraji,and B.Werger,“Fuzzy Terrain-Based Path Planningfor Planetary Rovers,”Proc.of the IEEE Int.Conf.on Fuzzy Systems,vol 1,May 2002,pp.316-320.[5]S.X.Yang and C.Luo,“A Neural Network Approach to Complete Cover-age Path Planning,”IEEE Trans.on Systems,Man,and Cybernetics–Part B:Cybernetics,vol.34,no.1,pp.718-725,February 2004.[6]W.Tao,M.Zhang,and T.J.Tarn,“A Genetic Algorithm Based AreaCoverage Approach for Controlled Drug Delivery Using Micro-Robots,”Proc.of the IEEE Int.Conf.on Robotics and Automation ,New Orleans,LA,vol.2,April 2004,pp.2086-2091.[7]M.Gerke,“Genetic Path Planning for Mobile Robots,”Proc.of the Ameri-can Control Conference,San Diego,California,June 1999,pp.2424-2429.[8]Y .Hu and S.X.Yang,“A Knowledge Based Genetic Algorithm for PathPlanning of a Mobile Robot ,”Proc.of IEEE Int.Conf.on Robotics and Automation,New Orleans,LA,vol.5,April 2004,pp.4350-4355.[9]Y .T.Hsiao,C.L.Chnang,and C.C.Chien,“Ant Colony Optimizationfor Best Path Planning,”Proc.of Int.Symp.on Communications and Information Technobgies (ISCIT 2004),Sapporo,Japan,October 2004,pp.109-113.[10]R.C.Eberhart and J.Kennedy,“A New Optimizer Using Particle SwarmTheory,”Proc.of the 6th Int.Symp.on Micro Machine and Human Science ,Nagoya,Japan,pp 39-43,1995.[11]J.Kennedy and R.C.Eberhart,“Particle Swarm Optimization,”Proc.of IEEE Int.Conf.on Neural Network ,Perth,Australia,pp.1942-1948,1995.[12]S.Doctor and G.K.Venayagamoorthy,“Unmanned Vehicle NavigationUsing Swarm Intelligence,”Proc.of Int.Conf.on Intelligent Sensing and Information Processing,2004,pp.249-253.[13]Y .Q.Qin,D.B.Sun,M.Li,and Y .G.Cen,“Path Planning for MobileRobot Using the Particle Swarm Optimization with Mutation Operator,”Proc.of Int.Conf.on Machine Learning and Cybernetics,vol.4,August 2004,pp.2473-2478.[14]Y .Li and X.Chen,“Leader-formation Navigation with Sensor Con-straints”,Proc.of IEEE Int.Conf.on Information Acquisition ,Hongkong SAR and Macau SAR,China,June 2005,pp.554-559.[15]Y .Li and X.Chen,“Mobile Robot Navigation Using Particle SwarmOptimization and Adaptive NN”,Advances in Natural Computation,LNCS 3612,Eds.by L.Wang,K.Chen and Y .S.Ong,Springer,pp.628-631,2005.[16] E.Dyllong,A.Visioli,“Planning and Real-Time Modifications of aTrajectory Using Spline Techniques,”Robotica ,vol.21,pp.475-482,2003.[17]Y .Shi and R.Eherhart,“Parameter Selection in Particle Swarm Opti-mization,”Proc.of the 7th Annual Conf.on Evolutionary Programming,New York,Springer Verlag,1998,pp.591-600.[18]R.Mendes,J.Kennedy,and J.Neves,“The Fully Informed ParticleSwarm:Simple,Maybe Better,”IEEE Transactions on Evolutionary Computation ,vol.8,no.3.pp.204-210,2004.。
A Modified Particle Swarm Optimizer for Pipe Route Design
A Modified Particle Swarm Optimizer for Pipe Route DesignQiang Liu, Chengen WangKey Laboratory of Integrated Automation of Process Industry,Ministry of Education, Northeastern University,Shenyang,110004,Chinaliuqiang0505@,wangc@AbstractPipe route design (PRD) is to design an appropriate route meeting various constraints and objectives from various candidates, which is a time-consuming and difficult task even to a skilled designer. This paper proposes a new method for PRD based on the grid method and particle swarm optimization (PSO). First, a modified particle swarm optimization (MPSO) is proposed to improve the performance of PSO. Next, the paper adopts the grid theory to establish the workspace model and designs a novel fixed-length encoding mechanism based on grid to overcome the shortcomings of variable-length encoding, then puts forward the evaluation function for PRD. Finally, the paper adopts MPSO to find the optimal path. Simulation results show this method is feasible and effective.1. IntroductionPipe route design (PRD) plays an important role in industry such as factory layout, aircraft design, ship piping system design, and VLSI design, etc. And it’s influence in performance and reliability of products is increasing. PRD has been researched since 1970’s, resulting in diverse approaches[1-10]. Some of them are maze algorithm[1], network optimization method[3], escape algorithm[5], expert system[8], genetic algorithm[6;10], cell-generation method [8] and ant colony algorithm[9],etc. Due to the complexity of constraints of pipe routing, so far PRD has not yet developed a mature theory and method[10]. For example: Maze algorithm guarantees a solution, if one exists, but it requires a lot of memory space. The escape algorithm is fast and uses less memory space, but it cannot guarantee a optimal solution[10]. Cell generation method based on some basic pipe pattern considers many constraints, but it is not easy to establish the basic pipe pattern when the workspace is complex[10]. Literature [6]、[10] adopt a geneticalgorithm with variable length encoding, but thegenetic operator in [6] usually results in illegal pathsand leds to lots of repair work, in addition, the variablelength coding approach is trivial because it requiresspecific design .Particle Swarm Optimization (PSO), a new population-based evolutionary computation technique,was first introduced in 1995 by Eberhart andKennedy[11]. It has been attracting more and moreattention now, because its simple concept and easy implementation with only a few tuning parameters.The technique involves simulating social behavioramong individuals (particles)“flying” through a multidimensional search space. This paper adopts thegrid theory to establish the workspace model anddesigns a novel fixed-length encoding mechanismbased on grid to overcome the shortcomings ofvariable-length encoding mechanism. For better performance, we propose a modified particle swarm optimization (MPSO), then adopts MPSO to find theoptimal path, which provides a new method for PRD.2.A pipe routing method based on MPSO2.1. PSO algorithm and its modificationIn PSO, positions of N particles are candidatesolutions to the D-dimensional problem, and the movesof the particles are regarded as the search process ofbetter solutions. The position of the i th particle at titeration is represented by))(),(),()(1txtxtxtxiDilii""(=, and itsvelocity is represented by))(),(),()(1tvtvtvtviDilii""(=. During the searchprocess the particle successively adjusts its positionaccording to two factors: one is the best position foundby itself (pbest), denoted byThe 11th IEEE International Conference on Computational Science and Engineering - Workshops))(),(),()(1t p t p t p t p iD il i i ""(=; the other isthe best position found so far by its neighbors (gbest),denoted by ))(),(),()(1t p t p t p t p gD gl g g ""(=. The neighbors can be either the whole population(global version) or a small group specified before run (local version). The velocity update equation (1) and position update equation (2) are described as follows.+−⋅⋅+⋅=+)]()([)()1(11t x t p r c t v t v il il il il ω)]()([22t x t p r c il gl −⋅⋅ (1) )1()()1(++=+t v t x t x il il il (2)where ω is inertia weight, 1c and 2c are acceleration constants, which usually amount to 2. 1r and 2r are two independent random numbers uniformly distributed in the range of [0, 1].Inertia weight ωis an important parameter which balances the global search ability and the local search ability. In order to find the balance between the global search and local search, many methods are proposed. The time linear decreasing strategy[12] is simple and has a good optimization performance. Literature [13] analyzed the decreasing strategy of concave function and achieved better results. 1c 、2c are accelerating factors, representing learning speed of particle in learning the experience of his own and the experience of the group. Literature [14;15] proposed a strategythat 1cdeclines linearly and 2c increases linearly to improve the performance of PSO.To summarize the above, this paper proposes a modified strategy(MPSO), that is: ωand 1c decline in the way of a concave function, while 2c increases in the way of convex function, as shown in formula (3), (4), (5). Its significance is as follows: the initial stage, ω、1c is larger so as to strengthen the exploring ability and 2c keeps smaller so that particle maintained strong self-cognitive ability and not to gather on extreme point too early. With the optimization, ωand 1c decrease and 2c increases gradually so as to strengthen the ability of information interaction in particles.s s e e s T t T t ωωωωωω+⋅⋅−+⋅−=)/2()()/()(2 (3)s s e e s c T t c c T t c c c 1112111)/2()()/()(+⋅⋅−+⋅−= (4) s s e e s c T t c c T t c c c 2222222)/2()()/()(+⋅⋅−+⋅−= (5)Here s 1ω、e 1ω are the initial value and the final value of ω. s c 1、e c 1 are the initial value and thefinal value of 1c . s c 2、e c 2 are the initial value and the final value of 2c . T is the maximum number of iterations, and t is the current number of iterations.2.2. Modeling and rules descriptionThe grid method is applied in PRD in many researches because it not only can guarantee the pipe path always go straightly or turn orthogonally, avoiding a diagonal, but also it can describe many engineering rules conveniently. This paper establishes the workspace model using the grid method. The grid width represents the pipe width. Dealing with obstacles according to this term: regard as a full grid when the obstacle cannot fill in a grid.Figure 1 shows a model and the workspace is divided into cells of 16×16: the shadow parts represent obstacles, Origin represents the initialposition and the Target represents the goal position.Fig.1. A workspace for PRDThe major assignment of PRD is looking for the shortest path from the initial position to the target position, which meets some engineering rules[9]:(1)Physical constraints: connect terminals of given locations and avoid obstacles. (2)Economic constraints: minimize the length of pipes and the number of pipe turns. (3)Safety constraints: keep minimum-clearance off from specific equipment. (4)Production constraints: some pipes run orthogonally and arrange pipes along walls and some obstacles if possible.2.3. A fixed-length encoding based on gridLiterature [6]、[10] adopt a genetic algorithm with variable length encoding, but the genetic operator of literature [6] usually results in illegal paths and leds to lots of repair work, in addition, the variable length coding approach is trivial because it requires specific design. This section designs a novel fixed-length encoding mechanism based on grid to overcome the shortcomings of variable-length encoding mechanism ..Particle encoding and structure of evaluation function are the key problems of PSO [16]. In order to encode particle, the workspace is divided into D+1 parts in the X direction firstly. (D is also defined as the number of dimension of particle, which is equal to the number of variables), and the longitudinal coordinates of the D nodes between the initial point and goal point are defined as code .In Fig.2(a), the code of particle p is (y1, y2, y3), and the particle p represents a working path, that is, the initial point ,node(1, y1), node(2, y2), node(3, y3), and the target point Target, five nodes compose a working path.Each effective particle represents a working path and the key problem is how to connect the D nodes into a integrated pipe path. In this paper, we connect the cells located between two adjacent nodes. For example, in Fig.2(b), the gridding part is the pipe path converted by node(2, y2) and node(3, y3). According to this method, a integrated pipe path converted by particle (y1, y2, y3) is shown in Fig.2(c).(a) (b)(c) (d)F ig.2. Particle encoding based on grid2.4. Structure of evaluation functionPRD is looking for the optimal path, which meets as many constraints as possible in workspace. The following objectives and constraints are taken into considerations in our evaluation function. They are (1) avoiding obstacles, (2) the shortest length of path (3) the least number of bends and (4) most pipes go as much as possible through existing equipment. Arrange pipes along walls if possible.To cope with constraint(1), we take measures in two steps: (ⅰ)Each component of particle position vector can not be selected in obstacles, which is carried out by the initialization, and after each updating, if particle drops into obstacle, letting it jump out; (ⅱ) Pipe path converted by two adjacent nodes can not be intersectant with obstacles, which is solved by penalty function)(pO[9]. If the path converted by two adjacent nodes are intersectant with obstacles, the corresponding penalty item ),(1+iiyyO=1, otherwise, ),(1+iiyyO=0. As is shown in figure 2(d), the path converted by y2 and y3 is intersectant with obstacle, so y2 and y3 constitute an illegal path, ),(32yyO=1.To cope with constraint(2)、(3), we can introduce the length of pipe and the total number of bends of pipe into evaluation function.To cope with constraint(4), we introduce energy function[6;10]. Each node in different regions has different energy value: nodes that are located next to some obstacle (equipment) are endowed with lower energy value, which means that the pipe path is more favourable if it goes along the obstacle.So the evaluation function established is as follows.+×+×=)()()(pBpLpFβα)()(pOpE×+×δγ(6)α、β、γandδis positive constant.)(pL、)(pB、)(pE、)(pO are defined as follows:)(pL represents the total length of path p, which is determined by the number of the cells that path p passed by.)(pB represents the total number of bends of path p.)(pE=∑=Diiye1)(, represents the total energy ofpath p;∑=+=DiiiyyOpO1),()( is penalty function,representing the number of pipe path intersected with the forbidden regions (namely obstacles) on path p.2.5. Procedure of MPSO for PRDThe procedure of MPSO for PRD is summarized as follows:Step1) Initialize a population of particles with random positions and velocities, where each particlecontains D variablesStep 2) Evaluate the objective values of all particles;let pbest of each particle and its objective valueequal to its current position and objectivevalue; and let gbest and its objective valueequal to the position and objective value of thebest initial particle.Step 3) Update the velocity and position of eachparticle according to formula (1) and (2).Step 4) Judge each particle weather drops into obstacle,if drops, let it jump out of obstacles.Step 5) Evaluate the objective values of all particlesaccording to formula (6).Step 6) For each particle, compare its current objectivevalue with the objective value of its pbest. If current value is better, then update pbest and its objective value with the current position and objective value.Step 7) Determine the best particle of the currentswarm with the best objective value. If the objective value is better than the objective value of gbest, then update gbest and its objective value with the position and objective value of the current best particle.Step 8) If a stopping criterion is met, then output gbestand its objective value; otherwise go back to Step 3).3. Simulation and resultsThe MPSO algorithm for PRD is simulated by Matlab language on an Intel 2.93GHz computer. we take the 16×16 model given in Fig.1 for example.Parameters of MPSO for PRD are given as follows: N =60,D =15,s ω=0.95、e ω=0.45、s c 1=2.75、e c 1=1.25、s c 2=0.5、e c 2=2.15, The maximumnumber of generations T is set as 1000. α=0.25、β=0.5、γ=1、δ=3. Each dimensional component of position vector is distributed in the range of [0,16]. The region which locates next to the underside of the obstacle(a) and the region which locates next to the underside and upside of the obstacle(b) are endowed with lower energy value. Both MPSO and standard PSO run 20 times, the results are listed in Table 1.Table 1. Simulation resultsSubjects PSO MPSO Best value of F(p) 18.75 18.25 Averaged value of F(p) 24.0 19.5 Averaged convergence number of generations 312 292 Percentage of convergence to sub-optimal solution(F(p)<20)35%85%The averaged CPU time (sec)/1000 generation19.6 19.9From the results in table 1, we can see that the averaged CPU time (sec) of running 1000 generations by two algorithms are nearly the same, because the complexity of MPSO doesn’t increase remarkably. Averaged convergence number of generations searched by the two algorithms are nearly the same, but the optimal solution and the averages of evaluation function value searched by MPSO are much better than PSO, which illustrates MPSO can work better than PSO in terms of performance and search ability.Here are the optimal particle and a sub-optimal particle searched by MPSO. The optimal particle = (10.53,10.96,10.58,10.56,10.97,10.58,10.93,11.02,11.64,11.49,11.89,15.98,15.61,15.76,15.80),its objective value is 18.25 ,Fig.3(a) shows its corresponding pipe layout and Fig.3(b) shows its convergence curve. A sub-optimal particle=(0.10,0.01,0.10,5.32,5.19,5.29,5.84,11.25,11.47,11.70,11.93,15.25,15.80,15.78,15.82),its objective value is 18.75. Its corresponding pipe layout and convergence curve are given in Fig.4.(a) (b)Fig.3. Pipe path (Optimal particle) and itsconvergence curve(a) (b)Fig.4. Pipe path (sub-optimal particle) andits convergence curveFig.3(a) shows the length of the pipe is 31, the number of bends is 5. Fig.4(a) shows the length of pipe is 31, the number of bends is 6. It is easy to see that pipe paths can satisfy avoiding obstacles, going orthogonally along some obstacles which have lowenergy as closely as possible, having the shortest length of pipes and fewer number of bends.4. ConclusionThis paper proposes a new method for pipe routing based on the grid method and a modified particle swarm optimization. This method adopts the grid theory to establish the workspace model for pipe routing and adopts MPSO to find the optimal path.A a novel fixed-length encoding mechanism is designed, then the evaluation function for PRD. is also put forward. The results show this method is feasible and effective and can search satisfying path, which provides a new method for pipe routing. The further work should be study on multi-pipe parallel routing with more engineering constraints.References[1]Rourke, P. W. Development of a three-dimensionalpipe routing algorithm. PhD Dissertation,LehighUniversity. 1975.[2]Schmidt-Traub H, Koster M, Holtkotter T et al,Conceptual plant layout. Computers Chemical Engineering, 1998,22 (Suppl.):499-504.[3]Yamada Y, Teraoka Y. An Optimal Design of PipingRoute in a CAD System for Power Plant. Computersand Mathematics with Applications.1998, 35 (6 ): 137-149.[4]Burdorf A, Kampczyk B,Lederhose M et al. CAPD—computer-aided plant design. Computers and ChemicalEngineering.2004, 28:73-81.[5]Aleksander Kniat. Optimization of three-dimensionalpipe routing. Schiffstechnik (Ship Technology Research), 2000,47:111-114.[6]Teruaki Ito. A genetic algorithm approach to pipe routepath planning. Journal of Intelligent Manufacturing,1999,10(1):103-114.[7]Park J H, Storch R L. Pipe-routing algorithmdevelopment: case of a ship engine room design.Expert System With Applications, 2002, 23:299-309. [8]Kang S, Myung S, Han S. A design expert system forauto routing of ship pipes. Journal of ShipProduction,1999.15(1):1-9.[9]Xiaoning Fan, Yan Lin, Zhoushang Ji. The Ant ColonyOptimization for Ship Pipe Route Design in 3D Space.Proceedings of the 6th World Congress on intelligentControl and Automation, June 21-23,2006,Dalian,China[10]Xiaoning Fan, A Study of Optimization Methods forShip Pipe Routing Design and Applications. phDDissertation,Dalian University of Technology, Dalian,China,2006[11]Kennedy J, Eberhart RC. Particle swarm optimization.Proc of IEEE Int Conf on Neural Networks. Piscataway:IEEE Press,1995:1942-1948[12]Shi Y, Eberhart R. Empirical study of particle swarmoptimization [A]. International Conference onEvolutionaryComputation[C].Washington,USA:IEEE1999: 1945-1950.[13]Guimin Chen, Jianyuan Jia, Qi Han. Study on theStrategy of Decreasing Inertia Weight in ParticleSwarm Optimization Algorithm. Journal of Xi’anJiaotong University, 2006,40(1):53-56[14]Ratnaweera A, Halgamuge S. Self-organizinghierarchical particle swarm optimizer with time varyingacceleration coefficients. Evolutionary Computation,2004,8(3):240-255[15]Xiang Feng ,Guolong Chen, Wenzhong Guo. Settingsand Experimental Analysis of Acceleration Coefficients in Particle Swarm Optimization Algorithm.Journal of Jimei University (Natural Science), 2006,11(2): 146-151 (in Chinese)[16]Xianzhang Zhao, Hongxing Chang, Junfang Zeng,Yibo Gao. Path Planning Method for Mobile RoboeBased on Particle Swarm Algorithm..ApplicationResearch of Computers,2007.24(3): 181-186 (inChinese)。
Comprehensive learning particle swarm optimizer for global optimization of multimodal functions(1)
IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTA TION, VOL. 10, NO. 3, JUNE 2006281 Comprehensive Learning Particle Swarm Optimizerfor Global Optimization of Multimodal FunctionsJ. J. Liang, A. K. Qin, Student Member, IEEE, Ponnuthurai Nagaratnam Suganthan, Senior Member, IEEE, andS. BaskarAbstract—This paper presents a variant of particle swarmoptimizers (PSOs) that we call the comprehensive learning par- ticle swarm optimizer (CLPSO), which uses a novel learning strategy whereby all other particles’historical best information is used to update a particle’s velocity. This strategy enables the diversity of the swarm to be preserved to discourage premature convergence. Experiments were conducted (using codes available from .sg/home/epnsugan) on multimodal test functions such as Rosenbrock, Griewank, Rastrigin, Ackley,and Schwefel and composition functions both with and without coordinate rotation. The results demonstrate good performance of the CLPSO in solving multimodal problems when compared with eight other recent variants of the PSO.Index Terms—Composition benchmark functions, comprehen- sive learning particle swarm optimizer (CLPSO), global numerical optimization, particle swarm optimizer (PSO).I. I NTRODUCTIONPTIMIZATION has been an active area of researchfor several decades. As many real-world optimizationproblems become increasingly complex, better optimizationexperienc es of the swarm to search for the global optimum inthe-dimensional solution space.ThePSOalgorithmis easy toimplementand hasbeen empir-ically shown to perform well on many optimization problems.However, it may easily get trapped in a local optimum whensolving complex multimodal problems. In order to improvePSO’s performance on complex multimodal problems, wepresent the comprehensiv e learning particle swarm optimizer(CLPSO) utilizing a new learning strategy.Thispaper isorganizedas follows.Section IIintroducestheoriginal PSO and some current variants of the original PSO.Section III describes the comprehensive learning particle swarmoptimizer. Section IV presents the test functions, the experi-mental setting for each algorithm, the results, and discussions.Conclusions are given in Section V.II. P ARTICLE S WARM O PTIMIZERSA. Particle Swarm OptimizerPSO emulates the swarm behavior and the individuals repre-algorithms are always needed. Unconstrained optimization sentpoints in the problems can be formulated as a -dimensionalminimization problem as follows: the thdimension ofthe th particleare updated asfollows [1],[2]:whereis thenumber of theparameters tobe optimized.The particle swarm optimizer (PSO) [1], [2] is a relatively (2)new technique. Although PSO shares many similarities with evolutionary computation techniques, the standard PSO does not use evolution operators such as crossover and mutation.where ticle; .PSO emulates the swarm behavior of insects, animals herding, birds flocking, and fish schooling where these swarms searchposition yielding the best fitness value for the th particle; andfor food in a collaborative manner. Each member in the swarm coveredby the wholepopulation. adapts its search patterns by learning from its own experience tionconstantsreflecting theweighting ofOstochastic accelera-and other members ’ experiences. These phenomena are studiedtion terms that pull each particle towardand mathematical models are constructed. In PSO, a member tions, respectively.in the swarm, called a particle, represents a potential solutionbers in the range [0, 1]. A particle ’s velocity on eachdimensionwhich is a point in the search space. The global optimum is re-isclamped to a maximum magnitudegarded as the location of food. Each particle has a fitness value positive constant valueand a velocity to adjust its flying direction according to the bestlocity of thatdimension is assigned toWhen updating the velocity of a particle using (1), dif-Manuscript received May 3, 2005; revised July 15, 2005.ferent dimensions have differentThe authors are with the School of Electrical and Electronic Engi- neering, Nanyang Technological University, 639798 Singapore (e-mail:liangjing@.sg; qinkai@.sg; epnsugan@.sg; baskar_mani@).Digital Object Identi fier 10.1109/TEVC.2005.857610research ers use the following updating equation:1089-778X/$20.00 © 2006 IEEE282IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTA TION, VOL. 10, NO. 3, JUNE 2006A linearly decreasing inertia weight over the course of searchwas proposed by Shi and Eberhart [8]. Parameters in PSO arediscussed in [9]. Shi and Eberhart designed fuzzy methodsto nonlinearly change the inertia weight [10]. In [11], inertiaweight is set at zero, except at the time of reinitialization. In ad-dition to the time-varying inertia weight, a linearly decreasingis introduced in [12]. By analyzing the convergencebehavior of the PSO, a PSO variant with a constriction factorwas introduced by Clerc and Kennedy [13]. Constriction factorguarantees the convergence and improves the convergencevelocity.Improving PSO’s performance by designing different typesof topologies has been an active research direction. Kennedy[14], [15] claimed that PSO with a small neighborhood mightperform better on complex problems, while PSO with a largeneighborhood would perform better on simple problems.Suganthan [16] applied a dynamically adjusted neighborhoodwhere the neighborhood of a particle gradually increases untilit includes all particles. In [17], Hu and Eberhart also usedadynamicneighborhoodwhereclosest particles in theperformance space are selected to be its new neighborhoodin each generation. Parsopoulos and Vrahatis combined theglobal version and local version together to construct a unifiedparticle swarm optimizer (UPSO) [18]. Mendes and Kennedyintroduced a fully informed PSO in [19]. Instead of using theandpositions in the standard algorithm, all theneighbors of the particle are used to update the velocity. Theinfluence of each particle to its neighbors is weighted based onits fitness value and the neighborhood size. Veeramachaneni et Fig. 1. Flowchart of the conventional PSO.Comparing the two variants in (1) and (3), the former can have a larger search space due to independent updating of eachdimension, while the second is dimension-dependent and has a smaller search space due to the same random numbers being used for all dimensions. Equation (1) always yields better per- formance on unrotated problems than the rotated version of the problems. Equation (3) performs consistently on unrotated and rotated problems [3]. As the first updating strategy achieves alarger search space and always performs better, we use (1) inthis paper. The flowchart of the standard PSO employing (1)isgiven in Fig. 1.B. Some Variants PSOSince its introduction in 1995 by Kennedy and Eberhart[1],[2], PSO has attracted a high level of interest [4]–[7]. Manyre-searchers have worked on improving its performance invariousways, thereby deriving many interesting variants. One of thevelocity dimension, the FDR-PSO algorithm selects one otherparticle , which has a higher fitness value and is nearer tothe particle being updated.Some researchers investigated hybridization by combiningPSO with other search techniques to improve the performanceof the PSO. Evolutionary operators such as selection, crossover,and mutation have been introduced to the PSO to keep the bestparticles [21], to increase the diversity of the population, and toimprove the ability to escape local minima [22]. Mutation op-erators are also used to mutate parameters such as the inertiaweight [23]. Relocating the particles when they are too close toeach other [24] or using some collision-avoiding mechanisms[25] to prevent particles from moving too close to each other inorder to maintain the diversity and to escape from local optimahas also been used. In [22], the swarm is divided into subpopula-tions, and a breeding operator is used within a subpopulation orbetween the subpopulations to increase the diversity of the pop-ulation. Negative entropy is used to discourage premature con- variants [8] introduces a parameter called inertia weightthe original PSO algorithms as follows:into(4)The inertia weight is used to balance the global and localsearch abilities. A large inertia weight is more appropriate forglobal search, and a small inertia weight facilitates localsearch.the results of these searches are integrated by a global swarmto significantly improve the performance of the original PSO onmultimodal problems.LIANG et al.: COMPREHENSIVE LEARNING PARTICLE SW ARM OPTIMIZERIII. C OMPREHENSIVE L EARNING P ARTICLE S WARM O PTIMIZERAlthough there are numerous variants for the PSO, prema-ture convergence when solving multimodal problems is still themain deficiency of the PSO. In the original PSO, each par-283ticle learns from its andthe social learning aspect to only the makesthe original PSO converge fast. However, because all particles in the swarmlearn from the even ifthe current global optimum, particles may easily be attracted to theregion and get trapped in a local optimum if the search envi-ronment is complex with numerous local solutions. As, the fitness value of a particle is possiblydetermined by values of all parameters, and aparticle that has discovered the region corresponding to the global optimumin some dimensions may have a low fitness value because of thepoor solutions in the other dimensions. In order to make betteruse of the beneficial information, we proposed new learningstrategies to improve the original PSO [30]. In [30], all parti-cles’are usedto update thevelocity ofany oneparticle. This novel strategy ensures that the diversity of the swarm is pre-served to discourage premature convergence. Three versions ofPSO using the comprehensive learning strategy were discussedand demonstrated with significantly improved performances onsolving multimodal problems in comparison to several othervariants of the PSO. Among the three variants, the CLPSO isthe best, based on the results. Hence, we further investigate theCLPSO in this paper. Fig. 2.Selection ofexemplardimensions forparticle .A. Compehensive Learning StrategyAll theseIn this new learning strategy, we use the following velocity updating equation:(5)space using the information derived from different particles’historical best positions. To ensure that a particle learns fromgood exemplars and to minimize the time wasted on poor direc-tions, we allow the particle to learn from the exemplars until thewhere definesparticle ceases improving for a certain number of generationss the particle shouldfollow.responding dimension of any particle’s including its own, and the decision depends on probability ,referred to as the learning probability, which can take different values for 1)Instead ofusingparticle’s own different particles. For each dimension of particle , we generate exemplars, allparticles’a random number. If this random number is larger than corresponding dimension will learn from its own, the ; other-wise it will learn from another particle’s . Weemploy the tournament selection procedure when the particle’s dimension canlearn fromdifferent learns from another particle’s asfollows.1) We first randomly choose two particles out of the popu- aparticle maylearn from thecorrespondingdimension of lation which excludes the particle whose velocity is up- differentparticle’s dated. 3)Instead oflearning fromtwo exemplars(2) We compare the fitness values of these two particles’s and select the better one. In CLPSO, we define the fitness value the larger the better, which means that when solving minimization problems, we will use the negative function value as the fitness values.3) We use the winner’s as the exemplar to learn fromfor that dimension. If all exemplars of a particle are itsatthesametime ineverygeneration as intheoriginalPSO(1)and (3),eachdimension of aparticlelearnsfromjustoneexemplar for afewgenerations.B. CLPSO’s Search BehaviorThe above operations increase the swarm ’s diversity to yieldown, we will randomly choose one dimension tolearn from another particle ’s’scorresponding di-mension. The details of choosingare given in Fig. 2.284IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTA TION, VOL. 10, NO. 3, JUNE 2006Fig. 3.(b) The CLPSO’s and the original PSO’s possible search regions per variable in a swarm with five members.(a). (c)., theandever, the is more likely to provide a larger momentum, asis likely to be larger than the . Hence, themay influence the particle to move in its directioneven ifticle beit is in a local optimum region far from the global optimum. Ifand are on the same side of the particle’s current po-sition and if it points to a local optimum, the particle will movein that direction and it may be impossible to jump out of theranges for the th particle of PSO asand CLPSO, as shown in (9) at the bottom of the page.local optimum area once its falls into the same local op-timum region where the is. However, in our new learning strategy, the particle can fly in other directions by learningfromrespectively.other particles’when the particle’sinto the same local optimum region. Hence, our new learning strategy has the ability to jump out of local optimum via the co-operative behavior of the whole swarm.swarm. In order to compare the potential search spaces of PSO and CLPSO, both algorithms are run 20 times on a (unimodal) sphere function and a (multimodal) Rastrigin function defined inIn order to compare the original PSO’s and CLPSO’spoten-Section IV-A.tial search spaces, first we omit the old velocitycom-ponent. If we let , in the original PSO and in CLPSO allbe equal to one, the update equations of the original PSO andCLPSO reduce to the following equations:(6) Table I presents ’s mean value of the 20 runs. andand versus the iterations are plotted in Fig. 4.From Table I and Fig. 4, we observe that CLPSO’s updating strategy yields a larger potential search space than that of theoriginal PSO. The multimodal Rastrigin’s function’s meanis ten times larger than that of the unimodal sphere function. By increasing each particle’s potential search space,(7)the diversity is also increased. As each particle’spossibly a good area, the search of CLPSO is neither blind norLet us consider the fourth particle in a swarm with five members as an example. The potential search spaces of the original PSO and the CLPSO on one dimension are plotted as arandom. Compared to the original PSO, CLPSO searches more promising regions to find the global optimum. Experimental results in Section IV support this description.line in Fig. 3. For the fourth particle whose position isdifferent cases are illustrated in Fig. 3: (a), threeand(c) ,values yielded different results on the same problem if theample,is theis the ,(9)LIANG et al.: COMPREHENSIVE LEARNING PARTICLE SW ARM OPTIMIZER285Fig. 4. Comparison of PSO’s and CLPSO’s potential search space. (a)andsphere function. (d) forRastrigin’sfunction.TABLE I M EAN V ALUE OF FORS PHERE R ASTRIGIN F UNCTIONS IN 20 R UNSwhile on the rotated problems, differentvaluesyield the best performance for different problems. Different values yield similar results on simple unimodal problems while seri-ously affecting CLPSO’s performance on multimodal problems.In order to address this problem in a general manner, we pro-pose to set suchthat eachparticle has adifferent Therefore, particles have different levels of exploration and ex-ploitation ability in the population and are able to solve diverseproblems. We empirically developed the following expressionto set a valuefor eachparticle:(10)searchbounds, inmanypracticalproblems,there areboundson thevariables’ranges. Thesearch rangefor a problem Fig. 5 presents an example of assignedfor apopulationof 30. Each particle from 1 to 30 has a0.05 to 0.5.D. Implementation of Search Boundsvalueranging fromThough we have shown in [30] the CLPSO to be ro- fitnessvalue of aparticle andupdate its bust to initialization and independent of upper and lower theparticle is inthe range.Since allexemplars arewithin the286IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTA TION, VOL. 10, NO. 3, JUNE 2006Fi g . 7. C L P S O ’s r e s u l t s o n s i x t e s t f u n c t i o n s w i t h d i f f e r e n t r e f r e s h i n g ga p. Fig. 6.Flowchart of the CLPSO algorithm.range, the particle will eventually return to the search range. The complete flowchart of the CLPSO is given in Fig. 6.E. Adjusting the Refreshing GapThe refreshing gap parameterneeds be tuned. In this sec-tion, six different kinds of ten-dimensional (10-D) test functions are used to investigate the impact of this parameter. They are the sphere, Rosenbrock, Ackley, Griewank, Rastrigin, and ro-tated Rastrigin functions as de fined in Section IV . The CLPSOis run 20 times on each of these functions, and the mean valuesof the final results are plotted in Fig. 7. As all test functions areFig. 8. The landscape maps of Group D problems. (a) Composition function 1 (CF1). (b) Composition function 5 (CF5).and better results on the sphere function. For the other five testminimization problems, the smaller the final result, the better functions, better results were obtained whenit is. From Fig. 7, we can observe that can in fluence the re-sults. Whenis zero, we obtained a faster convergence velocityLIANG et al.: COMPREHENSIVE LEARNING PARTICLE SW ARM OPTIMIZER287 IV. E XPERIMENTAL R ESULTS AND D ISCUSSIONA. Test FunctionsAs we wish to test the CLPSO on diverse test functions andour main objective is to improve PSO’s performance on mul- timodal problems, we choose two unimodal functions and 14 multimodal benchmark functions [32]–[35]. All functions are tested on ten and 30 dimensions. According to their properties, these functions are divided into four groups: unimodal prob- lems, unrotated multimodal problems, rotated multimodal prob-lems, and composition problems. The properties and the for- mulas of these functions are presented below.Group A: Unimodal and Simple Multimodal Problems: 7)8)1) Sphere function(11)2)Rosenbrock’s functiondimensions than higherdimensions [36]. The Weierstrass function is continuous butdifferenti able only on aThe first problem is the sphere function and is easy to solve. The second problem is the Rosenbrock function. It can be treated as a multimodal problem. It has a narrow valley from the perceived local optima to the global optimum. In the experiments below, we find that the algorithms that perform well on sphere function also perform well on Rosenbrock function.Group B: Unrotated Multimodal Problems:3) Ackley’s functionis a complex multimodal problem with a large number oflocal optima. When attempting to solve Rastrigin’s function,algorithm s may easily fall into a local optimum. Hence, analgorithm capable of maintaining a larger diversity is likelyto yield better results. Noncontinuou s Rastrigin’s function isconstruct ed based on the Rastrigin’s function and has the samenumber of local optima as the continuous Rastrigin’s function.The complexity of Schwefel’s function is due to its deep localoptima being far from the global optimum. It will be hard tofind the global optimum if many particles fall into one of thedeep local optima.Grou p C: Rotated Multimod al Problems:In Group B, somefunctions are separable and can be solved by using 1-D(13) searches,wherein Group C, we have four rotated multimodal problems. Torotate a function, first an orthogonal matrix shouldbe gener-4) Griewanks’s functionmatrix to getthe newrotatedvariable5)Weierstrass function(15)Whenone dimensionin6) Rastrigin’s functionIn thispaper, we used Salomon ’s method [37] to generate the orthog- onal matrix.288IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTA TION, VOL. 10, NO. 3, JUNE 20069)Rotated Ackley’s function10) Rotated Griewanks’s function11) Rotated Weierstrass function (22)than CF1 since evenafter the global basin is found, the global optimum isnot easy to locate. The landscape maps of these two composition functions are illustrated in Fig. 8. T h e g l o b a l o p t i ma, the corresponding fitness value (23)the search ranges each function are given in Table II. Biased initializations are12) Rotated Rastrigin ’s function13) Rotated noncontinuous Rastrigin ’s function(24)··· PSO with inertia weight (PSO-w) [8]; PSO with constriction factor (PSO-cf) [13]; Local version of PSO with inertia weight (PSO-w-local); 14) Rotated Schwefel ’s function(25)CPSO-H [29]; CLPSO.Among these PSO local versions, PSO_w_local and PSO_cf_local were chosen as these versions yielded the best results [15] with von Neumann neighborhoods where(26)In rotated Schwefel ’s function, in order to keep the global op- timum in the search range after rotation, noting that the original global optimum of Schwefel ’s function is at [420.96, 420.96, , 420.96 ], andareused instead of ’s function has better solutions out of the search range , when ,, i.e. is set in portion to thesquare neighbors above, below, and one each side on a two-dimen- sional lattice were connected. FIPS with U-ring topology that achieved the highest success rate [19] is used. When solving the 10-D problems, the population size is set at ten and the maximum fitness evaluations (FEs) is set at 30 000. When solving the 30-dimensional (30-D) problems, the population size is set at 40 and the maximum FE is set at 200 000. All experiments were run 30 times. The mean values and standard deviation of the results are presented.distance between and the bound.Group D: Composition Problems: Composition functions are constructed using some basic benchmark functions to obtainmore challenging problems with a randomly located global optimum and several randomly located deep local optima. The1More composition functions can be found at .sg/home/EPNSugan/.LIANG et al.: COMPREHENSIVE LEARNING PARTICLE SW ARM OPTIMIZERTABLE IIG LOBAL O PTIMUM, S EARCH R ANGES AND I NITIALIZA TION R ANGES OF THE T EST F UNCTIONS 289When solving real-world problems, usually the fitness calcu- lation accounts for the most time as the PSO is highly compu- tation efficient. Hence, the algorithm-related computation times of these algorithms are not compared in this paper. Further, the main difference between the CLPSO and the original PSO is the modified velocity updating equation, which has been made simpler in the CPSO. The complexity of the new algorithm is similar to the original PSO. In the experiments, a serial imple- mentation is used, while it is easy to be modified to a parallel implementation. With a parallel form, the performance is likelyFromthe results,we observethat for theGroup Aunimodalproblems, since CLPSO has a large potential search space, itcould not converge as fast as the original PSO. CLPSO achievedbetter results on all three multimodal groups than the originalPSO. CLPSO surpasses all other algorithms on functions 4,5, 7, 8, 10, 12, 13, 14, 15, and 16, and especially significantlyimproves the results on functions 7 and 8. According to theresults of -tests, these results are different from the secondbest results. The CLPSO achieved the same best result as theCPSO-H on function 6, and they both are much better thanto be not affected much due to batch updating ofcomputational efficiency improves.whileC. Experimental Results and Discussions1) Results for the 10-D Problems: Table III presents the means and variances of the 30 runs of the nine algorithmsperform s better on more complex problemswhen the otheralgorithms miss the global optimum basin. The Schwefel’sfunction is a good example, as it traps all other algorithms inlocal optima. The CLPSO successfully avoids falling into theon the sixteen test functions with . Thebest resultsamong the nine algorithms are shown in bold. In order to deter- mine whether the results obtained by CLPSO are statistically different from the results generated by other algorithms, the nonparametric Wilcoxon rank sum tests are conducted between the CLPSO’s result and the best result achieved by the otherOn the two composition functions with randomly distributedlocal and global optima, CLPSO performs the best.Comparing theresults andtheconvergence graphs,amongthese nine PSO algorithms, FDR-PSO has good local searchability and converges fast. PSO with inertia weight (PSO-w)eight PSO versions for each problem. The valuespresentedin the last row of Tables III and IV are the results of -tests. An value of one indicates that the performances of the two al- gorithms are statistically different with 95% certainty, whereas value of zero implies that the performances are not statisti- cally different. Fig. 9 presents the convergence characteristics in terms of the best fitness value of the median run of each algorithm for each test function.versions where the whole population is the neighborhood. PSOwith constriction factor converges faster than the PSO withinertia weight. But PSO with inertia weight performs betteron multimodal problems. UPSO combines global PSO andlocal PSO together to yield abalanced performance betweenthe global and the local versions. PSO with inertia weight (PSO-w-local), PSO with constriction factor (PSO-cf-local)290IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTA TION, VOL. 10, NO. 3, JUNE 2006TABLE IIIR ESULTS FOR 10-D P ROBLEMSand FIPS with a U-ring topology are all local versions. Theyall perform better on multimodal problems than the globalversions. Among the three, FIPS yields a comparatively betterperformance. CPSO-H presents good performance on someunrotated multimodal problems and converges faster whencompared to CLPSO. However, its performance is seriouslyaffected after rotation. Although CLPSO’s performance is alsoaffected by the rotation, it still performs the best on four rotated problems. It can be observed that all PSO variants failed onthe rotated Schwefel’s function, as it becomes much harder to solve after applying rotation.2) Results for the 30-D Problems: The experiments con- ducted on 10-D problems are repeated on the 30-D problems and the results presented in Table IV. As the convergence graphs are similar to the 10-D problems, they are not presented. From the results in Table IV, we can observe that the algorithms。
The Kalman Swarm - A New Approach to Particle Motion
The Kalman SwarmA New Approach to Particle Motion in SwarmOptimizationChristopher K.Monson and Kevin D.SeppiBrigham Young University,Provo UT84602,USAc@ or kseppi@Abstract.Particle Swarm Optimization is gaining momentum as a sim-ple and effective optimization technique.We present a new approach toPSO that significantly reduces the number of iterations required to reachgood solutions.In contrast with much recent research,the focus of thiswork is on fundamental particle motion,making use of the Kalman Filterto update particle positions.This enhances exploration without hurtingthe ability to converge rapidly to good solutions.1IntroductionParticle Swarm Optimization(PSO)is an optimization technique inspired by social behavior observable in nature,such asflocks of birds and schools offish [1].It is essentially a nonlinear programming technique suitable for optimizing functions with continuous domains(though some work has been done in discrete domains[2]),and has a number of desirable properties,including simplicity of implementation,scalability in dimension,and good empirical performance.It has been compared to evolutionary algorithms such as GAs(both in methodology and performance)and has performed favorably[3,4].As an algorithm,it is an attractive choice for nonlinear programming because of the characteristics mentioned above.Even so,it is not without problems.PSO suffers from premature convergence,tending to get stuck in local minima[4–7].We have also found that it suffers from an ineffective exploration strategy, especially around local minima,and thus does notfind good solutions as quickly as it could.Moreover,adjusting the tunable parameters of PSO to obtain good performance can be a difficult task[7,8].Research addressing the shortcomings of PSO is ongoing and includes such changes as dynamic or exotic sociometries[6,9–12],spatially extended particles that bounce[13],increased particle diversity[4,5],evolutionary selection mech-anisms[14],and of course tunable parameters in the velocity update equations [7,8,15].Some work has been done that alters basic particle motion with some success,but the possibility for improvement in this area is still open[16].This paper presents an approach to particle motion that significantly speeds the search for optima while simultaneously improving on the premature conver-gence problems that often plague PSO.The algorithm presented here,KSwarm, bases its particle motion on Kalmanfiltering and prediction.We compare the performance of KSwarm to that of the basic PSO model.In the next section,the basic PSO algorithm is reviewed,along with an instructive alternative formulation of PSO and a discussion of some of its shortcomings. Unless otherwise specified,“PSO”refers to the basic algorithm as presented in that section.Section3briefly describes Kalman Filters,and Section4describes KSwarm in detail.Experiments and their results are contained in Section5. Finally,conclusions and future research are addressed in Section6.2The Basic PSO AlgorithmPSO is an optimization strategy generally employed tofind a global minimum. The basic PSO algorithm begins by scattering a number of“particles”in the function domain space.Each particle is essentially a data structure that keeps track of its current position x and its current velocity v.Additionally,each particle remembers the“best”(lowest valued)position it has obtained in the past,denoted p.The best of these values among all particles(the global best remembered position)is denoted g.At each time step,a particle updates its position and velocity by the following equations:v t+1=χ v t+φ1(p−x)+φ2(g−x) (1)x t+1=x t+v t+1.(2) The constriction coefficientχ=0.729844is due to Clerc and Kennedy[15]and serves to keep velocities from exploding.The stochastic scalarsφ1andφ2are drawn from a uniform distribution over[0,2.05)at each time step.Though other coefficients have been proposed in an effort to improve the algorithm[7,8],they will not be discussed here in detail.2.1An Alternative MotivationAlthough the PSO update model initially evolved from simulatedflocking and other natural social behaviors,it is instructive to consider an alternative moti-vation based on a randomized hill climbing search.A naive implementation may place a single particle in the function domain,then scatter a number of random sample points in the neighborhood,moving toward the best sample point at each new time step:x t+1=g t.If the particle takes this step byfirst calculating a velocity,the position is still given by(2)and the velocity update is given byv t+1=g t−x t.(3) As this type of search rapidly becomes trapped in local minima,it is useful to randomly overshoot or undershoot the actual new location in order to do some directed exploration(after all,the value of the new location is already known). For similar reasons,it may be desirable to add momentum to the system,allowingparticles to“roll out”of local minima.Choosing a suitable random scalarφ,this yieldsv t+1=v t+φ(g t−x t).(4) The equation(4)is strikingly similar to(1).In fact,it is trivial to reformulate the PSO update equation to be of the same form as(1)[15,12].The fundamental difference between this approach and PSO is the way that g is calculated.In PSO,g is taken from other particles already in the system. In the approach described in this section,g is taken from disposable samples scattered in the neighborhood of a single particle.This suggests that the basic PSO is a hill climber that uses existing informa-tion to reduce function evaluations.It is set apart more by its social aspects than by its motion characteristics,an insight supported by Kennedy but for different reasons[16].2.2Particle Motion IssuesGiven that PSO is closely related to an approach as simple as randomized hill climbing,it is no surprise that attempts to improve the velocity update equa-tion with various scaling terms have met with marginal success.Instead,more fundamental changes such as increased swarm diversity,selection,and collision avoiding particles have shown the greatest promise[4,5,14].Unfortunately these methods are not without problems either,as they gen-erally fail to reduce the iterations required to reach suitable minima.They focus primarily on eliminating stagnation,eventuallyfinding better answers than the basic PSO withoutfinding them any faster.It has been pointed out that nonlinear programming is subject to a funda-mental tradeoffbetween convergence speed andfinalfitness[4],suggesting that it is not generally possible to improve one without hurting the other.Fortunately, this tradeoffpoint has not yet been reached in the context of particle swarm optimization,as it is still possible tofind good solutions more quickly without damagingfinal solutionfitness.For example,the development of a PSO visualization tool served to expose a particularly interesting inefficiency in the basic PSO algorithm.As the parti-cles close in on g they tend to lose their lateral momentum very quickly,each settling into a simple periodic linear motion as they repeatedly overshoot(and undershoot)the target.This exploration strategy around local minima is very inefficient,suggesting that a change to particle motion may speed the search by improving exploration.Such a change should ideally preserve the existing desirable characteristics of the algorithm.PSO is essentially a social algorithm,which gives it useful emergent behavior.Additionally,PSO motion is stochastic,allowing for ran-domized exploration.Particles also have momentum,adding direction to the random search.The constriction coefficient indicates a need for stability.Alter-ations to particle motion should presumably maintain these properties,making the Kalman Filter a suitable choice.3The Kalman FilterKalmanfilters involve taking noisy observations over time and using model in-formation to estimate the true state of the environment[17].Kalmanfiltering is generally applied to motion tracking problems.It may also be used for prediction by applying the system transition model to thefiltered estimate.The Kalman Filter is limited to normal noise distributions and linear transi-tion and sensor functions and is therefore completely described by several con-stant matrices and vectors.Specifically,given an observation column vector z t+1, the Kalman Filter is used to generate a normal distribution over a belief about the true state.The parameters m t+1and V t+1of this multivariate distribution are determined by the following equations[18]:m t+1=Fm t+K t+1(z t+1−HFm t)(5)V t+1=(I−K t+1)(FV t F +V x)(6)K t+1=(FV t F +V x)H H(FV t F +V x)H +V z −1.(7) In these equations,F and V x describe the system transition model while H and V z describe the sensor model.The equations require a starting point for thefiltered belief,represented by a normal distribution with parameters m0and V0,which must be provided.Thefiltered or“true”state is then represented by a distribution:x t∼Normal(m t,V t).(8) This distribution may be used in more than one way.In some applications,the mean m t is assumed to be the true value.In others,the distribution is sampled once to obtain the value.In this work,the latter is done.The above describes how to do Kalmanfiltering,yielding m t from an obser-vation z t.A simple form of prediction involves applying the transition model to obtain a belief about the next state m t+1:m t+1=Fm t.(9) There are other forms of prediction,but this simple approach is sufficient for the introduction of the algorithm in the next section,and for its use in particle swarms.4The Kalman Swarm(KSwarm)KSwarm defines particle motion entirely from Kalman prediction.Each particle keeps track of its own m t,V t,and K t.The particle then generates an observation for the Kalmanfilter with the following formulae:z v=φ(g−x)(10)z p=x+z v.(11)Similar to PSO,φis drawn uniformly from[0,2),and the results are row vec-tors.The full observation vector is given by making a column vector out of the concatenated position and velocity row vectors:z=(z p,z v) .This observation is then used to generate m t+1and V t+1using(5),(6),and(7) Once thefiltered value is obtained,a prediction m t+2is generated using(9). Together,m t+2and V t+1parameterize a normal distribution.We say,then,thatx t+1∼Normal(m t+2,V t+1).(12) The new state of the particle is obtained by sampling once from this distribution. The position of the particle may be obtained from thefirst half of x t+1,and the velocity(found in the remaining half)is unused.This method for generating new particle positions has at least one imme-diately obvious advantage over the original approach:there is no need for a constriction coefficient.Particle momentum comes from the state maintained by the Kalman Filter rather than from the transition model.In our experiments, this eliminated the need for any explicit consideration of velocity explosion.5ExperimentsKSwarm was compared to PSO infive common test functions:Sphere,DejongF4, Rosenbrock,Griewank,and Rastrigin.Thefirst three are unimodal while the last two are multimodal.In all experiments,the dimensionality d=30.The definitions of thefive functions are given here:Sphere(x)=di=1x2i(13)DeJongF4(x)=di=1ix4i(14)Rosenbrock(x)=d−1i=1100(x i+1−x2i)2+(x i−1)2(15)Rastrigin(x)=di=1x2i+10−10cos(2πx i)(16)Griewank(x)=1√Table1.Domains of Test Functions FunctionSphere(−20,20)dRosenbrock(−600,600)dRastriginTable 2.PSO vs.KSwarm Final ValuesFunctionPSO Sphere370.041 4.609Rosenbrock2.61e70.996Rastrigin 106.550These results represent a clear and substantial improvement over the basic PSO,not only in the final solutions,but in the speed with which they are found.It should be noted that much research has been done to improve PSO in other ways and that KSwarm performance in comparison to these methods has not been fully explored.The purpose of this work is to demonstrate a novel approach to particle motion that substantially improves the basic algorithm.The compari-son and potential combination of KSwarm with other PSO improvements is part of ongoing research and will be a subject of future work.0 1002003004005000 100 200 300 400 500 600 700 800 900 1000B e s t v a l u e o b t a i n e d Iterations on "Sphere"Kalman Standard Fig.1.Sphere5.3Notes on ComplexityIt is worth noting that the Kalman motion update equations require more com-putational resources than the original particle motion equations.In fact,because0 1000200030004000 500060000 100 200 300 400 500 600 700 800 900 1000B e s t v a l u e o b t a i n e d Iterations on "Dejongf4"Kalman Standard Fig.2.DeJongF40 5e+061e+071.5e+072e+07 2.5e+07 3e+073.5e+07 0 100 200 300 400 500 600 700 800 900 1000B e s t v a l u e o b t a i n e d Iterations on "Rosenbrock"Kalman Standard Fig.3.Rosenbrock0 51015 20 0 100 200 300 400 500 600 700 800 900 1000B e s t v a l u e o b t a i n e d Iterations on "Griewank"Kalman StandardFig.4.Griewank0 20406080 100 120 140160 0 100 200 300 400 500 600 700 800 900 1000B e s t v a l u e o b t a i n e d Iterations on "Rastrigin"Kalman Standard Fig.5.Rastriginof the matrix operations,the complexity is O(d3)in the number of dimensions (d=30in our experiments).The importance of this increased complexity,how-ever,appears to diminish when compared to the apparent exponential improve-ment in the number of iterations required by the algorithm.Additionally,the complexity can be drastically reduced by using matrices that are mostly diag-onal or by approximating the essential characteristics of Kalman behavior in a simpler way.6Conclusions and Future WorkIt remains to be seen how KSwarm performs against diversity-increasing ap-proaches,but preliminary work indicates that it will do well in that arena,es-pecially with regard to convergence speed.Since many methods which increase diversity do not fundamentally change particle motion update equations,com-bining this approach with those methods is simple.It can allow KSwarm to not onlyfind solutions faster,but also to avoid the stagnation to which it is still prone.Work remains to be done on alternative system transition matrices.The tran-sition model chosen for the motion presented in this work is not the only possible model;other models may produce useful behaviors.Additionally,the complexity of the algorithm should be addressed.It is likely to be easy to improve by simple optimization of matrix manipulation,taking advantage of the simplicity of the model.More work remains to be done in this area.KSwarm fundamentally changes particle motion as outlined in PSO while retaining its key properties of sociality,momentum,exploration,and stability. It represents a substantial improvement over the basic algorithm not only in the resulting solutions,but also in the speed with which they are found. References1.Kennedy,J.,Eberhart,R.C.:Particle swarm optimization.In:International Con-ference on Neural Networks,IV(Perth,Australia),Piscataway,NJ,IEEE Service Center(1995)1942–19482.Kennedy,J.,Eberhart,R.C.:A discrete binary version of the particle swarm algo-rithm.In:Proceedings of the World Multiconference on Systemics,Cybernetics, and Informatics,Piscataway,New Jersey(1997)4104–41093.Kennedy,J.,Spears,W.:Matching algorithms to problems:An experimental testof the particle swarm and some genetic algorithms on the multimodal problem generator.In:Proceedings of the IEEE Congress on Evolutionary Computation (CEC1998),Anchorage,Alaska(1998)4.Riget,J.,Vesterstroem,J.S.:A diversity-guided particle swarm optimizer-theARPSO.Technical Report2002-02,Department of Computer Science,University of Aarhus(2002)5.Løvbjerg,M.:Improving particle swarm optimization by hybridization of stochas-tic search heuristics and self-organized criticality.Master’s thesis,Department of Computer Science,University of Aarhus(2002)6.Richards,M.,Ventura,D.:Dynamic sociometry in particle swarm optimization.In:International Conference on Computational Intelligence and Natural Computing.(2003)7.Vesterstroem,J.S.,Riget,J.,Krink,T.:Division of labor in particle swarm op-timisation.In:Proceedings of the IEEE Congress on Evolutionary Computation (CEC2002),Honolulu,Hawaii(2002)8.Shi,Y.,Eberhart,R.C.:Parameter selection in particle swarm optimization.In:Evolutionary Programming VII:Proceedings of the Seventh Annual Conference on Evolutionary Programming,New York(1998)591–6009.Kennedy,J.,Mendes,R.:Population structure and particle swarm performance.In:Proceedings of the Congress on Evolutionary Computation(CEC2002),Honolulu, Hawaii(2002)10.Kennedy,J.,Mendes,R.:Neighborhood topologies in fully-informed and best-of-neighborhood particle swarms.In:Proceedings of the2003IEEE SMC Workshop on Soft Computing in Industrial Applications(SMCia03),Binghamton,New York, IEEE Computer Society(2003)11.Kennedy,J.:Small worlds and mega-minds:Effects of neighborhood topology onparticle swarm performance.In Angeline,P.J.,Michalewicz,Z.,Schoenauer,M., Yao,X.,Zalzala,Z.,eds.:Proceedings of the Congress of Evolutionary Computa-tion.Volume3.,IEEE Press(1999)1931–193812.Mendes,R.,Kennedy,J.,Neves,J.:Watch thy neighbor or how the swarm can learnfrom its environment.In:Proceedings of the IEEE Swarm Intelligence Symposium 2003(SIS2003),Indianapolis,Indiana(2003)88–9413.Krink,T.,Vestertroem,J.S.,Riget,J.:Particle swarm optimisation with spatialparticle extension.In:Proceedings of the IEEE Congress on Evolutionary Com-putation(CEC2002),Honolulu,Hawaii(2002)14.Angeline,P.J.:Using selection to improve particle swarm optimization.In:Pro-ceedings of the IEEE Congress on Evolutionary Computation(CEC1998),An-chorage,Alaska(1998)15.Clerc,M.,Kennedy,J.:The particle swarm:Explosion,stability,and convergencein a multidimensional complex space.IEEE Transactions on Evolutionary Com-putation6(2002)58–7316.Kennedy,J.:Bare bones particle swarms.In:Proceedings of the IEEE SwarmIntelligence Symposium2003(SIS2003),Indianapolis,Indiana(2003)80–87 17.Kalman,R.E.:A new approach to linearfiltering and prediction problems.Trans-actions of the ASME–Journal of Basic Engineering82(1960)35–4518.Russel,S.,Norvig,P.:Artificial Intelligence:A Modern Approach.Second edn.Prentice Hall,Englewood Cliffs,New Jersey(2003)。
杨振宁诺贝尔奖演讲词
C H E N N I N G Y A N GThe law of parity conservation and othersymmetry laws of physicsNobel Lecture, December 11, 1957It is a pleasure and a great privilege to have this opportunity to discuss with you the question of parity conservation and other symmetry laws. We shall be concerned first with the general aspects of the role of the symmetry laws in physics; second, with the development that led to the disproof of parity conservation; and last, with a discussion of some other symmetry laws which physicists have learned through experience, but which do not yet together form an integral and conceptually simple pattern. The interesting and very exciting developments since parity conservation was disproved, will be cov-ered by Dr. Lee in his lecture1.IThe existence of symmetry laws is in full accordance with our daily ex-perience. The simplest of these symmetries, the isotropy and homogeneity of space, are concepts that date back to the early history of human thought. The invariance of physical laws under a coordinate transformation of uni-form velocity, also known as the invariance under Galilean transformations, is a more sophisticated symmetry that was early recognized, and formed one of the corner-stones of Newtonian mechanics. Consequences of these sym-metry principles were greatly exploited by physicists of the past centuries and gave rise to many important results. A good example in this direction is the theorem that in an isotropic solid there are only two elastic constants. Another type of consequences of the symmetry laws relates to the con-servation laws. It is common knowledge today that in general a symmetry principle (or equivalently an invariance principle) generates a conservation law. For example, the invariance of physical laws under space displacement has as a consequence the conservation of momentum, the invariance under space rotation has as a consequence the conservation of angular momentum. While the importance of these conservation laws was fully understood, their close relationship with the symmetry laws seemed not to have been clearly recognized until the beginning of the twentieth century2. (Cf. Fig. 1.)3941957C.N.Y A N GFig. 1 .With the advent of special and general relativity, the symmetry laws gained new importance. Their connection with the dynamic laws of physics takes on a much more integrated and interdependent relationship than in classical mechanics, where logically the symmetry laws were only conse-quences of the dynamical laws that by chance possess the symmetries. Also in the relativity theories the realm of the symmetry laws was greatly en-riched to include invariances that were by no means apparent from daily experience. Their validity rather was deduced from, or was later confirmed by complicated experimentation. Let me emphasize that the conceptual sim-plicity and intrinsic beauty of the symmetries that so evolve from complex experiments are for the physicists great sources of encouragement. One learns to hope that Nature possesses an order that one may aspire to comprehend. It was, however, not until the development of quantum mechanics that the use of the symmetry principles began to permeate into the very language of physics. The quantum numbers that designate the states of a system are often identical with those that represent the symmetries of the system. It in-deed is scarcely possible to overemphasize the role played by the symmetry principles in quantum mechanics. To quote two examples: The general struc-ture of the Periodic Table is essentially a direct consequence of the isotropy of Coulomb’s law. The existence of the antiparticles - namely the positron, the antiproton, and the antineutron - were theoretically anticipated as con-sequences of the symmetry of physical laws with respect to Lorentz trans-formations. In both cases Nature seems to take advantage of the simple mathematical representations of the symmetry laws. When one pauses to consider the elegance and the beautiful perfection of the mathematical rea-soning involved and contrast it with the complex and far-reaching physicalP A R I T Y C O N S E R V A T I O N A N D O T H E R S Y M M E T R Y L A W S395 consequences, a deep sense of respect for the power of the symmetry laws never fails to develop.One of the symmetry principles, the symmetry between the left and the right, is as old as human civilization. The question whether Nature exhibits such symmetry was debated at length by philosophers of the pasts. Of course, in daily life, left and right are quite distinct from each other. Our hearts, for example, are on our left sides. The language that people use both in the orient and the occident, carries even a connotation that right is good and left is evil. However, the laws of physics have always shown complete symmetry between the left and the right, the asymmetry in daily life being attributed to the accidental asymmetry of the environment, or initial conditions in organic life. To illustrate the point, we mention that if there existed a mirror-image man with his heart on his right side, his internal organs reversed com-pared to ours, and in fact his body molecules, for example sugar molecules, the mirror image of ours, and if he ate the mirror image of the food that we eat, then according to the laws of physics, he should function as well as we do. The law of right-left symmetry was used in classical physics, but was not of any great practical importance there. One reason for this derives from the fact that right-left symmetry is a discrete symmetry, unlike rotational sym-metry which is continuous. Whereas the continuous symmetries always lead to conservation laws in classical mechanics, a discrete symmetry does not. With the introduction of quantum mechanics, however, this difference between the discrete and continuous symmetries disappears. The law of right-left symmetry then leads also to a conservation law: the conservation of parity. The discovery of this conservation law dates back to 1924 when Laporte4 found that energy levels in complex atoms can be classified into « gestriche-ne » and « ungestrichene » types, or in more recent language, even and odd levels. In transitions between these levels during which one photon is emitted or absorbed, Laporte found that the level always changes from even to odd or vice versa. Anticipating later developments, we remark that the evenness or oddness of the levels was later referred to as the parity of the levels. Even levels are defined to have parity +1,odd levels parity -1. One also defines the photon emitted or absorbed in the usual atomic transitions to have odd parity. Laporte’s rule can then be formulated as the statement that in an atomic transition with the emission of a photon, the parity of the initial state is equal to the total parity of the final state, i.e. the product of the parities of the final atomic state and the photon emitted. In other words, parity is conserved, or unchanged, in the transition.3961957 C. N. YANGIn 1927 Wigners took the critical and profound step to prove that the empirical rule of Laporte is a consequence of the reflection invariance, or right-left symmetry, of the electromagnetic forces in the atom. This fun-damental idea was rapidly absorbed into the language of physics. Since right-left symmetry was unquestioned also in other interactions, the idea was fur-ther taken over into new domains as the subject matter of physics extended into nuclear reactions,puzzle developed in the last few years. Before explaining the meaning of this puzzle it is best to go a little bit into a classification of the forces that act between subatomic particles, a classification which the physicists have learned through experience to use in the last 50 years. We list the four classes of interactions below. The strength of these interactions is indicated in the column on the right.The strongest interactions are the nuclear interactions which include the forces that bind nuclei together and the interaction between the nuclei and theP A R I T Y C O N S E R V A T I O N A N D O T H E R S Y M M E T R Y L A W S397 this century in the β-radioactivity of nuclei, a phenomena which especially in the last 25 years has been extensively studied experimentally. With the discovery of decays and µ capture it was noticed independently6 by Klein, by Tiomno and Wheeler, and by Lee, Rosenbluth and me, that these interactions have roughly the same strengths as β-interactions. They are called weak interactions, and in the last few years their rank has been con-stantly added to through the discovery of many other weak interactions responsible for the decay of the strange particles. The consistent and striking pattern of their almost uniform strength remains today one of the most tan-talizing phenomena - a topic which we shall come back to later. About the last class of forces, the gravitational forces, we need only mention that in atomic and nuclear interactions they are so weak as to be completely neg-ligible in all the observations with existing techniques.Now to return to theand τ mesonssome information about the spins and parities of the τ andmeson must have the total parity, or in other words, the product parity, of two π mesons,which is even (i.e. = +1). Similarly, the τ meson must have the total parity of three π mesons, which is odd. Actually because of the relative motion of the π mesons the argument was not as simple and unambiguous as we just discussed. To render the ar-gument conclusive and definitive it was necessary to study experimentally the momentum and angular distribution of the π mesons. Such studies were made in many laboratories, and by the spring of 1956 the accumulated ex-perimental data seemed to unambiguously indicate, along the lines of rea-soning discussed above, thatϑ and τ do not have the same parity, and con-sequently are not the same particle. This conclusion, however, was in marked contradiction with other experimental results which also became definite at about the same time. The contradiction was known as the ϑ-τ puzzle and was widely discussed. To recapture the atmosphere of that time allow me to quote a paragraph concerning the conclusion that3981957C.N.Y A N Gparticle from a report entitled « Present Knowledge about the New Par-ticles » which I gave at the International Conference on Theoretical Physics8 in Seattle, in September 1956.« However it will not do to jump to hasty conclusions. This is because ex-perimentally the K mesons (i.e. τ and ϑ) seem all to have the same masses and the same lifetimes. The masses are known to an accuracy of, say, from 2 to 10electron masses, or a fraction of a percent, and the lifetimes are known to an accuracy of, say, 20 percent. Since particles which have different spin and parity values, and which have strong interactions with the nucleons and pions, are not expected to have identical masses and lifetimes, one is forced to keep the question open whether the inference mentioned above that the are not the same particle is conclusive. Parenthetically, I might addthat the inference would certainly have been regarded as conclusive, and in fact more well-founded than many inferences in physics, had it not been for the anomaly of mass and lifetime degeneracies. »The situation that the physicist found himself in at that time has been likened to a man in a dark room groping for an outlet. He is aware of the fact that in some direction there must be a door which would lead him out of his predicament. But in which direction?That direction turned out to lie in the faultiness of the law of parity con-servation for the weak interactions. But to uproot an accepted concept one must first demonstrate why the previous evidence in its favor were insuffi-cient. Dr. Lee and I9 examined this question in detail, and in May 1956 we came to the following conclusions: (A) Past experiments on the weak inter-actions had actually no bearing on the question of parity conservation. (B) In the strong interactions, i.e. interactions of classes 1and 2 discussed above, there were indeed many experiments that established parity conservation to a high degree of accuracy, but not to a sufficiently high degree to be able to reveal the effects of a lack of parity conservation in the weak interactions. The fact that parity conservation in the weak interactions was believed for so long without experimental support was very startling. But what was more startling was the prospect that a space-time symmetry law which the phys-icists have learned so well may be violated. This prospect did not appeal to us. Rather we were, so to speak, driven to it through frustration with the various other efforts at understanding theP A R I T Y C O N S E R V A T I O N A N D O T H E R S Y M M E T R Y L A W S399 an approximate symmetry law was, however, not expected of the sym-metries related to space and time. In fact one is tempted to speculate, now that parity conservation is found to be violated in the weak interactions, whether in the description of such phenomena the usual concept of space and time is adequate. At the end of our discussion we shall have the occasion to come back to a closely related topic.Why was it so that among the multitude of experiments onThis experiment was first performed in the latter half of 1956 and finished early this year by Wu, Ambler, Hayward, Hoppes, and Hudson12. The actual experimental setup was very involved, because to eliminate disturbing out-side influences the experiment had to be done at very low temperatures. The technique of combining β-decay measurement with low temperature ap-paratus was unknown before and constituted a major difficulty which was successfully solved by these authors. To their courage and their skill, phys-icists owe the exciting and clarifying developments concerning parity con-servation in the past year.of cobalt. Very rapidly after these results were made known, many experi-ments were performed which further demonstrated the violation of parity conservation in various weak interactions. In his lecturer Dr. Lee will discuss these interesting and important developments.I I IThe breakdown of parity conservation brings into focus a number of ques-tions concerning symmetry laws in physics which we shall now briefly dis-cuss in general terms:(A) As Dr. Lee1 will discuss, the experiment of Wu, Ambler, and their collaborators also proves13,14 that charge conjugation invariance15 is violated forP A R I T Y C O N S E R V A T I O N A N D O T H E R S Y M M E T R Y L A W S401 The three discrete invariances - reflection invariance, charge conjugation invariance, and time reversal invariance - are connected by an important theorem17 called the CPT theorem. Through the use of this theorem one can prove13 a number of general results concerning the experimental manifesta-tions of the possible violations of the three symmetries in the weak inter-actions.Of particular interest is the possibility that time reversal invariance in the weak interactions may turn out to be intact. If this is the case, it follows from the CPT theorem that although parity conservation breaks down, right-left symmetry will still hold if18 one switches all particles into antiparticles in taking a mirror image.In terms of Fig. 2 this means that if one changes all the matter that composes the apparatus at the right into anti-matter, the meter reading would become the same for the two sides if time reversal invariance holds. It is important to notice that in the usual definition of re-flection, the electric field is a vector and the magnetic field a pseudovector while in this changed definition their transformation properties are switched. The transformation properties of the electric charge and the magnetic charge are also interchanged. It would be interesting to speculate on the possible relationship between the nonconservation of parity and the symmetrical or unsymmetrical role played by the electric and magnetic fields.The question of the validity of the continuous space time symmetry laws has been discussed to some extent in the past year. There is good evidence that these symmetry laws do not break down in the weak interactions. (B) Another symmetry law that has been widely discussed is that giving rise to the conservation of isotopic spin19. In recent years the use of this sym-metry law has produced a remarkable empirical order among the phenom-ena concerning the strange particles20.It is however certainly the least under-stood of all the symmetry laws. Unlike Lorentz invariance or reflection invariance, it is not a « geometrical » symmetry law relating to space time invariance properties. Unlike charge conjugation invariance21 it does not seem to originate from the algebraic property of the complex numbers that occurs in quantum mechanics. In these respects it resembles the conservation laws of charge and heavy particles. These latter laws, however, are exact while the conservation of isotopic spin is violated upon the introduction of electromagnetic interactions and weak interactions. An understanding of the origin of the conservation of isotopic spin and how to integrate it with the other symmetry laws is undoubtedly one of the outstanding problems in high-energy physics today.4021957 C.N.Y A N G(C) We have mentioned before that all the different varieties of weak interactions share the property of having very closely identical strengths. The experimental work on parity nonconservation in the past year reveals that they very likely also share the property of not respecting parity conservation and charge conjugation invariance. They therefore serve to differentiate be-tween right and left once one fixes one’s definition of matter vs. anti-mat-ter. One could also use the weak interactions to differentiate between matter and anti-matter once one chooses a definition of right vs. left. If time rever-sal invariance is violated, the weak interactions may even serve to differen-tiate simultaneously right from left, and matter from anti-matter. One senses herein that maybe the origin of the weak interactions is intimately tied in with the question of the differentiability of left from right, and of matter from anti-matter.1. T. D. Lee, Nobel Lecture, this volume, p. 406.2.For references to these developments see E. P. Wigner, Proc. Am. Phil. Soc., 93(1949) 521.3. Cf. the interesting discussion on bilateral symmetry by H. Weyl, Symmetry, Prince-ton University Press, 1952.4. O. Laporte, Z.Physik, 23 (1924) 135.5. E. P. Wigner, Z. Physik, 43 (1927) 624.6. O. Klein, Nature, 161 (1948) 897; J. Tiomno and J. A. Wheeler, Rev.Mod. Phys.,21 (1949) 144;T. D. Lee, M. Rosenbluth, and C. N. Yang, Phys. Rev., 75 (1949)905.7. R. Dalitz, Phil. Mag., 44 (1953) 1068; E. Fabri, Nuovo Cimento, II(1954) 479.8. C. N. Yang, Rev. Mod. Phys., 29 (1957) 231.9. T. D. Lee and C. N. Yang, Phys. Rev., 104 (1956) 254.10. T. D. Lee and J. Orear, Phys. Rev., 100 (1955) 932;T. D. Lee and C. N. Yang,Phys. Rev., 102 (1956) 290; M. Gell-Mann, (unpublished); R. Weinstein, (private communication) ; a general discussion of these ideas can be found in the Proceedings of the Rochester Conference, April 1956, Session VIII, Interscience, New York, 1957.11. C. N. Yang and J. Tiomno, Phys. Rev., 79 (1950) 495.12. C. S. Wu, E. Ambler, R. W. Hayward, D. D. Hoppes, and R. P. Hudson, Phys.Rev.,105 (1957) 1413.13. T. D. Lee, R. Oehme, and C. N. Yang, Phys. Rev., 106 (1957) 340.14. B. L. Ioffe, L. B. Okun, and A. P. Rudik, J.E.T.P. (U.S.S.R.), 32 (1957) 396.English translation in Soviet Phys. ]ETP, 5 (1957) 328.15. Charge conjugation invariance is very intimately tied with the hole theory inter-pretation of Dirac’s equation. The development of the latter originated with P. A.M. Dirac, Proc. Roy. Soc. London, A126 (1930) 360; J. R. Oppenheimer, Phys. Rev.,P A R I T Y C O N S E R V A T I O N A N D O T H E R S Y M M E T R Y L A W S40335 (1930) 562 and H. Weyl, Gruppentheorie und Quantenmechanik, 2nd ed., 1931,p. 234. An account of these developments is found in P. A. M. Dirac, Proc. Roy.S O c. London, A133(1931) 60. Detailed formalism and application of charge con-jugation invariance started with H. A. Kramers, Proc. Acad. Sci. Amsterdam, 40 (1937) 814and W. Furry, Phys. Rev., 51 (1937) 125.16.E. P. Wigner, Nachr. Akad. Wiss. Goettingen, Math.-Physik., 1932, p. 546.Thispaper explains in terms of time reversal invariance the earlier work of H. Kramers, Proc. Acad. Sci. Amsterdam, 33 (1930) 959.17.J. Schwinger, Phys. Rev., 91 (1953) 720, 723;G. Lüders, Kgl. Danske Videnskab.au‘s article in Niels Bohr and the Selskab., Mat.-Fys. Medd., 28, No. 5 (1954);W. P liDevelopment of Physics, Pergamon Press, London, 1955. See also Ref. 21.18.This possibility was discussed by T. D. Lee and C. N. Yang and reported by C. N.Yang at the International Conference on Theoretical Physics in Seattle in Septem-ber 1956. (See Ref. 8.) Its relation with the CPT theorem was also reported in the same conference in one of the discussion sessions. The speculation was later pub-lished in T. D. Lee and C. N. Yang, Phys. Rev., 105(1957) 1671. Independently the possibility has been advanced as the correct one by L. Landau, J.E.T.P.(U.S.S.R.), 32 (1957) 405. An English translation of Landau’s article appeared in Soviet Phys. JETP, 5 (1957) 336.19. The concept of a total isotopic spin quantum number was first discussed by B.Cassen and E. U. Condon, Phys. Rev., 50(1936) 846and E. P. Wigner, Phys. Rev., 51(1937) 106.The physical basis derived from the equivalence of p-p and n-p forces, pointed out by G. Breit, E. U. Condon, and R. D. Present, Phys. Rev., 50 (1936) 825. The isotopic spin was introduced earlier as a formal mathematical parameter by W. Heisenberg, Z. Physik, 77 (1932) I.20.A. Pais, Phys. Rev., 86 (1952) 663, introduced the idea of associated production ofstrange particles. An explanation of this phenomenon in terms of isotopic spin conservation was pointed out by M. Gell-Mann, Phys. Rev., 92 (1953) 833and by K. Nishijima, Progr. Theoret. Phys. (Kyoto), 12 (1954) 107.These latter authors also showed that isotopic spin conservation leads to a convenient quantum number called strangeness.21.R. Jost, Helv. Phys. Acta, 30 (1957) 409.。
Subthreshold Antiproton Spectra in Relativistic Heavy Ion Collisions
a rXiv:h ep-ph/959328v119Se p1995TPR-95-19Subthreshold Antiproton Spectra in Relativistic Heavy Ion Collisions Richard Wittmann and Ulrich Heinz Institut f¨u r Theoretische Physik,Universit¨a t Regensburg,D-93040Regensburg,Germany (February 1,2008)Abstract We study the structure of antiproton spectra at extreme subthreshold bom-barding energies using a thermodynamic picture.Antiproton production pro-cesses and final state interactions are discussed in detail in order to find out what can be learned about these processes from the observed spectra.Typeset using REVT E XI.INTRODUCTIONThere exist numerous examples for the production of particles in heavy ion collisions at bombarding energies well below the single nucleon-nucleon threshold[1].This phenomenon indicates collective interactions among the many participating nucleons and thus is expected to give information about the hot and dense matter formed in these collisions.At beam energies around1GeV per nucleon the most extreme of these subthreshold particles is the antiproton.Therefore,it is believed to be a very sensitive probe to collective behaviour in nucleus-nucleus collisions.However,presently neither the production mechanism nor thefinal state interactions of antiprotons in dense nucleonic matter are well understood.The antiproton yields measured at GSI and BEVALAC[2,3]seem to be described equally well by various microscopic models using different assumptions about the production mechanism and particle properties in dense nuclear matter[4–7].This ambiguity raises the question which kind of information can be really deduced from subthreshold¯p spectra.In this paper we use a simple thermodynamic framework as a background on wich we can systematically study the influence of different assumptions on thefinal¯p spectrum.In the next section we will focus on the production process.Following a discussion of thefinal state interactions of the antiproton in dense hadronic matter in Section III,we use in Section IV a one-dimensional hydrodynamic model for the explodingfireball to clarify which features of the production and reabsorption mechanisms should survive in thefinal spectra in a dynamical environment.We summarize our results in Section V.II.PRODUCTION OF ANTIPROTONS IN HEA VY ION COLLISIONSA.The Antiproton Production RateUnfortunately very little is known about the production mechanism for antiprotons in dense nuclear matter.Therefore,we are forced to use intuitive arguments to obtain aplausible expression for the production rate.As commonly done in microscopic models[8] we consider only two body collisions and take the experimentally measured cross sections for ¯p production in free NN collisions as input.The problem can then be split into two parts: the distribution of the two colliding nucleons in momentum space and the elementary cross sections for antiproton production.The procedure is later generalized to collisions among other types of particles(Section II C)using phase space arguments.Formally,the antiproton production rate P,i.e.the number of antiprotons produced in the space-time cell d4x and momentum space element d3p,is given by[9]P= i,j ds2w(s)d3σij→¯pδ(p2i−m2i)Θ(p0i).(2)(2π)3Finally,w(s,m i,m j)=√offinding two nucleons at a center-of-mass energys−4m)αIn order to obtain from the total cross section(3)a formula for the differential cross section,we assume like others that the momentum distribution of the produced particles is mainly governed by phase space.This leads to the simple relationshipd3σij→NNN¯pσij→NNN¯p(s).(4)R4(P)Here R n is the volume the n-particle phase space,which can be given analytically in the non-relativistic limit[12].P is the total4-momentum of the4-particlefinal state,which√reduces to(s min(p)/T.Hereρi,j are the densities of the incoming particles,that the cross section σij →NNN ¯p (s )is independent of the internal state of excitation of the colliding baryons in our thermal picture.The consequences of this assumption are quite obvious.While the distance to the ¯p -threshold is reduced by the larger rest mass of the resonances,the mean velocity of a heavy resonance state in a thermal system is smaller than that of a nucleon.Both factors counteract each other,and indeed we found that the total rate P is not strongly changed by the inclusion of resonances.The role of pionic intermediate states for ¯p -production in pp -collisions was pointed out by Feldman [15].As mesonsare created numerously in the course of a heavy ion collision,mesonic states gain even more importance in this case.In fact,Ko and Ge [13]claimed that ρρ→p ¯p should be the dominant production channel.Relating the ρρ-production channel to the p ¯p annihilation channel [13]byσρρ→p ¯p (s )= 2s −4m ρm σp ¯p →ρρ(s )(5)where S =1is the spin factor for the ρ,the production rate can be calculated straightfor-wardly from Eq.(1):P =g 2ρ(2π)516T .(6)The spin-isospin factor of the ρis g ρ=9,and E is the energy of the produced antiproton.The modified Bessel function K 1results from the assumption of local thermal equilibration for the ρ-distribution.Expanding the Bessel function for large values of 2E/T we see that the ”temperature”T ¯p of the ¯p -spectrum is only half the medium temperature:T ¯p =1w 2(s,m i ,m j )(8)Comparing this form with measured data onπ−p→np¯p collisions[16],a value ofσ0ij= 0.35mb is obtained.Due to the threshold behaviour of Eq.(8)and the rather large value ofσ0ij,it turns out[17]that this process is by far the most important one in a chemically equilibrated system.However,this chemical equilibration–if achieved at all–is reached only in thefinal stages of the heavy ion collision when cooling has already started.So it is by no means clear whether the dominance of the meson-baryon channel remains valid in a realistic collision scenario.This point will be further discussed in Section IV.III.FINAL STATE INTERACTION OF THE ANTIPROTONOnce an antiproton is created in the hot and dense hadronic medium,its state will be modified by interactions with the surrounding particles.Two fundamentally different cases have to be distinguished:elastic scattering,which leads to a reconfiguration in phase space, driving the momentum distribution towards a thermal one with the temperature of the surrounding medium,and annihilation.Each process will be considered in turn.A.Elastic ScatteringThe time evolution of the distribution function f(p,t)is generally described by the equation[18]f(p,t2)= w(p,p′;t2,t1)f(p′,t1)dp′(9) where w(p,p′;t2,t1)is the transition probability from momentum state p′at time t1to state p at t2.Because the number density of antiprotons is negligible compared to the total particle density in the system,the evolution of f(p,t)can be viewed as a Markov process.Assuming furthermore that the duration of a single scattering processτand the mean free pathλare small compared to the typical time scaleδt and length scaleδr which measure the variation of the thermodynamic properties of the system,τ≪δt,λ≪δr,Eq.(9)can be transformed into a master equation.Considering the structure of the differ-ential p¯p cross section one notices that in the interesting energy range it is strongly peaked in the forward direction[19–21].Therefore,the master equation can be approximated by a Fokker-Planck equation[22]:∂f(p,t)∂p A(p)f(p,t)+1∂2pD(p)f(p,t).(10)For the evaluation of the friction coefficient A and the diffusion coefficient D we follow the treatment described by Svetitsky[23].For the differential cross section we took a form suggested in[19]:dσ|t|2D/A(e2At−1)+2mT0p2≡exp −p2/2mT eff(t) .(14)This shows that the exponential shape of the distribution function is maintained throughout the time evolution,but that the slope T eff(t)gradually evolves from T0to the value D/mAwhich,according to the Einstein relation(12),is the medium temperature T.Looking at Fig.3it is clear that after about10fm/c the spectrum is practically thermalized.Therefore initial structures of the production spectrum(like the ones seen in Fig.2)are washed out quite rapidely,and their experimental observation will be very difficult.B.AnnihilationThe annihilation of antiprotons with baryons is dominated by multi-mesonfinal states X.For the parametrisation of the annihilation cross sectionσann(s)we take the form given in[14]for the process¯p+B−→X,B=N,∆, (15)Using the same philosophy as for the calculation of the production rate,a simple dif-ferential equation for the decrease of the antiproton density in phase space can be written down:dd3xd3p =−d6N(2π)3f i( x, p i,t)v i¯pσanni¯p≡−d6Nbecomes questionable.More reliable results should be based on a quantumfield theoretic calculation which is beyond the scope of this paper.IV.ANTIPROTON SPECTRA FROM AN EXPLODING FIREBALLA.A Model for the Heavy Ion CollisionIn order to compare the results of the two previous sections with experimental data we connect them through a dynamical model for the heavy ion reaction.In the spirit of our thermodynamic approach the so-called hadrochemical model of Montvay and Zimanyi [24]is applied for the simulation of the heavy ion collision.In this picture the reaction is split into two phases,an ignition and an explosion phase,and particles which have at least scattered once are assumed to follow a local Maxwell-Boltzmann distribution.In addition, a spherically symmetric geometry is assumed for the explosion phase.The included particle species are nucleons,∆-resonances,pions andρ-mesons.As initial condition a Fermi-type density distribution is taken for the nucleons of the incoming nuclei,ρ0ρ(r)=collision with momentum p z=±700MeV in the c.m.system.One clearly sees that at the moment of full overlap of the two nuclei a dense zone with hot nucleons,resonances and mesons(not shown)has been formed.Only in the peripheral regions”cold”target and projectile nucleons can still be found.On the other hand,chemical equilibrium of the hot collision zone is not reached in the short available time before the explosion phase sets in; pions and in particularρmesons remain far below their equilibrium abundances[17].It is important to note that due to the arguments given in Section II¯p production is strongly suppressed in this initial stage of the reaction.In our simple model we have in fact neglected this early¯p production completely.The ignition phase is only needed to obtain the chemical composition of the hotfireball which is expected to be the dominant source for the creation of antiprotons.For the subsequent expansion of the spherically symmetricfireball into the surrounding vacuum analytical solutions can be given if the equation of state of an ideal gas is taken as input[25].If excited states are included in the model,an exact analytical solution is no longer possible.Because a small admixture of resonaces is not expected to fundamentally change the dynamics of the system,we can account for their effect infirst order by adjusting only the thermodynamic parameters of the explodingfireball,but not the expansion velocity profiles.There is one free parameter in the model[24],α,which controls the density and the temperature profiles,respectively.Small valuesα→0representδ-function like density profiles whereasα→∞corresponds to a homogeneous density distribution throughout the fireball(square well profile).The time-dependent temperature profiles for two representative values ofαare shown in Fig.5for different times t starting at the time t m of full overlap of the nuclei.Clearly,a small value ofαleads to an unreasonably high temperature(T∼200MeV) in the core of thefireball at the beginning of the explosion phase,and should thus be considered unphysical.B.Antiproton Spectra from an Exploding FireballBased on the time-dependent chemical composition of this hadrochemical model we can calculate the spectrum of the antiprotons created in a heavy ion collision.Let usfirst con-centrate on the influence of the density distribution in thefireball characterized by the shape parameterα.Due to different temperature profiles connected with differentαvalues(see Fig.5)the absolute normalization varies substantially when the density profile is changed. For moreδ-like shapes an extremely hotfireball core is generated,whereas for increasingαboth density and temperature are more and more diffuse and spread uniformly over a wider area.Because of the exponential dependence upon temperature a small,but hot core raises the production rate drastically.This fact is illustrated in Fig.6for three values ofα.Not only the total normalization,but also the asymptotic slope of the spectrum is modified due to the variation of the core temperature withαas indicated in the Figure.Comparing the dotted lines,which give the pure production spectrum,to the solid lines representing the asymptotic spectrum at decoupling,the tremendous effect of antiproton absorption in heavy ion collisions is obvious.As one intuitively expects absorption is more pronounced for low-energetic antiprotons than for the high-energetic ones which have the opportunity to escape the high density zone earlier.Therefore,thefinally observed spectrum isflatter than the original production spectrum.Interestingly,while the baryon-baryon and theρρchannels are comparable in their con-tribution to¯p production,the pion-baryon channel turned out to be much more effective for all reasonable sets of parameters.This fact is indeed remarkable,because here,contrary to the discussion in Section II,the pions are not in chemical equilibrium;in our hadrochemical model the total time of the ignition phase is too short to saturate the pion channel.The meson-baryon channel is thus crucial for understanding¯p spectra.Only by including all channels reliable predictions about the antiprotons can be drawn.We did not mention so far that in our calculations we followed common practice and assumed afinite¯p formation time ofτ=1fm/c;this means that during this time intervalafter a¯p-producing collision the antiproton is assumed to be not yet fully developed and thus cannot annihilate.However,there are some(although controversial)experimental indications of an extremely long mean free path for antiprotons beforefinal state interactions set in[26].We have tested the influence of different values for the formation timeτon the¯p spectrum.Fig.7shows that this highly phenomenological and poorly established parameter has a very strong influence in particular on the absolute normalization of the spectra,i.e. the total production yield.In the light of this uncertainty it appears difficult to argue for or against the necessity for medium effects on the antiproton production threshold based on a comparison between theoretical and experimental total yields only.parison with Experimental DataIn all the calculations shown above a bombarding energy of1GeV/A has been assumed. Experimental data are,however,only available at around2GeV/A.At these higher energies thermalization becomes more questionable[27],and our simple model may be stretching its limits.Especially,the temperature in thefireball core becomes extremely high.In order to avoid such an unrealistic situation and in recognition of results from kinetic simulations [4,6,7]we thus assume that only part of the incoming energy is thermalized–in the following a fraction of70%was taken.Fig.8shows calculations for the antiproton spectrum from Na-Na and Ni-Ni collisions at a kinetic beam energy of2GeV/nucleon.The calculation assumes a¯p formation time of τ=1fm/c,and takes for the density and temperature profiles the parameter valueα=1 which corresponds to an upside-down parabola for the density profiparing with the GSI data[2]we see our model features too weak a dependence on the size of the collision system;the absolute order of magnitude of the antiproton spectrum is,however,correctly reproduced by our simple hadrochemical model,without adjusting any other parameters. No exotic processes for¯p production are assumed.As mentioned in the previous subsection, the pion-baryon channel is responsible for getting enough antiprotons in our model,withoutany need for a reduced effective¯p mass in the hot and dense medium[4,5].The existing data do not yet allow for a definite conclusion about the shape of the spectrum,and we hope that future experiments[28]will provide additional contraints for the model.V.CONCLUSIONSHeavy ion collisions at typical BEVALAC and SIS energies are far below the p¯p-production threshold.As a consequence,pre-equilibrium antiproton production in such collisions is strongly suppressed relative to production from the thermalized medium pro-duced in the later stages of the collision.Therefore,¯p production becomes important only when the heavy ion reaction is sufficiently far progressed,in accordance with microscopic simulations[4].By assuming a local Maxwell-Boltzmann distribution for the scattered and produced particles forming the medium in the collision zone one maximizes the¯p production rate(see Fig.1).If,contrary to the assumptions made in this work,the extreme states in phase space described by the tails of the thermal Boltzmann distribution are not populated, the antiproton yield could be reduced substantially.We also found that the threshold behaviour of the¯p production cross section is not only crucial for the total¯p yield,but also introduces structures into the initial¯p spectrum.This might give rise to the hope that by measuring the¯p momentum spectrum one may obtain further insight into the¯p production mechanism.On the other hand we saw here,using a Fokker-Plank description for the later evolution of the distribution function f¯p(p,t)in a hot environment,that these structures are largely washed out by subsequent elastic scattering of the¯p with the hadrons in the medium.In addition,the large annihilation rate reduces the number of observable antiprotons by roughly two orders of magnitude relative to the initial production spectrum;the exact magnitude of the absorption effect was found to depend sensitively on the choice of the¯p formation timeτ.We have also shown that meson(in particular pion)induced production channels con-tribute significantly to thefinal¯p yield and should thus not be neglected.We were thus ableto reproduce the total yield of the measured antiprotons in a simple model for the reaction dynamics without including,for example,medium effects on the hadron masses and cross sections[4,5].However,we must stress the strong sensitivity of the¯p yield on various unknown param-eters(e.g.the¯p formation time)and on poorly controlled approximations(e.g.the degree of population of extreme corners in phase space by the particles in the collision region),and emphasize the rapidly thermalizing effects of elasticfinal state interactions on the¯p momen-tum spectrum.We conclude that turning subthreshold antiproton production in heavy ion collisions into a quantitative probe for medium properties and collective dynamics in hot and dense nuclear matter remains a serious challenge.ACKNOWLEDGMENTSThis work was supported by the Gesellschaft f¨u r Schwerionenforschung(GSI)and by the Bundesministerium f¨u r Bildung und Forschung(BMBF).REFERENCES[1]U.Mosel,Annu.Rev.Nucl.Part.Sci.1991,Vol.41,29[2]A.Schr¨o ter et al.,Z.Phys.A350(1994)101[3]A.Shor et al.,Phys.Rev.Lett.63(1989)2192[4]S.Teis,W.Cassing,T.Maruyama,and U.Mosel,Phys.Rev.C50(1994)388[5]G.Q.Li and C.M.Ko,Phys.Rev.C50(1994)1725[6]G.Batko et al.,J.Phys.G20(1994)461[7]C.Spieles et al.,Mod.Phys.Lett.A27(1993)2547[8]G.F.Bertsch and S.Das Gupta,Phys.Rep.160(1988)189[9]P.Koch,B.M¨u ller,and J.Rafelski,Phys.Rep.42(1986)167[10]B.Sch¨u rmann,K.Hartmann,and H.Pirner,Nucl.Phys.A360(1981)435[11]G.Batko et al.,Phys.Lett.B256(1991)331[12]burn,Rev.Mod.Phys.27(1955)1,and references therein[13]C.M.Ko and X.Ge,Phys.Lett.B205(1988)195[14]P.Koch and C.Dover,Phys.Rev.C40(1989)145[15]G.Feldman,Phys.Rev.95(1954)1697[16]Landolt-B¨o rnstein,Numerical Data and Functional Relationships in Science and Tech-nology,Vol.12a und Vol.12b,Springer-Verlag Berlin,1988[17]R.Wittmann,Ph.D.thesis,Univ.Regensburg,Feb.1995[18]N.G.van Kampen,Stochastic Processes in Physics and Chemistry,North-Holland Pub-lishing Company,Amsterdam1983[19]B.Conforto et al.,Nouvo Cim.54A(1968)441[20]W.Br¨u ckner et al.,Phys.Lett.B166(1986)113[21]E.Eisenhandler et al.,Nucl.Phys.B113(1976)1[22]S.Chandrasekhar,Rev.Mod.Phys.15(1943)1[23]B.Svetitsky,Phys.Rev.D37(1988)2484[24]I.Montvay and J.Zimanyi,Nucl.Phys.A316(1979)490[25]J.P.Bondorf,S.I.A.Garpman,and J.Zim´a nyi,Nucl.Phys.A296(1978)320[26]A.O.Vaisenberg et al.,JETP Lett.29(1979)661[27]ng et al.,Phys.Lett.B245(1990)147[28]A.Gillitzer et al.,talk presented at the XXXIII International Winter Meeting on NuclearPhysics,23.-28.January1995,Bormio(Italy)FIGURES468101214-810-710-610-510-410-310-210-110010s in GeVλt=0.2 fm/c t=2 fm/c t=5 fm/c ©©©2FIG.1.λ(s,t )at different times t calculated from the model of Ref.[10].The starting point is a δ-function at s =5.5GeV 2.The dashed line is the asymptotic thermal distribution at t =∞),corresponding to a temperature T =133MeV.00.10.20.30.40.50.6-16-171010-1510-1410-1310-1210-1110-1010 E in GeVα = 1/2α = 3/2α = 5/2α = 7/2ÄÄÄÄPFIG.2.Antiproton production spectrum for different threshold behaviour of the elementaryproduction process (x =12,52from top to bottom).203040506070 8090t in fm/cT e f f T = 10 MeV 0T = 50 MeV 0T = 70 MeV 00 246810i n M e V 10FIG.3.Effective temperature T efffor three Maxwell distributions with initial temperatures T 0=10MeV,50MeV and 70MeV,respectively.0z in fm12ρ / ρ0FIG.4.Density distributions ρ(0,0,z )along the beam axis of target and projectile nucleons for a 40Ca-40Cacollision,normalized to ρ0=0.15fm −3.The solid lines labelled by “incoming nuclei”show the two nuclei centered at ±5fm at time t 0=0.The two other solid lines denote the cold nuclear remnants at full overlap time t m ,centered at about ±3fm.Also shown are fireball nucleons (long-dashed)and ∆-resonances (short-dashed)at time t m .α=0.20T i n G e V r in fm α=5r in fm024680.050.10.150.20.25T i n G e V t = 9 fm/c630FIG.5.Temperature profiles for α=0.2and α=5at four different times t =0(=beginning of the explosion phase)and t =3,6,and 9fm/c (from top to bottom).E in GeV33d N / d p i n G e V -3α=0.2α=1α=500.10.20.30.40.50.6-11-10-9-8-7-6-5-4-3101010101010101010FIG.6.¯p -spectra for different profile parameters α.The dotted lines mark the initial pro-duction spectra.The asymptotic temperatures at an assumed freeze-out density ρf =ρ0/2corre-sponding to the solid lines are,from top to bottom,105MeV,87MeV and 64MeV,respectively.E in GeVd N / d p i n Ge V 33-10101010101010FIG.7.¯p -spectrum for different formation times,for a profile parameter α=1.The dashed line indicates the original production spectrum.E in GeVkin d p 3d σ3i n b /Ge V /10101010FIG.8.Differential ¯p spectrum for Na-Na and Ni-Ni collisions,for a shape parameter α=1.The data are from GSI experiments [2].。
On the particle acceleration near the light surface of radio pulsars
a r X i v :a s t r o -p h /0002525v 1 29 F eb 2000Mon.Not.R.Astron.Soc.000,1–??(1999)Printed 1February 2008(MN L A T E X style file v1.4)On the particle acceleration near the light surface of radiopulsarsV.S.Beskin 1,2and R.R.Rafikov 31National Astronomical Observatory,Osawa 2–21–1,Mitaka,Tokyo 181–8588,Japan 2P.N.Lebedev Physical Institute,Leninsky prosp.,53,Moscow,117924,Russia 3Princeton University Observatory,Princeton,NJ,08544,USAAccepted 1999.Received 1999;in original form 1999ABSTRACTThe two–fluid effects on the radial outflow of relativistic electron–positron plasma are considered.It is shown that for large enough Michel (1969)magnetization param-eter σ≫1and multiplication parameter λ=n/n GJ ≫1one–fluid MHD approxima-tion remains correct in the whole region |E |<|B |.In the case when the longitudinal electric current is smaller than the Goldreich–Julian one,the acceleration of particles near the light surface |E |=|B |is determined.It is shown that,as in the previously considered (Beskin Gurevich &Istomin 1983)cylindrical geometry,almost all electro-magnetic energy is transformed into the energy of particles in the narrow boundary layer ∆̟/̟∼λ−1.Key words:two–fluid relativistic MHD:radio pulsars—particle acceleration1INTRODUCTIONDespite the fact that the structure of the magnetosphere of radio pulsars remains one of the fundamental astrophysical problems,the common view on the key theoretical question –what is the physical nature of the neutron star braking –is absent (Michel 1991,Beskin Gurevich &Istomin 1993,Mestel 1999).Nevertheless,very extensive theoretical studies in the seventies and the eighties allowed to obtain some model-independent results.One of them is the absence of magnetodipole energy loss.This result was first obtained theoretically (Henriksen &Norton 1975,Beskin et al 1983).It was shown that the electric charges filling the magnetosphere screen fully the magnetodipole radiation of a neutron star for an arbitrary inclination angle χbetween the rotational and magnetic axes if there are no longitudinal currents flowing in the ter this result was also confirmed by observations.The direct detections of the interaction of the pulsar wind with a companion star in close binaries (see e.g.Djorgovsky &Evans 1988,Kulkarni &Hester 1988)have shown that it is impossible to explain the heating of the companion by a low–frequency magnetodipole wave.On the other hand,the detailed mechanism of particle acceleration remains unclear.Indeed,a very high magnetization parameter σ(Michel 1969)in the pulsar magnetosphere demonstrates that within the light cylinder r <R L =c/Ωthe main part of the energy is transported by the Poynting flux.It means that the additional mechanism of particle acceleration must work in the vicinity of the light cylinder.It is necessary to stress that an effective particle acceleration can only take place for small enough longitudinal electric currents I <I GJ when the plasma has no possibility to pass smoothly through the fast magnetosonic surface and when the light surface |E |=|B |is located at a finite distance.As to the case of the large longitudinal currents I >I GJ ,both analytical (Tomimatsu 1994,Begelman &Li 1994,Beskin et al 1998)and numerical (Bogovalov 1997)considerations demonstrate that the acceleration becomes ineffective outside the fast magnetosonic surface,and the particle-to-Poynting flux ratio remains small:∼σ−2/3(Michel 1969,Okamoto 1978).The acceleration of an electron–positron plasma near the light surface was considered by Beskin Gurevich and Istomin (1983)in the simple 1D cylindrical geometry for I ≪I GJ .It was shown that in a narrow boundary layer ∆̟/̟∼1/λalmost all electromagnetic energy is actually converted to the particles energy.Nevertheless,cylindrical geometry does not provide the complete picture of particle acceleration.In particular,it was impossible to include self–consistently the disturbance of a poloidal magnetic field and an electric potential,the later playing the main role in the problem of the plasma acceleration (for more details see e.g.Mestel &Shibata 1994).Hence,a more careful 2D consideration is necessary.c1999RAS2V.S.Beskin and R.R.RafikovIn Sect.2we formulate a complete system of2D two–fluid MHD equations describing the electron–positron outflow from a magnetized body with a monopole magneticfield.The presence of an exact analytical force–free solution(Michel 1973)allows us to linearize this system which results in the existence of invariants(energy and angular momentum)along unperturbed monopolefield lines similar to the ideal one–fluid MHDflow.In Sect.3it is shown that forσ≫1andλ≫1 (λ=n/n GJ is the multiplication factor)the one–fluid MHD approximation remains true in the entire region within the light surface.Finally,in Sect.4the acceleration of particles near the light surface|E|=|B|is considered.It is shown that,as in the case of cylindrical geometry,in a narrow boundary layer∆̟/̟∼λ−1almost all the electromagnetic energy is converted into the energy of particles.2BASIC EQUATIONSLet us consider a stationary axisymmetric outflow of a two–component plasma in the vicinity of an active object with a monopole magneticfield.It is necessary to stress that,of course,the monopole magneticfield is a rather crude approximation for a pulsar magnetosphere.Nevertheless,even for a dipole magneticfield near the origin,at large distances r≫R L in the wind zone the magneticfield can have a monopole–like structure.For this reason the disturbance of a monopole magnetic field can give us an important information concerning particle acceleration far from the neutron star.The structure of theflow is described by the set of Maxwell‘s equations and the equations of motion∇E=4πρe,∇×E=0,∇B=0,∇×B=4πc×B.(2)Here E and B are the electric and magneticfields,ρe and j are the charge and current densities,and v±and p±are the speed and momentum of particles.In the limit of infinite particle energyγ=∞,v0r=c,v0ϕ=0,v0θ=0,(3) and for charge and current densityρ0e=ρs R2sr2cosθ,(4)the monopole poloidal magneticfieldB0r=B sR2scR sccosθ,(7) and theflux functionΨ(r,θ),so thatB0p=∇Ψ×eϕ2πce R2s2cosθ+η+(r,θ) ,(9)n−=ΩB sr2 λ+1c[−cosθ+δ(r,θ)],(11)Ψ(r,θ)=2πB s R2s[1−cosθ+εf(r,θ)],(12)c 1999RAS,MNRAS000,1–??On the particle acceleration near the light surface of radio pulsars3v ±r=c1−ξ±r (r,θ),v ±θ=cξ±θ(r,θ),v ±ϕ=cξ±ϕ(r,θ),(13)B r =B sR 2ssin θ∂fr sin θ∂f cR s c ∂δcr−sin θ−∂δsin θ∂2cos θξ+r−λ+1∂rr2∂δsin θ∂∂θ=0,(20)∂ζrλ−12cos θξ−θ,(21)−ε∂r 2−ε∂θ 1∂θ=2Ω2cos θξ+ϕ−λ+1∂rξ+θγ++ξ+θγ+r ∂δr −sin θΩr2ξ+ϕ,(23)∂r=−4λσ −1∂θ+ζrξ−r +c∂rγ+=4λσ−∂δr ξ+θ,(25)∂∂r−sin θ∂rξ+ϕγ++ξ+ϕγ+Ωr sin θ∂fΩr 2ξ+θ,(27)∂r=−4λσ −εc∂r−c4λmc 3(29)is the Michel‘s (1969)magnetization parameter,m is the electron mass,and all deflecting functions are supposed to be ≪1.It is necessary to stress that for applications the magnetic field B s is to be taken near the light cylinder R s ≈R L because in the internal region of the pulsar magnetosphere B ∝r −3.As it has already been mentioned,only outside the light cylinder the poloidal magnetic field may have quasi monopole structure.As a result,σ=Ω2eB 0R 34V.S.Beskin and R.R.Rafikov1975,Arons Scharlemann 1979),γin ≤102for secondary plasma.For this reason in what follows we consider in more details the caseγ3in ≪σ,(34)when the additional acceleration of particles inside the fast magnetosonic surface takes place (see e.g.Beskin Kuznetsova Rafikov 1998).It is this case that can be realized for fast pulsars.Moreover,it has more general interest because the relation(34)may be true also for AGNs.As to the case γ3in ≫σcorresponding to ordinary pulsars,the particle energy remains constant (γ=γin )at any way up to the fast magnetosonic surface (see Bogovalov 1997for details).Further,one can putδ(R s ,θ)=0,(35)εf (R s ,θ)=0,(36)η+(R s ,θ)−η−(R s ,θ)=0.(37)These conditions result from the relation c E s +ΩR s e ϕ×B s =0corresponding rigid rotation and perfect conductivity of the surface of a star.Finally,as will be shown in Sect.3.2,the derivative ∂δ/∂r actually determines the phase of plasma oscillations only and plays no role in the global structure.Finally,the determination of the electric current and,say,the derivative ∂f/∂r depend on the problem under consideration.Indeed,as is well–known,the cold one–fluid MHD outflow contains two singular surfaces,Alfv´e nic and fast magnetosonic ones.It means that for the transonic flow two latter functions are to be determined from critical conditions (Heyvaerts 1996).In particular,the longitudinal electric current within this approach is not a free parameter.On the other hand,if the electric current is restricted by some physical reason,the flow cannot pass smoothly through the fast magnetosonic surface.In this case,which can be realized in the magnetosphere of radio pulsars (Beskin et al 1983,Beskin &Malyshkin 1998),near the light surface |E |=|B |an effective particle acceleration may take place.Such an acceleration will be considered in Sect.4.3THE ELECTRON–POSITRON OUTFLOW 3.1The MHD LimitIn the general case Eqns.(19)–(28)have several integrals.Firstly,Eqns.(21),(25),and (26)result in ζ−22σλsin θ=1sin θ,(38)where l (θ)describe the disturbance of the electric current at the star surface by the equation I (R,θ)=I A sin 2θ+l (θ).Expression (38)corresponds to conservation of the total energy flux along a magnetic field line.Furthermore,Eqns.(25)–(28)together with the boundary conditions (35),(36)result in δ=εf −1c ξ+ϕ+14λσγ−1−Ωr sin θ4λσγin .(40)They correspond to conservation of the z –component of the angular momentum for both types of particles.It is necessary to stress that the complete nonlinearized system of equations contains no such simple invariants.As σλ≫1,we can neglect in Eqns.(23)–(28)their left-hand sides.In this approximation we have ξ+=ξ−i.e.γ−=γ+=γ,so that −1∂θ+ζr ξr +c Ωr sin θ∂fΩr 2ξθ=0,(42)and γ1−Ωr sin θtan θεf +l (θ)σsin θ(γ−γin ).(45)c1999RAS,MNRAS 000,1–??On the particle acceleration near the light surface of radio pulsars5 Substituting these expressions into(41)and using Eqns.(19)–(22),we obtain the following equation describing the disturbance of the magnetic surfacesε(1−x2sin2θ)∂2fx2∂sinθ∂f∂x−2εsinθcosθ∂fsinθdσ(γ−γin)−sinθ∂θ−2λsin2θ(ξ+r−ξ−r)+2λxsinθ(ξ+ϕ−ξ−ϕ)≈0,(47)so actually there is perfect agreement with the MHD approximationε(1−x2sin2θ)∂2fx2∂sinθ∂f∂x−2εsinθcosθ∂fsinθdσ(γ−γin)−sinθ∂θ=0.As was shown earlier(Beskin et al1998),to pass through the fast magnetosonic surface it’s necessary to have|l|<σ−4/3.(49) Hence,within the fast magnetosonic surface r≪r F one can neglect terms containingδ=εf andζ.Then,relations(41)and (42)result inγ(1−x sinθξϕ)=γin,(50)ξr=ξϕ2ξr−ξ2ϕ,(53) we obtain forσ≫γ3in for r≪r Fγ2=γ2in+x2sin2θ,(54)ξϕ= x sinθ x sinθ,(55)ξr= x2sin2θ x2sin2θ,(56)in full agreement with the MHD results.Next,to determine the position of the fast magnetosonic surface r F,one can analyze the algebraic equations(38)and (41)which give−∂δtanθδ−1xξϕ=0.(57) Using now expressions(43)and(53),one canfind2γ3−2σ K+1∂θ.(59) Equation(58)allows us to determine the position of the fast magnetosonic surface and the energy of particles.Indeed, determining the derivative r∂γ/∂r,one can obtainr ∂γ3γ−σ(2K+x−2).(60)c 1999RAS,MNRAS000,1–??6V.S.Beskin and R.R.RafikovAs the fast magnetosonic surface is the X–point,both the nominator and denominator are to be equal to zero here.As a result,evaluating r∂K/∂r as K,we obtainδ∼σ−2/3;(61) r F∼σ1/3R L;(62)γ(r F)=σ1/3sin2/3θ,(63) where the last expression is exact.These equations coincide with those obtained within the MHD consideration.It is the self–consistent analysis whenδ=εf,and hence K depends on the radius r that results in thefinite value for the fast magnetosonic radius r F.On the other hand,in a given monopole magneticfield,whenεf does not depend on the radius,the critical conditions result in r F→∞for a cold outflow(Michel,1969,Li et al1992).Near the fast magnetosonic surface r∼σ1/3R L the MHD solution givesγ∼σ1/3,(64)εf∼σ−2/3.(65) Hence,Eqns.(53),(55),and(56)result inξr∼σ−2/3,(66)ξθ∼σ−2/3,(67)ξϕ∼σ−1/3.(68) As we see,theθ–component of the velocity plays no role in the determination of theγ.However,analyzing the left-hand sides of the Eqns.(23)–(28)one can evaluate the additional(nonhydrodynamic)varia-tions of the velocity components∆ξ±r∼λ−1σ−4/3,(69)∆ξ±θ∼λ−1σ−2/3,(70)∆ξ±ϕ∼λ−1σ−1.(71) Hence,for nonhydrodynamic velocities∆ξ±r≪ξr and∆ξ±ϕ≪ξϕto be small,it is necessary to have a large magnetizationparameterσ≫1only.On the other hand,∆ξ±θ/ξθ∼λ−1.In other words,for a highly magnetized plasmaσ≫1even outside the fast magnetosonic surface the velocity components(and,hence,the particle energy)can be considered hydrodynamically. The difference∼λ−1appears in theθcomponent only,but it does not affect the particle energy.Finally,one can obtain from (39),(40)thatδ−εftanθδ−σ−11∂θ+sinθξr.(75) Together with(21)one can obtain for r≫r Fγ=σ 2cosθεf−εsinθ∂f∂r r2∂f∂θξr+1∂θ(ξϕsinθ)=0.(77) Together with(76)this equation in the limit r≫r F coincides with the asymptotic version of the trans–field equation (Tomimatsu1994,Beskin et al1998)ε∂2f∂r−sinθD+1∂θ=0,(78)where g(θ)=K(θ)/sin2θ,andc 1999RAS,MNRAS000,1–??On the particle acceleration near the light surface of radio pulsars7D +1=1∂rr 2∂δ∂θ+1∂θ(ξϕsin θ)+2λ(ξ+r −ξ−r )=0.(80)Indeed,one can see from equations (19)and (20)that near the origin x =R s in the case γ+in =γ−in (and for the small variationof the current ζ∼σ−4/3which is necessary,as was already stressed,to pass through a fast magnetosonic surface)the densityvariation on the surface is large enough:(η+−η−)∼γ−2in ≫ζ.Hence,the derivative ∂2δ/∂r 2here is of the order of γ−2in .Onthe other hand,according to (22),the derivative ε∂2f/∂r 2is x 2times smaller.This means that in the two–component system the longitudinal electric field is to appear resulting in a redistribution of the particle energy.Clearly,such a redistribution is impossible for the charge–separated outflow.In other words,for a finite particle energy a one–component plasma cannot maintain simultaneously both the Goldreich charge and Goldreich current density (4).In a two–component system with λ≫1it can be realized by a small redistribution of particle energy (Ruderman &Sutherland 1975,Arons &Scharlemann 1989).For simplicity,let us consider only small distances x ≪1.In this case one can neglect the changes of the magnetic ing now (25)and (26),we haveγ+=γin −4λσδ;(81)γ−=γin +4λσδ.(82)Finally,taking into account that ξθand ξϕare small here,one can obtain from (20)r2∂2δ∂r +1∂θsin θ∂δγ2in,(83)where A =16λ2σ16λ2σ,(86)and µ≈√∂rr2∂δ∂θξr +1∂θ(ξϕsin θ)=0.(89)c1999RAS,MNRAS 000,1–??8V.S.Beskin and R.R.RafikovAsδ∼εf≪σ−2/3for r≪r F,andξr∼γ−20≫δ,thefirst term in(89)can be omitted.As a result,the solution of Eqn.(89)coincides exactly with the MHD expression,i.e.γ2=γ2in+x2sin2θ(54).Finally,using(87),(88),and(55)–(56), one can easily check that the nonhydrodynamical terms(47)in the trans–field equation(48)do actually vanish.4THE BOUNDARY LAYERLet us now consider the case when the longitudinal electric current I(R,θ)in the magnetosphere of radio pulsars is too small (i.e.the disturbance l(θ)is too large)for theflow to pass smoothly through the fast magnetosonic surface.First of all,it can be realized when the electric current is much smaller than the Goldreich one.This possibility was already discussed within the Ruderman–Sutherland model of the internal gap(Beskin et al1983,Beskin&Malyshkin1998).But it may take place in the Arons model(Arons&Scharlemann1979)as well.Indeed,within this model the electric current is determined by the gap structure.Hence,in general case this current does not correspond to the critical condition at the fast magnetosonic surface.In particular,it may be smaller than the critical current(of course,the separate consideration is necessary to check this statement).For simplicity let us consider the case l(θ)=h sin2θ.Neglecting now the last terms∝σ−1in the trans–field equation (48),we obtainε(1−x2sin2θ)∂2fx2∂sinθ∂f∂x−2εsinθcosθ∂f(2|h|)1/4.(92) As we see,for l(θ)=h sin2θthis surface has the form of a cylinder.It is important that the disturbance of magnetic surfaces εf∼(|h|)1/2remains small here.Comparing now the position of the light surface(92)with that of the fast magnetosonic surface(62),one canfind that the light surface is located inside the fast magnetosonic one ifσ−4/3≪|h|≪1,(93) which is opposite to(49).One can check that the condition(93)just allows to neglect the non force–free term in Eqn.(48).Using now the solution(91)and the MHD conditionδ=εf,one canfind from(58)2γ3−2σ hx2sin4θ+1√x0−x sinθ,(95) wherex0=14(2|h|)1/21On the particle acceleration near the light surface of radio pulsars 9 Figure1.The behavior of the Lorentz factor in the caseσ−4/3≪|h|≪1.One can see that the one–fluid MHD solution(95)existsforγ<σ1/3only.But in the two–fluid approximation in the narrow layer∆̟=̟c/λthe particle energy increases up to the value∼σcorresponding to the full conversion of the electromagnetic energy to the energy of particles.invariants(39)and(40)can be used to defineξ±ϕ:ξ+ϕ=1γ+ ;(99)ξ−ϕ=1γ− .(100)Furthermore,one can define2ξ+r=1(γ−)2+(ξ−ϕ)2+(ξ−θ)2.(102) As to the energy integral(38),it determines the variation of the currentζ.Now it can be rewritten asζ=22σsinθ.(103)Finally,Eqns.(19)–(28)look like̟2c d2δ2cosθ ξ+θ− λ+12cosθ ξ+r− λ+1d̟2=−2sin2θ λ−12cosθ ξ−ϕ ,(105)̟c d2σsinθ−̟ccosθd̟−sinθξ+r+sinθd̟ ξ−θγ− =−4λσ −γ++γ−sinθdδx0ξ−ϕ ,(107)̟cdd̟−sinθξ+θ,(108)c 1999RAS,MNRAS000,1–??10V.S.Beskin and R.R.Rafikov̟c dd̟−sinθξ−θ,(109)where we neglected the terms∝δ/r in(106)and(107).Comparing the leading terms,we have inside the layer∆̟/R L∼λ−1γ±∼h1/2cσ,(110)ξ±θ∼h1/4c,(111)ξ±r∼h1/2c,(112)∆δ∼h3/4c/λ,(113) where h c=|h|.Then the leading terms in(99)–(103)for∆̟>λ−1R L areξ+ϕ=1x0,(114)ξ−ϕ=1x0,(115)2ξ+r=(ξ+ϕ)2+(ξ+θ)2,(116) 2ξ−r=(ξ−ϕ)2+(ξ−θ)2,(117)ζ=−(γ++γ−)d̟2=2λsinθcosθ(ξ+θ−ξ−θ),(119)̟c d2σsinθ−̟ccosθd̟−sinθξ+r,(120)̟c d2σsinθ−̟ccosθd̟−sinθξ−r,(121)̟cdd̟ γ− =4λσsinθξ−θ,(123)with all the terms in the right–hand sides of(120)and(121)being of the same order of magnitude.As a result,the nonlinear equations(119)–(123)and(105)give the following simple asymptotic solutionγ±=4sin2θσ(λl)2,(124)ξ±θ=∓2sinθλl,(125)∆δ=−4On the particle acceleration near the light surface of radio pulsars11F(rad) x =−2m2c4γ2 (E y−B z)2+(E z−B y)2 ,(129)which can be important for large enough particle paring(129)with appropriate terms in(120)–(123)one can conclude that the radiation force can be neglected forσ<σcr,whereσcr= cc <3×10−3B−3/712λ2/74(131)which givesP>0.06B3/712λ−2/74s.(132)Hence,for most radio pulsars the radiation force indeed can be neglected.As to the pulsars withσ>σcr,it is clear that for γ>σcr the radiation force becomes larger than the electromagnetic one and strongly inhibits any further acceleration.As a result,we can evaluate the maximum gamma–factor which can be reached during the acceleration asγmax≈σcr≈106.(133)5DISCUSSIONThus,on a simple example it was demonstrated that for real physical parameters of the magnetosphere of radio pulsars (σ≫1andλ≫1)the one–fluid MHD approximation remains true in the whole region within the light surface|E|=|B|. On the other hand,it was shown that in a more realistic2D case the main properties of the boundary layer near the light surface existing for small enough longitudinal currents I<I GJ(effective energy transformation from electromagneticfield to particles,current closure in this region,smallness of the disturbance of electric potential and poloidal magneticfield)remain the same as in the1D case considered previously(Beskin et al1983).It is necessary to stress the main astrophysical consequences of our results.First of all,the presence of such a boundary layer explains the effective energy transformation of electromagnetic energy into the energy of particles.As was already stressed,now the existence of such an acceleration is confirmed by observations of close binaries containing radio pulsars(as to the particle acceleration far from a neutron star,see e.g.Kennel&Coroniti1984,Hoshino et al1992,Gallant&Arons 1994).Simultaneously,it allows us to understand the current closure in the pulsar magnetosphere.Finally,particle acceleration results in the additional mechanism of high–energy radiation from the boundary of the magnetosphere(for more details see Beskin et al1993).Nevertheless,it is clear that the results obtained do not solve the whole pulsar wind problem.Indeed,as in the cylindrical case,it is impossible to describe the particle motion outside the light surface.The point is that,as one can see directly from Eqn.(126),for a complete conversion of electromagnetic energy into the energy of particles it is enough for them to pass onlyλ−1of the total potential drop between pulsar magnetosphere and infinity.It means that the electron–positron wind propagating to infinity has to pass the potential drop which is much larger than their energy.It is possible only in the presence of electromagnetic waves even in an axisymmetric magnetosphere which is stationary near the origin.Clearly,such aflow cannot be considered even within the two–fluid approximation.In our opinion,it is only a numerical consideration that can solve the problem completely and determine,in particular,the energy spectrum of particles and the structure of the pulsar wind.Unfortunately,up to now such numerical calculations are absent.ACKNOWLEDGMENTSThe authors are grateful to I.Okamoto and H.Sol for fruitful discussions.VSB thanks National Astronomical Observatory, Japan for hospitality.This work was supported by INTAS Grant96–154and by Russian Foundation for Basic Research(Grant 96–02–18203).REFERENCESArons J.,Scharlemann E.T.,1979,ApJ,231,854Begelman M.C.,Li Z.-Y.,1994,ApJ,426,269Beskin V.S.,Gurevich A.V.,Istomin Ya.N.,1983,Soviet Phys.JETP,58,235Beskin V.S.,Gurevich A.V.,Istomin Ya.N.,1993,Physics of the Pulsar Magnetosphere,Cambridge Univ.Press,CambridgeBeskin V.S.,Kuznetsova I.V.,Rafikov R.R.,1998,MNRAS299,341c 1999RAS,MNRAS000,1–??12V.S.Beskin and R.R.RafikovBeskin V.S.,Malyshkin L.M.,1998,MNRAS298,847Bogovalov S.V.,1997,A&A,327,662Djorgovsky S.,Evans C.R.,1988,ApJ,335,L61Gallant Y.A.,Arons J.,1994,ApJ,435,230Goldreich P.,Julian,W.H.,1969,ApJ,157,869Henriksen R.N.,Norton J.A.,1975,ApJ,201,719Heyvaerts J.,1996,in Chiuderi C.,Einaudi G.,ed,Plasma Astrophysics,Springer,Berlin,p.31Hoshino M.,Arons J.,Gallant Y.A.,Langdon A.B.,1992,ApJ,390,454Kennel C.F.,Coroniti,F.V.,1984,ApJ,283,694Kulkarni S.R.,Hester J.,1988,Nature,335,801Li Zh.–Yu.,Chiueh T.,Begelman M.C.,1992,ApJ,394,459Mestel L.,1999,Cosmical Magnetism,Clarendon Press,OxfordMestel L.,Shibata S.,1994,MNRAS,271,621Michel F.C.,1969,ApJ,158,727Michel F.C.,1973,ApJ,180,L133Michel F.C.,1991,Theory of Neutron Star Magnetosphere,The Univ.of Chicago Press,ChicagoOkamoto I.,1978,MNRAS,185,69Ruderman M.A.,Sutherland P.G.,1975,ApJ,196,51Shibata S.,1997,MNRAS,287,262Tomimatsu A.,1994,Proc.Astron.Soc.Japan,46,123c 1999RAS,MNRAS000,1–??。
The Properties of Doping in Materials
The Properties of Doping in MaterialsDoping is the process of introducing impurity atoms to a semiconducting material in order to alter and enhance its electrical properties. It is a widely used technique in the electronic industry to create faster, smaller and more powerful devices. In this article, we will explore the properties of doping in materials and how it affects their conductivity, resistivity, and band gap.Conductivity is a material's ability to conduct electricity. Doping can increase or decrease an otherwise non-conductive material's conductivity. This is due to the change of the number of electrons in the material's valence band. In intrinsic semiconductors, the valence band is fully occupied, and the conduction band is empty. When a dopant such as boron (B) is introduced, it has one less electron in its valence shell and is a good candidate to accept an electron from another atom. This results in the formation of positively charged holes that move as if they are particles of positive charge. As a result, the semiconductor can conduct electricity.On the other hand, doping with elements such as phosphorus (P) or arsenic (As) can increase the number of electrons in the conduction band, making the material an n-type semiconductor. These elements have one more electron in their valence shell that can easily populate the conduction band, resulting in an excess of negatively charged electrons. As a result, the material can also conduct electricity.Resistivity is the opposite of conductivity and measures a material's tendency to oppose electric current. Doping can influence the resistivity of a material as well. In an n-type semiconductor, the addition of dopants such as P or As will decrease the resistivity as more electrons become available for conduction through the material. Conversely, doping an intrinsic semiconductor with B creates an opposite effect, increasing the resistivity as fewer electrons are available for conduction.The band gap is the energy difference between the valence band and the conduction band in a semiconductor. Intrinsic semiconductors have a well-defined band gap, but doped semiconductors have their band gap altered. Doping with P or As results in areduction of the band gap, making the material more conductive and allowing for more efficient electronic devices. In contrast, doping with B results in a wider band gap and a subsequent reduction in the intrinsic carrier concentration. This leads to a lower conductivity and increased resistivity.In summary, doping is a process used in the electronic industry to modify the electrical properties of materials. The addition of impurity atoms can result in either increased or decreased conductivity, resistivity, and altered band gap. Understanding these properties is crucial in the design and fabrication of various electronic devices, such as transistors and solar cells.。
The mysteries of the atom Quantum mechanics
The mysteries of the atom Quantum mechanics Quantum mechanics is a branch of physics that deals with the behavior of matter and energy at the atomic and subatomic levels. It is a field that has been shrouded in mystery for decades, with scientists trying to unravel the secrets of the atom. The mysteries of the atom have fascinated scientists for centuries, and the discovery of quantum mechanics has only added to the intrigue. In this essay, we will explore the mysteries of the atom, and the role that quantum mechanics plays in our understanding of it.One of the most significant mysteries of the atom is the fact that it is made up of mostly empty space. The atom consists of a nucleus, which is made up of protons and neutrons, and electrons that orbit around the nucleus. However, the size of the nucleus is incredibly small compared to the size of the atom. This means that the atom is mostly empty space. This fact has puzzled scientists for years, and it was not until the discovery of quantum mechanics that we began to understand why this is the case.Quantum mechanics has also revealed that particles at the atomic level do not behave in the same way as larger objects in the classical world. In the classical world, objects have a definite position and momentum. However, in the quantum world, particles do not have a definite position or momentum until they are observed. This phenomenon is known as wave-particle duality, and it has been one of the most significant discoveries of quantum mechanics. It has led to many new technologies, including quantum computers and cryptography.Another mystery of the atom is the fact that particles at the atomic level can be in multiple states at the same time. This is known as superposition, and it is a fundamental concept in quantum mechanics. Superposition means that particles can exist in multiple states simultaneously, and it is this property that underpins many quantum technologies. For example, quantum computers rely on superposition to perform calculations that would be impossible with classical computers.Entanglement is another mystery of the atom that has been revealed by quantum mechanics. Entanglement occurs when two particles become linked in such a way that thestate of one particle is dependent on the state of the other. This phenomenon has been demonstrated in many experiments, and it has led to the development of quantum teleportation. Entanglement is still not fully understood, and scientists are still trying to unravel the mysteries of this phenomenon.Quantum mechanics has also revealed that the act of observation can affect the behavior of particles at the atomic level. This is known as the observer effect, and it has been a subject of much debate in the scientific community. Some scientists believe that the observer effect is evidence of the role of consciousness in the universe, while others believe that it is simply a property of the quantum world.In conclusion, the mysteries of the atom are many, and quantum mechanics has played a significant role in our understanding of these mysteries. The fact that the atom is mostly empty space, the phenomenon of wave-particle duality, superposition, entanglement, and the observer effect are all mysteries that have been revealed by quantum mechanics. While quantum mechanics has provided us with many new technologies, it has also raised many new questions that scientists are still trying to answer. The mysteries of the atom will continue to fascinate scientists for generations to come, and the discovery of new phenomena in the quantum world will undoubtedly continue to add to the intrigue.。
混沌粒子群优化算法
混沌粒子群优化算法¨计算机科学2004V01.31N-o.8高鹰h2谢胜利1(华南理工大学电子与信息学院广州510641)1(广州大学信息机电学院计算机科学与技术系广州510405)2摘要粒子群优化算法是一种新的随机全局优化进化算法。
本文把混沌手优思想引入到粒子群优化算法中,这种方法利用混沌运动的随机性、遍历性和规律性等特性首先对当前粒子群体中的最优粒子进行混池寻优,然后把混沌寻优的结果随机替换粒子群体中的一个粒子。
通过这种处理使得粒子群体的进化速度加快t从而改善了粒子群优化算法摆脱局部极值点的能力,提高了算法的收敛速度和精度。
仿真结果表明混沌粒子群优化算法的收敛性能明显优于粒子群优化算法。
关键词粒子群优化算法。
混沌手优,优化’ChaosParticle SwarmOptimizationAlgorithmGAOYin91”XIESheng—Lil(Collegeof Electronic&InformationEngineeringtSouthChina University ofTechnology,Guangzhou510641)1(Dept.of ComputerScience andTechnology.GuangzhouUniversity·Guangzhou510405)2Abstract Particle swarmoptimizationis anewstochasticglobaloptimization evolutionaryalgorithm.Inthis paper,the chaotic searchis embeddedintooriginalparticleswarmoptimizers.Basedon theergodicity,stochastic propertyandregularityofchaos,fl newsuperiorindividualisreproducedbychaoticsearchingonthecurrentglobalbest individ—ual。
英文原文--An Improved Particle Swarm Optimization Algorithm
An Improved Particle Swarm Optimization AlgorithmLin Lu, Qi Luo, Jun-yong Liu, Chuan LongSchool of Electrical Information, Sichuan University, Chengdu, Sichuan, China, 610065lvlin@AbstractA hierarchical structure poly-particle swarm optimization (HSPPSO) approach using the hierarchical structure concept of control theory is presented. In the bottom layer, parallel optimization calculation is performed on poly-particle swarms, which enlarges the particle searching domain. In the top layer, each particle swam in the bottom layer is treated as a particle of single particle swarm. The best position found by each particle swarm in the bottom layer is regard as the best position of single particle of the top layer particle swarm. The result of optimization on the top layer particle swarm is fed back to the bottom layer. If some particles trend to local extremumin particle swarm optimization (PSO) algorithm implementation, the particle velocity is updated and re-initialized. The test of proposed method on four typical functions shows that HSPPSO performance is better than PSO both on convergence rate and accuracy.1. IntroductionParticle swarm optimization algorithm first proposed by Dr. Kennedy and Dr. Eberhart in 1995 is a new intelligent optimization algorithm developed in recent years, which simulates the migration and aggregation of bird flock when they seek for food[1]. Similar to evolution algorithm, PSO algorithm adopts a strategy based on particle swarm and parallel global random search. PSO algorithm determines search path according to the velocity and current position of particle without more complicated evolution operation. PSO algorithm has better performance than early intelligent algorithms on calculation speed and memory occupation, and has less parameters and is easier to realize.At present, PSO algorithm is attracting more and more attention of researchers, and has been widely used in fields like function optimization, combination optimization, neural network training, robot path programming, pattern recognition, fuzzy system control and so on[2]. In addition, The study of PSO algorithm also has infiltrated into electricity, communications, and economic fields. Like other random search algorithms, there is also certain degree of premature phenomenon in PSO algorithm. So in order to improve the optimization efficiency, many scholars did improvement researches on basic PSO algorithm, such as modifying PSO algorithm with inertia weight[3], modifying PSO algorithm with contraction factor[4], and combined algorithm with other intelligent algorithm[5]. These modified algorithms have further improvement in aspects like calculation efficiency, convergence rate and so on.Proper coordination between global search and local search is critical for algorithm finally converging to global optimal solution. Basic PSO algorithm has simple concept and is easy to control parameters, but it takes the entire optimization as a whole without detailed division, and it searches on the same intensity all along that to a certain extent leads to premature convergence. A hierarchical structure poly-particle swarm optimization approach utilizing hierarchy concept of control theory is presented, in which the parallel optimization calculation employs poly-particle swarm in the bottom layer, which is equivalent to increase particle number and enlarges the particle searching domain. To avoid algorithm getting in local optimum and turning into premature, disturbance strategy is introduced, in which the particle velocity is updated and re-initialized when the flying velocity is smaller than the minimum restrictions in the process of iteration. The best position found by each poly-particle swarm in the bottom layer is regard as best position of single particle in the top layer. The top layer performs PSO optimization and feeds the global optimal solution back to the bottom layer. Independent search of the poly-particle swarm on bottom layer can be used to ensure that the optimization to carry out in a wider area. And in top layer, particle swarm’s tracking of current global optimal solution can be used to ensure theconvergence of the algorithm. Several benchmark functions have been used to test the algorithm in this paper, and the results show that the new algorithm performance well in optimization result and convergence characteristic, and can avoid premature phenomenon effectively.2. Basic PSO algorithmPSO algorithm is swarm intelligence based evolutionary computation technique. The individual in swarm is a volume-less particle in multidimensional search space. The position in search space represents potential solution of optimization problem, and the flying velocity determines the direction and step of search. The particle flies in search space at definite velocity which is dynamically adjusted according to its own flying experience and its companions’ flying experience, i.e., constantly adjusting its approach direction and velocity by tracing the best position found so far by particles themselves and that of the whole swarm, which forms positive feedback of swarm optimization. Particle swarm tracks the two best current positions, moves to better region gradually, and finally arrives to the best position of the whole search space.Supposing in a D dimension objective search space, PSO algorithm randomly initializes a swarm formed by m particles, then the position X i (potential solution of optimization problem)of ith particle can be presented as {x i 1 , x i 2 ,… , x iD }, substitute them into the object function and adaptive value will be come out, which can be used to evaluate the solution. Accordingly, flying velocity can be represented as {v i 1, v i 2,…, v iD }. Individual extremum P i {p i 1 , p i 2 ,… , p iD } represents the best previous position of the ith particle, and global extremum P g {p g 1 , p g 2 ,… , p gD } represents the best previous position of the swarm. Velocity and position are updated each time according to the formulas below.1112211max max 11min min 11()()if ,;if ,;k k k k k k id id id id gd id k k id id k k id id k k k id id idv wv c r p x c r p x v v v v v v v v x x v +++++++⎧=+−+−⎪>=⎪⎨<=⎪⎪=+⎩ (1) In the formula, k represents iteration number; w is inertia weight; c 1, c 2 is learning factor; r 1, r 2 is two random numbers in the range [0,1].The end condition of iteration is that the greatest iteration number appears or the best previous position fits for the minimum adaptive value.The first part of formula (1) is previous velocity of particle, reflecting the memory of particle. The second part is cognitive action of particle, reflecting particle’sthinking. The third part is social action of particle, reflecting information sharing and mutual cooperation between particles.3. Hierarchical structure poly-particle swarm optimizationDuring the search of PSO algorithm, particles always track the current global optimum or their own optimum, which is easy to get in local minimum [6]. Aiming at this problem in traditional PSO, this paper proposes a hierarchical structure poly-particle swarm optimization approach as the following.(1) Based on hierarchical control concept of control theory, two-layer ploy-particle swarm optimization is introduced. There are L swarms on bottom layer and p ig represents global optimum of ith swarm. There is a swarm on top layer and p g represents global optimum of top layer. Figure 1 represents scheme of HSPPSO.Figure 1Supposing there are L swarms in bottom layer, m particles in each swarm, then parallel computation of L swarms is equivalent to increasing the number of particle to L*m, which expands search space. The parallel computation time of ploy-particle swarms doesn’t increase with the number of particle. Besides individual and global extremum of the particle swarm, poly-particle swarm global extremum is also considered to adjust the velocity and position ofparticles in L swarms. Correction 33()k kg ij c r p x −isadded to the algorithm, and the iteration formulas are as the following. 111223311max max 11min min 11()()()if ,;if ,;k k k k k k k k ij ij ij ij ig ij g ij k k ij ij k k ij ij k k k ij ij ijv wv c r p x c r p x c r p x v v v v v v v v x x v +++++++⎧=+−+−+−⎪>=⎪⎪⎨<=⎪⎪=+⎪⎩ (2)In the formula, c 3 is learning factor; r 3 is random numbers in the range [0,1]. i represents swarm and i=1,…, L ,j represents particle and j=1,…, m ,x ijBottom layerlrepresents the position variable of jth particle in ith swarm; p ij represents individual extremum of jth particle in ith swarm.The fourth part of the formula (2) represents the influence of global experience to particle, reflecting information sharing and mutual cooperation between particles and global extremum.The top layer will commence secondary optimization after ploy-particle swarm optimization on bottom layer, which takes each swarm in L swarms for a particle and swarm optimum p ig for individual optimum of current particle. The iteration formulas of particle velocity update are as the following.1112211max max 11min min 11()()if ,;if ,;k k k k kk i i ig i g i k k i i k k i i k k k i i iv wv c r p x c r p x v v v v v v v v x x v +++++++⎧=+−+−⎪>=⎪⎨<=⎪⎪=+⎩ (3) The top layer PSO algorithm adjusts particle velocity according to the global optimum of each swarm on bottom layer. Independent search of the L poly-particle swarm on the bottom layer can be used to ensure the optimization to be carried out in a wider area. On top layer, particle swarm’s tracking of current global optimal solution can be used to ensure the convergence of the algorithm, in which both attentions are paid to the precision and efficiency of the optimization process.(2) Introduce disturbance strategy. Optimization is guided by the cooperation and competition between particles in PSO algorithm. Once a particle finds a position which is currently optimum, other particles will quickly move to the spot, and gather around the point. Then the swarm will lose variety and there will be no commutative effect and influence between particles. If particles have no variability, the whole swarm will stagnate at the point. If the point is local optimum, particle swarm won’t be able to search the other areas and will get in local optimum which is so-called premature phenomenon. To avoid premature, the particle velocity need to be updated and re-initialized without considering the former strategy when the velocity of particles on the bottom layer is less than boundary value and the position of particle can’t be updated with velocity.4. Algorithm flow(1) Initialization. Set swarm number L, particleswarm scales m and algorithm parameters: inertia weight, learning factor, velocity boundary value, and the largest iterative number.(2) Each swarm on bottom layer randomly generates m original solutions, which are regarded as currentoptimum solution p ij of particles meanwhile. Adaptive value of all particles is computed. The optimum adaptive value of all particles is regarded as current optimum solution p ig of swarm, which is transferred to top layer.(3) Top layer accepts L p ig from bottom layer as original value of particles, which are regarded as their own current optimum solution p ig of particles meanwhile. The optimum adaptive value of all particles is regarded as current optimum solution p g of swarm, which is transferred to bottom layer.(4) Bottom layer accepts p g form top layer, updates velocity and position of particle according to formula (2), if velocity is less than the boundary value, the particle velocity is updated and re-initialized.(5) Bottom layer computes adaptive value of particles, and compares with current individual extremum. The optimum value is regarded as current optimum solution p ij of particles. The minimum value of p ij is compared with global extremum. The optimum value is regarded as current global extremum p ig of swarm, which is transferred to top layer.(6) Top layer accepts p ig from bottom layer. The optimum adaptive value of p ig is compared with current global extremum p g . The optimum adaptive value is regarded as current global extremum p g . Velocity and position of each particle are updated according to formula (3).(7) Top layer computes adaptive value of particles, and compares with current individual extremum. The optimum value is regarded as current optimum solution p ig of particles. The minimum value of p ig is compared with global extremum p g . The optimum value is regarded as current global extremum p g of swarm, which is transferred to bottom layer.(8) Evaluate end condition of iteration, if sustainable then output the result p g , if unsustainable then turn to (4).5. ExampleIn order to study algorithm performance, tests are done on four common benchmark functions: Spherical, Rosenbrock, Griewank and Rastrigin. The adaptive value of the four functions is zero. Spherical and Rosenbrock are unimodal function, while Griewank and Rastrigin are multimodal function which has a great of local minimum.f 1:Sphericalfunction 211(),100100ni i i f x x x ==−≤≤∑f 2:Rosenbrock function222211()[100()(1)], 100100ni i i i i f x x x x x +==×−+−−≤≤∑f 3:Griewank function23111()(100)1,4000nn i i i f x x ===−−+∑∏100100i x −≤≤ f 4:Rastrigin function241()(10cos(2)10), 100100nii i i f x x x x π==−+−≤≤∑Set the dimension of four benchmark function to 10, the corresponding maximum iteration number to 500, swarm scale m to 40, number of swarm to 5, inertia weight to 0.7298, c1, c2 and c3 to 1.4962. Each test is operated 10 times randomly.Table 1 presents the results of four benchmark test function with PSO and HSPPSO.Table1.Compare of simulation resultsfunction algorithm minimummaximum averagef1PSO HSPPSO 3.0083e-93.5860e-12 8.9688e-5 1.9550e-7 3.5601e-84.2709e-11 f2PSO HSPPSO 5.74414.2580 8.8759 7.8538 7.659975.5342 f3PSO HSPPSO 00 24.9412 2.3861 7.36575 0.23861 f4PSO HSPPSO 4.97500.995013.9392 7.9597 10.15267 4.4806Table 1 shows that HSPPSO performance is better than PSO in searching solution, and gets better results than PSO both in test for local searching and global searching. Though the difference of HSPPSO between PSO in searching solution of unimodal function is not obvious, HSPPSO performance is better than PSO for multimodal function, which indicates that HSPPSO has better application foreground in avoiding premature convergence and searching global optimum.Figure 2 to 9 are adaptive value evolution curve after 10 times random iteration for four benchmark functions with HSPPSO and PSO, which indicates that HSPPSO performance is better than PSO both in initial convergence velocity and iteration time. PSO even cannot find optimum value in some tests.Figure 2.HSPPSO test function f1Figure 3.PSO test function f1Figure 4.HSPPSO test function f2Figure 5.PSO test function f2Figure 6.HSPPSO test function f3Adaptive valueIteration numberIteration numberAdaptive valueIteration numberAdaptive valueAdaptive valueIteration number Adaptive valueIteration numberFigure 7.PSO test function f3Figure 8.HSPPSO test function f4 Figure 9.PSO test function f4 6. ConclusionComparing with basic PSO, HSPPSO proposed in this paper can get better adaptive value and have faster convergence rate on condition of the same iteration step. HSPPSO provides a new idea for large scale system optimization problem.References[1] J Kennedy, R Eberhart. Particle Swarm Optimization[C].In:Proceedings of IEEE International Conference on Neural Networks, 1995, V ol 4, 1942-1948. [2] Vanden Bergh F, Engelbrecht A P. Training Product Unit Networks Using Cooperative Particle Swarm Optimization [C]. IEEE International Joint Conference on Neural Networks, Washington DC, USA.2001,126-131.[3] Shi Y, Eberhart R C. A modified particle swarm optimizer[C] IEEE World Congress on Computational Intelligence, 1998: 69- 73[4] Eberhart R C, Shi paring inertia weights and constriction factors in particle swarm optimization[C] Proceedings of the IEEE Conference on Evolutionary Computation, ICEC, 2001: 84- 88[5] Lφvbjerg M, Rasmussen T K, Krink T.Hybrid particle swarm optimizer with breeding and subpopulations[C] Third Genetic and Evolutionary Computation Conference. Piscataway, NJ: IEEE Press, 2001[6] WANG Ling. Intelligent Optimization Algorithms with Applications. Beijing: Tsinghua University Press, 2001.Iteration number Adaptive valueIteration number Adaptive value Adaptive valueIteration number。
The Particle Swarm—Explosion, Stability, and Convergence in a Multidimensional Complex Space
The Particle Swarm—Explosion,Stability,and Convergence in a Multidimensional Complex SpaceMaurice Clerc and James KennedyAbstract—The particle swarm is an algorithm for finding op-timal regions of complex search spaces through the interaction of individuals in a population of particles.Even though the algorithm, which is based on a metaphor of social interaction,has been shown to perform well,researchers have not adequately explained how it works.Further,traditional versions of the algorithm have had some undesirable dynamical properties,notably the particles’ve-locities needed to be limited in order to control their trajectories. The present paper analyzes a particle’s trajectory as it moves in discrete time(the algebraic view),then progresses to the view of it in continuous time(the analytical view).A five-dimensional de-piction is developed,which describes the system completely.These analyses lead to a generalized model of the algorithm,containing a set of coefficients to control the system’s convergence tendencies. Some results of the particle swarm optimizer,implementing modi-fications derived from the analysis,suggest methods for altering the original algorithm in ways that eliminate problems and increase the ability of the particle swarm to find optima of some well-studied test functions.Index Terms—Convergence,evolutionary computation,opti-mization,particle swarm,stability.I.I NTRODUCTIONP ARTICLE swarm adaptation has been shown to suc-cessfully optimize a wide range of continuous functions [1]–[5].The algorithm,which is based on a metaphor of social interaction,searches a space by adjusting the trajectories of individual vectors,called“particles”as they are conceptualized as moving points in multidimensional space.The individual particles are drawn stochastically toward the positions of their own previous best performance and the best previous performance of their neighbors.While empirical evidence has accumulated that the algorithm “works,”e.g.,it is a useful tool for optimization,there has thus far been little insight into how it works.The present analysis begins with a highly simplified deterministic version of the par-ticle swarm in order to provide an understanding about how it searches the problem space[4],then continues on to analyze the full stochastic system.A generalized model is proposed,in-cluding methods for controlling the convergence properties of the particle system.Finally,some empirical results are given, showing the performance of various implementations of the al-gorithm on a suite of test functions.Manuscript received January24,2000;revised October30,2000and April 30,2001.M.Clerc is with the France Télécom,74988Annecy,France(e-mail:Maurice. Clerc@).J.Kennedy is with the Bureau of Labor Statistics,Washington,DC20212 USA(e-mail:Kennedy_jim@).Publisher Item Identifier S1089-778X(02)02209-9.A.The Particle SwarmA population of particles is initialized with randompositionsand afunctionimplementation ofa parameter,which limits step size or velocity.The current paper,however,demonstrates that the im-plementation of properly defined constriction coefficients can prevent explosion;further,these coefficients can induce parti-cles to converge on local optima.An important source of the swarm’s search capability is the interactions among particles as they react to one another’s find-ings.Analysis of interparticle effects is beyond the scope of this paper,which focuses on the trajectories of single particles.B.Simplification of the SystemWe begin the analysis by stripping the algorithm down to a most simple form;we will add things back in later.The particle swarm formula adjusts thevelocityisthe best position found so far,by the individual particle in the first term,or by any neighbor in the second term.The formulacan be shortened byredefiningasfollows:a constant as well;wherenormally it is defined as a random number between zero and a constant upper limit,we will remove the stochastic component initially and reintroduce it in later sections.The effectof.Thus,we begin by considering the reducedsystemandand recog-nized that randomness was responsible for the explosion of the system,although the mechanism that caused the explosion was not understood.Ozcan and Mohan [6],[7]further analyzed the system and concluded that the particle as seen in discrete time “surfs”on an underlying continuous foundation of sine waves.The present paper analyzes the particle swarm as it moves indiscrete time (the algebraic view),then progresses to the view of it in continuous time (the analytical view).A five-dimensional (5-D)depiction is developed,which completely describes the system.These analyses lead to a generalized model of the al-gorithm,containing a set of coefficients to control the system’s convergence tendencies.When randomness is reintroduced to the full model with constriction coefficients,the deleterious ef-fects of randomness are seen to be controlled.Some results of the particle swarm optimizer,using modifications derived from the analysis,are presented;these results suggest methods for al-tering the original algorithm in ways that eliminate some prob-lems and increase the optimization power of the particle swarm.II.A LGEBRAIC P OINT OF V IEWThe basic simplified dynamic system is definedbybe the current pointin.Thus,the system is definedcompletelybyare(2.2)We can immediately see that thevalue is special.Below,we will see what this implies.Forsothat (2.3)(notethat).For example,from the canonicalformTABLE IS OME 'V ALUES FOR W HICH THE S YSTEM I S CYCLICSo,if wedefineis a diagonal matrix,so we havesimply.More precisely,we canwritesuchthat,the solutionsforfor which the systemis cyclic.Fig.1(a)–(d)show the trajectories of a particle in phase space,for various valuesoftakes on one of the values from Table I,the trajectory is cyclical,for any other value,the system is just quasi-cyclic,as in Fig.1(d).We can be a little bit more precise.Below,),so we have either:1)even)whichimplies ,not consistent with thehypothesis;2)(or ),which is impossible;3),that is tosay ,not consistent with thehypothesis.So,and this is the point:there is no cyclic behaviorfor and,in fact,the distance from thepoint,which meansthatincreaseslike .In Section IV ,this result is used to prevent the explosion of the system,which can occur when particle velocities increase without control.C.Case In thissituationand there is just one family of eigenvectors,generatedby.Thus,if (that is to say,ifdecreases and/or increases.Let usdefine.By recurrence,the following form isderived:(2.18)where,.The integers can be negative,zero,or positive.Supposing for aparticular.This quantity is pos-itive if and onlyifis not between (or equal to)theroots and the rootsare ,this result meansthat is also positive.So,as soon asbegins to increase,it does so infinitely,but it can decrease,at the beginning.The question to be answered next is,how long can it decrease before it begins increasing?Now take the caseofand .For instance,in the casewherewith(2.20)Finally(2.21)1Notethat the present paper uses the Bourbaki convention of representingopen intervals with reversed brackets.Thus,]a,b[is equivalent to parenthetical notation (a,b).as longas,which means that de-creases as longasIntegerpartthen wehave(2.23)Thus,it can be concluded that decreases/increases al-most linearlywhen(3.4)The general solutionis(3.5)A similar kind of expressionforis now produced,whereThecoefficientsand dependon .If(3.9)i n o r d e r t o p r e v e n t a d i s c o n t i n u i t y.R e g a r d i n g t h e e x p r es s i o n s a n d ,e i g e n v a l u e s o f t h e m a -t r ix)c a n b e m a d e ,p a r t i c u l a r l y a b o u t t h e (n o n )e x -i s t e n c e o f c y c l e s .T h e a b o v e r e s u l t s p r o v i d e a g u i d e l i n e f o r p r e v e n t i n g t h e e x -p l o s i o n o f t h e s y s t e m ,f o r w e c a n i m m e d i a t e l y s e e t h a t i t d e p e n d s o n w h e t h e r w e h a ve(3.10)B .A P o s t e r i o r i P r o o fO n e c a n d i r e c t l y v e r i f y t h a t a n d a r e ,i n d e e d ,s o l u -t i o n s o f t h e i n i t i a l s y s t e m .O n o n e h a n d ,f r o m t h e i r e x p r e s s i o ns(3.11)a n d o n t h e o t h e r h a nd(3.13)C .G e n e r a l I m p l i c i t a n d E x p l i c i t R e p r e s e n t a t i o n sA m o r e g e n e r a l i m p l i c i t r e p r e s e n t a t i o n (I R )i s p a d d i n g f i v e c o e f f i c i e n ts(3.14)T h e m a t r i x o f t h e s y s t e m i s n ow a n d b e i t s e i g e n v a l u e s .T h e (a n a l y t i c )e x p l i c i t r e p r e s e n t a t i o n (E R )b e coThe final complete ER can then be written from (3.15)and(3.16)byreplacingand ,respectively,by is always an integerandandare real numbers.In the ER,real numbers are obtained if and only if,in whichcaseandbecome true complex numbers.This fact will provide an elegant way of explaining the system’s behavior,by conceptualizing it in a 5-D space,as discussed in Section IV .Note 3.1:Ifvalue,there must be some relations among the five real coef-ficientssignsign(3.21)The two equalities of (3.20)can be combined and simplified asfollows:.Inorder to satisfy these equations,a set of possible conditionsis.In thiscase,value and (3.20)is always satisfied.D.From ER to IRThe ER will be useful to find convergence conditions.Nev-ertheless,in practice,the iterative form obtained from (3.19)isvery useful,as shown in (3.24)at the bottom of the page.Although there are an infinity of solutions in terms of the fiveparameters(3.25)In this particularcase,are.Under this additional condition,a class of solution is simply givenbyand,for ex-ample,and acorrespondingsignsign(3.20)3)Class Model:A second model related to the Class 1formula is definedbyhasbeen well studied.See Section IV-C for further discussion.4)Class 2Model:A second class of models is defined by therelationsandvalue is tohave,,andclassdue to the presence of thetermnumberwhen.Nevertheless,if the max-imumand.Now theconditionsbecomeandvalues.Fig.2(a)shows an example ofconvergence(.I.Reality and ConvergenceThe quick convergence seen in the above example suggestsan interesting question.Does reality—using real-valued vari-ables—imply convergence?In other words,does the followinghold for real-valued systemparameters:(for in-stance),sinceand are usually true complex numbers.Thus,the whole system can be represented in a5-DspaceIn this section,we study some examples of the most simple class of constricted cases:the ones with just one constriction coefficient.These will allow us to devise methods for control-ling the behavior of the swarm in ways that are desirable for optimization.A.Constriction for Model Type1Model Type1is described asfollows:,the constrictioncoefficient below isproduced(4.3)B.Constriction for ModelTypeinsteadoffor,that is to say,if,aasf orv a l u e s b e f o u n d?T h ea n s w e r i s n o.F o r i n t h i s c a s e ,i s a r e a l n u m b e r a n d i t s a b s o l u t ev a l u e i s:1)s t r i c t l y d e c r e a s i n g on(g r e a t e r t h a n1);2)s t r i c t l y d e c r e a s i n g onc a n n o t b e t o o s m a l l,de p e n d i n g onc o e f f i c i e n t i s l e s s t h a n1.0w h e'<4:0.These coefficients identify the conditions for convergence of the particlesystem.mean and minimallyacceptable,theconstraint.Note4.1:The above analysis isfor israndom,it is nevertheless possible to have convergence,evenwith a small constriction coefficient,when at leastoneReferring to theClass model,in the particular casewhere,we use the following IR(with(4.9)so it may be interesting to detail how,in practice,the constrictioncoefficient is found and its convergence properties proven.Step1)Matrix of the SystemWe haveimmediatelytrace(4.11)orThediscriminant.Inthis area,the eigenvalues are true complex numbersand their absolute value(i.e.,module)issimplyso that the eigenvalues are true com-plex numbers for a large fieldofvalues as soonas;2)this is the same as in Constriction Type1;3)we know from the algebraic point of view the system is(eventually)convergentlikeis negative onlybetween.The general algebraic formofvalues.Ifand bysolving.This relation is valid as soonas,fortwovalues given in Table II.D.Moderate ConstrictionWhile it is desirable for the particle’s trajectory to converge,by relaxing the constriction the particle is allowed to oscillatethrough the problem space initially,searching for improvement.Therefore,it is desirable to constrict the system moderately,preventing explosion while still allowing for exploration.To demonstrate how to produce moderate constriction,thefollowing ER isused:.In other words,there are many different IRsthat produce the same explicit one.Forexample(4.20)Fig.5.Real parts of y and v,varying'over50units of time,for a range of'values.orRe)of the particles that are typically studied.Wecan clearly see the three cases:1)“spiral”easy convergence toward a nontrivial attractorfor[see Fig.6(a)];2)difficult convergencefor[see Fig.6(b)];3)quick almost linear convergencefor[see Fig.6(c)].Nevertheless,it is interesting to have a look at the true system,including the complex dimensions.Fig.6(d)–(f)shows someother sections of the whole surfacein[center(0,0)andradius(withhas been precisely chosen sothat(a)(b)(c)(d)(e)(f)Fig.6.Trajectories of a particle in phase space with three different values of'.(a)(c)and(e)Real parts of the velocity v and position relative to the previousbest y.(b)(d)and(f)Real and imaginary parts of v.(a)and(d)show the attractorfor a particle with'=2:5.Particle tends to orbit,rather than converging to0.0.(b)and(e)show the same views with'=3:99.(c)and(f)depict the“easy”convergence toward0.0of a constricted particle with'=6:0.Particleoscillates with quickly decaying amplitude toward a point in the phase space(and the search space).thepart tends to zero.This provides an intu-itive way to transform this stabilization into a trueconvergence.(a)(b)Fig.7.“Trumpet”global attractor when '<4.Axis (Re (v );Im (v );'); =8.(a)Effect on 'of the real and imaginary parts of v .(b)Effects of the real and imaginary parts of y .We just have to use a second coefficient in order to reduce theattractor,in thecase,sothat (4.22)The models studied here have only one constriction coeffi-cient.If onesetsis random and twovector terms are added to the velocity.In this section the results are generalized back to the original system as definedby(5.1)Noware defined tobe(5.2)to obtain exactly the original nonrandom system described in Section I.For instance,if there is a cycleforsothatif(5.3)Coming back to the()system,are(5.4)The use of the constriction coefficient can be viewed as a rec-ommendation to the particle to “take smaller steps.”The conver-gence is toward the point().RememberCalculate ; ; ; ; ;Initialize population:random x DoFor i =1to population size2Convergenceimplies velocity =0,but the convergent point is not neces-sarily the one we want,particularly if the system is too constricted.We hope to show in a later paper how to cope with this problem,by defining the optimal parameters.Fig.8.Example of the trajectory of a particle with the “original”formula containing two '(p 0x )terms,where 'is the upper limit of a uniform random variable.As can be seen,velocity v converges to 0.0and the particle’s position x converges on the previous best point p .if f (x)then pFor d =1to dimension'1=rand ()2('=2)'='1+'2p =(('1p ))='x =x v =v v ='v 0( 0(p 0x )Next d Next iUntil termination criterion is met.In this generalized version of the algorithm,the user selectsthe version and chooses valuesforthat are consistent with it.Then the two eigenvalues are computed and the greater one is taken.This operation can be performed as follows.discrim =(( ')+2 '( 0 ))=4a =( + 0 ')=2if (discrim >0)thenneprim 1=abs (a +pdiscrim )neprim 2=abs (a 0pdiscrim )elseneprim 1=in a i i i n a n n n i ni ni n n n m a n i n a a i a a i n a i n a i n n an i i n a i a m a n a m m a a i m m i i a i n a n a i n n i n i n n i i n i i n)<f (~p =~x=min(~p (p 0x )+'i i f fic i i x a par a a a pe f a al i i i v v if it is i d v a p f par i a d p i i a f a in d a v al d a fu i a v a f fu i a ffe f6an d i a an d a i i fu i po p a i f par i a fo i al pe fu i i pe f a v al a i d d aft it a i fr i i an v i a al i ar fo p ar i fu i ar d as a fu i fo p ar i al i diffe ve i a ve ap-pe a d in i a fo a d fo f1f2f4i i f5an d a i i fu i ar a fr a ffe f6fu i is a fr a a i d it i i ve a a diffe fo a i a fu i i ve is d in i a i a v i a p i i a i d at an d d i-i a a i d fu i is a fr i ar i ve in aF UNCTIONS U SED TO T EST THE E FFECTS OF THE C ONSTRICTION COEFFICIENTSTABLE IVF UNCTION P ARAMETERS FOR THE T EST PROBLEMSA.Algorithm Variations UsedThree variations of the generalized particle swarm were used on the problem suite.Type 1:The first version applied the constriction coefficient to all terms of theformulausing .Type 1:The second version tested was a simple constriction,which was not designed to converge,but not to explode,either,as was assigned a value of 1.0.The model was definedaswas initiallydefinedas.If ,then it was multiplied by 0.9iteratively.Once a satisfactory value was found,the fol-lowing model wasimplemented:As in the first version,a “generic”valueof was used.Table IV displays the problem-specific parameters implemented in the experimental trials.B.ResultsTable V compares various constricted particle swarms’per-formance to that of thetraditional particle swarm and evo-lutionary optimization (EO)results reported by [1].All particle swarm populations comprised 20individuals.Functions were implemented in 30dimensions except for f2,f5,and f6,which are given for two dimensions.In all cases ex-cept f5,the globally optimal function result is 0.0.For f5,the best known result is 0.998004.The limit of the control param-eterconstrictionand ,withset to the range of the initial domain for the function.Func-tion results were saved with six decimal places of precision.As can be seen,theTypeand Type 1constricted versions outperformedtheversions in almost every case;the exper-imental version was sometimes better,sometimes not.Further,theTypeand Type 1constricted particle swarms performed better than the comparison evolutionary method on three of the four functions.With some caution,we can at least consider the performances to be comparable.Eberhart and Shi’s suggestion to hedge the search by re-tainingwithType constriction does seem to result in good performance on all functions.It is the best on all except the Rosenbrock function,where performance was still respectable.An analysis of variance was performed comparing the “E&S”version withType,standardizing data within functions.It was found that the algorithm had a significant maineffectE MPIRICAL RESULTSMean best evaluations at the end of 2000iterations for various versions of particle swarm and Angeline’s evolu-tionary algorithm [1].and it is extremely likely that features of the implementation are responsible for some variance in the observed results.The comparison though does allow the reader to confirm that constricted particle swarms are comparable in performance to at least one evolutionary algorithm on these test functions.As has long been noted,theparticle swarm succeeds at finding optimal regions of the search space,but has no feature that enables it to converge on optima (e.g.,[1]).The constriction techniques reported in this paper solve this problem,they do force convergence.The data clearly indicate an increase in the ability of the algorithm to find optimal points in the search space for these problems as a result.No algorithmic parameters were adjusted for any of theparticle swarm trials.Parameters suchas,method,in fact,requires only the addition of a single coefficient,calcu-lated once at the start of the program,with almost no increase in time or memory resources.In the current analysis,the sine waves identified by Ozcan and Mohan [6],[7]turn out to be the real parts of the 5-D attractor.In complex number space,e.g.,in continuous time,the particleis seen to spiral toward an attractor,which turns out to be quitesimple in form:a circle.The real-number section by which this is observed when time is treated discretely is a sine wave.The 5-D perspective summarizes the behavior of a particle completely and permits the development of methods for controlling the explosion that results from randomness in the system.Coefficients can be applied to various parts of the formula in order to guarantee convergence,while encouraging exploration.Several kinds of coefficient adjustments are suggested in the present paper,but we have barely scratched the surface and plenty of experiments should be prompted by these findings.Simple modifications based on the present analysis resulted in an optimizer which appears,from these preliminary results,to be able to find the minima of some extremely complex benchmark functions.These modifications can guarantee convergence,which thetraditional particle swarm does not.In fact,the present analysis suggests that no problem-specific parameters may need to be specified.We remind the reader that the real strength of the particle swarm derives from the interactions among particles as they search the space collaboratively.The second term added to the velocity is derived from the successes of others,it is considered a “social influence”term;when this effect is removed from the algorithm,performance is abysmal [3].Effectively,thevariable[5]Y.Shi and R.C.Eberhart,“Parameter selection in particle swarm adap-tation,”in Evolutionary Programming VII,V.W.Porto,N.Saravanan,D.Waagen,and A.E.Eiben,Eds.Berlin,Germany:Springer-Verlag,1997,pp.591–600.[6] E.Ozcan and C.K.Mohan et al.,“Analysis of a simple particle swarmoptimization problem,”in Proc.Conf.Artificial Neural Networks in Engineering,C.Dagli et al.,Eds.,St.Louis,MO,Nov.1998,pp.253–258.[7],“Particle swarm optimization:Surfing the waves,”in Proc.1999Congr.Evolutionary Computation,Washington,DC,July1999,pp.1939–1944.[8]R.C.Eberhart and Y.Shi,“Comparing inertia weights and constrictionfactors in particle swarm optimization,”in Proc.2000Congr.Evolu-tionary Computation,San Diego,CA,July2000,pp.84–88.[9]K.De Jong,“An analysis of the behavior of a class of genetic adaptivesystems,”Ph.D.dissertation,put.Sci.,Univ.Michigan,Ann Arbor,MI,1975.[10]R.G.Reynolds and C.-J.Chung,“Knowledge-based self-adaptation inevolutionary programming using cultural algorithms,”in Proc.IEEE Int.Conf.Evolutionary Computation,Indianapolis,IN,Apr.1997,pp.71–76.[11]L.Davis,Ed.,Handbook of Genetic Algorithms.New York:Van Nos-trand Reinhold,1991.Maurice Clerc received the M.S.degree in mathe-matics(algebra and complex functions)from the Uni-versitéde Villeneuve,France,and the Eng.degree in computer science from the Institut industriel du Nord, Villeneuve d’Asq,France,in1972.He is currently with Research and Design,France Télécom,Annecy,France.His current research in-terests include cognitive science,nonclassical logics, and artificial intelligence.Mr.Clerc is a Member of the French Association for Artificial Intelligence and the InternetSociety. James Kennedy received the Master’s degree in psychology from the California State University, Fresno,in1990and the Doctorate from the Univer-sity of North Carolina,Chapel Hill,in1992.He is currently a Social Psychologist with the Bu-reau of Labor Statistics,Washington,DC,working in data collection research.He has been working with particle swarms since1994.。
关于粒子群优化的英文摘要范文
关于粒子群优化的英文摘要范文Particle Swarm Optimization (PSO) is a powerful and widely-used computational method inspired by the social behavior of birds and fish. Developed by Russell Eberhart and James Kennedy in 1995, it falls under the category of swarm intelligence algorithms, which leverage the collective behavior of decentralized systems to solve optimization problems. PSO is particularly effective for continuous optimization tasks, where it seeks to find optimal or near-optimal solutions from a defined search space.The core concept of PSO involves a group of individual entities, referred to as particles, which traverse the solution space by adjusting their positions based on their own experiences and those of their neighbors. Each particle maintains a record of its best-known position, as well as the best-known position of its group. This dual memory guides theparticles in their movement and allows them to iteratively converge towards the optimal solution. The movement of each particle is influenced by two primary components: cognitive and social. The cognitive component draws from the particle's own history, while the social component draws from thegroup's best-known position.A distinctive feature of PSO is its simplicity and ease of implementation, which makes it suitable for a broad range of applications, including function optimization, neural network training, and engineering design problems. The PSO algorithm can be adjusted with parameters such as the number of particles, their velocities, and the coefficients controlling the cognitive and social influences. These parameters can significantly affect the convergence speed and accuracy of the method.Moreover, PSO has been adapted and enhanced through various strategies, such as employing different topologiesfor particle interactions, incorporating constraints for specific problem domains, and hybridizing with other optimization techniques to improve performance. These modifications allow PSO to effectively tackle more complexand multi-dimensional problems.Despite its advantages, PSO has some limitations, such as the potential for premature convergence and sensitivity to parameter settings. Researchers continue to investigate waysto overcome these challenges, leading to the development of improved variants and hybrid methodologies that combine the strengths of PSO with other optimization strategies.In summary, Particle Swarm Optimization represents a significant advancement in the field of optimization algorithms. Its biologically inspired mechanism, adaptability, and versatility have made it a popular choice for solving a wide array of real-world problems across diverse disciplines. As ongoing research continues to refine and enhance PSO, itis poised to remain an essential tool for optimization in the future.。
考虑制动能量转移的复杂电气化铁路牵引用光伏发电系统综合能量管理策略
考虑制动能量转移的复杂电气化铁路牵引用光伏发电系统综合能量管理策略郭跃进,丁璨,游海川,张鸿荣(三峡大学电气与新能源学院,湖北宜昌443000)摘要:本研究提出了一种新的光伏发电系统,以有效缓解外部电网的压力,并实现有效的再生制动能量回收。
首先对系统的能量控制模式进行了详细的分析;其次设计了一种考虑到相互制动能量转移的综合能源管理策略;利用粒子群(PSO)算法和优化过的粒子群算法,对背靠背变流器补偿功率参考值进行了计算;最后,通过实验来验证这一策略的有效性,并且比较两种算法的仿真结果的收敛性。
通过MATLAB/Simulink仿真系统,建立了一个基于背靠背变流器的光伏接入牵引供电系统模型,并且设计了一套有效的能量管理策略和编码算法,以及一系列典型工况下的能源利用、相间制动能量转移情况,以及与不用算法的效果进行比较。
经过研究发现,采用提出的系统架构和综合能源管理策略,可以有效地利用光伏发电,最大限度地优化相间制动能量的使用,而且优化后的收敛性也更加出色。
关键词:牵引用光伏系统;制动能量;能量管理;背靠背变流器;粒子群算法中图分类号:TM73 DOI:10.16786/ki.1671-8887.eem.2024.02.014Integrated Energy Management Strategy for ComplexRailway Electrification System Traction Photovoltaic Systems Considering Braking Energy TransferGUO Yuejin, DING can, YOU Haichuan, ZHANG Hongrong(China Three Gorges University the College of Electrical and new energy, Yichang 443000, China)Abstract: A new photovoltaic power generation system is proposed in this paper to effectively relieve the external power grid pressure and achieve effective regenerative braking energy recovery. Firstly, the energy control mode of the system was analyzed in detail; secondly, a comprehensive energy management strategy was designed which takes into account the transfer of braking energy. Next, the compensation power reference value of the back-to-back converter was calculated using Particle Swarm Optimization (PSO) algorithm and optimized PSO algorithm. We verified the efficacy of the strategy through experiments, and then contrasted the simulation outcomes of the PSO algorithm and optimized particle swarm algorithm. By MATLAB/Simulink in the simulation system, we established a model of photovoltaic access traction power supply system based on back-to-back converter, and design a set of effective energy management strategy and coding algorithm as well as a series of typical conditions of energy utilization, interphase braking energy transfer to comparison with the effect of not using the algorithm. The results show that that the proposed system architecture and integrated energy management strategy can effectively utilize photovoltaic power generation, and the use of interphase braking energy can be optimized to the maximum extent, and the convergence after optimization is better.Key words: photovoltaic system for traction; braking energy; energy management; back-to-back converter;particle swarm algorithm作者简介:郭跃进(1999-),男(汉族),山西长治人,硕士生,主要从事牵引系统的研究。
高中英语科技论文阅读单选题40题
高中英语科技论文阅读单选题40题1. The term "nanotechnology" refers to the manipulation of matter on an extremely small _____.A. scaleB. levelC. degreeD. range答案:A。
本题考查名词词义辨析。
“scale”有“规模、范围、程度”的意思,“on a small scale”表示“小规模地”,在“nanotechnology”(纳米技术)中,“scale”强调物质操作的规模大小;“level”侧重于水平、级别;“degree”指程度、度数;“range”表示范围、幅度。
此处“nanotechnology”涉及的是物质操作的极小规模,所以选A。
2. In the field of artificial intelligence, the concept of "machine learning" involves the ability of computers to ______ patterns in data.A. detectB. discoverC. exposeD. reveal答案:A。
“detect”有“察觉、探测、发现”的意思,强调通过观察或检测来发现;“discover”侧重于首次发现原本存在但未被知晓的事物;“expose”指暴露、揭露;“reveal”意为揭示、展现。
在“machine learning”((机器学习)中,计算机是通过分析数据来“察觉”模式,A 选项更符合语境。
3. The "quantum physics" phenomenon is characterized by the behavior of particles at the ______ level.A. atomicB. molecularC. subatomicD. microscopic答案:C。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
The Fully Informed Particle Swarm:Simpler,Maybe BetterRui Mendes,Member,IEEE,James Kennedy,and JoséNeves Abstract—The canonical particle swarm algorithm is a new ap-proach to optimization,drawing inspiration from group behaviorand the establishment of social norms.It is gaining popularity,es-pecially because of the speed of convergence and the fact that itis easy to use.However,we feel that each individual is not simplyinfluenced by the best performer among his neighbors.We,thus,decided to make the individuals“fully informed.”The results arevery promising,as informed individuals seem to find better solu-tions in all the benchmark functions.Index Terms—Optimization,particle swarm optimization,socialnetworks.I.I NTRODUCTIONT HE CANONICAL particle swarm algorithm works bysearching iteratively in a region that is defined by eachparticle’s best previous success,the best previous success ofany of its neighbors,the particle’s current position,and itsprevious velocity.The current paper proposes an alternativethat is conceptually more concise and promises to performmore effectively than the traditional particle swarm algorithm.In this new version,the particle uses information from all itsneighbors,rather than just the best one.The standard algorithm is given in some form resembling thefollowing:(1)(2)where denotes point-wise vectormultiplication,is a function that returns a vector whose positions are randomlygenerated following the uniform distributionbetweenand,is called the inertia weight and is less than1,andrepresent the speed and position of the particle at time,refers to the best position found by the particle,and refersto the position found by the member of its neighborhood thathas had the best performance so far.The Type1constrictioncoefficient is often used[1](3)(4)Manuscript received July10,2002;revised August29,2003.The workof R.Mendes was supported in part by PRAXIS XXI,ref.BD/3107/9andPOSI/ROBO/43904/2002.R.Mendes and J.Neves are with the Departmento de Informática,Univer-sidade do Minho,Braga4710-057,Portugal(e-mail:azuki@di.uminho.pt;jneves@di.uminho.pt).J.Kennedy is with the Bureau of Labor Statistics,Washington,DC20212USA(e-mail:Kennedy.Jim@).Digital Object Identifier10.1109/TEVC.2004.826074The two versions are equivalent,but are simply implementeddifferently.The second form is used in the present investiga-tions.Other versions exist,but all are fairly close to the modelsgiven above.A particle searches through its neighbors in order to identifythe one with the best result so far,and uses information from thatone source to bias its search in a promising direction.There isno assumption,however,that the best neighbor at time actuallyfound a better region than the second or third best neighbors.Important information about the search space may be neglectedthrough overemphasis on the single best neighbor.When constriction is implemented as in the second versionabove,lightening the right-hand side of the velocity formula,the constrictioncoefficient is calculated from the values ofthe acceleration coefficientlimitsand,importantly,it isthe sum of these two coefficients that determineswhat to use.This fact implies that the particle’s velocity can be adjusted byany number of terms,as long as the acceleration coefficientssum to an appropriate value.For instance,the algorithm givenabove is often usedwithand.The coefficients must sum,for that valueof to4.1.Clerc’sanalysis was worked out using a condensed form of theformula(5)(6)which was then expanded to partition the accelerationweight between the particle’s own previous success andthe neighborhood’s,suchthat.Notethat in this deterministicmodel is calculatedas.II.V ARIATION AND P ARTITIONINGOFThe search of particle converges on apoint in the searchspace.Variation is introduced in several ways.•First,obviously,the term is weighted by a random number.This in itself,however,would not prevent the velocityfrom approaching a zero limit.For instance,ifthedifference equals zero,the velocity will still converge tozero.•Thus,another important source of variation is the dif-ferencebetweenand.As long as the position ofthe particle differs from the previous best position,thenthere will be movement.In a constricted algorithm,how-ever,this difference tends toward zero over timeas isupdated.•Of course,it is hoped in practicethat does not re-main fixed,and a key source of variation is the updating of1089-778X/04$20.00©2004IEEEover time as new points are found in the search spacewhich are better than those previous ones.It is not neces-saryfor to be’s own previous best point,in order for’s trajectory to converge to it.For convergence,it is onlynecessaryfor to remain fixed.•In the traditional particle swarm,the very most importantsource of variation is the difference between’s own pre-vious best and the neighborhood’s previous best,that is,betweenand.Random weighting of the two termskeeps the particle searching between and beyond a re-gion defined by the two points.While some investigatorshave looked at schemes for differentially weighting thetwo terms(e.g.,[2]),the limits for the two uniform dis-tributions are usually the same.That is,the total weightofis partitioned into two equal components.Clerc’s method,however,does not require that the velocityadjustments be shared between two terms.It is only necessarythat the parts sum to a value that is appropriate for the constric-tion weight.The algorithm will behave properly,at least asfar as its convergence and explosion characteristics,whether allof is allocated to one term,or it is divided into thirds,fourths,etc.We propose an alternate form ofcalculating(7)(8)where is the set of neighbors of the particleand is the bestposition found byindividual.Thefunction may describe any aspect of the particle that ishypothesized to be relevant;in the experiments reported below,we use the fitness of the best position found by the particle,and the distance from that particle to the current individual,orhave return a constant value.Because all the neighbors con-tribute to the velocity adjustment,we say that the particle is fullyinformed.III.S OCIOMETRY IN THE F ULLY I NFORMED P ARTICLE S WARMIn the traditional particle swarm,a particlewith neighborsselects one to be a source of influence and ignores the others.Inthat situation,neighborhood size means how many other parti-cles you can choose among,and the more there are,the betterthe one you pick is likely to be.In the fully informed neighbor-hood,however,all neighbors are a source of influence.Thus,neighborhood size determines how diverse your influences willbe and in an optimization algorithm diverse influences mightmean that search is diluted rather than enhanced.The rest of the paper will describe experiments with var-ious neighborhoods,where all the neighbors’previous bestvalues are used to modify the velocity of the particle.Thesearrangements of the neighborhoods can be thought of as socialnetworks.It should be appreciated that the topological structure of thepopulation controls its exploration versus exploitation tenden-cies[3],[4].Fig.1.Topologies used in the paper are presented in the following order:All,where all vertexes are connected to every other;Ring,where every vertexis connected to two others;Four clusters,with four cliques connected amongthemselves by gateways;Pyramid,a triangular wire-frame pyramid,and Square,which is a mesh where every vertex has four neighbors that wraps around on theedges as a torus.The behavior of each particle is affected by its local neigh-borhood,which can be seen as a single region in the populationtopology.Thus,the topology affects search at a low level bydefining neighborhoods.Particles that are acquainted to one an-other tend to explore the same region of the search space.It alsoaffects search at a higher level,by defining the relationships be-tween the local neighborhoods.The current study tested five different social networks thathad given good results in a previous study[3].The networks areencoded in binary matrices for input into the program,and aredepicted graphically in Fig.1.A social network can be characterized by a series of statisticsthat convey some information about its structure and the speedof communication flow.The most descriptive statistics are thegraph’s average distance,its diameter and the distribution se-quence.The average distance measures the average number ofedges between any two nodes.The diameter is the largest dis-tance between two nodes in the graph.The distribution sequenceis a descriptive statistic,of theformwhereis the average number of nodes reachable from a vertex of thegraph by traversing exactly arcs,without cycles.Note that thefirst value of the distributionsequence,,is the average degreeof the graph.Whenever a particle discovers a good region of the searchspace,it only directly influencesits neighbors.Its seconddegree neighbors will only be influenced after those directlyconnected to it become highly successful themselves.Thus,there is a delay in the information spread through the graph.This delay can be characterized by the distribution sequencestatistic.The average distance and the diameter of the graphare two simple statistics that represent the average and themaximum,respectively,number of cycles of influence neededto broadcast information throughout the graph.By studying Table I,we can extract a number of conclusionsabout the topologies used.The all topology was the one usedwhen the algorithm was developed and is still widely used byresearchers.It represents a fully connected graph,and,basedon all three statistics,we conjecture that information spreadsTABLE IT OPOLOGIES U SED IN THE S TUDY ANDTHEA SSOCIATED G RAPH STATISTICSquickly.Sociologically,it could represent a small and closed community where decisions are taken in consensus.The ring topology,the usual alternative to all ,represents a regular graph with a minimum number of edges between its nodes.The graph statistics show that information travels slowly along the graph.This allows for different regions of the search space to be explored at the same time,as information of suc-cessful regions takes a long time to travel to the other side of the graph.The four clusters topology represents four cliques connected among themselves by several gateways.Sociologically,it re-sembles four mostly isolated communities,where a few indi-viduals have an acquaintance outside their group.This graph is characterized by the large number of individuals three hops away,despite the fact that its diameter is only 3.The pyramid represents a three-dimensional wire-frame pyramid.It has the lowest average distance of all the graphs and the highest first and second degree neighbors.The square is a graph representing a rectangular lattice that folds like a torus.This structure,albeit artificial,is commonly used to represent neighborhoods in the Evolutionary Computation and Cellular Automata communities,and is referred to as the von Neumann neighborhood.IV .D EPENDENT V ARIABLES AND F REE L UNCHThe present experiments extracted three kinds of measures of performance on a standard suite of test functions.The func-tions were the sphere or parabolic function in 30dimensions,Rastrigin ’s function in 30dimensions,Griewank ’s function in 10and 30dimensions (the importance of the local minima is much higher in 10dimensions,due to the product of co-sinuses,making it much harder to find the global minimum),Rosen-brock ’s function in 30dimensions,and Schaffer ’s f6,which is in 2dimensions.Formulas can be found in the literature,e.g.,in [5].It does not seem interesting to us to demonstrate that an al-gorithm is good on some functions and not on others.What we hope for is a problem-solver that can work well with a wide range of problems.This line of reasoning drives us head-on into the no free lunch (NFL)theorem [6],[7].A.Free LunchNFL asserts that no algorithm can be better than any other,over all possible functions.This seems to be true because of two classes of functions:deceptive ones,and random ones.De-ceptive functions lead a hill-climber away from the optimum,for instance there may be gradients that lead away from a dis-continuous point that is the global optimum.Over all possible functions,it must be true that gradients lead away from the optimum at least as often as they lead the searcher toward it.The second class,random functions,contains very many more members than the first [8].In fact,when all possible functions are considered,it seems certain —indeed it can be proven —that most of them are nonsense.Where gradients exist,they are un-related to real solutions.On these very numerous function land-scapes,a hill-climber will do no better than a hill-descender,no matter whether you are trying to maximize or minimize.It is like finding a needle in a haystack;no method of search can be any better than dumb luck.These two classes of functions ex-plain why there is NFL.But there is a third class of functions.These are functions where regularities on the fitness landscape do provide clues as to the location of a problem solution.Speaking of dumb luck,it is lucky for us that this third class contains most of the kinds of functions that we call problems .Problems are a special subclass of functions;they are special because somebody thinks there may be a solution to them,and wants to find it.It is interesting to consider whether this third class of func-tions is actually more common in the world,perhaps because of correlations forced by physical laws,or whether they are merely more salient because of some idiosyncrasy of human at-tention.As we cannot count up instances of real function land-scapes —like the set of “all possible functions ”it is innumerable and meaningless —we will never be able to satisfy our curiosity regarding this question.How do we know if a function has a solution or not?Of course,we have known since Turing that we cannot tell with certainty whether an algorithm will ever reach finality [9],that is in this case,whether a problem can be solved.But even though there is no certainty,there are clues.For instance,if it is believed that a cause and effect relationship exists among variables in the function,then we may expect to find some exploitable regular-ities in the fitness landscape.Even if the causal relationship is noisy,or if the relationship involves variables not mentioned in the function (e.g.,the “third variable problem ”in correlational research [10]),it is often possible to find useful features on the function landscape.Another clue that a function might be solvable is when it is compressible.The easiest-to-spot form of this clue exists when the problem is given as a mathematical formula,rather than a lookup table.If the formula is shorter than the table of all possible input –output matches,then we have been given a hint that it might be useful to watch for regularities.The evidence of this is seen in the difficulty of the search for functions that produce random outputs [11];it is not easy to produce an un-predictable series out of a mathematical formula,e.g.,a good random number generator,even though random functions are known to comprise the larger share of the universe of all pos-sible functions.In some hard cases,we may have only hypotheses and intu-itions to provide ideas for how to search for patterns that will reveal the problem solution.Sometimes we are wrong,and a problem is reassigned to one of the first two classes of functions.We reiterate the important point that a function is only a problem if someone thinks it is a problem.That means thata function,let us say Schaffer’s f6,may exist as a curiosity without anyone ever trying to find its minimum.All of science can be viewed as a progression of things that have always existed suddenly becoming problems to be solved and explained.The argument presented here is the pragmatist’s response to NFL.If somebody is trying to solve it,it is a problem;even if it does turn out to be deceptive or random,and they give up on it,and it stops being a problem,it is a problem during the time they are trying to solve it.It may remain a problem if something about it gives the researcher hope of solving it.This is all a way of saying that the NFL theorem,eyeball-popping as it is,is not especially relevant to the task of problem-solving.NFL does not say that the search for a general problem-solver is futile;it does say that the search for a general function optimizer is futile.As researchers,it is our aim to minimize the amount of time we devote to searching for optima on deceptive and random function spaces.Thus,in the current exercises we combined results from all the test functions,all of which are commonly used in experi-mentation with optimization algorithms,with the goal in mind of finding versions of the particle swarm algorithm that perform well on all of them.If we are successful in this,then we will nat-urally extend the range of problems until we have widened the applicability of the particle swarm to its broadest extent.B.PerformanceThe first dependent variable is simply the best function result after some arbitrary number of iterations,here,we use1,000. Basically,this is a measure of sloppy speed.It does not neces-sarily indicate whether the algorithm is close to the global op-timum;a relatively high score can be obtained on some of these multimodal functions simply by finding the best part of a locally optimal region.It is not possible to combine raw results from different func-tions,as they are all scaled differently.For instance,almost any decent algorithm will find a function result less than0.01on the sphere function,but a result of40.0on Rosenbrock is considered good.In order to combine the function outputs,we standardized the results of each function to a mean of0.0and standard de-viation of1.0.All results of all trials for a single function are standardized to the same scale;as all of these problems involve minimization,a lower result is better,and after standardization that means that a negative result is better than average.After standardizing each function separately,we can combine them and find the average for a single condition.One comment about combining data from different functions: when a very good performance is combined with a very bad one,the result is a moderate average.On the other hand,a very good average can only be attained through combining very good scores.In this paper,we are interested in discovering very good performers and will neglect the confusion found in the middle.C.Iterations to CriteriaThe second dependent variable is the number of iterations re-quired to reach a criterion.Function criteria are given in Table II. This is also a measure of speed,but in this case the criteria are intended to indicate that the searcher has arrived in the region of the global optimum.TABLE IIP ARAMETERS AND C RITERIA FOR THE T EST FUNCTIONSThere is,however,a problem with this measure,too.That is,some trials might never reach the criteria.Many hours have been lost waiting,trying to give each version a fair chance to find the global optimum,often in vain.Trials where the criteria are not met after a reasonable time—here,we use10000iter-ations—must be coded as infinite,which means among other things that the mean is meaningless.The proper measure of central tendency for such a data set is the median.If the majority of trials are coded as infinite,then the median is represented as infinity,shown in the results tables with the lemniscus.In order to combine iteration data,we used the mean of the medians,with the caveat that if any median were infinite,the mean would be infinite,too.Note that the first measure,performance,considers data after a short run of1000iterations,and is a speed measure.The trials were run for as many as10000iterations,however,to determine whether the criterion would be met at all.Thus,one measure was taken at1000iterations,and then if the criterion had not been met,the trial ran for as many iterations as were necessary. If the criterion was not met by10000iterations,the trial was treated as if the criterion would never be met.In most cases this is true,as failure after10000iterations suggests that the population has converged in an area of the search space that is not globally optimal.The first measure determines whether the algorithm can get a good solution fast,e.g.,after only1000 iterations,while the second and third measures determine how long it takes to find the global optimum if left to run,or whether it can find it at all.D.Proportion Reaching CriteriaThe third dependent measure is perhaps the most important one.This is a simple binary code indicating whether the cri-teria were met within10000iterations or not.Averaged over all function trials,this gives the proportion of trials that success-fully found the global optimum.There is no trick to this one; the mean of the ones and zeroes,where one indicates success and zero failure,gives the proportion of successes.V.M ETHODThe experiment manipulated neighborhood topologies, initialization strategies,and algorithm details.The types of topologies have been described and are shown in Fig.1.Two kinds of initialization strategies were used,which we called, after Shi and Eberhart[12]“symmetrical”and“asymmet-rical.”Symmetrical initialization is performed over the entire spectrum of valid solutions,while asymmetrical initializationS TANDARDIZED P ERFORMANCE OF THE T OPOLOGIES AND A LGORITHMS .N EGATIVE V ALUES ARE B ELOW THE M EANW HILE P OSITIVE V ALUES ARE A BOVE .A S THE T ASKS I NVOLVE M INIMIZATION ,THE B EST P ERFORMANCES ARE THE M OST N EGATIVE .I N B OLD ARE THE B EST R ESULTS FOR E ACH A LGORITHM /I NITIALIZATION PAIRstarted particles with an offset,so they were off-center.This eliminated any advantage that might be gained when function optima were located near the center of the parameter space.There were five kinds of algorithm types:Canonical:the traditional particle swarm,with Type 1constriction;FIPS:the fully informed particle swarmwithre-turning a constant,i.e.,where all contributions have the same value;wFIPS:a fully informed swarm,where the contributionof each neighbor was weighted by the goodness of its previous best;wdFIPS:also fully informed,with the contribution ofeach neighbor weighted by its distance in the search space from the target particle;Self:a fully informed model,where the particle ’sown previous best received half the weight;wSelf:a fully informed model,where the particle ’sown previous best received half the weight and the contribution of each neighbor was weighted by the goodness of its previous best.Canonical ,FIPS ,and wFIPS were tested with both symmet-rical and asymmetrical initializing.The five types of topologies shown in Fig.1were tested.As some were tested with and without including the target par-ticle in the neighborhood,there were nine topology conditions:Square,Ring,Pyramid,and All were tested both ways,and FourClusters was only tested with the self excluded.Conditions without the self are written with a “U ”prefix,e.g.,USquare is the Square topology,with the reference to the particle ’s own index removed from the neighborhood.VI.R ESULTSWe present the results on the three dependent measures sep-arately.Following that we look at patterns across the measures,and finally we discuss the implications of the results.A.PerformanceTable III shows the pattern of standardized averages across the topologies and algorithms.Recalling that positive values in-dicate bad performance and negative ones good for the mini-mization problem,we notice some patterns immediately.For in-stance,four of the nine wdFIPS algorithm conditions are quite bad (more than three standard deviations worse than the mean);one cell of the All topology was more than 3s.d.,and two cells more than one s.d.worse than the mean;and two of the UAll topology conditions were farther than one s.d.worse than the mean.Two other cells in the Square,and two in the Pyramid topology,were less than one s.d.worse than the mean.These account for all of the worse-than-average cells in the design.Looking for excellence,we note that of the eight conditions resulting in a performance 0.4standard deviations or farther below the mean,five of them occurred when the neighborhood was the unselfed square.The other three appear in selfless pyramid conditions.The best performance of all occurred in the selfless-square FIPS configuration.In light of the results presented below,it is noteworthy that problem solving using the URing topology was rather slow,rel-ative to the others,while the USquare was rather fast.The performance measure tells us how well a problem-solver is able to do within a limited amount of time.Many times in real-world applications it is “good enough ”to find a good point on a local optimum;this first dependent variable tells us how high an algorithm is able to get on a fitness peak,but says nothing about whether it is the globally best peak.B.Iterations to CriteriaHow quickly does an algorithm reach a criterion that presum-ably reflects the presence of a global optimum?In Table IV ,we see that some algorithm conditions cannot reach the crite-rion,even after 10000iterations.In particular,the wdFIPS tends not to reach it,especially with topologies that showed badly on the performance measure,as well;the All and UAll mea-sures also failed in all cases with the FIPS variations,though they displayed about average success on the canonical algo-rithms.A few other topologies had trouble with the asymmet-rical initializations.Again,the URing was relatively slow and the USquare rel-atively faster than others.The canonical versions were moder-ately slow.The configurations that converged the fastest were the UPyramid on both FIPS and wFIPS,and the Four-Cluster topology on wFIPS.Medians are used in this measure to account for failures to meet the criterion at all.A cell may have as many as half its trials fail to meet the standard,but if the remaining trials went quickly,the median iterations will suggest erroneously that something good has happened.Fast convergence of a configuration to the performance criterion on a large percentage of trials would sug-gest good problem solving qualities;fast convergence on halfM EDIAN N UMBER OF I TERATIONS TO C RITERIA .T HESE R EPRESENT THE N UMBER OF I TERATIONS THE A LGORITHM T OOK TOR EACH THE C RITERIA .A N I NFINITE V ALUE M EANS T HAT AT L EAST H ALF THE E XPERIMENTS W ERE U NSUCCESSFUL .I N B OLD ARE THE Q UICKEST R ESULT FOR E VERY A LGORITHM /I NITIALIZATION PAIRTABLE VP ROPORTION OF E XPERIMENTS R EACHING C RITERIA .T HEY R EPRESENT FOR E ACH C ONFIGURATION THE P ROPORTIONOFR UNS T HAT W ERE A BLE TO R EACH THE R EGION S PECIFIED BY THE C RITERIA .I N B OLD ARE THEB EST R ESULTS FOR E ACH A LGORITHM /I NITIALIZATION PAIRthe trials would not.The next dependent variable tells us how often the criteria were met.C.Proportion of Trials Reaching CriteriaFor us,the third dependent measure is the most important.With today ’s computer speeds,the difference of a few thou-sand iterations may be a matter of seconds,and slight speed advantages are not usually crucial.The proportion measure tells,though,in black and white,whether the given algo-rithm/topology configuration can solve the problems (Table V).The first result that jumps out is that the URing topology with the wFIPS algorithm found the global optimum (as measured by meeting the criteria)on 100%of its trials,that is,40trials each on 6functions,amounting to 240total trials.This is ob-viously a remarkable performance.We note also that 24algo-rithm/topology combinations,out of 81,met the criterion 90%of the time or more.The canonical algorithm harbored perfor-mances greater than 0.90on five of nine topologies,and the wFIPS on five of nine.wFIPS beat the 90%mark in three of the asymmetric initialization conditions,while the canonical al-gorithm never did.Unweighted FIPS was above 0.90four times in the symmetric and three times in the asymmetric initializa-tion conditions,and the weighted and unweighted Self algo-rithm broke the 0.90standard one time each.Looking at topologies,we see that none of the Square,Pyramid,All,or UAll conditions met the criterion 90%of the time.The Ring did it five times;the Four-Clusters thrice;USquare six times;URing eight times;and UPyramid twice.It appears that the USquare and URing topologies were the most successful vehicles for the communication among particles,at least across the algorithms tested here.bining the MeasuresOur prejudice is that the most weight should be given to the last measure,the proportion of successes,though the other mea-sures should be taken into account.By this rule of thumb,we could recommend always using the URing,which never failed when implemented with the wFIPS algorithm.We remember,however,that it was relatively slow by both the first two mea-sures —plus,weighting the FIPS adds some computational cost.If speed is a requirement,and the URing ’s relative slowness may create problems,then we would suggest the USquare with the unweighted FIPS algorithm.This combination succeeded approximately 98.9%of the time,meaning that it failed three times out of 240.The USquare/FIPS also had the best score at 1000iterations and was the ninth fastest to reach the criteria.VII.C ONCLUSIONThe canonical particle swarm algorithm showed itself to be a journeyman problem solver,finding the global optimum a re-spectable proportion of the time,depending on topology,and getting there in a respectably fast time.It was outperformed,though,by the FIPS versions,on every dependent measure.It should be mentioned that the FIPS versions were the only ones able to consistently find the minimum to the Griewank function in ten dimensions.The best result obtained by a canon-ical configuration was the USquare with a proportion of 72.5%followed by much lower proportions using the other topolo-gies (lower than 60%).Both FIPS algorithms with all the social topologies except the All versions were able to find the min-imum 100%of the time.。