Enslaving random fluctuations in nonequlibrium systems
AdaptivebacksteppingElman-basedneuralcontrol for unknownnonlinearsystems
Department of Electrical Engineering, Tamkang University, No. 151, Yingzhuan Road, Tamsui District, New Taipei City 25137, Taiwan
Adaptive backstepping Elman-based neural control for unknown nonlinear systems
1. Introduction The uncertainty of system dynamics may have strong adverse effects upon system performance, constructing a controller which can achieve favorable control performance is an important issue. The system uncertainties including the unmodeled system dynamics and the external disturbances unavoidably exist in the practical systems. To overcome this problem, many neural-network-based adaptive controllers without any knowledge of the control plants have been proposed [1–9]. The success key element is the self-learning ability that neural networks are used for controller developments and for describing the system dynamics without requiring preliminary offline tuning. By adequately choosing neural network structures, training methods and sufficient input data, the neural-network-based adaptive controllers are capable to compensate for the effects of nonlinearities and system uncertainties, so that the stability, error convergence and robustness of the control system can be guaranteed. Because a RBF neural network has a simple structure, there has been considerable interest in exploring the applications to deal with the nonlinearity and uncertainty of control systems [10–13]. From the controller design viewpoint, since the output for a dynamic nonlinear system is a function of past output, past input, or both, control of this dynamic nonlinear system is not a static control problem. The RBF neural network is static neural network.
a rX iv:mat h /1186v2[mat h.PR]1Oct21Randomness Paul Vit´a nyi ∗CWI and Universiteit van Amsterdam Abstract Here we present in a single essay a combination and completion of the several aspects of the problem of randomness of individual objects which of necessity occur scattered in our text [10].The reader can consult different arrangements of parts of the material in [7,20].Contents 1Introduction 21.1Occam’s Razor Revisited .......................31.2Lacuna of Classical Probability Theory ...............41.3Lacuna of Information Theory ....................42Randomness as Unpredictability 62.1Von Mises’Collectives ........................82.2Wald-Church Place Selection ....................113Randomness as Incompressibility 123.1Kolmogorov Complexity .......................143.2Complexity Oscillations .......................163.3Relation with Unpredictability ...................193.4Kolmogorov-Loveland Place Selection ...............204Randomness as Membership of All Large Majorities214.1Typicality ...............................214.2Randomness in Martin-L¨o f’s Sense .................244.3Random Finite Sequences ......................254.4Random Infinite Sequences .....................284.5Randomness of Individual Sequences Resolved (37)5Applications375.1Prediction (37)5.2G¨o del’s incompleteness result (38)5.3Lower bounds (39)5.4Statistical Properties of Finite Sequences (41)5.5Chaos and Predictability (45)1IntroductionPierre-Simon Laplace(1749—1827)has pointed out the following reason why intuitively a regular outcome of a random event is unlikely.“We arrange in our thought all possible events in various classes;andwe regard as extraordinary those classes which include a very smallnumber.In the game of heads and tails,if head comes up a hundredtimes in a row then this appears to us extraordinary,because thealmost infinite number of combinations that can arise in a hundredthrows are divided in regular sequences,or those in which we ob-serve a rule that is easy to grasp,and in irregular sequences,thatare incomparably more numerous”.[place,A PhilosophicalEssay on Probabilities,,Dover,1952.Originally published in1819.Translated from6th French edition.Pages16-17.]If by‘regularity’we mean that the complexity is significantly less than maximal, then the number of all regular events is small(because by simple counting the number of different objects of low complexity is small).Therefore,the event that anyone of them occurs has small probability(in the uniform distribution). Yet,the classical calculus of probabilities tells us that100heads are just as probable as any other sequence of heads and tails,even though our intuition tells us that it is less‘random’than some others.Listen to the redoubtable Dr. Samuel Johnson(1709—1784):“Dr.Beattie observed,as something remarkable which had hap-pened to him,that he chanced to see both the No.1and the No.1000,of the hackney-coaches,thefirst and the last;‘Why,Sir’,saidJohnson,‘there is an equal chance for one’s seeing those two num-bers as any other two.’He was clearly right;yet the seeing of twoextremes,each of which is in some degree more conspicuous than therest,could not but strike one in a stronger manner than the sightof any other two numbers.”[James Boswell(1740—1795),Life ofJohnson,Oxford University Press,Oxford,UK,1970.(Edited byR.W.Chapman,1904Oxford edition,as corrected by J.D.Fleeman,third edition.Originally published in1791.)Pages1319-1320.]Laplace distinguishes between the object itself and a cause of the object.2“The regular combinations occur more rarely only because they areless numerous.If we seek a cause wherever we perceive symmetry,itis not that we regard the symmetrical event as less possible than theothers,but,since this event ought to be the effect of a regular causeor that of chance,thefirst of these suppositions is more probablethan the second.On a table we see letters arranged in this order Co n s t a n t i n o p l e,and we judge that this arrangementis not the result of chance,not because it is less possible than others,for if this word were not employed in any language we would notsuspect it came from any particular cause,but this word being inuse among us,it is incomparably more probable that some personhas thus arranged the aforesaid letters than that this arrangementis due to chance.”[place,Ibid.]Let us try to turn Laplace’s argument into a formal one.First we introduce some notation.If x is afinite binary sequence,then l(x)denotes the length (number of occurrences of binary digits)in x.For example,l(010)=3.1.1Occam’s Razor RevisitedSuppose we observe a binary string x of length l(x)=n and want to know whether we must attribute the occurrence of x to pure chance or to a cause. To put things in a mathematical framework,we define chance to mean that the literal x is produced by independent tosses of a fair coin.More subtle is the interpretation of cause as meaning that the computer on our desk computes x from a program provided by independent tosses of a fair coin.The chance of generating x literally is about2−n.But the chance of generating x in the form of a short program x∗,the cause from which our computer computes x,is at least2−l(x∗).In other words,if x is regular,then l(x∗)≪n,and it is about 2n−l(x∗)times more likely that x arose as the result of computation from some simple cause(like a short program x∗)than literally by a random process.This approach will lead to an objective and absolute version of the classic maxim of William of Ockham(1290?–1349?),known as Occam’s razor:“if there are alternative explanations for a phenomenon,then,all other things being equal,we should select the simplest one”.One identifies‘simplicity of an object’with‘an object having a short effective description’.In other words,a priori we consider objects with short descriptions more likely than objects with only long descriptions.That is,objects with low complexity have high probability while objects with high complexity have low probability.This principle is intimately related with problems in both probability theory and information theory.These problems as outlined below can be interpreted as saying that the related disciplines are not‘tight’enough;they leave things unspecified which our intuition tells us should be dealt with.31.2Lacuna of Classical Probability TheoryAn adversary claims to have a true random coin and invites us to bet on the outcome.The coin produces a hundred heads in a row.We say that the coin cannot be fair.The adversary,however,appeals to probabity theory which says that each sequence of outcomes of a hundred coinflips is equally likely,1/2100, and one sequence had to come up.Probability theory gives us no basis to challenge an outcome after it has happened.We could only exclude unfairness in advance by putting a penalty side-bet on an outcome of100heads.But what about1010...?What about an initial segment of the binary expansion ofπ?Regular sequence1Pr(00000000000000000000000000)=226Random sequence1Pr(10010011011000111011010000)=being equally probable,this quantity is the number of bits needed to count all possibilities.This expresses the fact that each message in the ensemble can be communi-cated using this number of bits.However,it does not say anything about the number of bits needed to convey any individual message in the ensemble.To illustrate this,consider the ensemble consisting of all binary strings of length 9999999999999999.By Shannon’s measure,we require9999999999999999bits on the average to encode a string in such an ensemble.However,the string consisting of 99999999999999991’s can be encoded in about55bits by expressing9999999999 999999in binary and adding the repeated pattern‘1’.A requirement for this to work is that we have agreed on an algorithm that decodes the encoded string. We can compress the string still further when we note that9999999999999999 equals32×1111111111111111,and that1111111111111111consists of241’s.Thus,we have discovered an interesting phenomenon:the description of some strings can be compressed considerably,provided they exhibit enough regularity.This observation,of course,is the basis of all systems to express very large numbers and was exploited early on by Archimedes(287BC—212BC)in his treatise The Sand-Reckoner,in which he proposes a system to name very large numbers:“There are some,King Golon,who think that the number of sandis infinite in multitude[...or]that no number has been named whichis great enough to exceed its multitude.[...]But I will try to showyou,by geometrical proofs,which you will be able to follow,that,of the numbers named by me[...]some exceed not only the massof sand equal in magnitude to the earthfilled up in the way de-scribed,but also that of a mass equal in magnitude to the universe.”[Archimedes,The Sand-Reckoner,pp.420-429in:The World ofMathematics,Vol.1,J.R.Newman,Ed.,Simon and Schuster,NewYork,1956.Page420.]However,if regularity is lacking,it becomes more cumbersome to express large numbers.For instance,it seems easier to compress the number‘one billion,’than the number‘one billion seven hundred thirty-five million two hundred sixty-eight thousand and three hundred ninety-four,’even though they are of the same order of magnitude.The above example shows that we need too many bits to transmit regular objects.The converse problem,too little bits,arises as well since Shannon’s theory of information and communication deals with the specific technology problem of data transmission.That is,with the information that needs to be transmitted in order to select an object from a previously agreed upon set of alternatives;agreed upon by both the sender and the receiver of the message. If we have an ensemble consisting of the Odyssey and the sentence“let’s go drink a beer”then we can transmit the Odyssey using only one bit.Yet Greeks5feel that Homer’s book has more information contents.Our task is to widen the limited set of alternatives until it is universal.We aim at a notion of ‘absolute’information of individual objects,which is the information which by itself describes the object completely.Formulation of these considerations in an objective manner leads again to the notion of shortest programs and Kolmogorov complexity.2Randomness as UnpredictabilityWhat is the proper definition of a random sequence,the‘lacuna in probability theory’we have identified above?Let us consider how mathematicians test ran-domness of individual sequences.To measure randomness,criteria have been developed which certify this quality.Yet,in recognition that they do not mea-sure‘true’randomness,we call these criteria‘pseudo’randomness tests.For instance,statistical survey of initial segments of the sequence of decimal dig-its ofπhave failed to disclose any significant deviations of randomness.But clearly,this sequence is so regular that it can be described by a simple program to compute it,and this program can be expressed in a few bits.“Any one who considers arithmetical methods of producing randomdigits is,of course,in a state of sin.For,as has been pointed outseveral times,there is no such thing as a random number—there areonly methods to produce random numbers,and a strict arithmeticalprocedure is of course not such a method.(It is true that a problemwe suspect of being solvable by random methods may be solvable bysome rigorously defined sequence,but this is a deeper mathematicalquestion than we can go into now.)”[John Louis von Neumann(1903—1957),Various techniques used in connection with randomdigits,J.Res.Nat.Bur.Stand.Appl.Math.Series,3(1951),pp.36-38.Page36.Also,Collected Works,Vol.1,A.H.Taub,Ed.,Pergamon Press,Oxford,1963,pp.768-770.Page768.]This fact prompts more sophisticated definitions of randomness.In his famous address to the International Congress of Mathematicians in1900,David Hilbert (1862—1943)proposed twenty-three mathematical problems as a program to direct the mathematical efforts in the twentieth century.The6th problem asks for”To treat(in the same manner as geometry)by means of axioms,those physical sciences in which mathematics plays an important part;in thefirst rank are the theory of probability..”.Thus,Hilbert views probability theory as a physical applied theory.This raises the question about the properties one can expect from typical outcomes of physical random sources,which a priori has no relation whatsoever with an axiomatic mathematical theory of probabilities. That is,a mathematical system has no direct relation with physical reality.To6obtain a mathematical system that is an appropriate model of physical phe-nomena one needs to identify and codify essential properties of the phenomena under consideration by empirical observations.Notably Richard von Mises(1883—1953)proposed notions that approach the very essence of true randomness of physical phenomena.This is related with the construction of a formal mathematical theory of probability,to form a basis for real applications,in the early part of this century.While von Mises’objective was to justify the applications to the real phenomena,Andrei Niko-laevitch Kolmogorov’s(1903—1987)classic1933treatment constructs a purely axiomatic theory of probability on the basis of set theoretic axioms.“This theory was so successful,that the problem offinding the basisof real applications of the results of the mathematical theory of prob-ability became rather secondary to many investigators....[however]the basis for the applicability of the results of the mathematical the-ory of probability to real‘random phenomena’must depend in someform on the frequency concept of probability,the unavoidable natureof which has been established by von Mises in a spirited manner.”[A.N.Kolmogorov,On tables of random numbers,Sankhy¯a,SeriesA,25(1963),369-376.Page369.]The point made is that the axioms of probability theory are designed so that abstract probabilities can be computed,but nothing is said about what prob-ability really means,or how the concept can be applied meaningfully to the actual world.Von Mises analyzed this issue in detail,and suggested that a proper definition of probability depends on obtaining a proper definition of a random sequence.This makes him a‘frequentist’—a supporter of the frequency theory.The following interpretation and formulation of this theory is due to John Edensor Littlewood(1885—1977),The dilemma of probability theory,Little-wood’s Miscellany,Revised Edition,B.Bollob´a s,Ed.,Cambridge University Press,1986,pp.71-73.The frequency theory to interpret probability says, roughly,that if we perform an experiment many times,then the ratio of favor-able outcomes to the total number n of experiments will,with certainty,tend to a limit,p say,as n→∞.This tells us something about the meaning of probability,namely,the measure of the positive outcomes is p.But suppose we throw a coin1000times and wish to know what to expect.Is1000enough for convergence to happen?The statement above does not say.So we have to add something about the rate of convergence.But we cannot assert a certainty about a particular number of n throws,such as‘the proportion of heads will be p±ǫfor large enough n(withǫdepending on n)’.We can at best say‘the proportion will lie between p±ǫwith at least such and such probability(de-pending onǫand n0)whenever n>n0’.But now we defined probability in an obviously circular fashion.72.1Von Mises’CollectivesIn1919von Mises proposed to eliminate the problem by simply dividing all infi-nite sequences into special random sequences(called collectives),having relative frequency limits,which are the proper subject of the calculus of probabilities and other sequences.He postulates the existence of random sequences(thereby circumventing circularity)as certified by abundant empirical evidence,in the manner of physical laws and derives mathematical laws of probability as a con-sequence.In his view a naturally occurring sequence can be nonrandom or unlawful in the sense that it is not a proper collective.Von Mises views the theory of probabilities insofar as they are nu-merically representable as a physical theory of definitely observ-able phenomena,repetitive or mass events,for instance,as foundin games of chance,population statistics,Brownian motion.‘Prob-ability’is a primitive notion of the theory comparable to those of‘energy’or‘mass’in other physical theories.Whereas energy or mass exist infields or material objects,proba-bilities exist only in the similarly mathematical idealization of collec-tives(random sequences).All problems of the theory of probabilityconsist of deriving,according to certain rules,new collectives fromgiven ones and calculating the distributions of these new collectives.The exact formulation of the properties of the collectives is secondaryand must be based on empirical evidence.These properties are theexistence of a limiting relative frequency and randomness.The property of randomness is a generalization of the abundant experience in gambling houses,namely,the impossibility of a suc-cessful gambling system.Including this principle in the foundationof probability,von Mises argues,we proceed in the same way as thephysicists did in the case of the energy principle.Here too,the ex-perience of hunters of fortune is complemented by solid experienceof insurance companies and so forth.A fundamentally different approach is to justify a posteriori theapplication of a purely mathematically constructed theory of prob-ability,such as the theory resulting from the Kolmogorov axioms.Suppose,we can show that the appropriately defined random se-quences form a set of measure one,and without exception satisfyall laws of a given axiomatic theory of probability.Then it appearspractically justifiable to assume that as a result of an(infinite)ex-periment only random sequences appear.Von Mises’notion of infinite random sequence of0’s and1’s(collective)essen-tially appeals to the idea that no gambler,making afixed number of wagers of ‘heads’,atfixed odds[say p versus1−p]and infixed amounts,on theflips of a coin[with bias p versus1−p],can have profit in the long run from betting ac-8cording to a system instead of betting at random.Says Alonzo Church(1903—):“this definition[below]...while clear as to general intent,is too inexact in form to serve satisfactorily as the basis of a mathematical theory.”[A.Church, On the concept of a random sequence,Bull.Amer.Math.Soc.,46(1940),pp. 130-135.Page130.]Definition1An infinite sequence a1,a2,...of0’s and1’s is a random sequence in the special meaning of collective if the following two conditions are satisfied.1.Let f n is the number of1’s among thefirst n terms of the sequence.Thenf nlimn→∞we should distinguish between randomness proper(as absence of anyregularity)and stochastic randomness(which is the subject of prob-ability theory).There emerges the problem offinding reasons forthe applicability of the mathematical theory of probability to thereal world.”[A.N.Kolmogorov,On logical foundations of probabil-ity theory,Probability Theory and Mathematical Statistics,LectureNotes in Mathematics,Vol.1021,K.Itˆo and J.V.Prokhorov,Eds.,Springer-Verlag,Heidelberg,1983,pp.1-5.Page1.]Intuitively,we can distinguish between sequences that are irregular and do not satisfy the regularity implicit in stochastic randomness,and sequences that are irregular but do satisfy the regularities associated with stochastic randomness. Formally,we will distinguish the second type from thefirst type by whether or not a certain complexity measure of the initial segments goes to a definite limit. The complexity measure referred to is the length of the shortest description of the prefix(in the precise sense of Kolmogorov complexity)divided by its length. It will turn out that almost all infinite strings are irregular of the second type and satisfy all regularities of stochastic randomness.“In applying probability theory we do not confine ourselves to negat-ing regularity,but from the hypothesis of randomness of the ob-served phenomena we draw definite positive conclusions.”[A.N.Kol-mogorov,Combinatorial foundations of information theory and thecalculus of probabilities,Russian Mathematical Surveys,,38:4(1983),pp.29-40.Page34.]Considering the sequence as fair coin tosses with p=1/2,the second condition in Definition1says there is no strategyφ(principle of excluded gambling system) which assures a player betting atfixed odds and infixed amounts,on the tosses of the coin,to make infinite gain.That is,no advantage is gained in the long run by following some system,such as betting‘head’after each run of seven consecutive tails,or(more plausibly)by placing the n th bet‘head’after the appearance of n+7tails in succession.According to von Mises,the above conditions are sufficiently familiar and a uncontroverted empirical generalization to serve as the basis of an applicable calculus of probabilities.Example1It turns out that the naive mathematical approach to a concrete formulation,admitting simply all partial functions,comes to grief as follows. Let a=a1a2...be any collective.Defineφ1asφ1(a1...a i−1)=1if a i=1, and undefined otherwise.But then p=1.Definingφ0byφ0(a1...a i−1)=b i, with b i the complement of a i,for all i,we obtain by the second condition of Definition1that p=0.Consequently,if we allow functions likeφ1andφ0as strategy,then von Mises’definition cannot be satisfied at all.3102.2Wald-Church Place SelectionIn the thirties,Abraham Wald(1902—1950)proposed to restrict the a priori admissibleφto anyfixed countable set of functions.Then collectives do exist. But which countable set?In1940,Alonzo Church proposed to choose a set of functions representing‘computable’strategies.According to Church’s Thesis, this is precisely the set of recursive functions.With recursiveφ,not only is the definition completely rigorous,and random infinite sequences do exist,but moreover they are abundant since the infinite random sequences with p=1/2 form a set of measure one.From the existence of random sequences with proba-bility1/2,the existence of random sequences associated with other probabilities can be derived.Let us call sequences satisfying Definition1with recursiveφMises-Wald-Church random.That is,the involved Mises-Wald-Church place-selection rules consist of the partial recursive functions.Appeal to a theorem by Wald yields as a corollary that the set of Mises-Wald-Church random sequences associated with anyfixed probability has the cardinality of the continuum.Moreover,each Mises-Wald-Church random se-quence qualifies as a normal number.(A number is normal in the sense of´Emile F´e lix´Edouard Justin Borel(1871—1956)if each digit of the base,and each block of digits of any length,occurs with equal asymptotic frequency.)Note however, that not every normal number is Mises-Wald-Church random.This follows,for instance,from Champernowne’s sequence(or number),0.1234567891011121314151617181920...due to David G.Champernowne(1912—),which is normal in the scale of10 and where the i th digit is easily calculated from i.The definition of a Mises-Wald-Church random sequence implies that its consecutive digits cannot be effectively computed.Thus,an existence proof for Mises-Wald-Church random sequences is necessarily nonconstructive.Unfortunately,the von Mises-Wald-Church definition is not yet good enough, as was shown by Jean Ville in1939.There exist sequences that satisfy the Mises-Wald-Church definition of randomness,with limiting relative frequency of ones of1/2,but nonetheless have the property thatf nfor all n.2The probability of such a sequence of outcomes in randomflips of a fair coin is zero.Intuition:if you bet‘1’all the time against such a sequence of outcomes, then your accumulated gain is always positive!Similarly,other properties of randomness in probability theory such as the Law of the Iterated Logarithm do not follow from the Mises-Wald-Church definition.An extensive survey on these issues(and parts of the sequel)is given in[8].113Randomness as IncompressibilityAbove it turned out that describing‘randomness’in terms of‘unpredictability’is problematic and possibly unsatisfactory.Therefore,Kolmogorov tried another approach.The antithesis of‘randomness’is‘regularity’,and afinite string which is regular can be described more shortly than giving it literally.Consequently,a string which is‘incompressible’is‘random’in this sense.With respect to infinite binary sequences it is seductive to call an infinite sequence‘random’if all of its initial segments are‘random’in the above sense of being‘incompressible’.Let us see how this intuition can be made formal,and whether leads to a satisfactory solution.Intuitively,the amount of effectively usable information in afinite string is the size(number of binary digits or bits)of the shortest program that,without additional data,computes the string and terminates.A similar definition can be given for infinite strings,but in this case the program produces element after element forever.Thus,a long sequence of1’s such as10,000times11111 (1)contains little information because a program of size about log10,000bits out-puts it:for i:=1to10,000print1Likewise,the transcendental numberπ=3.1415...,an infinite sequence of seemingly‘random’decimal digits,contains but a few bits of information.(There is a short program that produces the consecutive digits ofπforever.)Such a definition would appear to make the amount of information in a string(or other object)depend on the particular programming language used.Fortunately,it can be shown that all reasonable choices of programming languages lead to quantification of the amount of‘absolute’information in indi-vidual objects that is invariant up to an additive constant.We call this quantity the‘Kolmogorov complexity’of the object.If an object is regular,then it has a shorter description than itself.We call such an object‘compressible’.More precisely,suppose we want to describe a given object by afinite binary string.We do not care whether the object has many descriptions;however,each description should describe but one object.From among all descriptions of an object we can take the length of the shortest description as a measure of the object’s complexity.It is natural to call an object‘simple’if it has at least one short description,and to call it‘complex’if all of its descriptions are long.But now we are in danger of falling in the trap so eloquently described in the Richard-Berry paradox,where we define a natural number as“the least natural number that cannot be described in less than twenty words”.If this number12does exist,we have just described it in thirteen words,contradicting its defini-tional statement.If such a number does not exist,then all natural numbers can be described in less than twenty words.(This paradox is described in[Bertrand Russell(1872—1970)and Alfred North Whitehead,Principia Mathematica,Ox-ford,1917].In a footnote they state that it“was suggested to us by Mr.G.G. Berry of the Bodleian Library”.)We need to look very carefully at the notion of‘description’.Assume that each description describes at most one object.That is,there is a specification method D which associates at most one object x with a description y.This means that D is a function from the set of descriptions,say Y,into the set of objects,say X.It seems also reasonable to require that,for each object x in X,there is a description y in Y such that D(y)=x.(Each object has a description.)To make descriptions useful we like them to befinite.This means that there are only countably many descriptions.Since there is a description for each object,there are also only countably many describable objects.How do we measure the complexity of descriptions?Taking our cue from the theory of computation,we express descriptions as finite sequences of0’s and1’s.In communication technology,if the specification method D is known to both a sender and a receiver,then a message x can be transmitted from sender to receiver by transmitting the sequence of0’s and1’s of a description y with D(y)=x.The cost of this transmission is measured by the number of occurrences of0’s and1’s in y,that is,by the length of y. The least cost of transmission of x is given by the length of a shortest y such that D(y)=x.We choose this least cost of transmission as the‘descriptional’complexity of x under specification method D.Obviously,this descriptional complexity of x depends crucially on D.The general principle involved is that the syntactic framework of the description language determines the succinctness of description.In order to objectively compare descriptional complexities of objects,to be able to say“x is more complex than z”,the descriptional complexity of x should depend on x alone.This complexity can be viewed as related to a universal description method which is a priori assumed by all senders and receivers.This complexity is optimal if no other description method assigns a lower complexity to any object.We are not really interested in optimality with respect to all description methods.For specifications to be useful at all it is necessary that the mapping from y to D(y)can be executed in an effective manner.That is,it can at least in principle be performed by humans or machines.This notion has been formalized as‘partial recursive functions’.According to generally accepted mathematical viewpoints it coincides with the intuitive notion of effective computation.The set of partial recursive functions contains an optimal function which minimizes description length of every other such function.We denote this func-tion by ly,for any other recursive function D,for all objects x,there is a description y of x under D0which is shorter than any description z of x13。
《孟德尔随机化研究指南》中英文版English:"Mendelian randomization (MR) has emerged as an important tool in epidemiology and biostatistics for investigating causal relationships between risk factors and disease outcomes. In order to ensure the validity and reliability of MR studies, researchers need to follow a standardized set of guidelines. The 'Mendelian Randomization Reporting Guidelines (MR-REWG)' provide detailed recommendations for conducting, reporting, and appraising MR studies. These guidelines cover key aspects such as study design, instrument selection, data sources, statistical analysis, and result interpretation. By adhering to these guidelines, researchers can minimize bias and confounding, and produce more robust evidence for causal inference in epidemiological research."中文翻译:“孟德尔随机化(MR)已经成为流行病学和生物统计学中研究危险因素与疾病结果之间因果关系的重要工具。
《随机森林算法优化研究》篇一一、引言随机森林(Random Forest)是一种以决策树为基础的集成学习算法,由于其优秀的性能和稳健的表现,被广泛应用于机器学习和数据挖掘领域。
三、随机森林算法存在的问题虽然随机森林算法在很多领域取得了显著的效果,但仍然存在一些问题:1. 过拟合问题:当数据集较大或特征维度较高时,随机森林算法容易产生过拟合现象。
2. 计算效率问题:随着数据集规模的扩大,随机森林算法的计算效率会逐渐降低。
3. 特征选择问题:在构建决策树时,如何选择合适的特征是一个关键问题。
四、随机森林算法优化方法针对上述问题,本文提出以下优化方法:1. 引入集成学习技术:通过集成多个随机森林模型,可以有效提高模型的泛化能力和抗过拟合能力。
2. 优化决策树构建过程:在构建决策树时,可以采用特征选择方法、剪枝技术等来提高决策树的准确性和泛化能力。
3. 特征重要性评估与选择:在构建随机森林时,可以利用特征重要性评估方法来识别对模型预测结果贡献较大的特征。
4. 优化模型参数:针对不同的问题和数据集,可以通过交叉验证等方法来调整随机森林算法的参数,如决策树的数量、每个决策树所使用的特征数量等。
随机森林构建方法英语作文Random Forest Construction Method。
Random Forest is a popular machine learning algorithm that is used to solve a wide range of problems, including classification and regression. It is a type of ensemble learning method that combines multiple decision trees to produce a more accurate and robust model. In this article, we will discuss the construction method of Random Forest.Step 1: Data Preparation。
The first step in building a Random Forest model is to prepare the data. This involves cleaning the data, removing any missing values, and transforming the data into a suitable format for the algorithm. The data should be split into a training set and a testing set, with the training set used to train the model and the testing set used to evaluate its performance.Step 2: Random Sampling。
Random Forest uses a technique called bagging, which involves randomly sampling the data with replacement to create multiple subsets of the data. Each subset is used to train a decision tree, and the results are combined to produce the final model. The number of subsets is determined by the user and is typically set to a value between 100 and 1000.Step 3: Decision Tree Construction。
f¨u r Mathematikin den NaturwissenschaftenLeipzigRandom perturbations of spiking activity in apair of coupled neuronsbyBoris Gutkin,J¨u rgen Jost,and Henry TuckwellPreprint no.:492007Random perturbations of spiking activity in apair of coupled neuronsBoris Gutkin∗,J¨u rgen Jost and Henry C.Tuckwell†May14,2007AbstractWe examine the effects of stochastic input currents on thefiring be-haviour of two coupled Type1or Type2neurons.In Hodgkin-Huxleymodel neurons with standard parameters,which are Type2,in the bistableregime,synaptic transmission can initiate oscillatory joint spiking,butwhite noise can terminate it.In Type1cells(models),typified by aquadratic integrate andfire model,synaptic coupling can cause oscilla-tory behaviour in excitatory cells,but Gaussian white noise can againterminate it.We locally determine an approximate basin of attraction,A,of the periodic orbit and explain thefiring behaviour in terms of theeffects of noise on the probability of escape of trajectories from A.1IntroductionHodgkin(1948)found that various squid axon preparations responded in quali-tatively different ways to applied currents.Some preparations gave a frequency offiring which rose smoothly from zero as the current increased whereas oth-ers manifested the sudden appearance of a train of spikes at a particular input current.Cells that responded in thefirst manner were called Class1(which we refer to as Type1)whereas cells with a discontinuous frequency-current curve were called Class2(Type2).Mathematical explanations for the two types are found in the bifurcation which accompanies the transition from rest state to a periodicfiring mode.For Type1behaviour,a resting potential vanishes via a saddle-node bifurcation whereas for Type2behaviour the instability of the rest point is due to an Andronov-Hopf bifurcation,see Rinzel and Ermentrout (1989).Stochastic effects in thefiring behaviour of neurons have been widely reported, discussed and analyzed since their discovery in the1940’s.One of thefirst reports for the central nervous system was by Frank and Fuortes(1955)for catX1X3X2X4X1X2TIMEFigure1:On the left are shown the solutions of(1)-(4)for two coupled QIF model neurons with the standard parameters.X1and X2are the potential variables of neurons1and2and X3and X4are the inputs to neurons1and2, respectively.On the right is shown the periodic orbit in the(x1,x2)-plane.The square marked P was explored in detail in reference to the extent of the basin of attraction of the periodic orbit.spinal neurons.Although there have been many single neuron studies,the effect of noise on systems of coupled neurons have not been extensively investigated. Some preliminary studies are those of Gutkin,Hely and Jost(2004)and Casado and Baltan´a s(2003).2The quadratic integrate andfire modelA relatively simple neural model which exhibits Type1firing behaviour is the quadratic integrate andfire(QIF)model.We couple two model neurons in the following manner(Gutkin,Hely and Jost,2004).Let{X1(t),X2(t),t≥0}be the depolarizations of neurons1and2,where t is the time index.Then the model equations are,for subthreshold states of two identical neurons,dX1=[(X1−x R)2+β+g s X3]dt+σdW1(1)dX2=[(X2−x R)2+β+g s X4]dt+σdW2(2)dX3=−X3τ+F(X1)(4)2where X3is the synaptic input to neuron1from neuron2and X4is the synaptic input to neuron2from neuron1.The quantity x R is a resting value.g s is the coupling strength.βis the mean background input.W1and W2are independent standard Wiener processes which enter with strengthσ.This term may model variations in nonspecific inputs to the circuit as well as possibly intrinsic membrane and channel noise.By construction,we take this term to be much weaker than the mutual coupling between the cells in our circuit.The function F is given byF(x)=1+tanh(α(x−θ))whereθcharacterizes the threshold effect of synaptic activation.Since when a QIF neuron is excited and it receives no inhibition,its potential reaches an infinite value in afinite time,for numerical simulations a cutoffvalue x max is introduced so that the above model equations for the potential apply only if X1 or X2are below x max.To complete a“spike”in any neuron,taken as occurring when its potential reaches x max,its potential is instantaneously reset to some value x reset which may be taken as−x max.At the bifurcation point g s=g∗s, two heteroclinic orbits between unstable rest points turn into a periodic orbit of antiphase oscillations.3Results and theoryIn the numerical work,the following constants are employed throughout.x R= 0,x max=20,θ=10,α=1,β=−1,g s=100andτ=0.25.The initial values of the neural potentials are X1(0)=1.1,X2(0)=0and the initial values of the synaptic variables are X3(0)=X4(0)=0.When there is no noise,σ=0,the results of Figure1are obtained.The spike trains of the two coupled neurons and their synaptic inputs are shown on the left.Thefiring settles down to be quite regular and the periodic orbit,S,is shown on the right.The patch marked P is the location of the region explored in detail below.The effects of a small amount of noise are shown in Figure2.The neural excitation variables are shown on the left and the corresponding trajectories in the(x1,x2)-plane are shown on the right.In the top portion an example of the trajectory forσ=0.1is shown.Here three spikes arise in neuron1and two in neuron2,but the time between spikes increases and eventually the orbit collapses away from the periodic orbit.In the example(lower part)forσ=0.2 there are no spikes in either neuron.In10trials,the average numbers of spikes obtained for the pair of neurons were(2.5,2.2)forσ=0.1,(1.4,1.1)forσ=0.2 and(1.3,0.9)forσ=0.3;these may be compared with(5,5)for zero noise. 3.1Exit-time and orbit stabilityIf a basin of attraction for a periodic orbit can be found,then the probabil-ity that the process with noise escapes from the region of attraction gives the probability,in the present context,that spiking will cease.Since the system3TIMEX1X21 X2Figure2:On the left are shown examples of the neuronal potentials for neurons 1and2(QIF model)for two values of the noise,σ=0.1andσ=0.2.On the right are shown the trajectories corresponding to the results on the left,showing how noise pushes or keeps the trajectories out of the basin of attraction of the periodic orbit.(1)-(4)is Markovian,we may apply standardfirst-exit time theory(Tuckwell, 1989).Letting A be a set in R4and letting x=(x1,x2,x3,x4)∈A be a values of X1,X2,X3,X4)at some given time,the probability p(x1,x2,x3,x4)that the process ever escapes from A is given byL p≡σ2∂x21+σ2∂x22(5)+[(x1−x R)2+β+g s x3]∂p∂x2+ F(x2)−x3∂x3+ F(x1)−x4∂x4=0,x∈Awith boundary condition that p=1on the boundary of A(since the process is continuous).If one also adds an arbitrarily small amount of noise for X3and X4(or considers those solutions of(5)that arise from the limit of vanishing noise for X3,X4),the solution of the linear elliptic partial differential equation (5)is unique and≡1,that is,the process will eventually excape from A with probability1.Hence,the expected time f(x)of exit of the process from A satisfies L f=−1,x∈A with boundary condition f=0on the boundary of A.In fact,for small noise,the logarithm of the expected exit time from A,that4is,the time at whichfiring stops,behaves like the inverse of the square of the noise amplitude(Freidlin and Wentzell,1998).These linear partial differential equations can be solved numerically,for example by Monte-Carlo techniques.The basin of attraction A must be found in order to identify the domain of(5).We have done this approximately for the square P in Figure1.The effects of perturbations of the periodic orbit S within P on the spiking activity were found by solving(1)-(4)with various initial conditions in the absence of noise.The values of x1were from−0.43to1.57in steps of0.2and the values of x2were from-4to2also in steps of0.2.For this particular region, as expected from geometrical considerations,the system responded sensitively to to variations in x1but not x2.For example,to the left of S there tended to be no spiking activity whereas just to the right there was a full complement of spikes and further to the right(but still inside P)one spike.4Coupled Hodgkin-Huxley neuronsAs an example of a Type2neuron,we use the standard Hodgkin-Huxley(HH) model augmented with synaptic input variables as in the model for coupled QIF neurons given by equations(3)and(4),but with different parameter values. It has been long known that additive noise has a facilitative effect on single HH neurons(Yu and Lewis,1989).Coupled pairs of HH neurons have been employed with a different approach using conductance noise in order to analyze synchronization properties(e.g.Casado and Balt´a nas,2003).For the present approach,with X1and X2as the depolarizations of the two cells,we putdX1=1g K n4(V K−X1)+it was found that transient synchronization can terminate sustained activity. For Type2neurons,we have investigated coupled Hodgkin-Huxley neurons and found that in the bistable regime,noise can again terminate sustained spiking activity initiated by synaptic connections.We have investigated a minimal cir-cuit model of sustained neural activity.Such sustained activity in the prefrontal cortex has been proposed as a neural correlate of working memory(Fuster and Alexander,1973).ReferencesCasado,J.M.,Balt´a nas,J.P.(2003).Phase switching in a system of two noisy Hodgkin-Huxley neurons coupled by a diffusive interaction.Phys.Rev.E68,061917,Frank,K.,Fuortes,M.G.(1955).Potentials recorded from the spinal cord with microelectrodes,J.Physiol.130,625-654.Freidlin,M.I.,Wentzell,A.D.(1998),Random Perturbations of Dynamical Sys-tems,2nd ed.,Springer,New York Fuster,J.M.and Alexander,G.E.(1971),Neuron activity related to short-term memory.Science652-654 Gutkin,B.,Ermentrout,G.B.(1998).Dynamics of membrane excitability de-termine interval variability:a link between spike generation mechanismsand cortical spike train statistics.Neural Comp.10,1047-1065. Gutkin,B.S.et al.(2001)Turning on and offwith p.Neurosc.11:2,121-134Gutkin,B.,Hely,T.,Jost,J.(2004).Noise delays onset of sustainedfiring in a minimal model of persistent activity.Neurocomputing58-60,753-760. Hodgkin,A.L.(1948).The local changes associated with repetitive action in a non-medullated axon.J.Physiol.107,165-181.Rinzel,J.,Ermentrout,G.B.(1989).Analysis of neural excitability and oscilla-tions;in:Koch C.&Segev I.,eds.MIT Press.Tateno,T.,Harsch,A.,Robinson,H.P.C.(2004).Thresholdfiring frequency-current relationships of neurons in rat somatosensory cortex:Type1and Type2dynamics.J.Neurophysiol.92,2283-2294.Tuckwell,H.C.(1989).Stochastic Processes in the Neurosciences.SIAM,Philadel-phia.Yu,X.,Lewis,E.R.(1989).Studies with spike initiators:linearization by noise allows continuous signal modulation in neural networks.IEEE Trans.Biomed.Eng.36,36-43.6。
An Efficient Approach to Nondominated Sorting for Evolutionary Multiobjective OptimizationXingyi Zhang,Ye Tian,Ran Cheng,and Yaochu Jin,Senior Member,IEEEAbstract—Evolutionary algorithms have been shown to be powerful for solving multiobjective optimization problems,in which nondominated sorting is a widely adopted technique in selection.This technique,however,can be computationally expen-sive,especially when the number of individuals in the population becomes large.This is mainly because in most existing nondom-inated sorting algorithms,a solution needs to be compared with all other solutions before it can be assigned to a front.In this paper we propose a novel,computationally efficient approach to nondominated sorting,termed efficient nondominated sort(ENS). In ENS,a solution to be assigned to a front needs to be compared only with those that have already been assigned to a front,thereby avoiding many unnecessary dominance comparisons.Based on this new approach,two nondominated sorting algorithms have been suggested.Both theoretical analysis and empirical results show that the ENS-based sorting algorithms are computationally more efficient than the state-of-the-art nondominated sorting methods.Index Terms—Computational complexity,evolutionary multi-objective optimization,nondominated sorting,Pareto-optimality.I.I NTRODUCTIONM OST REAL-WORLD optimization problems are char-acterized by multiple objectives that often conflict with each other.For solving such multiobjective optimiza-tion problems(MOPs),a set of optimal solutions,known as Pareto-optimal solutions,instead of a single optimal solution, are to be achieved.Most classical optimization methods are inefficient in solving MOPs,since they can typicallyfind only one Pareto-optimal solution in one run,which means that this kind of method has to be applied multiple times to achieve a Pareto-optimal solution set.Manuscript received July25,2013;revised November10,2013;accepted February6,2014.Date of publication March13,2014;date of current version March27,2015.This work was supported in part by the National Natural Science Foundation of China under Projects61272152,61033003,91130034, 61373066,61073116,61003131,and61202011;in part by the Ph.D.Programs Foundation,Ministry of Education of China under Project20100142110072; in part by the Fundamental Research Funds for the Central Universities under Project2010ZD001;in part by the Natural Science Foundation of Anhui Higher Education Institutions of China under Projects KJ2012A010and KJ2013A007;and in part by the Scientific Research Foundation for Doctor of Anhui University under Project02203104.X.Zhang and Y.Tian are with the Key Laboratory of Intelligent Comput-ing and Signal Processing of Ministry of Education,School of Computer Science and Technology,Anhui University,Hefei230039,China(e-mail: xyzhanghust@;field910921@).R.Cheng and Y.Jin are with the Department of Computing,University of Surrey,Guildford,Surrey GU27XH,U.K.(e-mail:r.cheng@; yaochu.jin@).Digital Object Identifier10.1109/TEVC.2014.2308305Over the past20years,a variety of evolutionary algorithms have been developed to tackle MOPs,for example, Pareto envelop-based selection algorithm II(PESA-II)[1], nondominated sorting genetic algorithm-II(NSGA-II)[2], strength Pareto evolutionary algorithm2(SPEA2)[3],and memetic Pareto archived evolution strategy(M-PAES)[4],to name just a few.These multiobjective evolutionary algorithms (MOEAs)are able tofind a set of Pareto-optimal solutions in one single run.Although various approaches have been adopted for selec-tion[5],most MOEAs adopt the Pareto-based approach,that is,the qualities of the candidate solutions are compared using Pareto dominance.Among various dominance comparison mechanisms,nondominated sorting[2]has been shown to be very effective forfinding Pareto-optimal solutions.Also much work has been done to efficiently store nondominated solutions found during search in an archive[6],[7].Nondominated sorting is a procedure where solutions in the population are assigned to different fronts based on their dominance relationships.Without loss of generality,we assume that the individuals in population P can be categorized into K Pareto fronts,denoted as F i,i=1,...,K.According to nondom-inated sorting,all nondominated solutions in population P are assigned to front F1;then the nondominated solutions in P−F1,which is the set of solutions by removing the solutions assigned to front F1,are assigned to front F2.This procedure repeats until all solutions in P are assigned to a front F i,i=1,...,K.Note that the solutions belonging to front F j are dominated by at least one solution belonging to front F i,if i<j,i,j=1,2,...,K.Fig.1provides an illustrative example of a population of13solutions composed of four fronts,where both objectives are to be minimized. Nondominated sorting is computationally intensive,in par-ticular,when the population size increases.To address this problem,much research work has been dedicated to the improvement of the computational efficiency of this proce-dure.The idea of nondominated sorting wasfirst suggested in[8]as a selection strategy for evolutionary multiobjective optimization,which was implemented in a multiobjective GA, termed NSGA[9].Nondominated sorting in NSGA has a time complexity of O(MN3)and a space complexity of O(N), where M is the number of objectives and N is the number of solutions in the population.A faster version of nondomi-nated sorting,termed fast nondominated sort,was proposed in[2],where the time complexity is reduced to O(MN2). The fast nondominated sort,however,requires a larger storage1089-778X c 2014IEEE.Personal use is permitted,but republication/redistribution requires IEEE permission.See /publications_standards/publications/rights/index.html for more information.Fig.1.Population with 13solutions of a biobjective minimization problem.The individuals can be divided into four fronts.space than the nondominated sorting in NSGA,which is increased to O (N 2).Jensen [10]adopted a divide-and-conquer strategy for nondominated sorting,the time complexity of which is O (N log M −1N ).Tang et al.[11]proposed a novel nondominated sorting approach based on arena’s principle,that is,each winner will be the next arena host to be chal-lenged.This approach has been proved to have the same time complexity as the fast nondominated sort,while empirical results show that it outperforms the fast nondominated sort in terms of computational efficiency,since it can achieve a time complexity O (MN √N )in some best cases.Clymont and Keedwell [12]proposed two improved approaches to nondominated sorting,called climbing sort and deductive sort,where some dominance relationships between solutions can be inferred based on recorded comparison results.In this paper,we propose a new,computationally efficient approach to nondominated sorting,called efficient nondomi-nated sort (ENS).ENS adopts an idea different from those used in the above-mentioned methods.The main difference lies in the fact that existing nondominated sorting approaches usually compare a solution with all other solutions in the population before assigning it to a front,while ENS compares it only with those that have already been assigned to a front.This is made possible by the fact that in ENS,the population is sorted in one objective before the ENS is applied.Thus,a solution added to the fronts cannot dominate any solutions that are added before.As a result,ENS can avoid a large num-ber of redundant dominance comparisons,which significantly improves the computational efficiency.Theoretical analysis shows that the ENS approach has a space complexity of O (1),which is smaller than all existing nondominated sorting methods.Meanwhile,the time complexity of ENS will be O (MN log N )in good cases,which is much lower than that of all existing algorithms.Even in the worst case,ENS has a complexity of O (MN 2),which is the same as the fast nondominated sort.Experimental results confirm that ENS has better computational efficiency than the state of the art.The remaining of this paper is organized as follows.In Section II ,we briefly review a few widely used nondom-inated sorting approaches and analyze their computational complexity.In Section III ,we propose a new approach to non-dominated sorting,ENS,based on which two nondominatedsorting algorithms are developed.The computational com-plexities of the two algorithms are then analyzed.Simulation results are presented in Section IV to empirically compare the two ENS-based nondominated sorting algorithms with three state-of-the-art methods.Finally,conclusions and remarks are given in Section V .II.R ELATED W ORKIn this section,we review a few popular nondominated sorting approaches together with an analysis of their compu-tational complexities.A.Nondominated Sorting MethodsSince Goldberg [8]suggested the use of nondominated sorting for selection in MOEAs,a number of nondominated sorting methods have been reported in the literature over the past years.Furthermore,we review a few nondominated sorting approaches widely used in MOEAs.The nondominated sorting strategy was first adopted for selecting parents from offspring in NSGA for multiobjective optimization [9].The nondominated sorting in NSGA is carried out as follows.Each solution is compared with all other solutions in the population,and solutions that are not dominated by any other solutions are assigned to front F 1.All solutions assigned to F 1are temporarily removed from the population.Then each solution in the remaining population is compared with others and all nondominated solutions are assigned to front F 2.This operation is repeated until all solutions have been assigned to a front.This approach contains many redundant comparisons in the sense that the comparison between two solutions may be performed more than once.The time complexity of this approach is O (MN 3),which makes NSGA highly time-consuming and computationally inefficient for large populations.As an improved version of NSGA,Deb et al.[2]proposed a computationally more efficient non-dominated sorting approach,called fast nondominated sort,where the comparison between any two solutions is performed only once.Fast nondominated sort has a time complexity of O (MN 2),albeit at the cost of an increased space complexity from O (N )to O (N 2).A recursive nondominated sorting approach [10],usu-ally called Jensen’s sort,was suggested based on the divide-and-conquer mechanism,which reduces the time com-plexity to O (MN log N )for biobjective MOPs and to O (N log M −1N )for MOPs having more than two objectives.The space complexity of this approach is O (1)and O (N )for MOPs with two objectives and more than two objectives,respectively.Just as shown in O (N log M −1N ),the time com-plexity of Jensen’s sort will grow exponentially with the incre-ment of number of objectives.This means that Jensen’s sorting method will not work efficiently for MOPs with a large number of objectives.Actually,this sorting method will likely consume more runtime in simulation due to its recursive nature.In addition,as Clymont and Keedwell [12]and Fang et al.[13]pointed out,Jensen’s sorting algorithm is not applicable in many cases,for instance,when strong-dominance [14]orZHANG et al.:EFFICIENT APPROACH TO NONDOMINATED SORTING FOR EVOLUTIONARY MULTIOBJECTIVE OPTIMIZATION203Fig.2.Illustration of the commonly used strategy for nondominated sorting in most existing nondominated sorting approaches,which determines the front number of all solutions on the same front all at once,and solutions on different fronts sequentially.-dominance[15]is used in comparison or when the popu-lation contains duplicate solutions.Tang et al.[11]used arena’s principle to assign solutions to a front,which has been shown to have a better computational efficiency than the fast nondominated sort and Jensen’s sort in empirical evaluations.This approach randomly selects one solution from the population,regarded as an arena host,and all the remaining solutions in the population are compared with the arena host.The solution that dominates the arena host becomes the new arena host to replace the current one. The time complexity and space complexity of this approach are O(MN2)and O(N),respectively.Clymont and Keedwell[12]proposed two nondominated sorting approaches:climbing sort and deductive sort.As shown in[12],deductive sort often performs better than climbing sort.Deductive sort infers the dominance relationship between solutions by recording the results of comparisons,thereby avoiding some unnecessary comparisons.Deductive sort holds a time complexity of O(MN2)and a space complexity of O(N),which outperforms other approaches,for instance,the fast nondominated sort.There are a few other nondominated sorting approaches inspired by different ideas,such as the nondominated rank sort of the omni-optimizer[16],better nondominated sort[17], immune recognition-based algorithm[18],quick sort[19], sorting-based algorithm[20],and divide-and-conquer-based nondominated sorting algorithm[13].Most of these ap-proaches are effective in dealing with MOPs that have a small number of objectives,however,their efficiency often seriously degrades as the number of objectives increases.B.Analysis of Existing MethodsAlthough existing nondominated sorting approaches per-form front assignments based on various ideas,most of them can be described in a generic framework as shown in Fig.2. In this framework,solutions in different fronts are assigned front by front.For example,for a population P containing K fronts F i,1≤i≤K,all nondominated solutions in P are first assigned to front F1.Once this is done,the nondominated solutions in P−F1(the remaining population with all solutions assigned to F1being removed)can then be assigned to F2. In other words,solutions belonging to front F i+1cannot be assigned until all solutions belonging to F i have been assigned. Dominance comparisons between the solutions are the main operation in nondominated sorting,that is,thenum-Fig.3.Categorization of dominance comparison results between two solu-tions in nondominated sorting.ber of needed comparisons determines the efficiency of a nondominated sorting approach.Most existing nondominated sorting methods focus on the reduction of the number of comparisons to improve their computational efficiencies.The reason is that some dominance comparisons between solutions are unnecessary and can be spared.Taking a closer look, wefind that the result of one dominance comparison can be categorized into the following four cases,assuming that solution p m is compared with solution p n.1)Case1:p m is dominated by p n,or p n is dominatedby p m.2)Case2:p m and p n are nondominated,and they belong tothe same front F i,where F i is the current front(i.e.,the front that the solutions are being assigned to).3)Case3:p m and p n are nondominated,and they belongto the same front F i,where F i is not the current front.4)Case4:p m and p n are nondominated,but they belongto different fronts.Recall that a solution is assigned to the current front if it is not dominated by any other solutions in the current population. In Case1,if solution p m dominates solution p n,then p n does not belong to the current front and we no longer need to perform additional comparisons between p n and all other solutions in the current population,which means that such comparisons,if performed,are redundant.In Case2,both p m and p n belong to the current front,so there does not exist any solution that can dominate p m or p n,and the comparison between p m and p n should be done to verify whether one dominates the other.In fact,all solutions belonging to the current front should be compared with each other to ensure that they are all nondominated with each other.In Case3, neither p m nor p n belongs to the current front,which means that there exists at least one solution dominating p m and a solution dominating p n,and the comparison between p m and p n can be skipped.In Case4,since there exists at least one solution dominating p m or p n,a comparison between p m and p n is unnecessary.The four cases of possible comparisons are illustrated in Fig.3.As shown in Fig.3,for nondominated sorting,comparisons in Case1and comparisons in Case2cannot be avoided,which are termed necessary comparisons.The necessary comparisons204IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION,VOL.19,NO.2,APRIL2015Fig.4.Population containing six solutions for a biobjective minimization problem.in Case1refer to the comparisons between solutions in differ-ent fronts,while the necessary comparisons in Case2refer to comparisons between solutions in the same front.The number of necessary comparisons in Cases1and2is the theoretical minimum number of needed dominance comparisons for any nondominated sorting algorithm.If a nondominated sorting approach determines the front of a solution by starting to check from thefirst front to the last one,for example,from F1to F4in Fig.1,the number of necessary comparisons in Case1 can be calculated in the following way.Given a population consisting of N solutions that can be divided into K fronts, assume front F i contains N i solutions,where1≤i≤K.So, we have N1+N2+...+N K=N.If a solution p n belongs to front F i,then at least one solution dominating p n will be found in each of the preceding i−1fronts.This means that i−1comparisons in Case1are needed for solution p n.Since there are N i solutions belonging to F i,a total of(i−1)N i comparisons in Case1are needed for front F i.Therefore,the total number of necessary dominance comparisons between solutions in different fronts isNum_Comp1=Ki=1(i−1)N i.(1)Regarding the minimum number of needed comparisons in Case2,since each of the N i solutions in front F i should be compared with the other solutions in front F i,which needs a total of N i(N i−1)/2comparisons,the total number of comparisons between solutions in the same front isNum_Comp2=Ki=1N i(N i−1)2.(2)Unfortunately,most popular nondominated sorting approaches have a total number of dominance comparisons much higher than the sum of Num_Comp1and Num_Comp2.From the above discussions,we canfind that further improvements of the computational efficiencies of nondominated sorting approaches should concentrate on the reduction of unnecessary comparisons in Cases1, 3,and 4.In fact,most existing improved nondominated sorting approaches have successfully removed unnecessary dominance comparisons in Case1,however,still perform many unnecessary comparisons belonging to Cases3and4. As an example,we consider the comparisons performed in deductive sort over a population shown in Fig.4.Thispopula-Fig.5.Illustration of the proposed dominance comparison strategy,where solutions in the population can be assigned to the fronts one by one.tion contains six candidate solutions of a biobjective minimiza-tion problem,each being denoted by p i(f1,f2),i=1,2, (6)where f1and f2are the values of the two objectives of solution p i,respectively.In this example,the six candidate solutions are p1(5,4),p2(6,3),p3(7,2),p4(1,6),p5(2,5),and p6(3,1). As shown in Fig.4,there are two fronts in this population, where solutions p4,p5,and p6belong to thefirst front F1,and solutions p1,p2,and p3belong to F2.Deductive sort performs the following comparisons to assign the solutions to one of the two fronts.It begins with comparing solution p1with all other solutions in the population one by one.Solution p1isfirst compared with solution p2.Since p1is not dominated by p2, deductive sort continues to perform the comparison between p1and p3.The comparison result indicates that solution p1 is not dominated by p3either.Similarly,p1will be further compared with p4and p5,and it is concluded that p1is dominated neither by p4nor by p5.Then,p1is compared with p6,and it will be found that p1is dominated by p6, which means that p1does not belong to the current front F1. By then,solution p1will no longer be involved in further comparisons of front F1in deductive sort.In the above procedure,the following dominance compar-isons have been made:two comparisons of Case3(compar-isons between p1and p2,or p3),two comparisons of Case4 (comparisons between p1and p4,or p5),and one comparison of Case1(a comparison between p1and p6).After p1is compared with all the other solutions,deductive sort starts to consider solution p2.Dominance comparison will continue until all solutions are assigned to a front.Table I lists all comparisons performed by deductive sort for the population shown in Fig.4.From Table I,it is not difficult to see that there exist many unnecessary comparisons performed by deductive sort, which belong to Cases3and4.Among these unnecessary comparisons,several of them are duplicate comparisons,such as the comparisons between p1and p2.In fact,all duplicate comparisons in deductive sort belong to Case3.In this paper, we propose a new nondominated sorting algorithm using a strategy different from the one illustrated in Fig.2,which aims to avoid duplicate comparisons,thereby considerably reducing the number of unnecessary comparisons.III.E FFICIENT N ONDOMINATED S ORT F RAMEWORK Here,we present a new efficient nondominated sorting strategy,termed ENS,which is conceptually different fromZHANG et al.:EFFICIENT APPROACH TO NONDOMINATED SORTING FOR EVOLUTIONARY MULTIOBJECTIVE OPTIMIZATION 205TABLE IC OMPARISONS P ERFORMED BY D EDUCTIVE S ORT FORTHEP OPULATION S HOWN IN F IG .4Algorithm 1Main Steps of ENS for Nondominated SortingInput:population POutput:the set of fronts F 1:F =empty ;2:Sort P in an ascending order of the first objective value;3:for all P [n ]∈sorted P do 4:Assign solution P [n ]into F by Algorithm 2orAlgorithm 3;5:end for 6:return F ;most existing nondominated sorting methods.The main idea of the ENS approach is shown in Fig.5.By comparing Figs.2and 5,we can see that the ENS approach determines the front each solution belongs to one by one,while most existing nondominated sorting approaches determine the front of all solutions on the same front as a whole.The main merit of determining the front to which each solution belongs separately is that it can avoid duplicate comparisons,since in this approach,a solution to be assigned only needs to be compared with solutions that have already been assigned to a front.The details of ENS are given in Algorithm 1.For a minimization problem,this approach first sorts the N solutions in population P in an ascending order according to the first objective value,where N is the population size.If the first objective values of two solutions are the same,then they are sorted according to the second objective value.This procedure continues until all individuals in the population are sorted.If solutions have the same value in all objectives,their order can be arbitrary.For this sorted population P ,a solution p m will never be dominated by a solution p n ,if m <n ,since there exists at least one objective in p m whose value is smaller than that of the same objective in p n .This means that there existAlgorithm 2Sequential Search Strategy for Finding the Front of a SolutionInput:solution P [n ],the set of fronts F Output:the front number of solution P [n ]1:x =size(F );{the number of fronts having been found}2:k =1;{the front now checked}3:while true do4:compare P [n ]with the solutions in F [k ]starting fromthe last one and ending with the first one;5:if F [k ]contains no solution dominating P [n ]then 6:return k ;{move P [n ]to F [k ]}7:break ;8:else 9:k ++;10:if k >x then 11:return x +1;{move P [n ]to a new front}12:break;13:end if 14:end if 15:end whileonly two possible relationships between the two solutions:either p m dominates p n ,or p m and p n are not comparable.After finishing sorting the individuals in population P ,ENS begins to assign solutions to fronts in the sorted population P one by one,starting from the first solution p 1and ending with the last one p N .As we know,if a solution is assigned to a front,it is dominated by at least one solution in the preceding front.As pointed out earlier,a solution can never be dominated by any succeeding solution in the sorted population P .Therefore,it is sufficient to compare a solution with those that have already been assigned to a front to determine the front of this solution.The possible relationships between a solution to be assigned and those that have been assigned to a front are shown in Fig.6.Actually,if a solution p n is assigned to front F i ,F i must satisfy the following two conditions.1)There exists at least one solution in each front F j that has been assigned and dominates p n ,for 1≤j ≤i −1.2)There exists no solution in any of the assigned fronts F kthat dominates p n ,for k ≥i .In this way,the front to which a solution belongs can be determined by finding out the front that satisfies the above two conditions.In what follows,we present two strategies for searching for the front satisfying the above two conditions within the ENS framework,one using a sequential search strategy (termed ENS-SS)and the other using a binary search strategy (ENS-BS).A.Sequential Search StrategyThe pseudocode of the sequential search is presented in Algorithm 2.The idea in this search strategy is quite straight-forward.For solution p n ,the algorithm checks at first whether there exists a solution that has been assigned to the first front F 1and dominates p n .If such a solution does not exist,assign p n to front F 1.If p n is dominated by any solution in F 1,start comparing p n with the solutions assigned to F 2.If no solution in front F 2dominates p n ,assign p n to front F 2.If p n is not206IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION,VOL.19,NO.2,APRIL2015Fig.6.Relationships between p n and the solutions having been assigned to a front.assigned to any of the existing fronts,create a new front and assign p n to this new front.There is a little trick in checking whether a front has a solution dominating p n.Recall that solutions assigned to an existing front are also sorted in the same order as the population.Therefore,the comparisons between p n and the solutions assigned to the front should start with the last one in the front and end with thefirst one.This trick often leads to fewer comparisons if a solution assigned to this front dominates p n,since solutions in the end of the sorted front are more likely to dominate p n.As a result,unnecessary comparisons can be avoided.For biobjective optimization problems,the idea presented here is computationally more efficient than existing nondom-inated sorting methods.In fact,as shown in Algorithm2, only one comparison is sufficient for determining whether a solution to be assigned belongs to an existing front.The reason is as follows.In the sorting,solutions assigned to a front are sorted in the ascending order of thefirst objective,which means that the second objectives of these solutions are in the descending order,since these solutions are nondominated with each other.This means that,if p n,a solution to be assigned,is dominated by a solution in an existing front,it should be dominated by the last solution in the front,since the last solution has the smallest value in the second objective among all solutions in the front.Therefore,for biobjective optimization problems,this method can determine the front number of a solution by performing only the comparisons between this solution and the last solution in each front.In Table II,we list the comparisons performed by ENS using the sequential search strategy(ENS-SS)for nondominated sorting of the population given in Fig.4.As shown in the table,ENS-SS needs only nine comparisons in total,which is much smaller than the number of comparisons needed by deductive sort,refer to Table I.It is not difficult for the reader to check that there does not exist any comparison belonging to Case3in ENS-SS.This means that ENS-SS does not perform any duplicate comparisons,which can be attributed to the ENS strategy shown in Fig.5.It should be noted that although ENS-SS does not perform any comparison belonging to Case4 for the population shown in Fig.4,such comparisons may occur for other populations.Algorithm3Binary Search Strategy for Finding the Front of a SolutionInput:solution P[n],the set of fronts FOutput:the front number of solution P[n]1:x=size(F);{the number of fronts having been found} 2:kmin=0;{the lower bound for checking}3:kmax=x;{the upper bound for checking}4:k= (kmax+kmin)/2+1/2 ;{the front now checked} 5:while true do6:Compare P[n]with the solutions in F[k]starting from the last one and ending with thefirst one;7:if F[k]has no solution dominating P[n]then8:if k==kmin+1then9:return k;{move P[n]to F[k]}10:break;11:else12:kmax=k;13:k= (kmax+kmin)/2+1/2 ;14:end if15:else16:kmin=k;17:if(kmax==kmin+1)and(kmax<x)then18:return kmax;{move P[n]to F[kmax]}19:break;20:else if kmin==x then21:return x+1;{move P[n]to a new front}22:break;23:else24:k= (kmax+kmin)/2+1/2 ;25:end if26:end if27:end whileB.Binary Search StrategyThe pseudocode of the binary search strategy is presented in Algorithm3.Different from sequential search,the binary search strategy starts with checking the intermediate front F L/2 instead of thefirst front F1,where L is the number of fronts that have been created thus far,that is,before solution p n is assigned.If solution p n is not dominated by any solution in front F L/2 ,then solution p n will be compared with the solutions in front F L/4 .Otherwise,p n is compared with the solutions in front F 3L/4 .In this way,the binary search can determine the front to which solution p n belongs after check-ing log(L+1) fronts.If the last existing front F L has been checked and p n does not belong to this front,a new front F L+1 will be created and solution p n is assigned to this new front. The binary search strategy adopted here usually outperforms the sequential search strategy in which it requires to check fewer fronts in a population.But this does not mean that the binary search strategy can always perform fewer comparisons than the sequential search strategy in front assignment.In binary search,it can happen that more than one front that does not have any solution dominating p n needs to be checked. All solutions in these checked fronts have to be compared with p n.In sequential search,at most one front containing no solution dominating p n needs to be checked.Therefore,the。
Aspects of Gravitational Clustering
ˆk is a linear second order differmode, labeled by a wave vector k. Here L ential operator in time. Solving this set of ordinary differential equations, with given initial conditions, we can determine the evolution of each mode separately. [Similar procedure, of course, works for the case with Ω = 1. In this case, the mode functions will be more complicated than the plane waves; but, with a suitable choice of orthonormal functions, we can obtain a similar set of equations]. This solves the problem of linear gravitational clustering completely. There is, however, one major conceptual difficulty in interpreting the results of this program. In general relativity, the form (and numerical value) of the metric coefficients gαβ (or the stress-tensor components Tαβ ) can be changed by a relabeling of coordinates xα → xα′ . By such a trivial change we can make a small δTαβ large or even generate a component which was originally absent. Thus the perturbations may grow at different rates − or even decay − when we relabel coordinates. It is necessary to tackle this ambiguity before we can meaningfully talk about the growth of inhomogeneities. There are two different approaches to handling such difficulties in general relativity. The first method is to resolve the problem by force: We may choose a particular coordinate system and compute everything in that coordinate system. If the coordinate system is physically well motivated, then the quantities computed in that system can be interpreted easily; 0 to be the perturbed mass (energy) density for example, we will treat δT0 even though it is coordinate dependent. The difficulty with this method is that one cannot fix the gauge completely by simple physical arguments; the residual gauge ambiguities do create some problems. The second approach is to construct quantities − linear combinations of various perturbed physical variables − which are scalars under coordinate transformations. [see eg. the contribution by Brandenberger to this volume and references cited therein] Einstein’s equations are then rewritten as equations for these gauge invariant quantities. This approach, of course, is manifestly gauge invariant from start to finish. However, it is more complicated than the first one; besides, the gauge invariant objects do not, in general, possess any simple physical interpretation. In these lectures, we shall be mainly concerned with the first approach. Since the gauge ambiguity is a purely general relativistic effect, it is necessary to determine when such effects are significant. The effects due to the curvature of space-time will be important at length scales bigger than (or comparable to) the Hubble radius, defined as dH (t) ≡ (a/a ˙ )−1 . Writing
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
PACS No.:05.40.+j, 82.20.Mj
I. Introduction There has been much discussion recently[1-10] on how to model a physical system that could extract work out of random fluctuations without having to apply directly an obvious biased force taking, in fact, cue from —- or perhaps yearning to explain —– an experimentally observed phenomena[11] of predominantly unidirectional motion of macromolecules (biological motors) along microtubules. We reason in the present work that system inhomogeneity may provide a clear and unifying framework to approach the problem of macroscopic motion under discussion. Macroscopic unidirectional motion of a particle is not possible thermodynamically in the prsence of equilibrium fluctuations. However, such a motion can be obtained in an nonequilibrium situation where the principle of detailed balance does not hold. The existing popular models,[1-10] currently in the literature, mostly take the fluctuations to be nonequilibrium, that is, consider nonwhite or at least nonguassian-white (colored) noise together with a ratchetlike periodic system potential to aid asymmetric motion of an overdamped Brownian particle. The ratchetlike periodic system potentials, V (q ), obviously violate parity V (q ) = V (−q ). For such a ratchetlike potential one can readily calculate steady current flow J (F ) of a Brownian particle in the presence of an external field F . It turns out that J (F ) is not an odd function of F and, in general, J (F ) = −J (−F ). In other words, reversal of the external force may not lead to a reversed current of the same magnitude in sharp contrast to the case of a nonratchetlike (symmetric) periodic potential system where J (F )= −J (−F ) follows. From this general observation, in a ratchetlike potential, it can be easily concluded that on application of a zero time averaged periodic field, say F =F0 sinωt, one can obtain net unidirectional current. Of course, the direction and magnitude of the average velocity rnal field parameters, F0 and ω . A careful tuning of the relevant parameters may even result in the reversal of the macroscopic current [2]. This is the basic physics behind some of the physical models used to obtain current rectification in a periodic potential system. There are models, however, that do not use oscillating external fields. Instead, colored noise of zero average strength—-dichotomous, Ornstein-Uhlenbeck, Kangaroo processes,..., [3-5]—– is used to drive the Brownian particle to obtain macroscopic motion in a ratchetlike potential system. There are further interesting models where the potential barriers themselves are allowed to fluctuate, for instance, with finite time correlations between two states under the influence of a noise source. An example being an overdamped Brownian particle subjected to a ratchetlike periodic potential, where the ratchetlike saw-tooth potential is switched on to its full strength for time τon during which the Brownian particle slides down the potential slope to the bottom of the potential trench. At the end of τon , the system is put in the other (of f ) state during which the potential is set equal to a constant (say = 0) for an interval τof f and the particle executes force-free diffusive motion. At the end of τof f the system is put back in the on state for interval τon . This process of flipping of states is repeated ad-infinitum. If τof f is adjusted in such a way that by the end of τof f the diffusive motion just takes the particle out of the (now nonexistent) potential trench in the steeper slope direction (smaller distance) of the saw-tooth potential but fails to do so in the gentler slope direction (larger distance), the immediate next on interval will take the particle to the adjacent trench minimum in the steeper slope side of the saw-tooth potential. Repititon of such sequential flippings of states for a large number of times leads to a net unidirectional macroscopic current of the Brownian particle. It should be noted that a symmetrical nonratchetlike potential would, instead, have yielded symmetrical excursions of the Brownian particle and, hence, no net unidirectional
Enslaving random fluctuations in nonequlibrium systems
arXiv:cond-mat/9603103v1 14 Mar 1996
Mangal C. Mahato, T. P. Pareek and A. M. Jayannavar Institute of Physics, Sachivalaya Marg, Bhubaneswar-751005, INDIA.
Abstract Several physical models have recently been proposed to obtain unidirectional motion of an overdamped Brownian particle in a periodic potential system. The asymmetric ratchetlike form of the periodic potential and the presence of correlated nonequilibrium fluctuating forces are considered essential to obtain such a macroscopic motion in homogeneous systems. In the present work, instead, inhomogeneous systems are considered, wherein the friction coefficient and/or temperature could vary in space. We show that unidirectional motion can be obtained even in a symmetric nonratchetlike periodic potential system in the presence of white noise fluctuations. We consider four different cases of system inhomogeneity We argue that all these different models work under the same basic principle of alteration of relative stability of otherwise locally stable states in the presence of temperature inhomogeneity.