Derivatives of repeated eigenvalues and corresponding eigenvectors of damped systems

合集下载

Cleve Moler - MATLAB数值计算第十章-eigenvalues and singular values

Cleve Moler - MATLAB数值计算第十章-eigenvalues and singular values

10.1
Eigenvalue and Singular Value Decompositions
An eigenvalue and eigenvector of a square matrix A are a scalar λ and a nonzero vector x so that Ax = λx. A singular value and pair of singular vectors of a square or rectangular matrix A are a nonnegative scalar σ and two nonzero vectors u and v so that Av = σu, AH u = σv. The superscript on AH stands for Hermitian transpose and denotes the complex conjugate transpose of a complex matrix. If the matrix is real, then AT denotes the same matrix. In Matlab, these transposed matrices are denoted by A’. The term “eigenvalue” is a partial translation of the German “eigenvert.” A complete translation would be something like “own value” or “characteristic value,” but these are rarely used. The term “singular value” relates to the distance between a matrix and the set of singular matrices. Eigenvalues play an important role in situations where the matrix is a transformation from one vector space onto itself. Systems of linear ordinary differential equations are the primary examples. The values of λ can correspond to frequencies of vibration, or critical values of stability parameters, or energy levels of atoms. Singular values play an important role where the matrix is a transformation from one vector space to a different vector space, possibly with a different dimension. Systems of over- or underdetermined algebraic equations are the primary examples.

分数阶梯度和哈密顿系统

分数阶梯度和哈密顿系统

I NSTITUTE OF P HYSICS P UBLISHING J OURNAL OF P HYSICS A:M ATHEMATICAL AND G ENERAL J.Phys.A:Math.Gen.38(2005)5929–5943doi:10.1088/0305-4470/38/26/007Fractional generalization of gradient and Hamiltonian systemsVasily E TarasovSkobeltsyn Institute of Nuclear Physics,Moscow State University,Moscow119992,RussiaE-mail:tarasov@theory.sinp.msu.ruReceived11April2005,infinal form23May2005Published15June2005Online at /JPhysA/38/5929AbstractWe consider a fractional generalization of Hamiltonian and gradient systems.We use differential forms and exterior derivatives of fractional orders.Wederive fractional generalization of Helmholtz conditions for phase space.Examples of fractional gradient and Hamiltonian systems are considered.Thestationary states for these systems are derived.PACS numbers:45.20.−d,05.45.−a1.IntroductionDerivatives and integrals of fractional order[1,2]have found many applications in recent studies in physics.The interest in fractional analysis has been growing continually during the past few years.Fractional analysis has numerous applications:kinetic theories[3,4,9], statistical mechanics[10–12],dynamics in complex media[13–17]and many others[5–8].The theory of derivatives of non-integer order goes back to Leibniz,Liouville,Grunwald, Letnikov and Riemann.In the past few decades,many authors have pointed out that fractional-order models are more appropriate than integer-order models for various real materials. Fractional derivatives provide an excellent instrument for the description of memory and hereditary properties of various materials and processes.This is the main advantage of fractional derivatives in comparison with classical integer-order models in which such effects are,in fact,neglected.The advantages of fractional derivatives become apparent in modelling mechanical and electrical properties of real materials,as well as in the description of rheological properties of rocks,and in many otherfields.In this paper,we use a fractional generalization of exterior calculus that was suggested in [18,19].Fractional generalizations of differential forms and exterior derivatives were defined in[18].It allows us to consider the fractional generalization of Hamiltonian and gradient dynamical systems[20,21].The suggested class of fractional gradient and Hamiltonian systems is wider than the usual class of gradient and Hamiltonian dynamical systems.The 0305-4470/05/265929+15$30.00©2005IOP Publishing Ltd Printed in the UK59295930V E Tarasov gradient and Hamiltonian systems can be considered as a special case of fractional gradient and Hamiltonian systems.In section2,a brief review of gradient systems and exterior calculus is considered tofix notation and provide a convenient reference.In section3,a brief review of fractional(exterior) calculus will be given tofix notations and provide a convenient reference.In section4, a definition of fractional generalization of gradient systems is suggested.In section5,we consider a fractional gradient system that cannot be considered as a gradient system.In section6,we prove that a dynamical system that is defined by the well-known Lorenz equations [23,24]can be considered as a fractional gradient system.In section7,a brief review of Hamiltonian systems is considered tofix notations and provide a convenient reference.In section8,we consider the fractional generalization of Hamiltonian systems and Helmholtz conditions.In section9,the simple example of fractional Hamiltonian systems is discussed. Finally,a short conclusion is given in section10.2.Gradient systemsIn this section,a brief review of gradient systems and exterior calculus[21]is considered to fix notations and provide a convenient reference.Gradient systems arise in dynamical systems theory[20–22].They are described by the equation d x/d t=−grad V(x),where x∈R n.In Cartesian coordinates,the gradient is given by grad V=e i∂V/∂x i,where x=e i x i.Here and later,we mean the sum on the repeated indices i and j from1to n.Definition1.A dynamical system that is described by the equationsd x id t=F i(x)(i=1,...,n)(1) is called a gradient system in R n if the differential1-formω=F i(x)d x i(2) is an exact formω=−d V,where V=V(x)is a continuously differentiable function(0-form).Here,d is the exterior derivative[21].Let V=V(x)be a real,continuously differentiable function on R n.The exterior derivative of the function V is the1-form d V=d x i∂V/∂x i written in a coordinate chart(x1,...,x n).In mathematics[21],the concepts of closed form and exact form are defined for differential forms by the equation dω=0for a given formωto be a closed form andω=d h for an exact form.It is known that to be exact is a sufficient condition to be closed.In abstract terms, the question of whether this is also a necessary condition is a way of detecting topological information,by differential conditions.Let us consider the1-form(2).The formula for the exterior derivative d of differential form(2)isdω=12∂F i∂x j−∂F j∂x id x j∧d x i,where∧is the wedge product.Therefore,the condition forωto be closed is∂F i ∂x j −∂F j∂x i=0.Fractional generalization of gradient and Hamiltonian systems5931 In this case,if V(x)is a potential function then d V=d x i∂V/∂x i.The implication from ‘exact’to‘closed’is then a consequence of the symmetry of the second derivatives:∂2V ∂x i∂x j =∂2V∂x j∂x i.(3)If the function V=V(x)is a smooth function,then the second derivative commutes,and equation(3)holds.The fundamental topological result here is the Poincare lemma.It states that for a contractible open subset X of R n,any smooth p-formβdefined on X that is closed,is also exact,for any integer p>0(this has content only when p is at most n).This is not true for an open annulus in the plane,for some1-formsωthat fail to extend smoothly to the whole disc, so that some topological condition is necessary.A space X is contractible if the identity map on X is homotopic to a constant map.Every contractible space is simply connected.A space is simply connected if it is path connected and every loop is homotopic to a constant map. Proposition1.If a smooth vectorfield F=e i F i(x)of system(1)satisfies the relations∂F i ∂x j −∂F j∂x i=0(4)on a contractible open subset X of R n,then the dynamical system(1)is the gradient system such thatd x i d t =−∂V(x)∂x i.(5)This proposition is a corollary of the Poincar´e lemma.The Poincar´e lemma states that for a contractible open subset X of R n,any smooth1-form(2)defined on X that is closed,is also exact.The equations of motion for the gradient system on a contractible open subset X of R n can be represented in the form(5).Therefore,the gradient systems can be defined by the potential function V=V(x).If the exact differential1-formωis equal to zero(d V=0),then we get the equation V(x)−C=0,(6) which defines the stationary states of the gradient dynamical system(5).Here,C is a constant.3.Fractional differential formsIf the partial derivatives in the definition of the exterior derivative d=d x i∂/∂x i are allowed to assume fractional order,a fractional exterior derivative can be defined[18]by the equation dα=(d x i)αDαxi.(7) Here,we use the fractional derivative Dαx in the Riemann–Liouville form[1]that is defined by the equationDαx f(x)=1(m−α)∂m∂x mxf(y)d y(x−y)α−m+1,(8)where m is thefirst whole number greater than or equal toα.The initial point of the fractional derivative[1]is set to zero.The derivative of powers k of x isDαx x k=(k+1)(k+1−α)x k−α,(9)5932V E Tarasov where k 1andα 0.The derivative of a constant C need not be zeroDαx C=x−α(1−α)C.(10)For example,the fractional exterior derivative of orderαof x k1,with the initial point takento be zero and n=2,is given bydαx k1=(d x1)αDαx1x k1+(d x2)αDαx2x k1.(11)Using equation(9),we get the following relation for the fractional exterior derivative of x k1:dαx k1=(d x1)α (k+1)x k−α1(k+1−α)+(d x2)αx k1x−α2(1−α).4.Fractional gradient systemsA fractional generalization of exterior calculus was suggested in[18,19].A fractional exterior derivative and the fractional differential forms were defined[18].It allows us to consider the fractional generalization of gradient systems.Let us consider a dynamical system that is defined by the equation d x/d t=F,on a subset X of R n.In Cartesian coordinates,we can use the following equation:d x id t=F i(x),(12) where i=1,...,n,x=e i x i and F=e i F i(x).The fractional analogue of definition1has the formDefinition2.A dynamical system(12)is called a fractional gradient system if the fractional differential1-formωα=F i(x)(d x i)α(13) is an exact fractional formωα=−dαV,where V=V(x)is a continuously differentiable function.Using the definition of the fractional exterior derivative,equation(13)can be represented asωα=−dαV=−(d x i)αDαxiV.Therefore,we have F i(x)=−Dαxi V.Note that equation(13)is a fractional generalization of equation(2).Ifα=1,then equation(13)leads us to equation(2).Obviously,a fractional1-formωαcan be closed when the1-formω=ω1is not closed.The fractional generalization of the Poincar´e lemma is considered in[19].Therefore,we have the following proposition.Proposition2.If a smooth vectorfield F=e i F i(x)on a contractible open subset X of R n satisfies the relationsDαxj F i−DαxiF j=0,(14)then the dynamical system(12)is a fractional gradient system such thatd x i d t =−DαxiV(x),(15)where V(x)is a continuous differentiable function and DαxiV=−F i.Fractional generalization of gradient and Hamiltonian systems 5933Proof.This proposition is a corollary of the fractional generalization of Poincar´e lemma[19].The Poincare lemma is shown [18,19]to be true for the exterior fractional derivative.Relations (14)are the fractional generalization of relations (4).Note that the fractional derivative of a constant need not be zero (10).Therefore,we see that constants C in the equation V (x)=C cannot define a stationary state of the gradient system (15).It is easy to see thatD αx i V (x)=D αx i C =x −αi (1−α)C =0.In order to define stationary states of fractional gradient systems,we consider the solutions of the system of the equationsD αx i V (x)=0.Proposition 3.The stationary states of gradient system (15)are defined by the equation V (x)− n i =1x i α−m m −1 k 1=0···m −1 k n =0C k 1,...,k n n i =1(x i )k i =0.(16)The C k 1,...,k n are constants and m is the first whole number greater than or equal to α.Proof.In order to define the stationary states of a fractional gradient system,we consider the solution of the equationD αx i V (x)=0.(17)This equation can be solved by using equation (8).Let m be the first whole number greater than or equal to α;then we have the solution [1,2]of equation (17)in the formV (x)=|x i |αm −1k =0a k (x 1,...,x i −1,x i +1,...,x n )(x i )k ,(18)where a k are functions of the other ing equation (18)for i =1,...,n ,we get the solution of the system of equation (17)in the form (16).If we consider n =2such that x =x 1and y =x 2,we have the equations of motion for the fractional gradient systemd x d t =−D αx V (x,y),d y d t=−D αy V (x,y).(19)The stationary states of this system are defined by the equationV (x,y)−|xy |α−1m −1 k =0m −1l =0C kl x k y l =0.The C kl are constants and m is the first whole number greater than or equal to α.5.Examples of fractional gradient systemIn this section,we consider a fractional gradient systems that cannot be considered as a gradient system.We prove that the class of fractional gradient systems is wider than the usual class of gradient dynamical systems.The gradient systems can be considered as a special case of fractional gradient systems.Example 1.Let us consider the dynamical system that is defined by the equationsd x d t =F x ,d y d t =F y ,(20)5934V E Tarasov where the right-hand sides have the formF x=acx1−k+bx−k,F y=(ax+b)y−k,(21) where a=0.This system cannot be considered as a gradient dynamical ing∂F x ∂y −∂F y∂x=ay−k=0,we get thatω=F x d x+F y d y is not closed formdω=−ay−k d x∧d y.Note that relation(14)in the formDαy F x−Dαx F y=0is satisfied for the system(21),ifα=k and the constant c is defined byc= (1−α) (2−α).Therefore,this system can be considered as a fractional gradient system with the linear potential functionV(x,y)= (1−α)(ax+b),whereα=k.Example2.Let us consider the dynamical system that is defined by equation(20)withF x=an(n−1)x n−2+ck(k−1)x k−2y l,(22)F y=bm(m−1)y m−2+cl(l−1)x k y l−2,(23) where k=1and l=1.It is easy to derive that∂F x ∂y −∂F y∂x=cklx k−2y l−2[(k−1)y−(l−1)x]=0,and the differential formω=F x d x+F y d y is not closed dω=0.Therefore,this system is not a gradient dynamical ing conditions(14)in the formD2y F x−D2x F y=∂2F x∂y2−∂2F y∂x2=0,we get dαω=0forα=2.As a result,we have that this system can be considered as a fractional gradient system with the potential functionV(x,y)=ax n+by m+cx k y l.In the general case,the fractional gradient system cannot be considered as a gradient system.The gradient systems can be considered as a special case of fractional gradient systems such thatα=1.Fractional generalization of gradient and Hamiltonian systems5935 6.Lorenz system as a fractional gradient systemIn this section,we prove that dynamical systems that are defined by the well-known Lorenz equations[23,24]are the fractional gradient system.The well-known Lorenz equations[23,24]are defined byd x d t =F x,d yd t=F y,d zd t=F z,where the right-hand sides F x,F y and F z have the formsF x=σ(y−x),F y=(r−z)x−y,F z=xy−bz.The parametersσ,r and b can be equal to the following values:σ=10,b=8/3,r=470/19 24.74.The dynamical system which is defined by the Lorenz equations cannot be considered as a gradient dynamical system.It is easy to see that∂F x ∂y −∂F y∂x=z+σ−r,∂F x∂z−∂F z∂x=−y,∂F y∂z−∂F z∂y=−2x.Therefore,ω=F x d x+F y d y+F z d z is not a closed1-form and we havedω=−(z+σ−r)d x∧d y+y d x∧d z+2x d y∧d z.For the Lorenz equations,conditions(14)can be satisfied in the formD2y F x−D2x F y=0,D2z F x−D2x F z=0,D2z F y−D2y F z=0.As a result,we get that the Lorenz system can be considered as a fractional gradient dynamical system with the potential functionV(x,y,z)=16σx3−12σyx2+12(z−r)xy2+16y3−12xyz2+b6z3.(24)The potential(24)uniquely defines the Lorenz ing equation(16),we can get that the stationary states of the Lorenz system are defined by the equationV(x,y,z)+C00+C x x+C y y+C z z+C xy xy+C xz xz+C yz yz=0,(25) where C00,C x,C y,C z C xy,C xz and C yz are the constants andα=m=2.The plot of stationary states of the Lorenz system with the constants C00=1,C x=C y=C z=C xy= C xz=C yz=0and parametersσ=10,b=3and r=25is shown infigures1and2.Note that the Rossler system[25],which is defined by the equationsd x d t =−(y+z),d xd t=x+0.2y,d zd t=0.2+(x−c)z,can be considered as a fractional gradient system with the potential functionV(x,y,z)=12(y+z)x2−12xy2−130y3−110z2−16(x−c)z3.(26) This potential uniquely defines the Rossler system.The stationary states of the Rossler system are defined by equation(25),where the potential function is defined by(26).The plot of stationary states of the Rossler system for the constants C00=1,C x=C y=C z=C xy= C xz=C yz=0and parameter c=1is shown infigures3and4.Let us note the interesting qualitative property of surfaces(25)which is difficult to see from thefigures.The surfaces of the stationary states of the Lorenz and Rossler systems separate the three-dimensional Euclidean space into some number of areas.We have eight areas for the Lorenz system and four areas for the Rossler system.This separation has the interesting property for some values of parameters.All regions are connected with each other. Beginning movement from one of the areas,it is possible to appear in any other area,not crossing a surface.Any two points from different areas can be connected by a curve which does not cross a surface.It is difficult to see this property fromfigures1–4.5936V ETarasov Figure 1.Stationary states of the Lorenzsystem.–300–200–100Figure 2.Stationary states of the Lorenz system.Fractional generalization of gradient and Hamiltonian systems5937–100Figure3.Stationary states of the Rosslersystem.Figure4.Stationary states of the Rossler system.5938V E Tarasov7.Hamiltonian systemsIn this section,a brief review of Hamiltonian systems is considered to fix notations and provide a convenient reference.Let us consider the canonical coordinates (q 1,...,q n ,p 1,...,p n )in the phase space R 2n .We consider a dynamical system that is defined by the equationsd q i d t =G i (q,p),d p i d t=F i (q,p).(27)The definition of Hamiltonian systems can be realized in the following form [27,28].Definition 3.A dynamical system (27)on the phase space R 2n is called a Hamiltonian system if the differential 1-formβ=G i d p i −F i d q i (28)is a closed form d β=0,where d is the exterior derivative.A dynamical system is called a non-Hamiltonian system if the differential 1-form βis nonclosed d β=0.The exterior derivative for the phase space is defined asd =d q i ∂∂q i +d p i ∂∂p i.(29)Here and later,we mean the sum on the repeated indices i and j from 1to n .Proposition 4.If the right-hand sides of equations (27)satisfy the Helmholtz conditions[26–28]for the phase space,which have the following forms:∂G i ∂p j −∂G j ∂p i=0,(30)∂G j ∂q i +∂F i∂p j=0,(31)∂F i ∂q j −∂F j ∂q i=0,(32)then the dynamical system (27)is a Hamiltonian system.Proof.In the canonical coordinates (q,p),the vector fields that define the system have the components (G i ,F i ),which are used in equation (27).Let us consider the 1-form that is defined by the equationβ=G i d p i −F i d q i .The exterior derivative for this form can be written by the relationd β=d (G i d p i )−d (F i d q i ).It now follows thatd β=∂G i ∂q j d q j ∧d p i +∂G i ∂p j d p j ∧d p i −∂F i ∂q j d q j ∧d q i −∂F i∂p jd p j ∧d q i .Here,∧is the wedge product.This equation can be rewritten in an equivalent form as d β= ∂G j ∂q i +∂F i ∂p j d q i ∧d p j +12 ∂G j ∂p i −∂G i ∂p j d p i ∧d p j +12 ∂F i ∂q j −∂F j∂q i d q i ∧d q j .Here,we use the skew symmetry of d q i∧d q j and d p i∧d p j with respect to the indices i and j.It is obvious that conditions(30)–(32)lead to the equation dβ=0.Some of Hamiltonian systems can be defined by the unique function.Proposition5.A dynamical system(27)on the phase space R2n is a Hamiltonian system that is defined by the Hamiltonian H=H(q,p)if the differential1-formβ=G i d p i−F i d q iis an exact formβ=d H,where d is the exterior derivative and H=H(q,p)is a continuous differentiable unique function on the phase space.Proof.Suppose that the differential1-formβ,which is defined by equation(28),has the formβ=d H=∂H∂p id p i+∂H∂q id q i.In this case,vectorfields(G i,F i)can be represented in the formG i(q,p)=∂H∂p i,F i(q,p)=−∂H∂q i.If H=H(q,p)is a continuous differentiable function,then conditions(30)–(32)are satisfied. Using proposition4,we get that this system is a Hamiltonian system.The equations of motion for the Hamiltonian system(27)can be written in the formd q i d t =∂H∂p i,d p id t=−∂H∂q i,(33)which is uniquely defined by the Hamiltonian H.If the exact differential1-formβis equal to zero(d H=0),then the equationH(q,p)−C=0(34) defines the stationary states of the Hamiltonian system(27).Here,C is a constant.8.Fractional Hamiltonian systemsFractional generalization of the differential form(28),which is used in the definition of the Hamiltonian system,can be defined in the following form:βα=G i(d p i)α−F i(d q i)α.Let us consider the canonical coordinates(x1,...,x n,x n+1,...,x2n)=(q1,...,q n, p1,...,p n)in the phase space R2n and a dynamical system that is defined by the equationsd q i d t =G i(q,p),d p id t=F i(q,p).(35)The fractional generalization of Hamiltonian systems can be defined by using fractional generalization of differential forms[18].Definition4.A dynamical system(35)on the phase space R2n is called a fractional Hamiltonian system if the fractional differential1-formβα=G i(d p i)α−F i(d q i)αis a closed fractional formdαβα=0,(36)where dαis the fractional exterior derivative.A dynamical system is called a fractional non-Hamiltonian system if the fractional differential1-formβαis a nonclosed fractional form dαβα=0.The fractional exterior derivative for the phase space R2n is defined asdα=(d q i)αDαqi +(d p i)αDαpi.(37)For example,the fractional exterior derivative of orderαof q k,with the initial point taken to be zero and n=2,is given bydαq k=(d q)αDαq q k+(d p)αDαp q k.(38) Using equations(9)and(10),we have the following relation for the fractional exterior derivative(37):dαq k=(d q)α (k+1)q k−α(k+1−α)+(d p)αq k p−α(1−α).Let us consider a fractional generalization of the Helmholtz conditions.Proposition6.If the right-hand sides of equations(35)satisfy the fractional generalization of the Helmholtz conditions in the following form:Dαpj G i−DαpiG j=0,(39)Dαqi G j+DαpjF i=0,(40)Dαqj F i−DαqiF j=0,(41)then dynamical system(35)is a fractional Hamiltonian system.Proof.In the canonical coordinates(q,p),the vectorfields that define the system have the components(G i,F i),which are used in equation(27).The1-formβαis defined by the equationβα=G i(d p i)α−F i(d q i)α.(42) The exterior derivative for this form can now be given by the relationdαβα=dα(G i(d p i)α)−dα(F i(d q i)α).Using the ruleDαx(fg)=∞k=0αkDα−kxf∂k g∂x kand the relation∂k∂x k((d x)α)=0(k 1), we get thatdα(A i(d x i)α)=∞k=0(d x j)α∧αkDα−kx jA i∂k∂xj(d x i)α=(d x j)α∧(d x i)ααDαxjA i.Here,we useαk=(−1)k−1α (k−α)(1−α) (k+1).Therefore,we haved αβα=D αq j G i (d q j )α∧(d p i )α+D αp j G i (d p j )α∧(d p i )α−D αq j F i (d q j )α∧(d q i )α−D αp j F i (d p j )α∧(d q i )α.This equation can be rewritten in an equivalent formd αβα= D αq i G j +D αp j F i (d q i )α∧(d p j )α+1 D αp i G j −D αp j G i (d p i )α∧(d p j )α+1 D αq j F i −D αq i F j (d q i )α∧(d q j )α.Here,we use the skew symmetry of ∧.It is obvious that conditions (39)–(41)lead to theequation d αβα=0,i.e.,βαis a closed fractional form.Let us define the Hamiltonian for the fractional Hamiltonian systems.Proposition 7.A dynamical system (35)on the phase space R 2n is a fractional Hamiltoniansystem that is defined by the Hamiltonian H =H (q,p)if the fractional differential 1-formβα=G i (d p i )α−F i (d q i )αis an exact fractional formβα=d αH,(43)where d αis the fractional exterior derivative and H =H (q,p)is a continuous differentiable function on the phase space.Proof.Suppose that the fractional differential 1-form βα,which is defined by equation (42),has the formβα=d αH =(d p i )αD αp i H +(d q i )αD αq i H.In this case,vector fields (G i ,F i )can be represented in the formG i (q,p)=D αp i H,F i (q,p)=−D αq i H.Therefore,the equations of motion for fractional Hamiltonian systems can be written in theformd q id t =D αp i H,d p i d t=−D αq i H.(44)The fractional differential 1-form βαfor the fractional Hamiltonian system with Hamiltonian H can be written in the form βα=d αH .If the exact fractional differential 1-form βαis equal to zero (d αH =0),then we can get the equation that defines the stationary states of the Hamiltonian system. Proposition 8.The stationary states of the fractional Hamiltonian system (44)are defined by the equationH (q,p)−ni =1q i p iα−m m −1 k 1=0,l 1=0···m −1 k n =0,l n =0C k 1,...,k n ,l 1,...,l n n i =1(q i )k i (p i )l i =0,(45)where C k 1,...,k n ,l 1,...,l n are constants and m is the first whole number greater than or equal to α.Proof.This proposition is a corollary of proposition 3.9.Example of fractional Hamiltonian systemLet us consider a dynamical system in the phase space R2(n=1)that is defined by the equationd q d t =Dαp H,d pd t=−Dαq H,(46)where the fractional order0<α 1and the Hamiltonian H(q,p)has the formH(q,p)=ap2+bq2.(47) Ifα=1,then equation(46)describes the linear harmonic oscillator.If the exact fractional differential1-formβα=dαH=(d p)αDαp H+(d q)αDαq His equal to zero(dαH=0),then the equationH(q,p)−C|qp|α−1=0defines the stationary states of the system(46).Here,C is a constant.Ifα=1,we get the usual stationary-state equation(34).Using equation(47),we get the following equation for stationary states:|qp|1−α(ap2+bq2)=C.(48) Ifα=1,then we get the equation ap2+bq2=C,which describes the ellipse.10.ConclusionFractional derivatives and integrals[1,2]have found many applications in recent studies in physics.The interest in fractional analysis has been growing continually during the past few years[3–17].Using the fractional derivatives and fractional differential forms,we consider the fractional generalization of gradient and Hamiltonian systems.In the general case,the fractional gradient and Hamiltonian systems cannot be considered as gradient and Hamiltonian systems.The class of fractional gradient and Hamiltonian systems is wider than the usual class of gradient and Hamiltonian dynamical systems.The gradient and Hamiltonian systems can be considered as a special case of fractional gradient and Hamiltonian systems.Therefore,it is possible to generalize the application of catastrophe and bifurcation theory from gradient to a wider class of fractional gradient dynamical systems.Note that quantization of the fractional Hamiltonian systems can be realized by the method suggested in[29–32].References[1]Samko S G,Kilbas A A and Marichev O I1993Fractional Integrals and Derivatives Theory and Applications(New York:Gordon and Breach)[2]Oldham K B and Spanier J1974The Fractional Calculus(New York:Academic)[3]Zaslavsky G M2002Phys.Rep.371461–580[4]Zaslavsky G M2005Hamiltonian Chaos and Fractional Dynamics(Oxford:Oxford University Press)[5]Metzler R and Klafter J2000Phys.Rep.3391–77[6]Metzler R and Klafter J2004J.Phys.A:Math.Gen.37R161–208[7]Hilfer R(ed)2000Applications of Fractional Calculus in Physics(Singapore:World Scientific)[8]Carpinteri A and Mainardi F1997Fractals and Fractional Calculus in Continuum Mechanics(New York:Springer)[9]Tarasov V E and Zaslavsky G M2005Fractional Ginzburg–Landau equation for fractal media Physica A354249–61[10]Tarasov V E2004Chaos14123–7[11]Tarasov V E2005Phys.Rev.E71011102[12]Tarasov V E2005J.Phys.:Conf.Ser.717–33[13]Nigmatullin R1986Phys.Status Solidi b133425–30[14]Tarasov V E2005Phys.Lett.A336167–74[15]Tarasov V E2005Possible experimental test of continuous medium model for fractal media Phys.Lett.A atpress[16]Tarasov V E2005Fractional hydrodynamic equations for fractal media Ann.Phys.318at press[17]Tarasov V E2005Chaos15023102[18]Cottrill-Shepherd K and Naber M2001J.Math.Phys.422203–12[19]Cottrill-Shepherd K and Naber M2003Fractional differential forms II Preprint math-ph/0301016[20]Gilmor R1981Catastrophe Theory for Scientists and Engineers(New York:Wiley)section14[21]Dubrovin B A,Fomenko A N and Novikov S P1992Modern Geometry—Methods and Applications:Part I(New York:Springer)[22]Hirsh M and Smale S1974Differential Equations,Dynamical Systems and Linear Algebra(New York:Academic)[23]Lorenz E N1963J.Atmos.Sci.20130–41[24]Sparrow C1982The Lorenz Equations(New York:Springer)[25]Rossler O E1976Phys.Lett.A57397–8[26]Helmholtz H1886J.Reine Angew.Math.100137–66[27]Tarasov V E1997Theor.Math.Phys.11057–67[28]Tarasov V E2005J.Phys.A:Math.Gen.382145–55[29]Tarasov V E2001Phys.Lett.A288173–83[30]Tarasov V E2001Moscow Univ.Phys.Bull.56/65–9[31]Tarasov V E2002Theor.Phys.2150–60[32]Tarasov V E2004J.Phys.A:Math.Gen.373241–57。

A short course in effective Lagrangians

A short course in effective Lagrangians

a rXiv:h ep-ph/2180v116Fe b2UCRHEP–T270A short course in effective Lagrangians.∗Jos´e Wudka †Physics Department,UC Riverside Riverside CA 92521-0413,USA Abstract These lectures provide an introduction to effective theories concentrating on the basic ideas and providing some simple applications I.INTRODUCTION.When studying a physical system it is often the case that there is not enough information to provide a fundamental description of some of its properties.In such cases one must parameterize the corresponding effects by introducing new interactions with coefficients to be determined phenomenologically.Experimental limits or measurement of these parameters then (hopefully)provides the information needed to provide a more satisfactory description.A standard procedure for doing this is to first determine the dynamical degrees of freedom involved and the symmetries obeyed,and then construct the most general Lagrangian,the effective Lagrangian for these degrees of freedom which respects the required symmetries.The method is straightforward,quite general and,most importantly,it works!In following this approach one must be wary of several facts.Fist it is clear that the relevant degrees of freedom can change with scale(e.g.mesons are a good description of low-energy QCD,but at higher energies one should use quarks and gluons);in addition,physics at different scales may respect different symmetries(e.g.mass conservation is violated at sufficiently high energies).It follows that the effective Lagrangian formalism is in general applicable only for a limited range of scales.It is often the case(but no always!)that there is a scaleΛso that the results obtained using an effective Lagrangian are invalid for energies aboveΛ.The formalism has two potentially serious drawbacks.First,effective Lagrangian has an infinite number of terms suggesting a lack of predictability.Second,even though the model has an UV cutoffΛand will not suffer from actual divergences,simple calculations show that is is a possible for this type of theories to generating radiative corrections that grow withΛ,becoming increasingly important for higher and higher order graphs.Either of these problems can render this approach useless.It is also necessary verify that the model is unitary.I will discuss below how these problems are solved,an provide several applications of the formalism.The aim is to give aflair of the versatility of the approach,not to provide an exhaustive review of all known applications.II.F AMILIAR EXAMPLESA.Euler-Heisenberg effective LagrangianThis Lagrangian summarizes QED at low energies(below the electron mass)[1].At these energies only photons appear in real processes and the effective Lagrangian will be then constructed using the photonfield Aµ,and will satisfy a U(1)gauge and Lorenz invariances. Thus it can be constructed in terms of thefield strength Fµνor the loop variables A(Γ)=2FIG.1.Graph generating the leading terms in the Euler-Heisenberg effective Lagrangian ΓA·dx.The latter are non-local,so that a local description would involve only F,namely1 L eff=L eff(F)=aF2+bF4+c(F˜F)2+dF2(F˜F) (1)One can arbitrarily normalize thefields and so choose a=−1/4.The constants b,c and d have units of mass−2.Note that the term∝d violates CP.Though we know QED respects C and P,it is possible for other interactions to violate these symmetries,there is nothing in the discussion above that disallows such terms and,in fact,weak effects will generate them.For this system we are in a privileged position for we know the underlying physics,and so we can calculate b,c,d,....The leading effects come form QED which yields b,c∼1/(4πm e)2at 1loop[1].The parameters b and c summarize all the leading virtual electron effects.(see Fig.1).Forgetting about this underlying structure we could have simply defined a scale M and taken b,c∼1/M2(so that M=4πm e),and while this is perfectly viable,M is not relevant phenomenologically speaking as it does not corresponds of a physical scale.In order to extract information about the physics underlying the effective Lagrangian from a measurement of b and c we must be able to at least estimate the relation between these constants and the underlying scales.In addition we also know that d∼ξ/(4πv)with v∼246GeV andξis a very small constant proportional to the Jarlskog determinant[2].The effective Lagrangian can holdterms with radically different scales and limits on some constants cannot,in general,translate to others.In this case the terms are characterized by different CP transformation properties, and it is often the case that such global symmetries are useful in differentiating terms in the effective Lagrangian.The point being that a term violating a given global symmetry at scaleΛwill generate all terms in the effective Lagrangian with the same symmetry properties through radiative corrections.The caveat in the argument being that the underlying theory might have some additional symmetries not apparent at low energies which might further segregate interactions and so provide different scales for operators with the same properties under all low energy symmetries.When calculating with the effective Lagrangian the effects produced by the new terms proportional to b,c are suppressed by a factor∼(E/4πm e)2,where E is the typical energy on the process and E≪m e.Thus the effects of these terms are tiny,yet they are noticeable because they generate a new effect:γ−γscattering.B.(Standard)SuperconductivityThis is a brief summary of the very nice treatment provided by Polchinski[3].The system under consideration has the electronfieldψas its only dynamical variable(the phonons are assumed to have been integrated out,generating a series of electron self-interactions),it respects U(1)electromagnetic gauge invariance,as well as Galilean invariance and Fermion number conservation.Assuming a local description,thefirst few terms in the effective Lagrangian expansion are(neglecting those containing photons for simplicity)L eff= kψ∗k[i∂t−e k+µ]ψk+ ψ∗kψlψqψ∗pδ(k−l−q+p)V klq+ (2)In this equation the relation e k=µdetermines the Fermi surface,while V∼(electron-photon coupling)2on the Fermi Surface(FS)if e k=µ,if p is near the FS one can write p=k+ℓˆn(with e k=µ).Scaling towards the FS impliesℓ→sℓwith s→0.Then assumingψ→s dψthe quadratic terms in the action will be scale invariant provided d=−1/2.The quartic terms in the action then scales as s and becomes negligible near the FS except when the pairing condition q+l=0is obeyed.In this case the quartic term scales as s0and cannot be ignored.In fact this term determines the most interesting behavior of the system at low temperatures(see[3]for full details).C.Electroweak interactionsAgain I will follow the general recipe.I will concentrate only on the(low energy)inter-actions involving leptonfields,which are then the degrees of freedom.Since I assume the energy to be well below the Fermi scale,the only relevant symmetries are U(1)gauge and Lorenz invariances.In addition there is the question whether the heavy physics will respect the discrete symmetries C,P or CP;using perfect hindsight I will retain terms that violate these symmetriesAssuming a local description I have[1]L eff= ¯ψi(i D−m i)ψi+ f ijkl ¯ψiΓaψj ¯ψkΓaψl + (3)where the ellipsis indicate terms containing operators of higher dimension,or those involving the electromagneticfield.The matricesΓare to be chosen among the16independent basis Γa={1,γµ,σµν,γµγ5,γ5}The coefficients for thefirst two terms are befixed by normalization requirements.While a SM calculation gives f∼g2/m2W=1/v2(v≃246GeV)and is generated by tree-level√graphs(see Fig.II C)because of this the scale1/be observed(or bounded)despite the E≪v condition because they generate new effects: C and P(and some of them chirality)violation.FIG.2.Standard model processes generating four fermion interactions at low energies(e.g.. Bhaba scattering)D.Strong interactions at low energiesIn this case we are interested in the description of the interactions among the lightest hadrons,the meson multiplet.The most convenient parameterization of these degrees of freedom is in terms of a unitaryfield[9]U such that U=exp(λaπa/F)whereπa denote the eight mesonfields,λa the Gell-Mann matrices and F is a constant(related to the pion decay constant).The symmetries obeyed by the system are chiral SU(3)L×SU(3)R,Lorenz invariance,C and P.With these constraints the effective Lagrangian takes the formL eff=a tr∂U†.∂U+ b tr∂µU†∂νU∂µU†∂νU+... + (4)I can set a∼F2by properly normalizing thefields.In this case the leading term in the effective Lagrangian will determine all(leading)low-energy pion interactions in terms of the single constant F.The effects form the higher-order terms have been measured and the data requires b∼1/(4π)2.This result is also predicted by the consistency of this approach which requires that radiative corrections to a,b,etc.should be at most of the same size as their tree-level values.6III.BASIC IDEAS ON THE APPLICABILITY OF THE FORMALISMBeing a model with intrinsic an cutoffthere are no actual ultraviolet divergences in most effective Lagrangian computations.Still there are interesting renormalizability issues that arise when doing effective Lagrangian loop computations.Imagine doing a loop calculation including some vertices terms of(mass)dimension higher than the dimension of space-time.These must have coefficients with dimensions of mass to some negative power.The loop integrations will produce in general terms growing withΛthe UV cutoffwhich are polynomials in the external momenta2and will preserve the symmetries of the model[4].Hence these terms which may grow withΛcorrespond to vertices appearing in the most general effective Lagrangian and can be absorbed in a renormalization of the corresponding coefficients.They have no observable effects(though they can be used in naturality arguments[5].Effective theories will also be unitary provided one stays within the limits of their appli-cability.Should one exceed them new channels will open(corresponding to the production of the heavy excitations)and unitarity violating effects will occur.This is not produced by real unitarity violating interactions,but due to our using the model beyond its range of applicability(e.g.it the typical energy of the process under consideration reaches of exceeds Λ).One can,of course,extend the model,but this necessarily introduces ad-hoc elements and will dilute the generality gained using effective theories.For example consider W W Z interactions with an effective Lagrangian of the formL eff=λ(p,k)Wµν(k)Wνρ(p)Zρµ(−p−k)+···;(5) (where Vαβ=∂αVβ−∂βVα)One can then chooseλto insure unitarity is preserved(at least in some processes),for example[6]λ0λ(p,k)=Another common situation where effective Lagrangians appear occurs when some heavy excitations are integrated out.This can be illustrated by the following toy model3S= d n x ¯ψ(i∂−m)ψ+12Λ2φ21+fφ¯ψψ (9) whereφis heavy.A simple calculation givesS eff= d n x ¯ψ(i∂−m)ψ+1+Λ2¯ψψ (10) andL eff=¯ψ(i∂−m)ψ+f2Λ2 n¯ψψ(11) Note that terms with large number of derivatives will be suppressed by a large power of the small factor(E/Λ),if we are interested in energies E∼Λthe whole infinite set of vertices must be included in order to reproduce theφpole.A.How to parameterize ignoranceIf one knows the theory we can,in principle,calculate L eff(or do a full calculation).Yet there are many cases where the underlying theory is not known.In these cases an effective theory if obtained by writing all possible interactions among the light excitations.The model then has an infinite number of terms each with an unknown parameter,and these constants then parameterize all possible underlying theories.The terms which dominate are those usually called renormalizable(or,equivalently,marginal or relevant).The other terms are called non-renormalizable,or irrelevant,since their effects become smaller as the energy decreasesThis recipe for writing effective theories must be supplemented with some symmetry re-strictions.The most important being that the all the terms in the effective Lagrangian mustrespect the local gauge invariance of the low-energy physics(more technically,the one re-spected by the renormalizable terms in the effective action)[7].The reason is that the presence of a gauge variant term will generate all gauge variant interactions thorough renor-malization group evolution.a.Gauge invariantizing Using a simple argument it is possible to turn any theory into a gauge theory[8]and so it appears that the requirement of gauge invariance is empty.That this is not the case is explained here.Ifirst describe the trick which grafts gauge invariance onto a theory and then discuss the implications.Consider an arbitrary theory with matterfields(spin0and1/2)and vectorfields V nµ, n=1,...N.Then•Choose a(gauge)group G with N generators{T n}.Define a covariant derivative Dµ=∂µ+V nµT n and assume that the V nµare gaugefields.•Invent a unitaryfield U transforming according to the fundamental representation ofG and construct the gauge invariant compositefieldsV nµ=−tr T n U†DµU(12) Taking tr T n T m=−δnm,it is easy to see that in the unitary gauge U=1,V nµ=V nµ.Thus if simply replace V→V in the original theory we get a gauge theory.Does this mean that gauge invariance irrelevant since it can be added at will?In my opinion this is not the case.In the above process all matterfields are assumed gauge singlets(none are minimally coupled to the gaugefields).In the case of the standard model,for example,the universal coupling of fermions to the gauge bosons would be accidental in this approach.In order to recover the full predictive power commonly associated with gauge theories,the matterfields must transform non-trivially under G which can be done only if there are strong correlations among some of the couplings.It is not trivial to say that the standard model group is10SU(3)×SU(2)×U(1)with left-handed quarks transforming as(3,2,1/6),left-handed leptons as(1,2,−1/2),etc.,as opposed to a U(1)12with all fermions transforming as singlets[10].B.How to estimate ignoranceA problem which I have not addressed so far is the fact that effective theories have an infinite number of coefficients,with the(possible)problem or requiring an infinite number of data points in order to make any predictions.On the other hand,for example,if this is the case why is it that the Fermi theory of the weak interactions is so successful?The answer to this question lies in the fact that not all coefficients are created equal,there is a hierarchy[9,10].As a result,given any desired level of accuracy,only afinite number of terms need to be included.Moreover,even though the effective Lagrangian coefficients cannot be calculated without knowing the underlying theory,they can still be bounded using but a minimal set of assumptions about the heavy interactions.It is then also possible to estimate the errors in neglecting all but thefinite number of terms used.As an example consider the standard model at low energies and calculate two processes: Bhaba cross section and the anomalous magnetic moment of the electron.For Bhaba scat-tering there is a contribution due the Z-boson exchange(see Fig.II C)e+e−→Z→e+e−generates O=14In addition the coefficient is suppressed by a factor of m e since it violates chirality.11eeeFIG.3.Weak contributions to the electron anomalous magnetic moment The point of this exercise is to illustrate the fact that,for weakly coupled theories,loop-generated operators have smaller coefficients than operators generated at tree level.Leading effects are produced by operators which are generated at tree level.C.Coefficient estimatesIn this section I will provide arguments which can be used to estimate(or,at least bound) the coefficients in the effective Lagrangian.These are order of magnitude calculations and might be offby a factor of a few;it is worth noting that no single calculation has provided a significant deviation from these results.The estimate calculations should be done separately for weakly and strongly interacting theories.I will characterize thefirst as those where radiative corrections are smaller than the tree-level contributions.Strongly interacting theories will have radiative corrections of the same size at any order51.Weakly interacting theoriesIn this case leading terms in the effective Lagrangian are those which can be generated at tree level by the heavy physics.Thus the dominating effects are produced by operators which have the lowest dimension(leading to the smallest suppression from inverse powers ofΛ)and which are tree-level generated(TLG)operators can be determined[11].When the heavy physics is described by a gauge theory it is possible to obtained all TLG operators[11].The corresponding vertices fall into3categories,symbolically •vertices with4fermions.•vertices with2fermions and k bosons;k=2,3•vertices with n bosons;n=4,6.A particular theory may not generate one or more of these vertices,the only claim is that there is a gauge theory which does.In the case of the standard model with lepton number conservation the leading operators have dimension6[12,11].Subleading operators are either dimension8and their contribu-tions are suppressed by an additional factor(E/Λ)2in processes with typical energy E. Other subleading contributions are suppressed by a loop factor∼1/(4π)2.Note that it is possible to have situations where the only two effects are produced by either dimension8 TLG operators or loop generated dimension6operators.In this case the former dominates only whenΛ>4πE.a.Triple gauge bosons The terms in the electroweak effective Lagrangian which describe the interaction of the W and Z bosons generated by some heavy physics underlying the standard model has received considerable attention recently[13].In terms of the SU(2)and U(1)gaugefields W and B and the scalar doubletφthese interactions areL eff=1information about the heavy physics.2.Strongly interacting theoriesI will imagine a theory containing scalars and fermions which interact strongly.Gauge couplings are assumed to be small and will be ignored.This calculation is useful for low energy chiral theories but not for low energy QCD[14,15,9].A generic effective operator in this type of theories takes the formO abc∼λΛ4 φΛψ3/2 b ∂(4π)2/3Λ,Λφ=116π2(17)In terms of U∼exp(φ/Λφ),the operators take the formO abc=1For the case whereφrepresents the interpolatingfield for the lightest mesons PCAC impliesΛφ=fπ[14,9].Thenψ4∝116π2ψ2∂2U2∝1Λ2 ¯ψγµψ ¯ψγµψ + (20)whereψdenotes the electronfield.The calculation is illustrated in Fig.IV D where the loops involving the4-fermion oper-ator are cut-offat a scaleΛ.The SM and new physics(NP)contributions are,symbolically,_4πg ()2_1v 2_1v2_2f Λ_4π()2Λ_2f Λ_4π()2Λ_2f Λ_4π()2Λ_2f Λ_2f Λ_2fΛ+++. . . FIG.4.Radiative corrections to Bhaba scattering in the presence of a 4-fermion interactionSM:116π2+··· NP:f16π2+ (21)Note that this consistent behavior (that the new physics effects disappear as Λ→∞)results form having the physical scale of new physics Λin the coefficient of the operator.Had we used f ′/v 2instead of f/Λ2the new physics effects would appear to be enormous,and growing with each new loop.It is not that the use of f ′/v 2is wrong,it is only that it is misleading to believe f ′can be of order one;it must be suppressed by the small factor (v/λ)ing these results we see that this reaction is sensitive to Λprovidedf (v/Λ)2>sensitivity.If the sensitivity is,say 1%this corresponds to Λ/√V.APPLICATIONS TO ELECTROWEAK PHYSICSWith the above results one can determine,for any given process,the leading contributions (as parameterized by the various effective operator coefficients).Using then the coefficient estimates one can provide the expected magnitude of the new physics effects with onlyΛas an unknown parameter,and so estimate the sensitivity to the scale of new physics.It is important to note that this is sometimes a rather involved calculation as all con-tributing operators must be included.For example,in order to determine the heavy physics effects on the oblique parameters one must calculate not only these affecting the vector bo-son polarization tensors,but also this which modify the Fermi constant,thefine structure constant,etc.as these quantities are used when extracting S,T and U from the data[18].A.Effective lagrangianIn the following I will assume that the underlying physics is weakly coupled and derive the leadingoperators that can be expected form the existence of heavy excitations at scale Λ.The complete list of dimension6operators was cataloged a long time ago for the case where the low energy spectrum includes a single scalar doublet[12]7.It is then straightfor-ward to determine the subset of operators which can be TLG,they are[11]•Fermions: ¯ψiΓaψj ¯ψkΓaψl•Scalars:|φ|6,(∂|φ|2)2•Scalars and fermions:|φ|2×Yukawa term•Scalars and vectors:|φ|2|Dφ|2,|φ†Dφ|2•Fermions,scalars and vectors: φ†T n Dµφ ¯ψi T nγµψjwhere T denotes a group generator andΓa product of a group generator and a gamma matrix.Observables affected by the operators in this list provide the highest sensitivity to new physics effects provided that the standard model effects are themselves small(or that the experimental sensitivity is large enough to observe small deviations).I will illustrate this with two(incomplete)examplesB.b-parityThis is a proposed method for probing newflavor physics[19].Its virtue lies in the fact that it is very simple and sensitive(though it does not provide the highest sensitivity for all observables).The basic idea is based on the observation that the standard model acquires an additional global U(1)b symmetry in the limit V ub=V cb=V td=V ts=0(given the experimental values0.002<|V ub|<0.005,0.036<|V cb|<0.046,0.004<|V td|<0.014, 0.034<|V ts|<0.046,deviations form exact U(1)b invariance will be small).Then for any standard model interaction a reaction to the typen i b−jet+X→n f b−jet+Y(22) will obey(−1)n i=(−1)n f(23) to very high accuracy.The number(−1)#of b jets defines the b-parity of a state(it being understood that the top quarks have decayed).The standard model is then b-parity even,and the idea is to consider a lepton collider8 and simply count the number of b jets in thefinal state;new physics effects will show up as events with odd number of b jets.The standard model produces no measurable irreducible background,yet there are significant reducible backgrounds which reduced the sensitivity toΛ.To estimate these effects I define•ǫb=b−jet tagging efficiency•t c=c−jet mis tagging efficiency(probability of mistaking a c−jet jet for a b−jet •t j=light-jet mis tagging efficiency(probability of mistaking a light-jet for a b−jet so that the measured cross section with k-b-jets is¯σk= u+v+w=k n u ǫu b(1−ǫb)n−u m v t v c(1−t c)m−v ℓw t w j(1−tj)ℓ−w σnmℓ(24) whereσnmℓdenotes the cross section for thefinal state with n b-jets,m c-jets,andℓlight jets.Note that n u ǫu b(1−ǫb)n−u is the probability of tagging u and missing n−u b-jets out of the n available.As an example considerL eff=L sm+f ijLimits from e+e−→t¯c+¯t c+b¯s+¯bs→1b−jet+Xsǫb=50%ǫb=70%2.5fb−1 1.5TeV500GeV 5.0TeV 5.5TeV200fb−110.0TeVThese results are promising yet they will be degraded in a realistic calculation.First one must include the effects of having t c,j=0.In addition there are complications in using inclusive reactions such as e+e−→b+X since the contributions form events with large number of jets can be very hard to evaluate(aside from the calculational difficulties there19are additional complications when defining what a jet is).A more realistic approach is to restrict the calculation to a sample with afixed number of jets(2and4are the simplest) and determine the sensitivity toΛfor various choices ofǫb and t j using this population only.C.CP violationJust as for b-parity the CP violating effects are small within the standard model and so precise measurements of CP violating observable might be very sensitive to new physics effects.In order to study CP violations it is useful tofirst define what the CP transformation is. In order to do this in general denote the Cartan group generators by H i and the root gener-ators by Eα,then it is possible tofind a basis where all the group generators are real and, in addition,the H i are diagonal[20].Define then CP transformation by Transformationsψ→Cψ∗(fermions)φ→φ∗(scalars)A(i)µ→−A(i)µ,(i:Cartan generator)A(α)µ→−A(−α)µ,(α:root)it is easy to see that thefield strengths and currents transform as Aµ,while Dφ→(Dφ)∗.It then follows that in this basis the whole gauge sector of any gauge theory is CP conserving; CP violation can arise only in the scalar potential and fermion-scalar interactions using this basis.In order to apply this to electroweak physics I will need the list of TLG operators of dimension6which violate CP,they are given by9¯ℓe ¯dq −h.c.(¯q u)ε(¯q d)−h.c. ¯qλA u ε ¯qλA d −h.c.¯ℓe ε(¯q u)−h.c. ¯ℓu ε(¯q e)−h.c.|φ|2 ¯ℓeφ−h.c.|φ|2 ¯q u˜φ−h.c. |φ|2(¯q dφ−h.c.)|φ|2∂µ ¯ℓγµℓ|φ|2∂µ(¯eγµe)|φ|2∂µ(¯qγµq)|φ|2∂µ(¯uγµu)|φ|2∂µ ¯dγµdO1= φ†τIφ D IJµ ¯ℓγµτJℓO2= φ†τIφ D IJµ ¯qγµτJ qO3= φ†εDµφ (¯uγµd)−h.cAll operators except O1,2,3violate chirality and their coefficients are strongly bounded by their contributions to the strong CP parameterθ;in addition some chialiry violating operators contribute to meson decays(which again provide strong bounds for fermions in thefirst generation)and,finally,in natural theories some contribute radiatively to fermion masses and will be then suppressed by the smaller of the corresponding Yukawa couplings. For these reasons I will not consider them further.Moreover,since I will be interested in limits that can be obtained using current data,I will ignore operators whose only observable effects involve Higgs particles.With these restrictions only O1,2,3remain;their terms not involving scalars areO1→−igv22 ¯νL W+e L−h.c.O2→−igv22 ¯u L W+d L−h.c.O3→−igv28 ¯u R W+d R−h.c.The contributions from O1,2can be absorbed in a renormalization of standard model coef-ficients whence only O3produces observable effects,corresponding to a right-handed quark current.Existing data(fromτdecays and m W measurements)impliesΛ∼>500GeV One can also determine the type of new interactions which might be probed using these operators[11].The heavy physics which can generate O3at tree level is described in Fig.10. If the underlying theory is natural we conclude that there will be no super-renormalizableSR coupling(unnatural)FIG.5.Heavy violating operators.Wavy lines denote vectors,solid lines fermions,and dashed ones scalars.Heavy lines denote heavy excitations.couplings;in this case O3will be generated by heavy fermion exchanges only10 Notefinally that these arguments are only valid for weakly coupled heavy physics.For strongly coupled theories other CP violating operators can be important,e.g.f10It is true that vertices involving light fermions,light scalars and heavy fermions produce mixings between the light and heavy scales,but this occurs at the one loop level.In contrast cubic terms of orderΛin the scalar potential would shift v at tree level.。

Rayleigh--fading channels employing

Rayleigh--fading channels employing

On the Limits of Coded Transmission over Fading Channels withCDMAAlexander Lampe,and Johannes B.HuberTelecommunications Institute II,University of ErlangenCauerstraße7/NT,D–91058Erlangen,GermanyPhone:+49-9131-8527114,Fax:+49-9131-8528919,Email:alampe@lnt.deIn this paper we investigate the synchronous transmission over time–variant multi-path Rayleigh–fading channels employing Direct Se-quence Code Division Multiple Access(DS–CDMA). Assuming ideal knowledge of the actual channel state,we introduce an equivalent transmission model for the case of high processing gain.Based on this model an analytical solution for the spectral effi-ciency achievable by application of various nonlinear receivers can be given.Here,we consider the appli-cation of linear interference suppression by means of MMSEfilters combined with successive cancellation and single user decoding.We show that in this way the system’s spectral efficiency approaches that of the A WGN channel if the number of users is much larger than the spreading factor.This holds for the flat Rayleigh fading channel as well as for frequency selective channels with a large number of propaga-tion paths.I.I NTRODUCTIONRecently,the search for information theoretical bounds regarding the transmission of users with code division multiple access(CDMA)to a single receiver has attracted considerable attention.So,in[1,2]the sys-tem’s capacity resulting for several linear multiuser re-ceivers with and without decision feedback was derived for synchronous transmission over additive white Gaus-sian noise(AWGN)channels.Further,for transmission over attenuated single path channels with constant path gains which are known at the receiver,it was shown that linear interference suppression by means of an MMSE filter combined with single user decoding and succes-sive interference cancellation achieves the same capac-ity as the overall joint optimum decoder(see[3,2,4]). Next,assuming a random distribution of the users’pow-ers as well as random spreading sequences it was shown by Tse and Hanly[5]that the signal to noise ratio at the output of a linear MMSEfilter reaches a nonrandom limit under the condition of infinite processing gain and constant system load.This result was derived from the distribution of eigenvalues of large random co-variance matrices(see also[6]).In this work we address the problem of reliable trans-mission over frequency selective fading channels with synchronous CDMA.Taking into account the practi-cal relevance of large spreading factors for forthcom-ing communications systems an equivalent model for the synchronous transmission over multipath fading chan-nels with randomly chosen spreading sequences is de-rived for the limit.Based on this model and as-suming perfect channel state information a closed so-lution on the system’s spectral efficiency achievable with MMSE uccessivei ancellation(MMSE–SIC)is given.Withthis analytical result,the influence of the system load and the number of propagation paths on the spec-tral efficiency is studied.In addition,we show that thespectral efficiency of the optimal joint decoder can be achieved by MMSE–SIC in the case of fading channels with known path gains,too.The paper is arranged as follows.In Section II,the ac-tual and the equivalent transmission model for high pro-cessing gain are given.Based on the equivalent model, the capacity for CDMA employing MMSE–SIC at the receiver is derived in Section III.Finally,Section IV points out conclusions.II.T RANSMISSION M ODELSWe consider the transmission of users over in-dependent frequency selective channels with CDMA to a single receiver.Assuming that all users transmit synchronously the underlying discrete–time equivalent complex baseband transmission model is illustrated in Fig.1a.Fig.1:a)Transmission model of CDMA system with users b)Tapped–delay–line channel model with paths. Here,anddenote the th user’s transmitted channel symbol and his/her spreading sequence in the th transmission interval,respectively.The users’channel symbols are chosen from the set and the elements of the unit energy spreading sequence are drawn randomly aspaths.is the length of one chip interval.As usual we assume the path gainsto be independent and identically zero mean proper complex Gaussian distributed with variance,Thus,in order to simplify the derivations we restrict our consider-ations to so–called equal gain channels.1Supposing the path weights to be constant over one transmis-sion interval the received signal resulting fromcan be written as, where the th user’s effective spreading sequenceresults from the convolution of the actual spreading sequenceand the channel’s impulse response i.e.,.In the following we look at the behavior of this multi-ple access system in the case of large spreading factors. In fact,we consider the situation while keeping the physical bandwidth allocated to each user constant, what means that the data symbol interval is considerably longer than the chip duration.Of course,this implies that the data rate of each single user is relatively small compared to total system rate for sufficiently large load .However,it is to be expected that the data rates re-quired in the uplink of future mobile CDMA systems will be much lower than in the downlink.So,this as-sumption can be used to evaluate analytically the sys-tem’s performance describing the uplink scenario in mo-bile telecommunication systems.But,before we can proceed with that calculation we have to simplify our transmission modelfirst.For this, we consider the th transmission interval.The average expected power of the intersymbol interference(ISI)due to a single user’s transmission in the st as well as st interval affecting the th data symbol interval of interest is obtained as1and denote the expectation value and the absolute mag-nitude of,respectively.2Note,that the number of paths needed to be resolved in this system is limited since isfixed.The dimensional vectorrepresents the additive channel noise,where the i.i.d.samplesare zero mean complex Gaussian variables with variance .Further,andconsist of the users’transmitted symbols and their effective spreading sequences as well.It can be seen that the ratio of each user’s signal en-ergy contained in thefirst as well as last samples of the received signal to the whole energy transmitted by each user in one symbol interval tends to zero for rising.In addition,the same holds for the energy of the channel noise while the relation between the total channel noise power and signal power as well as the multiuser interference in each transmission interval remains unchanged.So,focusing on the elementswefind that their distribution(and hereby the system’s spectral efficiency)is not changed for if each element is randomly chosen as(3)where is zero complex Gaussian distributed with variance and(4)Here,denotes the unit step function.Further, the independent and identically distributed elements of can be randomly chosen from any arbitrary com-plex distribution with zero mean and variance.So, for we end up with the model(5)Note it is still assumed that andare perfectly known to the receiver in each transmission interval.Further,it should be stressed that this model is valid even forfinite processing gains in the cases of synchronous transmission over aflat Rayleigh fading channel as well as the additive white Gaussian noise channel.III.C APACITY OF CDMA APPLYING MMSE–SIC In order to maximize the mutual informationof the considered multiple access system all channel symbols have to be drawn i.i.d.from a Gaussian dis-tribution with equal power,i.e.,.We investigate the spectral efficiency achievable by applica-tion of appropriately chosen MMSEfilters in conjunc-tion with successive cancellation and single user decod-ing at the receiver.For this,wefirst determine the signalto noise ratio(SNR)provided at the output of an MMSE filter for a specific user,supposing that the number of uncancelled interfering users is.Let us consider a system with users defining user as user of interest.Then,the signal to noise ratio in the th transmission interval at the output of an MMSE–filter adapted to user is[9]where denotes the identity matrix and the con-jugate transpose.Now,for(implying) and regarding that the users amplitudes are drawn from identical wide sense stationary random pro-cesses the empirical distribution of the eigenvalues ofconverges for each to.So,we get with Theorem3.1of[5]for the signal to noise ratio of user(see also[6])(6)(7)Introducing the normalized signal to noise ratio depend-ing on the ratio and being de-noted as(9)it-eratively.Equipped with(11)(13) where.To illustrate the above result we consider the case ,i.e.,the single path Rayleigh fading channel and the case which is well–known to be equivalent to the transmission over an AWGN channel.First,for Eq.(10)simplifies to(14)whereas for using( denotes the Dirac impulse)is solvedThus,for the required SNR can be obtained ex-plicitly only relying on the load and the signal as wellas noise power and,respectively(see also[5],[1])(17)(18) and integration with respect to(see Eq.(13))leads tothe desired spectral efficiencies(19)log(22) Now,assuming again while const.it turns out that approaches in probability(23)log.Thus,we get for(24) Finally,averaging over the various transmission inter-vals we solve(25)More explicitly,the spectral efficiency of the opti-mum decoder can be achieved by MMSE–SIC with sin-gle user decoding for time variant channels supposing perfect channel knowledge,too.So,it can be shown that the result given in Eq.(20)is equal to that found in[1] for the AWGN channel assuming equal power users anda closed analytical solution of the integral is provided in[11].In Fig.2the spectral efficiencies versus power effi-ciency given in terms of the required energy per bit to noise ratio for with as well as the analytical solution for aredepicted.[dB]Fig.2:Spectral Efficiency vs.forwith and Shannon Bound.The plot shows that the spectral efficiencies depend strongly on the chosen loadan orthogonal transmission scheme the corresponding curve is given,too.The spectral efficiency resulting for an orthogonal scheme is[12](27)Fig.3:Spectral Efficiency vs.forwith,orthogonal system and Shan-non Bound.Thefigure shows that for rising the spec-tral efficiency of synchronous CDMA employing ran-dom spreading sequences for transmission over aflat Rayleigh fading channel converges to the Shannon bound.It has to be emphasized that this is not caused by using the equivalent transmission model,as Eq.(5) represents exactly the actual system for the casefor all values of.Instead,this result can be explained by the fact that the power allocated to each dimension of the dimensional space spanned by the spreading se-quences reaches an invariant limit for being equal for all dimensions.Considering,it can be seen that it is only marginally larger than with. Thus,a relatively small overload suffices to outperform orthogonal transmission.Finally,in Fig.4the spectral efficiency for andis depicted.Studying the curves given in thisfigure wefind again that the increase of is very moderate for rising. Regarding the previous results this stresses again the dif-ference of the two parameters and.While al-lows a transmission close to the Shannon bound the gap to the Shannon bound does not vanish even for if isfinite.Fig.4:Spectral Efficiency vs.forload with and ShannonBound.IV.C ONCLUSIONSIn this paper we derived an analytical formula for the efficiency achievable in the limit by syn-CDMA systems employing MMSE–SIC com-with single user decoding as well as randomlyspreading sequences.Equipped with this for-mula we studied the influence of system load as well as number of propagation paths on the system’s spec-tral efficiency.It was shown that the gain resulting from resolving more and more paths supposing an equal gain channel model is quite moderate.In contrast the increase in spectral efficiency due to rising load is significant.So, the results indicate that independently of the Shannon bound is reached for.Moreover,we found that for transmission over fading channels and loadthe spectral efficiency of a nonorthogonal CDMA sys-tem with equal power users can exceed that of an or-thogonal access scheme.In addition,imposing a limit on the total transmit power the superiority of nonorthog-onal CDMA compared to an orthogonal access scheme can be shown,too.In the same way as above it is also possible to de-rive the spectral efficiencies of other nonlinear multiuser receivers like the matchedfilter/decorrelating decision feedback receiver.However,as these receivers are not as good as the MMSE–SIC scheme except for some special cases this has been omitted here.Finally,it is worth pointing out that the possibility to ignore the intersymbol interference for while can also be shown by calculating the capacitiesof a CDMA systems with processing gains where the first and last received samples in each trans-mission interval are dumped leading to a lower bound as well as upper bound on the actual system’s capacity which merge for.V.R EFERENCES[1]S.Verd´u and S.Shamai(Shitz),“Spectral effi-ciency of CDMA with random spreading,”IEEE Transactions on Information Theory,vol.45, pp.622–640,Mar.1999.[2]R.M¨u ller,Power and Bandwidth Efficiency of Mul-tiuser Systems with Random Spreading.Aachen: Shaker–Verlag,1999.[3]M.K.Varanasi and T.Guess,“Achieving verticesof the capacity region of the synchronous Gaus-sian correlated–waveform multiple–access chan-nel with decision–feedback receivers,”in Proc.of IEEE International Symposium on Information Theory(ISIT),(Ulm,Germany),p.270,June/July 1997.[4]mpe,R.R.M¨u ller,and J.B.Huber,“Trans-mit Power Allocation for Gaussian Multiple Ac-cess Channels with Diversity,”in Proc.of IEEE In-formation Theory Workshop(ITW),(South Africa), p.101,June1999.[5]D.Tse and S.Hanly,“Linear multiuser receivers:Effective interference,effective bandwidth and ca-pacity,”IEEE Transactions on Information Theory, vol.45,pp.641–657,Mar.1999.[6]J.W.Silverstein and Z.Bai,“On the empirical dis-tribution of eigenvalues of a class of large dimen-sional random matrices,”Journal of Multivariate Analysis,vol.54,pp.175–192,1995.[7]J.S.Evans and D.N.Tse,“Linear multiuser re-ceivers for multipath fading channels,”in Proc.of IEEE Information Theory Workshop(ITW),(South Africa),pp.30–32,June1999.[8]A.Papoulis,Probability,Random Variables,andStochastic Processes.New York:McGraw–Hill, 3rd ed.,1991.[9]S.Haykin,Adaptive Filter Theory.EnglewoodCliffs,NJ:Prentice–Hall,3rd ed.,1996.[10]S.S.(Shitz)and A. D.Wyner,“Information–theoretic considerations for symmetric,cellular,multiple–access fading channels—Part I,”IEEE Transactions on Information Theory,vol.43, pp.1877–1894,Nov.1997.[11]P.Rapajic and D.Popescu,“Derivation of theclosed form information capacity equation of the random signature multiple–input multiple–output gaussian channel,”in Proc.of IEEE Information Theory Workshop(ITW),(South Africa),p.96, June1999.[12]I.Gradshteyn and I.Ryzhik,Table of Integrals,Se-ries,and Products.New York:Academic Press, 5th ed.,1994.。

eigenval解读 -回复

eigenval解读 -回复

eigenval解读-回复Eigenvalues and eigenvectors are essential concepts in linear algebra and have extensive applications in various fields, such as physics, engineering, computer science, and statistics. In this article, we will delve into the topic of eigenvalues and provide astep-by-step explanation to enhance our understanding.First, let's start by defining what eigenvalues and eigenvectors are. An eigenvector of a square matrix A is a non-zero vector "v" such that when A is multiplied by "v," the result is a scaled version of "v." In other words, Av = λv, where λis the eigenvalue associated with "v."1. Intuitive Explanation:To better grasp the concept, let's visualize the transformation of a matrix A on a vector "v" geometrically using a simple 2x2 matrix. Consider a matrix A that acts as a transformation on the space, and "v" is a vector in that space. When A is applied to "v," it stretches, rotates, or skews "v."Now, if there exists a vector "v" that only stretches but does not change direction after applying A, we call it an eigenvector. Thefactor by which the vector stretches is the corresponding eigenvalue. Hence, eigenvalues represent the scaling factor associated with each eigenvector.2. Calculation of Eigenvalues:To calculate eigenvalues, we solve the characteristic equation, det(A - λI) = 0, where A is the given matrix, λis the eigenvalue, and I is the identity matrix of the same size as A. This equation arises from the fact that (A - λI)v = 0 for an eigenvector "v."Let's consider an example to illustrate the calculation. Suppose we have a 2x2 matrix A = [a b; c d]. To find the eigenvalues, we solve the equation det(A - λI) = 0, which becomes:(ad - λ)(λ- tr(A)) - bc = 0, where tr(A) represents the trace of A (the sum of diagonal elements).Simplifying this equation, we get a quadratic equation in λ: λ²- tr(A)λ+ det(A) = 0. Solving this equation yields the eigenvalues.3. Properties and Interpretation:Eigenvalues possess several important properties. Firstly, the sum of eigenvalues equals the trace of the matrix (tr(A)). Secondly, theproduct of eigenvalues equals the determinant of the matrix (det(A)). These properties hold true for square matrices of any size.Eigenvalues also have essential implications in various fields. In physics, eigenvalues represent the energy levels of a system, and eigenvectors correspond to the stationary states. For example, in quantum mechanics, the wave function of a particle can be expressed as a linear combination of eigenvectors of the Hamiltonian operator.In engineering applications, eigenvalues play a crucial role in structural dynamics and vibrations. By computing the eigenvalues of a structure, we can determine the natural frequencies and mode shapes, which help design robust and stable structures.Eigenvalues find applications in image recognition, where eigenfaces are used to represent facial features and classify images. Similarly, in recommendation systems, we can use eigenvectors to represent user preferences and suggest personalized recommendations.4. Eigendecomposition:Another essential concept related to eigenvalues is the eigendecomposition of a matrix. Eigendecomposition decomposes a matrix A into a product of eigenvectors and diagonal matrix. It can be expressed as A = PDP^(-1), where P is a matrix with eigenvectors as columns, and D is a diagonal matrix with corresponding eigenvalues.Eigendecomposition allows us to simplify calculations involving matrix powers, exponentials, and matrix inverse. Moreover, it provides insight into the underlying structure of the matrix, revealing important patterns and relationships.In summary, eigenvalues and eigenvectors are fundamental concepts in linear algebra that have vast applications in various disciplines. Understanding these concepts and their applications can facilitate efficient problem-solving and provide valuable insights into the behavior and properties of matrices and systems.。

A Fast Leading Eigenvector Approximation for Segmentation and Grouping

A Fast Leading Eigenvector Approximation for Segmentation and Grouping

based grouping algorithms more efficient. The approxima-
tion is based on a linear perturbation analysis and applies
to matrices that are non-sparse, non-negative and symmet-
Ç form at a cost of ´ቤተ መጻሕፍቲ ባይዱƵ.
In order to obtain the vector ¨ we re-write equation 3
using the expansion
¼
½
´ ½ ½µ
´ ½ Ƶ
¼ ´ ½µ¾
½
½Æ
... ´ Æ ½µ
...
...
´ Æ Æµ
...
...
our method on image segmentation problems.
1 Introduction
Recently, there has been considerable interest in the use of graph-spectral [1] methods for computer vision for segmentation and grouping [2, 3, 4]. These methods all share the feature that they use the eigenvectors of a weighted adjacency matrix to locate salient groupings of objects. Although elegant, one of the criticisms that can be leveled at these methods is that they are computationally demanding because they rely on the numerical determination of eigenvectors of large matrices. The problem of computing the eigenvalues and eigenvectors of a matrix is one of classical linear algebra which arises in many practical problems in science and engineering. However, it is frequently the determination of the leading eigenvector which turns out to be of pivotal importance. The reason for this is that it is intimately related to the Raleigh Quotient [3, 5].

derivative derivative equation analysis

derivative derivative equation analysis

Derivative derivative analysis, also known as second-order differentiation or double differentiation, involves taking the derivative of a function with respect to one variable and then taking the derivative of the resulting expression with respect to another variable. This analysis is commonly used in calculus to study the rates of change of rates of change, such as acceleration or curvature.Let's consider a function \( f(x) \) and its derivative \( f'(x) \). The derivative of \( f'(x) \) with respect to \( x \) is called the second derivative of \( f(x) \) and is denoted as \( f''(x) \) or \( \frac{d^2f}{dx^2} \).Here's a step-by-step explanation of how to find the second derivative:1. **Find the first derivative**: Differentiate \( f(x) \) with respect to \( x \) to find \( f'(x) \).2. **Find the second derivative**: Differentiate \( f'(x) \) with respect to \( x \) to find \( f''(x) \). For example, let's find the second derivative of the function \( f(x) = x^3 \):1. **First derivative**: \( f'(x) = 3x^2 \)2. **Second derivative**: \( f''(x) = \frac{d}{dx}(3x^2) = 6x \)The second derivative of \( f(x) = x^3 \) is \( f''(x) = 6x \).The second derivative provides information about the concavity of the graph of the function. If \( f''(x) > 0 \), the graph is concave up (like a "U"), and if \( f''(x) < 0 \), the graph is concave down (like an "n"). A second derivative of zero at a specific \( x \) value indicates a possible inflection point, where the concavity of the graph changes.Derivative derivative analysis can be extended to higher orders, where you take the derivative of a function with respect to one variable multiple times. Each subsequent derivative provides increasingly detailed information about the function's behavior.。

Algebraic eigenvalue problems

Algebraic eigenvalue problems

6.0. Introduction 113 Chapter 6Algebraic eigenvalue problemsDas also war des Pudels Kern! G OETHE.6.0. IntroductionDetermination of eigenvalues and eigenvectors of matrices is one of the most important problems of numerical analysis. Theoretically, the problem has been reduced to finding the roots of an algebraic equation and to solving linear homogeneous systems of equations. In practical computation, as a rule, this method is unsuitable, and better methods must be applied.When there is a choice between different methods, the following questions should be answered:(a)Are both eigenvalues and eigenvectors asked for, or are eigenvalues alone sufficient?(b)Are only the absolutely largest eigenvalue(s) of interest?(c)Does the matrix have special properties (real symmetric, Hermitian, and so on)?If the eigenvectors are not needed less memory space is necessary, and further, if only the largest eigenvalue is wanted, a particularly simple technique can be used. Except for a few special cases a direct method for computation of the eigenvalues from the equation is never used. Further it turns out that practically all methods depend on transforming the initial matrix one way or other without affecting the eigenvalues. The table on p. 114 presents a survey of the most important methods giving initial matrix, type of transformation, and transformation matrix. As a rule, the transformation matrix is built up successively, but the resulting matrix need not have any simple properties, and if so, this is indicated by a horizontal line. It is obvious that such a compact table can give only a superficial picture; moreover, in some cases the computation is performed in two steps. Thus a complex matrix can be transformed to a normal matrix following Eberlein, while a normal matrix can be diagonalized following Goldstine-Horwitz. Incidentally, both these procedures can be performed simultaneously giving a unified method as a result. Further, in some cases we have recursive techniques which differ somewhat in principle from the other methods.It is not possible to give here a complete description of all these methods because of the great number of special cases which often give rise to difficulties. However, methods which are important in principle will be treated carefully114 Algebraic eigenvalue problems6.1. The power method 115 and in other cases at least the main features will be discussed. On the whole we can distinguish four principal groups with respect to the kind of transformation used initially:1.Diagonalization,2.Almost diagonalization (tridiagonalization),3.Triangularization,4.Almost triangularization (reduction to Hessenberg form).The determination of the eigenvectors is trivial in the first case and almost trivialin the third case. In the other two cases a recursive technique is easily established which will work without difficulties in nondegenerate cases. To a certain amount we shall discuss the determination of eigenvectors, for example, Wilkinson's technique which tries to avoid a dangerous error accumulation. Also Wielandt's method, aiming at an improved determination of approximate eigenvectors, will be treated.6.1. The power methodWe assume that the eigenvalues of are where Now we let operate repeatedly on a vector which we express as a linear combination of the eigenvectors(6.1.1) Then we haveand through iteration we obtain(6.1.2). For large values of the vectorwill converge toward that is, the eigenvector of The eigenvalue is obtained as(6.1.3) where the index signifies the component in the corresponding vector. The rate of convergence is determined by the quotient convergence is faster the116 Algebraic eigenvalue problemssmaller is. For numerical purposes the algorithm just described can be formulated in the following way. Given a vector we form two other vectors, and(6.1.4)The initial vector should be chosen in a convenient way, often one tries vector with all components equal to 1.E XAMPLEStarting fromwe find thatandAfter round-off, we getIf the matrix is Hermitian and all eigenvalues are different, the eigenvectors, as shown before, are orthogonal. Let be the vector obtained after iterations:We suppose that all are normalized:6.1. The power method 117 Then we haveandFurther,When increases, all tend to zero,and with, we get Rayleigh's quotient(6.1.5) ExampleWith andwe obtain for and 3,,and respectively, compared with the correct value The corresponding eigenvector isThe quotients of the individual vector components give much slower convergence; for example,The power method can easily be modified in such a way that certain other eigenvalues can also be computed. If, for example,has an eigenvalue then has an eigenvalue Using this principle, we can produce the two outermost eigenvalues. Further, we know that is an eigenvalue of and analogously that is an eigenvalue of If we know that an eigenvalue is close to we can concentrate on that, since becomes large as soon as is close toWe will now discuss how the absolutely next largest eigenvalue can be calculated if we know the largest eigenvalue and the corresponding eigenvector Let be the first row vector of and form(6.1.6)Here is supposed to be normalized in such a way that the first component is Hence the first row of is zero. Now let and be an eigenvalue and the corresponding eigenvector with the first component of equal to Then118 Algebraic eigenvalue problems we havesince and(note that the first component of as well as of is 1).Thus is an eigenvalue and is an eigenvector of Since has the first component equal to 0, the first column of is irrelevant, and in fact we need consider only the-matrix, which is obtained when the first row and first column of are removed. We determine an eigenvector of this matrix, and by adding a zero as first component, we get a vector Then we obtain from the relationMultiplying with we find and hence When and have been determined, the process, which is called deflation, can be repeated.E XAMPLEThe matrixhas an eigenvalue and the corresponding eigenvectoror normalized,Without difficulty we findNow we need consider onlyand we find the eigenvalues which are also eigenvalues of6.1. The power method 119 the original matrix The two-dimensional eigenvector belonging to isand henceSince we get andWith we findand Hence andand all eigenvalues and eigenvectors are known.If is Hermitian, we have when Now suppose thatand form(6.1.7) It is easily understood that the matrix has the same eigenvalues and eigenvectors as exceptwhich has been replaced by zero. In fact, we haveand and so on. Then we can again use the power method on the matrix120 Algebraic eigenvalue problems With the starting vectorwe find the following values for Rayleigh's quotient:and compared with the correct valueIf the numerically largest eigenvalue of a real matrix is complex,then must also be an eigenvalue. It is also clear that if is the eigenvector belonging to then is the eigenvector belonging toNow suppose that we use the power method with a real starting vectorThen we form with so large that the contributions from all the other eigenvectors can be neglected. Further, a certain component of is denoted by Thenwhere and the initial component of corresponding to is Hencewhere we have put Now we formHence(6.1.8) Then we easily findIn particular, if that is, if the numerically largest eigenvalues are ofthe form with real then we have the simpler formula(6.1.10)6.2. Jacobi's methodIn many applications we meet the problem of diagonalizing real, symmetric matrices. This problem is particularly important in quantum mechanics.In Chapter 3 we proved that for a real symmetric matrix all eigenvalues are real, and that there exists a real orthogonal matrix such that is diagonal. We shall now try to produce the desired orthogonal matrix as a— product of very special orthogonal matrices. Among the off-diagonal elements6.2. Jacobi's method 121 we choose the numerically largest element:The elementsand form a submatrix which can easily be transformed to diagonal form. We putand get(6.2.1) Now choose the angle such that that is, tan This equationgives 4 different values of and in order to get as small rotations as possible we claimPuttingandwe obtain:since the angle must belong to the first quadrant if tan and to the fourth quadrant if tan Hence we have for the anglewhere the value of the arctan-function is chosen between After a few simple calculations we get finally:(6.2.2)(Note that andWe perform a series of such two-dimensional rotations; the transformation matrices have the form given above in the elements and and are identical with the unit matrix elsewhere. Each time we choose such values and that We shall show that with the notation the matrix for increasing will approach a diagonal122 Algebraic eigenvalue problems matrix with the eigenvalues of along the main diagonal. Then it is obvious that we get the eigenvectors as the corresponding columns of since we have that is, Let be the column vector of and the diagonal element of Then we haveIf is denoted by we know from Gershgorin's theorem that for some value of and if the process has been brought sufficiently far, every circle defined in this way contains exactly one eigenvalue. Thus it is easy to see when sufficient accuracy has been attained and the procedure can be discontinued.The convergence of the method has been examined by von Neumann and Goldstine in the following way. We put and, as before,The orthogonal transformation affects only the row and column and the row and column. Taking only off-diagonal elements into account, we find for and relations of the formand hence Thus will be changed only through the cancellation of the elements and that is,Since was the absolutely largest of all off-diagonal elements, we haveandHence we get the final estimate,(6.2.3)After iterations,has decreased with at least the factor and for a sufficiently large we come arbitrarily close to the diagonal matrix containing the eigenvalues.In a slightly different modification, we go through the matrix row by row performing a rotation as soon as Here is a prescribed tolerance which, of course, has to be changed each time the whole matrix has been passed. This modification seems to be more powerful than the preceding one. The method was first suggested by Jacobi. It has proved very efficient for diagonalization of real symmetric matrices on automatic computers.6.2. Jacobi's method 123 ExampleChoosing we obtain, tan andAfter the first rotation, we haveHere we take and obtain tan andAfter the second rotation we haveand after 10 rotations we haveAfter rotations the diagonal elements are and while the remaining elements are equal to to decimals accuracy. The sum of the diagonal elements is and the product in good agreement with the exact characteristic equation:Generalization to Hermitian matrices, which are very important in modern physics, is quite natural. As has been proved before, to a given Hermitian matrix we can find a unitary matrix such that becomes a diagonal matrix. Apart from trivial factors, a two-dimensional unitary matrix has the formA two-dimensional Hermitian matrix124 Algebraic eigenvalue problems is transformed to diagonal form by wherePutting we separate the real and imaginary parts and then multiply the resulting equations, first by and then by and and finally add them together. Using well-known trigonometric formulas, we get(6.2.4) In principle we obtain from the first equation and then can be solved from the second. Rather arbitrarily we demand and hencewhereSince the remaining equation has the solutionwith and Now we want to choose according to in order to get as small a rotation as possible which impliesThe following explicit solution is now obtained (note that and cannot both be equal to because then would already be diagonal):(6.2.5) As usual the value of the arctan-function must be chosen between and6.3. Givens' method 125The element can now be writtenand consequently:(6.2.6) If we get and recover the result in Jacobi's method.This procedure can be used repeatedly on larger Hermitian matrices, where the unitary matrices differ from the unit matrix only in four places. In the places and we introduce the elements of our two-dimensional matrix. The product of the special matrices is a new unitary matrix approaching when is increased.Finally we mention that a normal matrix (defined through can always be diagonalized with a unitary matrix. The process can be performed following a technique suggested by Goldstine and Horwitz which is similar to the method just described for Hermitian matrices. The reduction of an arbitrary complex matrix to normal form can be accomplished through a method given by Patricia Eberlein In practice, both these processes are performed simultaneously.6.3. Givens' methodAgain we assume that the matrix is real and symmetric. In Givens' method we can distinguish among three different phases. The first phase is concerned with orthogonal transformations, giving as result a band matrix with unchanged characteristic equation. In the second phase a sequence of, functions is generated, and it is shown that it forms a Sturm sequence, the last member of which is the characteristic polynomial. With the aid of the sign changes in this sequence, we can directly state how many roots larger than the inserted value the characteristic equation has. By testing for a number of suitable values we can obtain all the roots. During the third phase, the eigenvectors are computed. The orthogonal transformations are performed in the following order. The elements and define a two-dimensional subspace, and we start by performing a rotation in this subspace. This rotation affects all elements in the second and third rows and in the second and third columns. However, the quantity defining the orthogonal matrixis now determined from the condition and not, as in Jacobi's method, by We have and The next rotation is performed in the (2, 4)-plane with the new126 Algebraic eigenvalue problemsdetermined from that is, tan that the element was changed during the preceding Now all elements in the second and fourth rows and in the second and fourth columns are changed, and it should be particularly observed that the element is not affected. In the same way, we make the elements equal to zero by rotations in the-planes.Now we pass to the elements and they are all set to zero by rotations in the planesDuring the first of these rotations, the elements in the third and fourth rows and in the third and fourth columns are changed, and we must examine what happens to the elements and which were made equal to zero earlier. We findFurther, we get and By now the procedure should be clear, and it is easily understood that we finally obtain a band matrix, that is, such a matrix that In this special case we have Now we put(6.3.1)has been obtained from by a series of orthogonal transformations,with In Chapter it was proved that and have the same eigenvalues and further that, if is an eigenvector of and an eigenvector of(both with the same eigenvalue), then we have Thus the problem has been reduced to the computation of eigenvalues and eigenvectors of the band matrixWe can suppose that all otherwise could be split into two determinants of lower order Now we form the following sequence of functions:(6.3.2)with and We find at once that which can be interpreted as the determinant of the-element in the matrix6.3. Givens' method 127 Analogously, we have which is the-minor ofBy induction, it is an easy matter to prove that is the characteristic polynomial.Next we shall examine the roots of the equation For we have the only root.For we observe that, Hence we have two real roots and with, for example,For we will use a method which can easily be generalized to an induction proof. Then we write and obtain from (6.3.2):Now it suffices to examine the sign of in a few suitable points:We see at once that the equation has three real roots and such thatIn general, if has the roots and the roots thenwhereBy successively putting and we find that has different signs in two arbitrary consecutive points. Hence has real roots, separated by the roots ofWe are now going to study the number of sign changes in the sequenceIt is evident that and Suppose that and are two such real numbers that in the closed interval Then obviously First we examine what happens if the equation has a root in the interval. From it follows for thatHence and have different signs, and clearly this is also true in an interval Suppose, for example, that then we may have the following combination of signs:Hence, the number of sign changes does not change when we pass through a root of When however, the situation is different.128 Algebraic eigenvalue problemsSuppose, for example, that& odd. Denoting the roots of by and the roots of by we haveThen we see that Now we let increase until it reaches the neighborhood of |where we find the following scheme:Hence Then we let increase again (now a sign change of may appear, but, as shown before, this does not affect until we reach the neighborhood of where we haveand hence Proceeding in the same way through all the rootswe infer that the number of sign changes decreases by one unit each time a root is passed. Hence we have proved that if is the number of eigenvalues of the matrix which are larger than then(6.3.3) The sequence is called a The described technique makes it possible to compute all eigenvalues in a given interval ("telescope method").For the third phase, computation of the eigenvectors, we shall follow J. H. Wilkinson in Let be an exact eigenvalue of Thus we search for a vector such that Since this is a homogeneous system in variables, and since we can obtain a nontrivial solution by choosing equations and determine the components of(apart from a constant factor); the remaining equation must then be automatically satisfied. In practical work it turns out, even for quite well-behaved matrices, that the result to a large extent depends on which equation was excluded from the Essentially, we can say that the serious errors which appear on an unsuitable choice of equation to be excluded depend on numerical compensations; thus round-off errors achieve a dominant influence.Let us assume that the equation is excluded, while the others are solved by elimination. The solution (supposed to be exact) satisfies the equations used for elimination but gives an error when inserted into the6.3. Givens' method 129Actually, we have solved the system(We had to use an approximation instead of the exact eigenvalue.) Since constant factors may be omitted, this system can be written in a simpler way:(6.3.5)where is a column vector with the component equal to and the others equal to If the eigenvectors of are this vector can be expressed as a linear combination, that is,(6.3.6) and from (6.3.5) we get(6.3.7) Now let and we obtain(6.3.8)Under the assumption that our solution approaches as(apart from trivial factors). However, it may well happen that is of the same order of magnitude as(that is, the vector is almost orthogonal to), and under such circumstances it is clear that the vector in (6.3.8) cannot be a good approximation of Wilkinson suggests that (6.3.5) be replaced by(6.3.9)where we have the vector at our disposal. This system is solved by Gaussian elimination, where it should be observed that the equations are permutated properly to make the pivot element as large as possible. The resulting system is written:(6.3.10)As a rule, most of the coefficients are zero. Since the have been obtained from the which we had at our disposal, we could as well choose the constants deliberately. It seems to be a reasonable choice to take all130 Algebraic eigenvalue problems equal to no eigenvector should then be disregarded. Thus we choose(6.3.11) The system is solved, as usual, by back-substitution, and last, the vector is normalized. Even on rather pathological matrices, good results have been obtained by Givens' method.6.4. Householder's methodThis method, also, has been designed for real, symmetric matrices. We shall essentially follow the presentation given by Wilkinson The first step consists of reducing the given matrix to a band matrix. This is done by orthogonal transformations representing reflections. The orthogonal matrices, will be denoted by with the general structure(6.4.1) Here is a column vector such that(6.4.2) It is evident that is symmetric. Further, we havethat is,is also orthogonal.The matrix acting as an operator can be given a simple geometric interpretation. Let t operate on a vector from the left:In Fig. 6.4 the line is perpendicular to the unit vector in a plane defined by and The distance from the endpoint of to is and the mapping means a reflection in a plane perpendicular toFigure 6.46.4. Householder's method 131 Those vectors which will be used are constructed with the first components zero, orWith this choice we form Further, by (6.4.2) we haveNow put and form successively(6.4.3)At the first transformation, we get zeros in the positionsand in the corresponding places in the first column. The final result will become a band matrix as in Givens' method. The matrix contains elements in the row, which must be reduced to zero by transformation with this gives equations for theelements and further we have the condition that the sum of the squares must beWe carry through one step in the computation in an example:The transformation must now produce zeros instead of and Obviously, the matrix has the following form:Since in the first row of only the first element is not zero, for example, the -element of can become zero only if the corresponding element is zero already in Puttingwe find that the first row of has the following elements:Now we claim that(6.4.4) Since we are performing an orthogonal transformation, the sum of the squares132 Algebraic eigenvalue problems of the elements in a row is invariant, and hencePutting we obtain(6.4.5) Multiplying (6.4.5) by and (6.4.4) by and we getThe sum of the first three terms is and further Hence(6.4.6) Inserting this into (6.4.5), we find that and from (6.4.4), andIn the general case, two square roots have to be evaluated, one for and one for Since we have in the denominator, we obtain the best accuracy if is large. This is accomplished by choosing a suitable sign for the square-root extraction for Thus the quantities ought to be defined as follows:(6.4.7) The sign for this square root is irrelevant and we choose plus. Hence we obtain for and(6.4.8) The end result is a band matrix whose eigenvalues and eigenvectors are computed exactly as in Givens' method. In order to get an eigenvector of an eigenvector of the band matrix has to be multiplied by the matrix this should be done by iteration:(6.4.9) 6.5. Lanczos' methodThe reduction of real symmetric matrices to tridiagonal form can be accomplished through methods devised by Givens and Householder. For arbitrary matrices a similar reduction can be performed by a technique suggested by Lanczos. In this method two systems of vectors are constructed, and which are biorthogonal; that is, for we have6.5. Lanczos' method 133 The initial vectors and can be chosen arbitrarily though in such a way that The new vectors are formed according to the rulesThe coefficients are determined from the biorthogonality condition, and for we form:If we getAnalogouslyLet us now consider the numerator in the expression for whenbecause of the biorthogonality. Hence we have for and similarly we also have under the same condition. In this way the following simpler formulas are obtained:If the vectors are considered as columns in a matrix and if further a tridiagonal matrix is formed from the coefficients and with one's in the remaining diagonal:then we can simply write and provided the vectors are linearly independent134 Algebraic eigenvalue problemsIf similar matrices are formed from the vectors and from the coefficientswe getCertain complications may arise, for example, that some or may become zero, but it can also happen that even if and The simplest way out is to choose other initial vectors even if it is sometimes possible to get around the difficulties by modifying the formulas themselves.Obviously, Lanczos' method can be used also with real symmetric or Hermi tian matrices. Then one chooses just one sequence of vectors which must form an orthogonal system. For closer details, particularly concerning the determination of the eigenvectors, Lanczos' paper should be consulted; a detailed discussion of the degenerate cases is given by Causey and GregoryHere we also mention still one method for tridiagonalization of arbitrary real matrices, first given by La Budde. Space limitations prevent us from a closer discussion, and instead we refer to the original paper6.6. Other methodsAmong other interesting methods we mention the method. Starting from a matrix we split it into two triangular matrices with and then we form Since the new matrix has the same eigenvalues as Then we treat in the same way as and so on, obtaining a sequence of matrices which in general converges toward an upper triangular matrix. If the eigenvalues are real, they will appear in the main diagonal. Even the case in which complex eigenvalues are present can be treated without serious complications. Closer details are given in where the method is described by its inventor, H. Rutishauser. Here we shall also examine the more general eigenvalue problem,where and are symmetric and, further,is positive definite. Then we can split according to where is a lower triangular matrix. Henceand where Sincethe problem has been reduced to the usual type treated before.6.7. Complex matricesFor computing eigenvalues and eigenvectors of arbitrary complex matrices (also, real nonsymmetric matrices fall naturally into this group), we shall first discuss a triangularization method suggested by Lotkin and Greenstadt6.7. Complex matrices 135The method depends on the lemma by Schur stating that for each square matrix there exists a unitary matrix such that where is a (lower or upper) triangular matrix (see Section 3.7). In practical computation one tries to find as a product of essentially two-dimensional unitary matrices, using a procedure similar to that described for Hermitian matrices in Section 6.2. It is possible to give examples for which the method does not converge (the sum of the squares of the absolute values of the subdiagonal elements is not monotonically decreasing, cf.but in practice convergence is obtained in many cases. We start by examining the two-dimensional case and put(6.7.1)From we get Further, we suppose that whereand obtain(6.7.2) Clearly we have Claiming we find withand(6.7.3) Here we conveniently choose the sign that makes as small as possible; with and we get Hence is obtained directly from the elements and Normally, we must take the square root of a complex number, and this can be done by the formulawhere When has been determined, we get and from(6.7.4) Now we pass to the main problem and assume that is an arbitrary complex matrix We choose that element below the main diagonal which is largest。

Eigenvalue inequalities for Klein-Gordon Operators

Eigenvalue inequalities for Klein-Gordon Operators

2
The plan of attack is to use trace identities to derive universal spectral bounds and geometric spectral bounds for Hm,Ω . The generator of the Cauchy process, corresponding to the√ case m = 0, is often referred to as the fractional Laplacian and designated −∆. The latter is, unfortunately, ambiguous notation, √ since this operator is distinct from the operator −∆Ω as defined by the functional calculus for the Dirichlet Laplacian −∆Ω , except when Ω is all of Rd . For this reason we shall avoid the ambiguous notation when speaking of compact Ω. (For the spectral theorem and the functional calculus, see, e.g., [47].) Whereas several universal eigenvalue bounds, mostly of unknown or indifferent sharpness, have been obtained for higher-order partial differential operators such as the bilaplacian (e.g., [32,25,14,54,57]), and for some first-order Dirac operators [11], universal bounds for pseudodifferential operators appear not to have been studied before. In a final section we study interacting Klein-Gordon operators of the form H = Hm,Ω + V (x), (1.2)

参考文献1The Matrix Cookbook_157802274

参考文献1The Matrix Cookbook_157802274

. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
The Matrix Cookbook
[ ]
Kaare Brandt Petersen Michael Syskind Pedersen Version: November 14, 2008
What is this? These pages are a collection of facts (identities, approximations, inequalities, relations, ...) about matrices and matters relating to them. It is collected in this form for the convenience of anyone who wants a quick desktop reference . Disclaimer: The identities, approximations and relations presented here were obviously not invented but collected, borrowed and copied from a large amount of sources. These sources include similar but shorter notes found on the internet and appendices in books - see the references for a full list. Errors: Very likely there are errors, typos, and mistakes for which we apologize and would be grateful to receive corrections at cookbook@2302.dk. Its ongoing: The project of keeping a large repository of relations involving matrices is naturally ongoing and the version will be apparent from the date in the header. Suggestions: Your suggestion for additional content or elaboration of some topics is most welcome at cookbook@2302.dk. Keywords: Matrix algebra, matrix relations, matrix identities, derivative of determinant, derivative of inverse matrix, differentiate a matrix. Acknowledgements: We would like to thank the following for contributions and suggestions: Bill Baxter, Brian Templeton, Christian Rishøj, Christian Schr¨ oppel Douglas L. Theobald, Esben Hoegh-Rasmussen, Glynne Casteel, Jan Larsen, Jun Bin Gao, J¨ urgen Struckmeier, Kamil Dedecius, Korbinian Strimmer, Lars Christiansen, Lars Kai Hansen, Leland Wilkinson, Liguo He, Loic Thibaut, Miguel Bar˜ ao, Ole Winther, Pavel Sakov, Stephan Hattinger, Vasile Sima, Vincent Rabaud, Zhaoshui He. We would also like thank The Oticon Foundation for funding our PhD studies. 1

Monotone method for periodic boundary value problems of Caputo fractional differential equations

Monotone method for periodic boundary value problems of Caputo fractional differential equations

1. INTRODUCTION The derivative of an arbitrary order or fractional derivative has been introduced almost 300 years ago with a query posed by L’Hospital to Leibnitz. The fractional calculus was reasonably developed by 19th century. It was realized, only in the past few decades that these derivatives are better models to study physical phenomenon in transient state. This gave a fresh lease to this field and there is a growing interest to study the theory of fractional differential equations.[1, 3, 4, 5, 6, 8, 9, 10]. The monotone iterative technique [7] is an effective and flexible mechanism that offers theoretical, as well as constructive results in a closed set, namely, the sector. The generalized monotone iterative technique is a generalization and a refinement of the monotone method. In this paper, the PBVP for Caputo fractional differential equation is considered and the generalized monotone iterative technique is developed to cater to the situation where the function on the righthand side is split into two functions- a function that can be made into a non-decreasing function and a non-increasing function.

Eigenvalues of a real supersymmetric tensor

Eigenvalues of a real supersymmetric tensor

Abstract In this paper, we define the symmetric hyperdeterminant, eigenvalues and E-eigenvalues of a real supersymmetric tensor. We show that eigenvalues are roots of a one-dimensional polynomial, and when the order of the tensor is even, E-eigenvalues are roots of another one-dimensional polynomial. These two one-dimensional polynomials are associated with the symmetric hyperdeterminant. We call them the characteristic polynomial and the E-characteristic polynomial of that supersymmetric tensor. Real eigenvalues (E-eigenvalues) with real eigenvectors (E-eigenvectors) are called H-eigenvalues (Z-eigenvalues). When the order of the supersymmetric tensor is even, H-eigenvalues (Z-eigenvalues) exist and the supersymmetric tensor is positive definite if and only if all of its H-eigenvalues (Z-eigenvalues) are positive. An m th-order n -dimensional supersymmetric tensor where m is even has exactly n (m − 1)n −1 eigenvalues, and the number of its E-eigenvalues is strictly less than n (m − 1)n −1 when m ≥ 4. We show that the product of all the eigenvalues is equal to the value of the symmetric hyperdeterminant, while the sum of all the eigenvalues is equal to the sum of the diagonal elements of that supersymmetric tensor, multiplied by (m − 1)n −1 . The n (m − 1)n −1 eigenvalues are distributed in n disks in C. The centers and radii of these n disks are the diagonal elements, and the sums of the absolute values of the corresponding off-diagonal elements, of that supersymmetric tensor. On the other hand, E-eigenvalues are invariant under orthogonal transformations. © 2005 Elsevier Ltd. All rights reserved.

线性代数 英文讲义

线性代数 英文讲义

Ak = XD k X 1
k λ1 = X
λk 2
1 X O λk n
Example Let
2 3 A= 2 5
Determine whether the matrix is diagonalizable or not.
Example Let
3 1 2 A = 2 0 2 2 1 1
Let A be an n×n matrix and λ be a scalar. The following statements are equivalent: (a) λ is an eigenvalue of A. (b) ( A λI ) x = 0 has a nontrivial solution. (c) N ( A λI ) ≠ { 0 } (d) A λI is singular. (e) det( A λI ) = 0

The Product and Sum of the Eigenvalues
a11 λ a21 p (λ ) = det( A λI ) = M an1 a12 L a22 λ L an 2 a1n a2 n
L ann λ
Expanding along the first column, we get
1. The eigenvalues of a real symmetric matrix are all real. 2. If
λ1, 2, ,k are distinct eigenvalues of an λ Kλ
n×n real symmetric matrix A with corresponding eigenvectors x1, x2, …,xk, then x1, …, xk are orthogonal. 3. If A is a real symmetric matrix, then there is an orthogonal matrix U that diagonalizes A, that is, U-1AU=UTAU=D, where D is diagonal.

Covariant Time Derivatives for Dynamical Systems

Covariant Time Derivatives for Dynamical Systems

a r X i v :n l i n /0102038v 2 [n l i n .C D ] 23 J u l 2001Covariant Time Derivatives forDynamical SystemsJean-Luc ThiffeaultDepartment of Applied Physics and Applied MathematicsColumbia University,New York,NY 10027,USAPACS numbers:83.10.Bb,05.45.-a,47.50.+d 1Introduction In physics,choosing an appropriate coordinate system can make the difference between a tractable problem and one that defies analytical study.In fluiddynamics,two main types of coordinates are used,each representing a natural setting in which to study fluid motion:the Eulerian coordinates,also known as the laboratory frame,are time-independent and fixed in space;in contrast,the Lagrangian (or material)coordinates are constructed to move with fluid elements.In between these extremes,other types of coordinates are used,such as rotating coordinates in geophysical fluid dynamics.Such coordinates usually have a nontrivial spatial and temporal dependence.The situation becomes more complicated when dealing with moving surfaces:here the metric itself has intrinsic time dependence.This time dependenceincorporates the strain imposed on a2D surfaceflow as the surface deforms. To properly formulatefluid equations on thinfilms and other surfaces,one needs a covariant description,that is,a description of the building blocks of equations of motion—spatial and temporal derivatives—that obey tensorial transformation laws.There are other reasons than inherent deformation of the space to introduce a time-dependent,nontrivial metric.For instance,the advection-diffusion equa-tion can have an anisotropic,time-dependent diffusion tensor,perhaps arising from some inhomogeneous turbulent process.In that case,it is advantageous to use the diffusion tensor as a metric,for then the characteristic directions of stretching,given by the eigenvectors of the metric tensor in Lagrangian coordi-nates,correspond to directions of suppressed or enhanced diffusion associated with positive or negative Lyapunov exponents,respectively[1,2].Local physical quantities can be viewed as tensors(scalars,vectors,or higher-order tensors)evaluated alongfluid trajectories.For instance,we may be in-terested in how the temperature(scalar)of afluid element varies along a trajectory,or how the magneticfield(vector)associated with afluid element evolves.Characterising the evolution of these tensors in complicated coordi-nates is again best done using some form of covariant time derivative,also called an objective time derivative.The covariant spatial derivative is a familiar tool of differential geometry[3–5]. The emphasis is usually on covariance under coordinate transformations of the full space-time.Influid dynamics and general dynamical systems,however,the time coordinate is not included in the metric(though the metric components may depend on time),and the required covariance is less restrictive:we seek covariance under time-dependent transformations of the coordinates,but the new time is the same as the old and does not depend on the coordinates. Time derivatives lead to non-tensorial terms because of time-dependent basis vectors—the same reason that ordinary derivatives are not covariant.There are many ways of choosing a covariant time derivative.The most fa-miliar is the convective derivative introduced by Oldroyd[6,7]in formulating rheological equations of state.This derivative was then used by Scriven[8]to develop a theory offluid motion on an interface.The convective derivative of a tensor is essentially its Lie derivative along the velocity vector.In spite of its economical elegance,the convective derivative has drawbacks.Firstly,un-like the usual covariant spatial derivative,it is not compatible with the metric tensor.A compatible operator vanishes when acting on the metric.Because the covariant derivative also has the Leibniz property,compatibility allows the raising and lowering of indices“through”the operator.This is convenient for some applications[1],and implies that the equation of motion for a con-travariant tensor has the same form as the covariant one.A second drawbackof the convective derivative is that it involves gradients of the velocity,and so is not directional.The commutator of the convective derivative and the spatial derivative thus involves second derivatives of the velocity,requiring it to be at least of class C2.A second common type of derivative is the corotational or Jaumann derivative (See Refs.[9]and[10,p.342],and references therein),where the local vortic-ity of theflow is incorporated into the derivative operator.The corotational derivative is compatible with the metric,but like the convective derivative it depends on gradients of the velocity.The third type of derivative we discuss is a new,time-dependent version of the usual directional derivative along a curve used to define parallel transport[3–5].The curve here is the actual trajectory of afluid particle,with tangent vector given by the Eulerian velocityfield.The directional derivative does not depend on gradients of the velocityfield.The concept of time-dependent parallel transport can be introduced using this derivative,and is equivalent to a covariant description of advection without stretching.A directional derivative was introduced in the context offluid motion by Truesdell[11,p.42],but it does not allow for time-dependence in the coordinates or metric.(Truesdell calls it the material derivative because of its connexion tofluid elements.)In this paper,we present a unified derivation of these different types of covari-ant time derivatives.We do not restrict ourselves to Eulerian and Lagrangian coordinates,as this obscures the general covariance of the theory:both these descriptions lack certain terms that vanish because of the special nature of the coordinates.From a dynamical system defined in some Eulerian frame,we transform to general time-dependent coordinates.We thenfind a transforma-tion law between two time-dependent frames with no explicit reference to the Eulerian coordinates.The Eulerian velocity of theflow is not a tensor,but the move to general coordinates allows the identification of a velocity tensor that transforms appropriately(Section2).We also derive a time evolution equation for the Jacobian matrix of a coordinate transformation between two arbitrary time-independent frames.This time evolution equation facilitates the construction of the covariant time derivative in Section3.After a discus-sion of the rate-of-strain tensor in Section4,we present in Section5the three types of covariant time derivatives mentioned above:convective,corotational, and directional.Section6addresses a fundamental issue when dealing with generalised coor-dinates:the problem of commuting derivatives.In manipulatingfluid equa-tions it is often necessary to commute the order of time and space derivation. When commuting two covariant spatial derivatives,the Riemann curvature tensor must be taken into account.Similarly,when commuting a covariant time derivative with a spatial derivative,there arises a tensor we call thetime-curvature.This tensor vanishes for sufficiently simple time-dependence of the metric,and satisfies many properties similar to the Riemann tensor.Throughout this paper,we will usually refer to the“fluid,”“fluid elements,”and“velocity,”but this is merely a useful concretion.The methods developed apply to general dynamical systems where the velocity is some arbitrary vector field defined on a manifold.The covariant time derivative still refers to the rate of change of tensors along the trajectory,but the tensors do not necessarily correspond to identifiable physical quantities.For example,the covariant time derivative is useful in fornmulating methods forfinding Lyapunov exponents on manifolds with nontrivial metrics[1].2Time-dependent CoordinatesWe consider the dynamical system on an n-dimensional smooth manifold U,˙x=v(t,x),(1)where the overdot indicates a time derivative and v is a differentiable vector field.(For simplicity,we restrict ourselves to a given chart.)A solution x(t) defines a curve C in U with tangent v.We view the x as special coordinates, called the Eulerian coordinates,and denote vectors expressed in the Eulerian coordinate basis{∂/∂x i}by the indices i,j,k.A time-dependent coordinate change z(t,x)satisfies˙z a(t,x(t))=∂z a∂t x,(2)where the∂/∂t|x is taken at constant x.Here and throughout the rest of the paper,we assume the usual Einstein convention of summing over repeated indices.We denote vectors expressed in the general coordinate basis{∂/∂z a} by the indices a,b,c,d.We use the shorthand notation that the index on a vec-tor X characterises the components of that vector in the corresponding basis: thus X a and X i are the components of X in the bases{∂/∂z a}and{∂/∂x i}, respectively.The components X a and X i are also understood to be functions of z and x,respectively,in addition to depending explicitly on time.Defining v:=˙z,we can regard Eq.(2)as a transformation law for v,v a=∂z a∂t x.(3)This last term prevents v from transforming like a tensor.(We refer the reader to standard texts in differential geometry for a more detailed discussion oftensors[3–5].)Now consider a second coordinate system¯z(t,x),also defined in terms of x. We can use Eq.(3)and the chain rule to define a transformation law between z and¯z,v a−∂z a∂¯z¯a v¯a−∂¯z¯a∂t x,(5)which we call the velocity tensor.The velocity tensor is the absolute velocity of thefluid v with the velocity of the coordinates subtracted.In addition to the coordinates x,characterised by∂x i/∂t|x=0,we intro-duce another special set of coordinates,the Lagrangian coordinates a,de-fined by˙a=0.We denote vectors expressed in the Lagrangian coordinate basis{∂/∂a q}by the indices p and q.From Eq.(3),we have,v q(t,x(t))=∂a q∂t x=0.(6)The initial conditions for a are chosen such that Eulerian and Lagrangian coordinates coincide at t=0:a(0,x)=x.Lagrangian and Eulerian coordinates have the advantage that the time evolu-tion of their Jacobian matrix is easily obtained.The Jacobian matrix∂x i/∂a q satisfies[6]d∂a q =∂v i∂a q.(7) By using the identityd∂a q ∂a pdt(δq p)=0,which follows from the chain rule,and using the Leibniz property and Eq.(7), wefindd∂x i =−∂a q∂x i.The Leibniz property can be used again tofind the time evolution of the Jaco-bian matrix of two arbitrary time-dependent transformations z(t,x)and¯z(t,x),d∂¯z¯a =∂v a∂¯z¯a−∂z a∂¯z¯a.(8)All reference to Eulerian and Lagrangian coordinates has disappeared from Eq.(8);this equation is crucial when deriving the covariant time derivative of Section3.3The Covariant Time DerivativeThe standard time derivative operator,which we have been denoting by an overdot,is defined for a vectorfield X as˙X a:=∂X av b,(9)∂z bwhere recall that˙z=v.Thefirst term is the change in X due to any explicit time-dependence it might have;the second term is the change in X due to its dependence on z.The time derivative is not covariant,because a time-dependent change of basis will modify the form of Eq.(9).We define the covariant time derivative D byD X a:=˙X a+αa b X b,(10)where theαa b are time-dependent quantities that are chosen to make D X a covariant.In order that the operator D have the Leibniz property,and that it reduce to the ordinary derivative(9)when acting on scalars,we requireD Y a=˙Y a−αb a Y b,when acting on a1-form Y.When D acts on mixed tensors of higher rank, anαmust be added for each superscript,and one must be subtracted for each subscript.We refer to theαas connexions,by analogy with the spatial derivative case.By enforcing covariance of D,we can derive a general expression forαa b.Since X is a tensor,we can writeD X a=D ∂z adt ∂z a∂¯z¯a˙X¯a+∂z b∂¯z¯aD X¯a,because D X a is by definition covariant.Hence,we require theα’s to transform asα¯a¯b=∂¯z¯a∂¯z¯bαa b+∂¯z¯adt ∂z c∂¯z¯b=∂¯z¯a∂¯z¯b αa b+∂v a∂z b+H a b,(13)where H is an arbitrary tensor.Equation(13)is the most general form of the connexionsα.In Section5,we consider three convenient choices of the tensor H.Butfirst in Section4we examine the action of the covariant derivative on the metric tensor.4The Rate-of-strain TensorOur development so far has not made use of a metric tensor.We now introduce such a tensor,specifically a Riemannian metric g:T U×T U→ℜ.The components g ab of the metric are functions of z and t,but the indices a and b run over the dimension n of T U,and so do not include a time component.It is informative to consider the derivative of the metric tensor,D g ab=˙g ab−αc a g bc−αc b g ac=∂g ab∂z cv c+g ac∂v c∂z a−(H ab+H ba),where we have used the metric to lower the indices on H.We define the intrinsic rate-of-strain or rate-of-deformation tensorγ[8,7]asγab=1∂t z .(14)Here we denote by∇a the covariant derivative with respect to z a,∇b X a:=∂X a2g ad ∂g bd∂z b−∂g bc∂z dholds.The covariant time derivative of the metric can thus be rewritten12(H ab+H ba)denotes the symmetric part of H.The rate-of-strain tensorγdescribes the stretching offluid elements.The time derivative of the metric in its definition(14)is necessary for covariance under time-dependent transformations;the term describes straining motion that is inherent to the space,as embodied by the metric.The trace of the rate-of-strain tensor is a scalarγc c=g acγac=∇c v c+1∂t z log|g|,(17) where|g|is the determinant of g ab and we have used the identityg ac ∂g ac∂t z log|g|.(18)The rate-of-strain tensor can be decomposed asγab=γ′ab+1whereγ′ab is traceless and represents a straining motion without change of volume,andγc c g ab/n is an isotropic expansion.We see from the trace(17) that for a time-dependent metric there can be an isotropic expansion even for an incompressibleflow,if∂|g|/∂t|z=0.Note also that in Lagrangian coordinates(characterised by v q=0),the rate-of-strain tensor reduces toγpq=1∂t a,so that the deformation of the space is contained entirely in the metric tensor. 5Three Covariant DerivativesAs mentioned in Section2,the requirement of covariance onlyfixes the covari-ant time derivative up to an arbitrary tensor[Eq.(13)].That tensor may be chosen to suit the problem at hand,but there are three particular choices that merit special attention.In Section5.1we treat the convective derivative,and in Section5.2we examine two types of compatible derivatives:corotational and directional.5.1The Convective DerivativeThe choice H≡0is equivalent to the convective derivative of Oldroyd[6,7]. The connexion,Eq.(13),reduces to the simple form∂v aαa b=−∂t z+∂X a∂z b X b.(20) When acting on a contravariant vector X a,as in Eq.(20),D c is sometimes called the upper convected derivative[9];D c acting on a covariant vector Y a, D c Y a=˙Y a+(∂v b/∂z a)Y b,is then called the lower convected derivative.In general,for an arbitrary tensor T,∂TD c T=Table1Comparison of the equation of motion for the components of an advected and stretched vectorfield B.The equations for the covariant and contravariant com-ponents of D c B differ because of the lack of compatibility with the metric.Type Contravariant components Covariant componentswhere L v T is the Lie derivative of T with respect to v[4,5].In Lagrangian coordinates,we have v q≡0,so the convective derivative reduces to∂X qD c X q=D c g ab=γab.2which does not vanish,unless the velocityfield is strain-free.The convective derivative is ideally suited to problems of advection with stretching,where a tensor is carried and stretched by a velocityfield.Ta-ble1summarises the form of the equation for advection with stretching of a vectorfield B(B is“frozen in”theflow[12])for the three different types of derivatives introduced here.The equation for the contravariant component B a is simply D c B a=0,but the equation for the covariant component B a=g ac B c is D c B a=2B cγc a.These two equations differ because the operator D c is not compatible with the metric.5.2Compatible DerivativesAnother way tofix H is to require that the operator D be compatible with the metric,that is,D g ab=0.This allows us to raise and lower indices through the operator D,a property possessed by the covariant spatial derivative.From Eq.(16),the requirement D g ab=0uniquely specifies the symmetric part of H ab,so that H S=γ.Using Eqs.(13)and(14),we thenfindαab=g acΓc bd v d+1∂t z−1where H A:=12 g ac∂V c∂z a (22)and the symmetric coordinate rate-of-strain tensorκab:=1∂t x +g bc∇a ∂z c∂tz.(23)In Eulerian coordinates,we haveκij=1∂t x +κab−ωab+H A ab.(24)Since H A ab is antisymmetric,we can use it to cancel the vorticity,or we can set it to zero.The two choices are discussed separately in Sections5.2.1and5.2.2.The decomposition of the velocity gradient tensor∇V into the rate-of-strain and vorticity tensors has the formg ac∇b V c=[γab−κab]+ωab(25)in general time-dependent coordinates.When the coordinates have no time dependence,the tensorκvanishes,as does the derivatives∂z c/∂t|x,and we recover the usual decomposition of the velocity gradient tensor into the rate-of-strain and the vorticity.We can think ofκas the contribution to the rate-of-strain tensor that is due to coordinate deformation and not to gradients of the velocityfield.However,the term∂g/∂t|z is a“real”effect representing the deformation due to a time-dependent metric,and is thus also included in the definition of the intrinsic rate-of-strain tensor,γ,defined by Eq.(14).In Euclidean space,when the rate-of-strain tensorγvanishes everywhere we are left with rigid-body rotation at a constant rate given byω[7].With an arbitrary metric and time-dependent coordinates the situation is not so sim-ple:the very concept of rigid-body rotation is not well-defined.Hence,even whenγ≡0,we cannot expect to be able to solve for v in closed form.5.2.1The Corotational DerivativeIn this instance we choose the antisymmetric part H A ab to be zero.We call corotational the resulting covariant derivative,and denote it by D J(the sub-script J stands for Jaumann).The appellation“corotational”really appliesto the Euclidean limit,g ij=δij,for which the compatible connexion Eq.(21) reduces toαij=−ωij.It is then clear that the covariant derivative is designed to include the effects of local rotation of theflow,as embodied by the vortic-ity.(See Refs.[9]and[10,p.342],and references therein.)The derivative(21) with H A≡0is thus a generalisation of the corotational derivative to include the effect of time-dependent non-Euclidean coordinates.In Table1,we can see that,written using D J,the equation for advection with stretching of a vector B a has the rate-of-strain tensor on the right-hand side.The“rotational”effects are included in D J,hence the terms that remain include only the strain.5.2.2The Directional DerivativeAnother convenient choice is to set H A ab=ωab,thus cancelling the vorticity in Eq.(24).The resulting covariant time derivative then has the property that, in the absence of any explicit time-dependence,it reduces to the covariant derivative along the curve C[4,5],or directional derivative,where C is the trajectory of the dynamical system in the general coordinates z(Section2). The derivative is called directional because it only depends on v,and not gradients of v.The form of the equation for advection with stretching of a vector B a written using D v is shown in Table1.The∇V term on the right-hand side is the “stretching”term[12](called vortex stretching when B is the vorticity vec-tor[13]).Theκterm represents coordinate stretching,and does not appear in Euclidean space with time-independent coordinates.Because the directional derivative depends only on v and not its gradients,it can be used to define time-dependent parallel transport of tensors.A vector X is said to be parallel transported along v if it satisfies D v X=0,or equivalently∂X a∂t x −κa c .(26)This can be readily generalised to tensors of higher rank.In Euclidean space, with time-independent coordinates,the right-hand side of Eq.(26)vanishes, leaving only advection of the components of X.Thus,parallel transport is closely related to advection without stretching;Equation(26)is the covariant formulation of the passive advection equation.6Time-curvatureA hallmark of generalised coordinates is the possibility of having nonzero cur-vature.The curvature reflects the lack of commutativity of covariant deriva-tives,and is tied to parallel transport of vectors along curves[4,5].An anal-ogous curvature arises when we try to commute D and∇,respectively the covariant time and space derivatives:∇a[D X b]−D[∇a X b]=H c a∇c X b+g bc ∇a(H cd−γcd−ωcd)+R cdae V e+1∂t z+g be∇c ∂z e∂t z+g ae∇c∂z e∂tx,(28)and the Riemann curvature tensor R obeys[5](∇c∇d−∇d∇c)X a=R a bcd X b.(29) The time-curvature tensor satisfies S abc=−S bac,and the Riemann curvature tensor satisfies R abcd=−R bacd,R abcd=R cdab.Even for trivial(Euclidean)coordinates,we do not expect D and∇to com-mute,because of the derivatives of v in the∇a(H cd−γcd−ωcd)term of Eq.(27).Note that the coordinate rate-of-strain tensorκ,defined by Eq.(23), does not appear in Eq.(27).The∇X term in Eq.(27)vanishes for the convective derivative of Section5.1, since then H≡0.For the directional derivative of Section5.2.2,we have H cd=γcd+ωcd,so the commutation relation simplifies to∇a[D X b]−D[∇a X b]=(γcd+ωcd)∇c X b+g bc R cdae V e+1identity,S abc+S cab+S bca=0,(30)which corresponds to thefirst Bianchi identity of the Riemann tensor.The time-curvature does not appear to satisfy an analogue of the second Bianchi identity.The property S abc=−S bac,together with the Bianchi identity(30),imply that S has n(n2−1)/3independent components,compared to the n2(n2−1)/12 components of R,where n is the dimension of the space.Thus one-dimensional manifolds have vanishing S and R.For1≤n≤3,S has more independent components than R;for n=4,they both have20.For n>4,R has more independent components than S.The time-curvature S vanishes for a time-independent metric and coordinates. It also vanishes for a metric of the form g ij(t,x)=β(t)h ij(x),where h is a time-independent metric and x are the Eulerian coordinates.It follows from its tensorial nature that the time-curvature must then vanish in any time-dependent coordinates.In general,it is convenient tofind S in Eulerian coor-dinates(where∂x/∂t|x=0),S ijk:=∇i ∂g kj∂t z ,(31)and then transform S ijk to arbitrary time-dependent coordinates using the tensorial law.7DiscussionIn this paper,we aimed to provide a systematic framework to handle com-plicated time-dependent metrics and coordinate systems on manifolds.The explicit form of the relevant tensors is often fairly involved,but the advantage is that they can be evaluated in time-independent Eulerian coordinates and then transformed to arbitrary coordinate systems using the usual tensorial transformation laws.The covariance of the time derivatives is made explicit by using arbitrary time-dependent coordinates.The results for the Eulerian coordinates x i are recovered by setting∂x i/∂t|x=0,and those for the Lagrangian coordinates a q by setting v q=0.The introduction of the time-curvature tensor allows us to treat the temporal dependence of the metric tensor in a manner analogous to its spatial depen-dence.For simple time-dependence,the time-curvature vanishes,such as for the case of a time-independent metric multiplied by a time-dependent scalar. As for the(spatial)Riemann curvature tensor,the components of the time-curvature can be computed for a given metric,and then inserted whenever a temporal and spatial derivative need to be commuted.We have only addressed the kinematics offluid motion.The dynamical equa-tions relating the rate of change of quantities to the forces in play have not been discussed(see Refs.[9,6,8,7]),and depend on the specifics of the problem at hand.Nevertheless,covariant time derivatives provide a powerful framework in which to formulate such dynamical equations.AcknowledgementsThe author thanks Chris Wiggins for pointing out a useful reference,and Tom Yudichak for an illuminating discussion.This work was supported by an NSF/DOE Partnership in Basic Plasma Science grant,No.DE-FG02-97ER54441.References[1]J.-L.Thiffeault,Differential constraints in chaoticflows on curved manifolds,Physica D(2001)in submission.arXiv:nlin.CD/0105010.[2]J.-L.Thiffeault,The one-dimensional nature of the advection–diffusionequation,Physical Review Letters(2001)in submission.arXiv:nlin.CD/0105026.[3] C.W.Misner,K.S.Thorne,J.A.Wheeler,Gravitation,W.H.Freeman&Co.,San Francisco,1973.[4] B.Schutz,Differential Geometry,Cambridge University Press,Cambridge,U.K.,1980.[5]R.M.Wald,General Relativity,University of Chicago Press,Chicago,1984.[6]J.G.Oldroyd,On the formulation of rheological equations of state,Proc.R.Soc.Lond.A200(1950)523–541.[7]R.Aris,Vectors,Tensors,and the Basic Equations of Fluid Mechanics,Dover,New York,1989.[8]L.E.Scriven,Dynamics of afluid interface:Equations of motion for Newtoniansurfacefluids,Chem.Eng.Sci.12(1960)98–108.[9] D.Jou,J.Casas-V`a zquez,G.Lebon,Extended Irreversible Thermodynamics,2nd Edition,Springer-Verlag,Berlin,1996.[10]R.B.Bird,R.C.Armstrong,O.Hassager,Dynamics of Polymeric Liquids,2ndEdition,Vol.1,John Wiley&Sons,New York,1987.[11]C.Truesdell,The Kinematics of Vorticity,Indiana University Press,Bloomington,1954.[12]S.Childress,A.D.Gilbert,Stretch,Twist,Fold:The Fast Dynamo,Springer-Verlag,Berlin,1995.[13]D.J.Tritton,Physical Fluid Dynamics,2nd Edition,Oxford University Press,Oxford,U.K.,1988.。

1996-Eigenvalues of regular Sturm-Liouville problems

1996-Eigenvalues of regular Sturm-Liouville problems

2
Notation
Consider the differential equation −(py ) + qy = λw y on (a , b ), −∞ ≤ a < b ≤ ∞ with λ ∈ R where p, q, w : (a , b ) → R, 1/p, q, w ∈ Lloc (a , b ), w > 0 a.e. on (a , b ) . Let I = [a, b], a < a < b < b , and consider the BC
1
In [9] we give a different proof of the Dauge - Helffer Theorem with substantially weaker hypotheses and we obtained a similar result for coupled BC. Here we show that the eigenvalues of regular SL problems are differentiable functions of all the data: the endpoints, the boundary conditions, as well as the coefficients and the weight function and we find expressions for their derivatives. Differentiability with respect to a coefficient p, q or weight function w is in the sense of the Frechet derivative in the Banach space L1 (a, b). We maintain that L1 (a, b) - not L2 (a, b) - is the “ natural” setting for the regular SL theory. This is because the condition 1/p, q, w ∈ L1 (a, b) is necessary and sufficient for initial value problems to have unique solutions - see Everitt and Race [8] and [10]. Our proof is elementary - given the continuous dependence of the eigenvalues. The latter seems to be a part of the folklore of Mathematics and so we provide only an outline of a proof. Besides its theoretical importance, the continuous dependence of the eigenvalues and the eigenfunctions on the data is fundamental from the numerical point of view. The major general purpose codes for the numerical computation of the eigenvalues and eigenfunctions of SL problems - SLEIGN [5], the Fulton and Pruess code SLEDGE, the NAG library code [14] and SLEIGN2 [2], [3] and [4], are based on it. As a consequence of our main result - Theorem 4.2 - it follows that the convergence of the approximations based on small changes of the data is at least of order o(h) as h → 0. In section 2 we establish the notation, the continuity of the eigenvalues and eigenfunctions is discussed in section 3, followed by our main result on the differentiability of the eigenvalues in section 4.

The low-energy nuclear density of states and the saddle point approximation

The low-energy nuclear density of states and the saddle point approximation

a r X i v :n u c l -t h /0107074v 1 30 J u l 2001The low-energy nuclear density of states and the saddle point approximationSanjay K.Ghosh ∗and Byron K.Jennings †TRIUMF,4004Wesbrook Mall,Vancouver,British Cloumbia,Canada V6T 2A3(February 8,2008)The nuclear density of states plays an important role in nuclear reactions.At high energies,above a few MeV,the nuclear density of states is well described by a formula that depends on the smooth single particle density of states at the Fermi surface,the nuclear shell correction and the pairing energy.In this paper we present an analysis of the low energy behaviour of the nuclear density of states using the saddle point approximation and extensions to it.Furthermore,we prescribe a simple parabolic form for excitation energy,in the low energy limit,which may facilitate an easy computation of level densities.21.10.-k,21.10.Ma,26.50.+xI.INTRODUCTIONOne of the important ingredients in the Hauser-Feshbach approach to the calculation of nuclear reaction rates important for astrophysical interest is the nuclear density of states [1].In fact uncertainties in the nuclear density of states is a leading cause of errors [1]in these calculations.The studies of nuclear level densities dates back to the 1950s.with work by Rozenweig [2],and Gilbert and Cameron [3,4].The usual technique is to calculate the partition function and then invert the Laplace transform using the saddle point approximation.At energies sufficiently high for shell and pairing effects to be washed out the density of states is given in terms of the single particle density of states at the Fermi surface (and its derivatives)the shell correction energy and pairing energy.Most statistical model calculations use the back shifted fermi gas description [4].Monte-Carlo shell model calculations [5]as well as combinatorial approaches [6]show excellent agreement with this phenomenological approach.At lower energies the results are more problematic and typically crude extrapolations from the higher energy are used.In this paper we study the nuclear level density with an emphasis on the lower energy region,using a single particle shell model.The dependences in the two regimes are rather different.In contrast to higher energies,where the density of states depends on the shell correction and the smooth single particle density of states,in the lower energy regime the density of states depends on the separation of single particle levels and their degeneracy.Moreover,at very low energy the saddle point approximation itself breaks down.We show that in this region the correction suggested by Grossjean and Feldmeier [7]gives dramatic improvements.In the next section we review the saddle point approximation in the context of nuclear level density calculation.We use thermodynamic identities to rewrite the equations in simpler form compared to the usual ones [8].Furthermore,we discuss the possible ways to simplify the evaluation of level densities at low energies.A temperature dependent parabolic equation for excitation energy seems to be a good choice.The corrections suggested [7]and the corresponding modifications to the equation are discussed in section 3.Finally in section 4we discuss our calculation and the results.II.THE SADDLE POINT APPROXIMATIONThe grand canonical partition function for two type of particles can be written as:e Ω= N ′,Z ′,E ′exp(αN N ′+αZ Z ′−βE ′)(2.1)where the sum is over all nuclei with N ′neutrons,Z ′protons and over all energy eigenstates E ′.τ=β−1,is thetemperature and µN (Z )=αN (Z )∗Email :sanjay@triumf.ca †Email :jennings@triumf.cae Ω=N ′,Z ′dE ′ρ(E ′,N ′,Z ′)exp(αN N ′+αZ Z ′−βE ′)(2.2)where ρ(E ′,N ′,Z ′)is the nuclear density of states.It represents the density of energy eigenvalues for the nucleus (N ′,Z ′)at the energy E ′.The above equation also shows that the grand partition function can be considered a Laplace transform of the nuclear density of states.The inversion integral is:ρ(E ′,N ′,Z ′)=1dβ=−E ;dΩ′d αZ=Z .(2.4)The path of the integration can be chosen to pass through this point.By expanding the exponent S in Taylor series about the saddle point and retaining only the quadratic terms,the nuclear density of states in the saddle point approximation can be written as:ρ=e Sdβ2d 2S dβdαZ d 2S dα2N d 2S dβdαZd 2Sdα2Z(2.6)To simplify the above determinant we change the independent variables to τ=1/β,µN =ταN ,and µN =ταZ and change the dependent variable to Ω′=τΩ=τS +µN N +µZ Z −E .In terms of the new variables the equations determining the saddle point are:d Ω′d µN=−N;dΩ′dτdN dτdS dµN dZ dµZdNdµZ(2.8)In deriving this result we have used the fact that in a determinant addition of a multiple of row (column)to another row (column)does not change the value of the determinant.In the first row the derivatives are at constant µN and µZ ;in the second at constant τand µZ ,and the third at constant τand µN .The variables which are held constant can be changed using the following equations:dS dτ NZ +dS dτ µN µZ +dS dτ µN µZ(2.9)dSdNτZdNdZτNdZdSdNτZdNdZτNdZdNτZtimes the second column anddSdτNZdN dµZ dZdµZ(2.12)This procedure can be repeated to yield:D =−τ5dSdµN τZdZ aE wherea =πg/6.The temperature is τ=πaE ]g 1g 2exp[(ǫ1−ǫ2)/(2τ)]/τ.This goes to zero exponentially fast as τgoes to zero.Note that in neither of the cases above shell correction is involved.The above discussion is useful since S is a function of the energy.Here again one may use a few trick.It turns out to be easier to parameterize E as function of τ.Since τdS =dE ,one can write S = τ01dτ′dτ′+S (τ=0).The last term is the integration constant and is given once the degeneracy of the ground state is known.It contributes to exponent but not to the denominator where derivatives are taken.Next one needs E as a function of τ.For many systems there is a quite reliable approximation.For very low temperature,much less then the level spacing,the energy does not change significantly.However above some critical temperature ,τ0,it starts to increase rapidly.For temperature nears this region the energy can be parametrized quite simply byE −E 0=c (τ−τ0)2θ(τ−τ0)(2.14)We have checked this approximation using a simple shell model and found that it works quite well except if there are more then one level approximately equal distant from the Fermi surface.The parameters τ0and c depend on the level spacing and degeneracy near the Fermi surface.Again they do not depend on the shell correction.Before being useful at very low energies a short-coming of the saddle point approximation must be overcome.It is well known that at low energies the saddle point approximation tends to diverge as the denominator goes to zero,In many cases this problem can be fixed by using a technique from ref.[7]which handles the contribution to the nuclear density of states from the ground state delta function explicitly.III.MODIFIED SADDLE POINTIn ref.[7]Grossjean and Feldmeier have proposed a modification of the saddle point method to remove the diver-gences of the level density at the ground state.Introducing explicitly the ground state energy E g(A)as the lower boundary,the density of state becomes˜ω(E∗,A)=ω(E∗+E g(A),A)−δ(E∗)δ(A−A0)(3.1) where E∗=E−E g is the excitation energy and A0is the mean particle number.The corresponding modified grand canonical potential is given by,˜Ω=Ω+βEg+ln(1−Y)(3.2) whereY=d0eαN N+αZ Z−βE g−Ω(3.3)The chemical potentials for neutrons and protons are given byµZ=αZβrespectively,d0being theground state occupancy.The nuclear level density in the modified saddle point approximation becomes,ρ=˜SdτNZ˜dN dµZτµN(3.5)The derivation of eq.(3.5)is straightforward as it depends only on the thermodynamic relations of the quantities involved and not on their explicit forms.The computation of the level density using eq.(3.4)will depend on the relations between the usual and the modified thermodynamic quantities as the usual quantities are directly related to the single particle shell model states.The modified saddle point conditions in terms of the usual thermodynamic potential becomes,dΩdβα0,β0=−(E g+˜E∗)(3.6)where˜E∗=E∗(1−Y)(see eq.(3.3)).Forα=αN(Z),A=N(Z).The derivatives of the entropy and numbers are related as,˜dSdτN,Z−β3E∗2Y+β3E∗2Y2)×1dµNτ,Z=dN−1+Y ˜dZdµZτ,N×1IV.DISCUSSIONThe single particle energies,required for the evaluation of different thermodynamic quantities are obtained for the Nilsson shell model.The values of the constants associated with l2and l.s are taken from ref.[9].Here it should be mentioned that the level densities are strongly dependent on the single particle energy levels.Hence for astrophysical applications one should make a judicious choice for the model as well as the constants.Wefirst calculate the level densities for different nuclei,for both the usual as well as modified saddle point approx-imations,using all thefilled and a equal number of unfilled levels.It should be noted the for low excitation energies (of the order offirst excitation level or less)the inclusion of only the lastfilled and thefirstfilled level,as described in section2,is sufficient for the evaluation of the level densities.A comparison of the level densities from eq.(2.3)and eq.(3.4)for nuclei32S,88Sr and208P b are shown infig.1,fig.2andfig.3respectively.The modified saddle point results are shown by curve(a)and the usual saddle points results are shown by curve(c)in the abovefigures.As evident from the graphs the usual saddle point does show a divergence at the low energies whereas modified version goes to zero smoothly.This is due to the fact that entropy in the modified version goes to zero much faster as it takes into account the nonavailability of states below the ground state energy.Moreover,at low energies the differences are more pronounced for the lighter nuclei like32S compared to208P b.The differences can be attributed to the fact that in modified prescription the thermodynamic potential gets an additive contribution compared to the usual one as shown by eq.(3.2).As discussed in section2.we try tofit the excitation energies with parabolic form as given in eq.(2.14).Thesefits for different nuclei are shown infig.4,fig.5andfig.6respectively.Infigures(4-6)we have plotted the variation of excitation energy with temperature.For low energies,thefitted value is in good agreement with exact values.Next we calculate the entropy and its derivative,using the steps given in section2,from thisfitted expression for excitation energy,the derivatives of N and Z being the same as in preceding ing these we calculate the level densities for different nuclei.A comparison of the level densities from the full calculation from modified saddle point approximation(curve(a))and the one usingfitted excitation energies(curve(b))are shown infig.1,fig.2andfig.3.It is obvious from the graphs that thefitted excitation energy gives a better agreement with exact calculation for heavier nuclei.To conclude,we have shown that the modification of the saddle approximation is necessary for the correct evaluation of the level densities at lower energies.One can simplify the equations substantially using the thermodynamic identities.Furthermore,a parabolic prescription for the excitation energy may be useful for easier computation of the level densities.More work in this direction is needed to make the methodology useful for direct application to astrophysical reactions.0.02.04.0 6.08.010.0Excitation energy (MeV)0.02.04.06.08.0L o g (r h o )(a)(b)(c)FIG.1.Level density for 32S ;(a)modified saddle point,(b)corresponds to the fitted excitation energy as in fig.4and (c)usual saddle point.The vertical dashed line gives the position of 1st excitation level0.02.04.0 6.08.010.0Excitation energy0.02.04.06.08.010.012.014.0L o g (r h o )(a)(b)(c)FIG.2.Level density for 88Sr ;(a)modified saddle point,(b)corresponds to the fitted excitation energy as in fig.4and (c)usual saddle point.The vertical dashed line gives the position of 1st excitation level0.02.04.0 6.08.010.0Excitation energy0.04.08.012.016.020.0L o g (r h o )(a)(b)(c)FIG.3.Level density for 208P b ;(a)modified saddle point,(b)corresponds to the fitted excitation energy as in fig.4and (c)usual saddle point.The vertical dashed line gives the position of 1st excitation level0.00.51.01.52.0Temperature (MeV)0.02.04.06.08.010.0E x c i t a t i o n e n e r g y (M e V )(a)(b)FIG.4.Excitation energy for 32S ;(a)actual excitation energy nd (b)corresponds to the fitted form eq.(2.14)with E 0=5.62and τ0=0.110.00.20.50.8 1.0Temperature (MeV)0.02.04.06.08.010.012.0E x c i t a t i o n e n e r g y (M e V )(a)(b)FIG.5.Excitation energy for 88Sr ;(a)actual excitation energy nd (b)corresponds to the fitted form eq.(2.14)with E 0=15.71and τ0=0.140.000.250.500.75 1.00Temperature (MeV)0.05.010.015.020.025.0E x c i t a t i o n e n e r g y (M e V )(a)(b)FIG.6.Excitation energy for 208P b ;(a)actual excitation energy nd (b)corresponds to the fitted form eq.(2.14)with E 0=40.0and τ0=0.2011。

Minimal_Critical_Sets_of_Refined_Inertias_for_Irre

Minimal_Critical_Sets_of_Refined_Inertias_for_Irre

Advances in Linear Algebra & Matrix Theory , 2013, 3, 7-10doi:10.4236/alamt.2013.32002 Published Online June 2013 (/journal/alamt) Minimal Critical Sets of Refined Inertias for IrreducibleSign Patterns of Order 2Ber-Lin YuFaculty of Mathematics and Physics, Huaiyin Institute of Technology, Huai’an, ChinaReceived May 1, 2013; revised June 1, 2013; accepted June 8, 2013Copyright © 2013 Ber-Lin Yu. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.ABSTRACTLet be a nonempty, proper subset of all refined inertias. Then, is called a critical set of refined inertias for ire- ducible sign patterns of order if is sufficient for any sign pattern S S n ()S ri A ⊆A to be refined inertially arbitrary. If no proper subset of is a critical set of refined inertias, then is a minimal critical set of refined inertias for sign patterns of order . In this paper, all minimal critical sets of refined inertias for irreducible sign patterns of order 2 are identified. As a by-product, a new approach is presented to identify all minimal critical sets of inertias for irreducible sign patterns of order 2. S S nKeywords: Associated Digraph; Inertially Arbitrary Sign Pattern; Refined Inertia; Critical Set of Refined Inertias1. IntroductionA sign pattern is a matrix with en- tries from the set n n ׈ij A α⎡⎤=⎣⎦{},,0+−n n × where + (respectively, ) denotes a positive (respectively, negative) real number; see, e.g., [1]. The set of all real matrices with the samesign pattern as the sign pattern −ˆAis the qualita- tive class()()(){}ˆsign .ij n ij ij Q A A a M R a α⎡⎤==∈=⎣⎦A subpattern of an sign pattern n n ׈Ais a sign pattern obtained by replacing some (possibly empty) subset of the nonzero entries of ˆBˆAwith zeros. If is a subpattern of ˆB ˆA, then ˆA is a superpattern of . Two square zero-nonzero patterns ˆB ˆAand are equivalent if one can be obtained from the other by any combination of transposition and permutation similarity.A sign pattern ˆB ˆAis reducible if it is permutation simi- lar to a pattern of the form111222ˆˆˆ0A A A ⎛⎫ ⎪ ⎪⎝⎭where 1122ˆˆ,A A are square and non-vacuous. A pattern is irreducible if it is not reducible.The inertia of a matrix A is an ordered triple where and 0n are the number of eigenvalues of (0,,n n n +−part, respectively; see, e.g., [2]. The refined inertia of Ais the ordered quadruple (,,,2)z p n n n n +−n of nonnega- tive integers that sum to , where (),,2z p n n +n n +−is the inertia of A while z n is the number of 0 as an ei- genvalue of A and 2p n is the number of eigenvalues of A with nonzero pure imaginary eigenvalues; see, e.g., [3]. The inertia (respectively, refined inertia) of asign pattern ˆAis ()()(){}ˆi A i A A Q A =∈ˆ (respectively, ()()(){}ˆˆri A ri A A Q A =∈). A sign pattern n n ׈A is an inertially arbitrary pattern (IAP) if given any or- dered triple (0of nonnegative integers with, there exists a real matrix ),,n n n +−n =0n n n +−++()ˆA Q A∈ such that ; see, e.g., [4,5] and the refe-()(,,n n +−=)0n i A rence therein. Similarly, ˆAis a refined inertially arbi- trary pattern (rIAP) if given any ordered quadruple (,,,2)z p n n n n +− of nonnegative integers that sum to ,there exists a real matrix n ()ˆA Q A∈) such that ()(,,,2z p n ri A n n n +−=S ; see, e.g., [3].Let be a nonempty, proper subset of the set of allinertias for any zero-nonzero (or sign) pattern n n ׈A. If is sufficient for (S i )ˆA⊆ˆA to be inertially arbi- trary, then is said to be a critical set of inertias for zero-nonzero (or sign) patterns of order and if no proper subset of is a critical set of inertias, then is said to be a minimal critical set of inertias for zero-S n S S ),n n +−A with positive, negative and zero realB.-L. YU8 nonzero (or sign) patterns of order ; see, e.g., [6]. Critical sets of refined inertias for irreducible zero-non- zero patterns are defined in [7]. Similarly, we introduce the concept of a critical set of refined inertias for sign patterns. Let be a nonempty, proper subset of all refined inertias. Then, is called a critical set of re- fined inertias for sign patterns of order , if the setis sufficient for any sign pattern n S S n ()ˆS ri A⊆ˆA to be refined inertially arbitrary. If no proper subset of is a critical set of refined inertias, then is a minimal critical set of refined inertias for irreducible sign patterns of order . We note that all minimal critical sets of in- ertias for irreducible sign patterns of order 2 have been identified in [6]. But, identifying all minimal critical sets of inertias for irreducible zero-nonzero (or sign) patterns of order has been posed as an open question in [6]. Also open is the minimum cardinality of such a set. The concept of critical sets of refined inertias for sign pattern is introduced for the first time. In this work, we concen-trate on the minimal critical sets of refined inertias for irreducible sign patterns of order 2.S S ⎤⎦n 3≥22×Our work is organized as follows. Section 2 describes some preliminary results on the refined inertias of sign patterns. The minimal critical sets of refined inertias for irreducible sign patterns of order 2 are identified in Sec- tion 3. In Section 4, as a by-product, an alternative proof is given to identify all minimal critical sets of inertias for irreducible sign patterns. Some concluding re- marks are given in Section 5.2. PreliminariesRecall that a sign pattern has its associated ˆij A α⎡=⎣digraph with vertex set (D A)ˆ{}1,2α,, =−n and for all and , a positive (resp., negative ) arc from to if and only if ij (resp., ij ). A (directed) simple cycle (or a k-cycle) i j i j α=+γof length k is a sequence of k arcs ()12231k such that these vertices are distinct; see, e.g., [1]. The sign positive (or negative)of a simple cycle in a sign pattern (),,i (,i ),,i ,i i i ˆAis the actual product of the entries in the cycle, following the obvious rules that multiplication is commutative and associative, and (+)(+) = +,(+)(−) = −.Lemma 2.1. let ˆAbe an irreducible sign pattern of order 2. Then the following are equivalent :(1) ˆAis spectrally arbitrary ; (2) ˆAis inertially arbitrary ; (3) Up to equivalence , ; ˆA ++⎛⎫= ⎪−−⎝⎭ˆˆ(4) Associated digraph of (,)AD A , has two loops of opposite sign and a negative 2-cycle.Proof. The equivalences of follow from Proposition 3 in [6]. The equivalence ofcan be verified directly.()(()(12))3⇔⇔()34⇔It is known that there are seven refined inertias for sign patterns. To identify all minimal critical sets of refined inertias for irreducible sign patterns of order 2, the following three sign patterns are necessary to be in- vestigated.22×Lemma 2.2. Let Then , .M −+⎛⎫= +−⎝⎭M allowsthe ⎪only refined inertias (0, 1, 1, 0), (0, 2, 0, 0) and (1, 1, 0, 0).Proof. Since M requires every realization with a negative trace, the refined inertias (2, 0, 0, 0), (0, 0, 2, 0), (0, 0, 0, 2) and (1, 0, 1, 0) cannot be allowed by M . For the remaining refined inertias, consider realizations of1111212,,,1112121M −−−⎛⎫⎛⎫⎛⎫ ⎪ ⎪ −−⎝⎭⎝⎭⎝⎭⎪− with refined iner-tias (0, 1, 1, 0), (0, 2, 0, 0) and (1, 1, 0, 0), respectively. It follows that M allows the only refined inertias (0, 1, 1, 0), (0, 2, 0, 0), (1, 1, 0, 0).Lemma 2.3. Let . Then , allows allN ++⎛= ++⎝⎭⎫⎪N refined inertias except (0, 1, 1, 0), (0, 2, 0, 0), (0, 0, 2, 0) and (0, 0, 0, 2).Proof. Since requires every realization with a positive trace, the refined inertias (0, 1, 1, 0), (0, 2, 0, 0), (0, 0, 2, 0) and (0, 0, 0, 2) cannot be allowed by . Consider the following realizations of , N N N 11211,12111⎛⎫⎛ ⎪ ⎝⎭⎝⎫⎪⎭⎪⎪P ⎪ and with refined inertias (2, 1141⎛⎫⎝⎭0, 0, 0), (1, 0, 1, 0) and (1, 1, 0, 0), respectively. It fol- lows that allows all refined inertias except (0, 1, 1, 0), (0, 2, 0, 0), (0, 0, 2, 0) and (0, 0, 0, 2).N Lemma 2.4. Let Then , allows the0.0P +⎛⎫= −⎝⎭refined inertia (0, 0, 0, 2).Proof. Lemma 2.4 follows from the fact that a realiza-tion of , has (0, 0, 0, 2) as its refined inertia.01,10P ⎛⎫−⎝⎭3. Minimal Critical Sets of Refined Inertias for Irreducible Sign Patterns of Order 2We are now to identify all minimal critical sets of refined inertias for irreducible sign patterns of order 2.Theorem 3.1. The set {(0, 0, 2, 0)} is the only minimal critical set with a single refined inertia for ire- ducible sign patterns.22×Proof. Lemma 2.2 indicates that {(0, 2, 0, 0), {(1, 1, 0, 0)} and {(0, 1, 1, 0)} cannot be a minimal critical set of refined inertias. Lemma 2.3 indicates that {(2, 0, 0, 0)} and {(1, 0, 1, 0)} cannot be a minimal critical set of re-B.-L. YU 9fined inertias. Lemma 2.4 indicates that {(0, 0, 0, 2)} cannot be a minimal critical set of refined inertias. So, it suffices to show that the set {(0, 0, 2, 0)} is a minimal critical sets of refined inertias.If {(0, 0, 2, 0)} is allowed by an arbitrary irreduciblesign pattern ˆAof order 2, then all the main diagonal entries of ˆAmust be nonzero. Since ˆA allows a re- alization with a zero trace, the two diagonal entries of ˆAare of opposite sign. That is to say, the associated di-graph of (ˆD A)ˆA , has a positive loop and a negative loop. For ˆAallows a realization with zero determinant, has a negative 2-cycle. It follows from Lemma 2.1 that (ˆD A)ˆAis refined inertially arbitrary. Theorem 3.2. The refined inertia sets {(0, 0, 0, 2), (1, 0, 1, 0)}, {(0, 0, 0, 2), (0, 1, 1, 0)}, {(0, 0, 0, 2), (2, 0, 0, 0)}, {(0, 0, 0, 2), (0, 2, 0, 0)}, {(0, 0, 0, 2), (1, 1, 0, 0)}, {(1, 0, 1, 0), (0, 1, 1, 0)}, {(1, 0, 1, 0), (0, 2, 0, 0)}, {(1, 0, 1, 0), (2, 0, 0, 0)} and {(2, 0, 0, 0), (0, 2, 0, 0)} are minimal critical sets of refined inertias for irreducible sign patterns of order 2.Proof. Let ˆAbe an arbitrary irreducible sign pattern of order 2. If {(0, 0, 0, 2), (1, 0, 1, 0)} , then ()ˆri A⊆ˆA allows a realization with a positive trace and a realizationwith zero trace. It follows that has a positive loop and a negative loop. Since ()ˆD AˆAallows a realization with zero determinant, has a negative 2-cycle. By Lemma 2.1, ()ˆD AˆAis refined inertially arbitrary and {(0, 0, 0, 2), (1, 0, 1, 0)} is a minimal critical set of refined inertias for irreducible sign patterns. Similarly, we can show the refined inertias {(0, 0, 0, 2), (0, 1, 0)}, {(0, 0, 0, 2), (2, 0, 0, 0)}, {(0, 0, 0, 2), (0, 2, 0, 0)}, {(1, 0, 1, 0), (0, 1, 1, 0)}, {(1, 0, 1, 0), (0, 2, 0, 0)}, {(1, 0, 1, 0), (2, 0, 0, 0)} and {(2, 0, 0, 0), (0, 2, 0, 0)} are minimal critical sets of refined inertias for irreducible sign patterns.2×22×2For the refined inertia set {(0, 0, 0, 2), (1, 1, 0, 0)}, we claim that the diagonal entries of ()ˆri A⊆ˆA must be nonzero. In fact, assume that there exists at least azero diagonal entry of ˆA. Then sign pattern ˆA requires nonsingularity. It is contradicted that ˆAallows two re- alizations with a positive and negative determinant, re-spectively. So, the diagonal entries of ˆAmust be non- zero. And the fact that the diagonal entries of ˆAare of opposite sign follows from that ˆAallows a realization with a zero trace. has a negative simple cycle of length 2 follows from that the inertia (0, 0, 0, 2). ()ˆD A()ˆri A∈Next we identify all minimal critical sets of refined in- ertias for irreducible sign patterns.22×Theorem 3.3. The sets {(0, 0, 2, 0)}, {(0, 0, 0, 2), (1, 0, 1, 0)}, {(0, 0, 0, 2), (0, 1, 1, 0)}, {(0, 0, 0, 2), (2, 0, 0, 0)}, {(0, 0, 0, 2), (0, 2, 0, 0)}, {(0, 0, 0, 2), (1, 1, 0, 0)}, {(1, 0, 1, 0), (0, 1, 1, 0)}, {(1, 0, 1, 0), (0, 2, 0, 0)}, {(1, 0, 1, 0), (2, 0, 0, 0)} and {(2, 0, 0, 0), (0, 2, 0, 0)} are the onlyminimal critical sets of refined inertias for irre- ducible sign patterns.22×Proof. By Theorems 3.1 and 3.2, the refined inertia sets stated in Theorem 3.3 are minimal critical sets of refined inertias for irreducible sign patterns. To show there exists no other minimal critical sets of refined inertias, it suffices to show that the remaining nine re- fined inertia sets with cardinality 2, {(1, 0, 1, 0), (2, 0, 0, 0)}, {(1, 0, 1, 0), (1, 1, 0, 0)}, {(2, 0, 0, 0), (1, 1, 0, 0)}, {(0, 1, 1, 0), (0, 2, 0, 0)}, {(0, 1, 1, 0), (1, 1, 0, 0)}, {(1, 1, 0, 0), (0, 2, 0, 0)} and the two refined inertia sets with cardinality 3, {(1, 0, 1, 0), (2, 0, 0, 0), (1, 1, 0, 0)} and {(0, 1, 1, 0), (0, 2, 0, 0), (1, 1, 0, 0)} are not critical sets of refined inertias. By Lemma 2.3, {(1, 0, 1, 0), (2, 0, 0, 0)}, {(1, 0, 1, 0), (1, 1, 0, 0)}, {(2, 0, 0, 0), (1, 1, 0, 0)} and {(1, 0, 1, 0), (2, 0, 0, 0)}, (1, 1, 0, 0)} are not critical sets of refined inertias. By Lemma 2.2, {(0, 1, 1, 0), (0, 2, 0, 0)}, {(0, 1, 1, 0), (1, 1, 0, 0)}, {(1, 1, 0, 0), (0, 2, 0, 0)} and {(0, 1, 1, 0), (0, 2, 0, 0)}, (1, 1, 0, 0)} are not critical sets of refined inertias.22×The following theorem follows directly for Theorem 3.3.Theorem 3.4. Let ˆAbe a irreducible sign pattern of order 2. Then the following are equivalent:1) ˆAis refined inertially arbitrary; 2) ˆA allows (0, 0, 2, 0); 3) ˆA allows (0, 0, 0, 2) and (1, 0, 1, 0); 4) ˆA allows (0, 0, 0, 2) and (0, 1, 1, 0); 5) ˆA allows (0, 0, 0, 2) and (2, 0, 0, 0); 6) allows (0, 0, 0, 2) and (0, 2, 0, 0); ˆA 7) ˆA allows (0, 0, 0, 2) and (1, 1, 0, 0); 8) ˆA allows (1, 0, 1, 0) and (0, 1, 1, 0); 9) ˆA allows (1, 0, 1, 0) and (0, 2, 0, 0); 10) ˆA allows (1, 0, 1, 0) and (2, 0, 0, 0); 11) ˆA allows(2, 0, 0, 0) and (0, 2, 0, 0). 4. Minimal Critical Sets of Inertias for Irreducible Sign Patterns of Order 2In [6], all minimal critical sets of inertias for ir- reducible sign patterns, which are restated here as Theo- rem 4.1, have been identified. In this section, we present an alternative proof in terms of critial sets of refined in-ertias, as a by-product.22×Theorem 4.1. The sets {(2, 0, 0), (0, 0, 2)}, {(0, 2, 0), (0, 0, 2)}, {(2, 0, 0), (0, 1, 1)}, {(0, 2, 0), (1, 0, 1)}, {(1, 1, 0), (0, 0, 2)}, {(1, 0, 1), (0, 0, 2)}, {(0, 1, 1), (0, 0, 2)}, {(2, 0, 0), (0, 2, 0)} and {(1, 0, 1), (0, 1, 1)} are the only minimal critical sets of inertias for irreducible sign patterns.22×Proof. Note that an irreducible sign pattern: 22×allows {(0, 0, 2), (2, 0, 0)} if and only if it allows {(2, 0, 0, 0), (0, 0, 2, 0)} or {(2, 0, 0, 0), (0, 0, 0, 2)}, allows {(0, 0, 2), (0, 2, 0)} if and only if it allowsB.-L. YU10 {(0, 2, 0, 0), (0, 0, 2, 0)} or {(0, 2, 0, 0), (0, 0, 0, 2)}, allows {(2, 0, 0), (0, 1, 1)} if and only if it allows {(2, 0, 0, 0), (0, 1, 1, 0)},allows {(0, 2, 0), (1, 0, 1)} if and only if it allows {(0, 2, 0, 0), (1, 0, 1, 0)},allows {(1, 1, 0), (0, 0, 2)} if and only if it allows {(0, 0, 0, 2), (1, 1, 0, 0)} or {(0, 0, 2, 0), (1, 1, 0, 0)}, allows {(1, 0, 1), (0, 0, 2)} if and only if it allows {(0, 0, 0, 2), (1, 0, 1, 0)} or {(0, 0, 2, 0), (1, 0, 1, 0)}, allows {(0, 1, 1), (0, 0, 2)} if and only if it allows {(0, 0, 0, 2), (0, 1, 1, 0)} or {(0, 0, 2, 0), (0, 1, 1, 0)}, allows {(2, 0, 0), (0, 2, 0)} if and only if it allows {(2, 0, 0, 0), (0, 2, 0, 0)},allows {(1, 0, 1), (0, 1, 1)} if and only if it allows {(0, 1, 1, 0), (1, 0, 1, 0)}.Since all of the refined inertia sets above are critical sets of refined inertias (not necessarily minimal), the above sets of inertias are critical sets, and indeed are also easily seen to be minimal (since at least one of the cor- responding sets of refined inertias is minimal in each case). It follows that the above sets of inertias are the only minimal critical sets of inertias because the corre- sponding sets of refined inertias are the only minimal critical sets of refined inertias.5. Concluding RemarksWe have identified all minimal critical sets of refined inertias for irreducible sign patterns of order 2. As a by-product, all minimal critical sets of inertias for irre- ducible sign patterns of order 2 have also been identified in a new proof. In a follow-up paper, we will consider other cases, e.g., , though identification of all critical sets of inertias for irreducible sign patterns of order has been posed as an open question in [6].3n =3n ≥6. AcknowledgementsThe authors would like to express their great gratitude to the referees and editor for their constructive comments and suggestions that led to the enhancement of this paper. This research was supported in part by the Sci. & Tech. Research Fund of Huaiyin Institute of Technology (HGB1111), National Natural Science Foundation of China (11201168) and Natural Science Foundation of the Higher Education Institutions of Jiangsu Province (Grant No. 12KJB110001).REFERENCES[1] F. Hall and Z. Li, “Sign Pattern Matrices,” In: L. Hogben,Ed., Handbook of Linear Algebra , Chapman & Hall/CRC Press, Boca Ration, 2007. [2] R. A. Horn and C. R. Johnson, “Matrix Analysis,” Cam-bridge University Press, New York, 1985. doi:10.1017/CBO9780511810817 [3] L. Deaett, D. D. Olesky and P. van den Driessche, “Re-fined Inertially and Spectrally Arbitrary Zero-Nonzero Patterns,” The Electronic Journal of Linear Algebra , Vol. 20, 2010, pp. 449-467. [4] M. S. Cavers and K. N. Vander Meulen. “Spectrally andInertially Arbitrary Sign Patterns,” Linear Algebra and Its Applications , Vol. 394, No. 1, 2005, pp. 53-72. doi:10.1016/a.2004.06.003 [5] M. S. Cavers, K. N. Meulen and L. Vanderspek. “SparseInertially Arbitrary Patterns,” Linear Algebra and Its Ap- plications , Vol. 431, No. 11, 2009, pp. 2024-2034. doi:10.1016/a.2009.06.040 [6] I. J. Kim, D. D. Olesky and P. van den Driessche, “Criti-cal Sets of Inertias for Matrix Patterns,” Linear and Mul- tilinear Algebra, Vol. 57, No. 3, 2009, pp. 293-306. doi:10.1080/03081080701616672 [7] B. L. Yu, T. Z. Huang, J. Luo and H. B. Hua, “CriticalSets of Refined Inertias for Irreducible Zero-Nonzero Pat- terns of Orders 2 and 3,” Linear Algebra and Its Applica- tions , Vol. 437, No. 2, 2012, pp. 490-498. doi:10.1016/a.2012.03.007。

  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
相关文档
最新文档