sharp lower bounds of the least eigenvalue of planar graphs

合集下载

微分流形上Laplace型算子的主特征值估计

1.2. 特征值估计问题的发展
对 Laplace 算子的特征值进行估计时，由于边界条件的不同主要分为 Dirichlet 边界问题与 Neumann 边界问题： Dirichlet 边界问题：
−λ u , ∆u = = u 0,
Neumann 边界问题：
on M on ∂M on M on ∂M
Open Access
1. 引言
特征值问题的主要目标之一是通过尽可能明确的几何量，例如等周常数、流形的体积、流形的直径以及有关的曲率条件对特征值的上下界进行估计[1]。Laplace 算子为最重要的微分算子之一，其特征值估计对几何、分析及物理等都有极其重要的作用。丘成桐(Yau)等著名数学家对微分算子的特征值估计问题做出了许多重要的贡献，直到现在特征值估计依然是流形分析上的重要问题，在许多数学家的不懈努力下，特征值估计问题迅速发展并拓展出许多新的更加精确的结果。本节将首先介绍微分流形上特征值估计问题的研究背景与研究意义，然后介绍特征值估计问题的基本发展。
1.1. 微分流形上特征值估计问题的研究背景与意义
爱因斯坦广义相对论需要用弯曲空间来描述物理世界，黎曼流形为描述解释广义相对论提供了工具，因此研究黎曼流形上的特殊算子有助于解决广义相对论及爱因斯坦场方程等问题。 Laplace 算子为微分流形上最重要的一类微分算子，是因为很多重要的非线性算子在线性化之后就是某个黎曼度量的 Laplace 算子，因此 Laplace 算子对微分流形的谱(即 Laplace 算子的特征值全体) [2]的研究具有重要的意义。丘成桐在 2000 年发表的《几何与分析回顾》中提出了十个有待解决的问题，其中第二个问题就是：理解一个完备流形的 Laplace 算子的谱。微分流形上特征值问题的研究始于 20 世纪 60 年代，经过多年的发展获得了丰富有效的成果[1] [3] [4]，已经成为流形分析的重要研究课题之一，谱几何也成为大范围几何分析的一个重要分支。 20 世纪，微分几何与物理学及数学中的分析数学、代数几何、拓扑学等相互影响相互促进，得到了迅猛发展。分析方法的引入更是对微分几何的发展产生了深远的影响，促进了许多著名问题的解决。一个微分流形的全体特征值能够反映出很多几何或拓扑信息？众所周知：一个等距的流形必然是等谱的。但是等谱的流形是否必然等距呢？1911 年，Weyl 证明了平面区域的面积可以由谱来决定；1964 年，著名数学家 Milnor 构造出等谱而不等距的 16 维平坦环面；1966 年，Kac 提出一个问题“你能听出鼓的形状吗”化为数学问题即两个平面等谱区域是否等距同构？并证明答案是否定的，即等谱的流形不一定等

inequalities for graph eigenvalues

1
Introduction
We consider the Laplacian and eigenvalues of graphs and induced subgraphs. Although an induceas a graph in its own right, it is natural to consider an induced subgraph S as having a boundary (formed by edges joining vertices in S and vertices not in S but in the “ host ” graph). The host graph then can be regarded as a special case of a subgraph with no boundary. This paper consists of three parts. In the ﬁrst part (Section 2-5), we give deﬁnitions and describe basic properties for the Laplacian of graphs. We introduce the Neumann eigenvalues for induced subgraphs and the heat kernel for graphs and induced subgraphs. Then we establish the following lower bound for the Neumann eigenvalues of induced subgraphs. 1
Abstract For an induced subgraph S of a graph, we show that its Neumann eigenvalue λS can be lower-bounded by using the heat kernel Ht (x, y ) of the subgraph. Namely, √ 1 Ht (x, y ) dx λS ≥ inf y ∈S 2t dy x ∈S where dx denotes the degree of the vertex x. In particular, we derive lower bounds of eigenvalues for convex subgraphs which consist of lattice points in an d-dimensional Riemannian manifolds M with convex boundary. The techniques involve both the (discrete) heat kernels of graphs and improved estimates of the (continuous) heat kernels of Riemannian manifolds. We prove eigenvalue lower bounds for convex subgraphs of the form cǫ2 /(dD(M ))2 where ǫ denotes the distance between two closest lattice points, D(M ) denotes the diameter of the manifold M and c is a constant (independent of the dimension d and the number of vertices in S , but depending on the how “dense” the lattice points are). This eigenvalue bound is useful for bounding the rates of convergence for various random walk problems. Since many enumeration problems can be approximated by considering random walks in convex subgraphs of some appropriate host graph, the eigenvalue inequalities here have many applications.

A sharp Trudinger- Moser type inequality

u
D ≤1
(eαu − 1)dx = c(Ω) < +∞ for α ≤ 4π ,
Ω
2
2
The inequality is optimal: for any growth eαu with α > 4π the corresponding supremum is +∞. The supremum (1.1) becomes inﬁnite for domains Ω with |Ω| = ∞, and therefore the Trudinger-Moser inequality is not available for unbounded domains. Related inequalities for unbounded domains have been 2 proposed by Cao [5] and Tanaka [2], however they assume a growth eαu with α < 4π , i.e. with subcritical growth. In this paper we show that replacing the Dirichlet norm u 1 Sobolev norm on H0 (Ω), namely (1.2) u
un
D ≤1
(e4πun − 1)dx = cc−lim
B1 un
2
S ≤1
(e4πun − 1)dx = e π .
R2
2
In the ﬁnal result of the paper we prove Theorem 1.3 For any ball Ω = BR (0) and for Ω = R2 holds (1.9) sup

On the parabolic kernel of the Schrodinger operator

(0.2)
where q is a function on M alone. We shall point out that for the heat equation (q=0), upper estimates for the heat kernel were obtained in [7] and [5]. However the estimate which we obtain is so far the sharpest, especially for large time. When the Ricci curvature is nonnegative the sharpness is apparent, since a comparable lower bound is also obtained in w4. A lower bound for the fundamental solution of (0.2) is also derived for some special situations. Applications of these estimates for the heat kernel are discussed in w5. A generalization of Widder's [25] uniqueness theorem for positive solutions of the heat equation is proved (2) (Theorem 5.1). In fact, the condition on the curvature is best possible due to the counter-example of Azencott [2]. We also point out that generalizations of Widder's theorem to general elliptic operators in R n were derived in [21], [1 l] and [l]. When M has nonnegative Ricci curvature, sharp upper and lower bounds of Green's function are derived. This can also be viewed as a necessary and sufficient condition for the existence of Green's function which was studied in [23]. In fact, in [24], our estimates on Green's function were proved for nonnegatively Ricci curved manifolds with pole and with nonnegative radial sectional curvatures. In [13], Gromov proved lower bounds for all the eigenvalues of the Laplacian on a compact nonnegatively Ricci curved manifold without boundary. We generalized these estimates to allow the manifold to have convex boundaries with either Dirichlet or Neumann boundary conditions. These lower bounds can also be viewed as a generalization of the lower bound for the first eigenvalue obtained in [16]. Another application is to derive an upper bound of the first Betti number, bi, on a compact manifold in terms of its dimension, a lower bound of the Ricci curvature, and an upper bound of the diameter. The manifold is allowed to have convex boundaries, in which case bl can be taken to be the dimension of either HI(M) of HI(M, aM). It was proved in [14] that if M has no boundary, then bk can be estimated from above by the

(0,1)矩阵矩阵积和式的上下界

n
PerA ≤
i=1
(ri !) ri .
1
If A ∈ Bn , let τ = n2 − A be the number of zeros of A. For 0 ≤ τ ≤ n2 , let U (n, τ ) = {A ∈ Bn : A = n2 − τ }, In 1988, Brualdi et al considered upper bounds of PerA when A ∈ Bn with restrictions on A . The following are the two results of A ∈ U (n, τ ). Theorem 1.3(Brualdi, Goldwasser and Michael, [5]) Let A ∈ U (n, τ ) with τ ≥ n2 − n. Let σ = n2 − τ and r = σ/n . Then PerA ≤ (r!)

The upper bound and lower bound for the permanent of (0, 1)-matrices
Zhang Xueyuan Zhu Xiaoying Wang Cuiqi ∗ (School of Science, China University of Mining and Technology, Xuzhou, Jiangsu, 221008)

A=
0 ··· 1 ··· . . . . . . 1 ···
0 1 . . .
1 ··· 1 ··· . . . . . . 1 1 ···
1 1 . . . 1
∈ U (n, τ )
Now using Laplace (2) expand PerA by the 1st row: 1 ··· . . PerA = τ · 0 · Per . .. . 1 ··· 1 ··· . . +(n − τ ) · 1 · Per . . .. 1 ··· = (n − τ ) · (n − 1)!.

wavelet image inpainting

1
coding JPEG2000 image ﬁles and data loss during wireless transmission processes, for example, could both result in the need for ﬁlling-in the missing information in the wavelet domain instead of the pixel domain. Working on the wavelet domain, instead of the pixel domain, changes the nature of the inpainting problem, since damages to wavelet coeﬃcients can create correlated damage patterns in the pixel domain. For instance, for wavelets based image inpainting problems, there is usually no corresponding clear cut inpainting regions, which is however necessary for most existing PDE based inpainting models in pixel domains. A direct consequence of this lack of geometric regularity of inpainting regions is the prohibition of geometric interpolation techniques in pixel domains [39]. On the other hand, direct interpolation in the wavelet domain is also problematic, as wavelet coeﬃcients (except the low frequencies) are calculated to decouple the correlation between neighboring pixels. Retained high frequency coeﬃcients provide minimal information for the missing coeﬃcients. For this reason, many contemporary error concealment methods for JPEG (built upon discrete cosine transforms (DCT)) or JPEG2000 (upon wavelets) images require the additional control of regularity in pixel domains, in addition to direct operations in transformed domains (i.e., DCT or wavelets). Such examples include HemamiGray’s bi-cubic Coons surface [34], non-uniform rational B-spline (NURBS) by ParkLee [44] and by Cheng et. al. [18], Niu-Poston’s harmonic postprocessing techniques [43], Sapiro and collaberators’ separate reconstruction techniques for structures and textures [47], and least square minimization in wavelet-domain reconstructions [46]. However, these JPEG based error concealment methods usually work on images that have already been partitioned into 8 × 8 or 16 × 16 blocks. Each missing or damaged block corresponds to a well deﬁned square region to be ﬁlled-in in the pixel domain. This is diﬀerent from our current work, for which no assumption is made of the block partitioning. Depending on the scales or resolutions, missing or damaged wavelet coeﬃcients could cause degradation wide spread in the pixel domain. That is, even a few coeﬃcients can potentially aﬀect all pixels. Moreover, unlike denoising problems in which the perturbation in the pixel domain is mostly homogeneous, the degradation in wavelet inpainting problems is usually inhomogeneous (diﬀerent regions can suﬀer diﬀerent level of damages). This new phenomenon demands diﬀerent treatments in diﬀerent regions in the pixel domain. These novel features and challenges call for new models and methods of image inpainting in the wavelet domain. An important guiding principle for us is that even though the primary goal is to ﬁll in the missing coeﬃcients in the wavelet domain, it is important to control the regularity in the pixel domain, so that the inpainted images retain important geometrical features, especially when noise is present. Such considerations have motivated the variational PDE approach in our current work. The variational PDE technique has been widely used in numerous applications such as image segmentation [7] [13] [52], restoration [14], [48], and compression [16] [26]. Even for traditional image inpainting, PDE methods have been well studied [3] [9] [11] [28] [2]. The growing impact of PDE techniques in image processing is mainly due to their capability in controlling geometrical features of images. PDE’s (many of them are derived from variational principles) are usually designed to possess certain desirable geometrical properties. For example, total variation (TV) minimization, which leads to a curvature term in the corresponding Euler-Lagrange equation, can retain sharp

图的特征值交错方法的几个应用

The Laplace matrix L(G) of a graph G is singular and positive semideﬁnite with eigenvalues
θ1 ≥ θ2 ≥ · · · ≥ θn = 0, say Laplace eigenvalues of G. Particularly, θ1 is called the Laplace spectral radius of G.
complete graph, the star and the cycle , respectively, each on n vertices. Two edges of a graph are said to be independent if they are not adjacent. An m-matching M of G is a set of m mutually independent edges. A vertex v is said to be M -saturated, if some edge of M is incident with v ; otherwise, v is M -unsaturated. If every vertex of G is M -saturated, the matching M is perfect. Consider two sequences of real numbers: λ1 ≥ λ2 ≥ · · · ≥ λn and µ1 ≥ µ2 ≥ · · · ≥ µm with (m < n). The second sequence is said to interlace the ﬁrst one whenever λi ≥ µi ≥ λn−m+i for i = 1, 2, . . . , m. Lemma 1[3] If B is a principal submatrix of a symmetric matrix A, then the eigenvalues of B interlace the eigenvalues of A. Suppose that the rows and columns of the matrix A11 A12 · · · A = ··· ··· ··· Am1 Am2 ···

Asymptotic Quantum Search and a Quantum Algorithm for Calculation of a Lower Bound of the P

ASYMPTOTIC QUANTUM SEARCH AND A QUANTUM ALGORITHM FOR CALCULATION OF A LOWER BOUND OF THE PROBABILITY OF FINDING A DIOPHANTINE EQUATIONTHAT ACCEPTS INTEGER SOLUTIONSR. V. Ramos and J. L. de Oliveirarubens@deti.ufc.br jluzeilton@deti.ufc.brDepartment of Teleinformatic Engineering, Federal University of CearaCampus do Pici, C. P. 6007, 60455-740, Fortaleza, Brazil.Several mathematical problems can be modeled as a search in a database. An example is the problem of finding the minimum of a function. Quantum algorithms for solving this problem have been proposed and all of them use the quantum search algorithm as a subroutine and several intermediate measurements are realized. In this work, it is proposed a new quantum algorithm for finding the minimum of a function in which quantum search is not used as a subroutine and only one measurement is needed. This is also named asymptotic quantum search. As an example, we propose a quantum algorithm based on asymptotic quantum search and quantum counting able to calculate a lower bound of the probability of finding a Diophantine equation with integer solution.Keywords:Quantum algorithms, quantum search, optimization, Diophantine equations.1. IntroductionThe Grover’s quantum search algorithm is an important result in quantum computation that proves that quantum superposition can speed-up the task of finding a specific value within an unordered database. The quantum search is proved to use O(N1/2) operations of the oracle (in comparison with the O(N) operations of the best classical algorithm), indicating a quadratic speed-up [1-3]. Several mathematical problems can be modeled as a search, like the problem of finding the minimum (or the maximum) of a function. Thus, some algorithms for finding the minimum or maximum using quantum search have been proposed [4,5], using the generalization of the quantum search proposed in [6] as a subroutine that is called several times. Every time the quantum search is called, at the end a measurement is realized. The number of measurements in the algorithm proposed in [4] is (log2N), where N is the number of elements in the database [7]. Aiming to reduce the number of measurements, Kowada at al [7] proposed a new quantum algorithm for finding the minimum of a function that realizes O(log N) measurements. Here, we go beyond, proposing a quantum algorithm for finding the minimum that realizes only one measurement, at the final of the algorithm. Furthermore, conversely the already proposed quantum algorithms for finding the minimum, the quantum search in the proposed algorithm is not used as a subroutine, indeed it is a part of the algorithm as any other quantum gate. In this work, as an example, we apply asymptotic quantum search together with quantum counting algorithm [3,8] in order to create a quantum algorithm able to calculate a lower bound of the probability of finding a Diophantine equation that accepts integer solutions. We are going to call this number by D.(1)(2) (3)2. Quantum algorithm for finding the minimum of a functionThe circuit for the proposed quantum algorithm is shown in Fig. 1.Fig. 1- Circuit for asymptotic quantum search.In Fig. 1, U f is the quantum gate that implements the function f , U f x 0 = x f (x ) , the gate QBSC is a quantum bit string comparator [9], QBSC x y 0 = x y b , where b =0 if x >y and b =1 if x y . At last, the oracle in the amplitude amplification recognizes if the quantum registers 4 and 5 are in the total state 0 N 0 . In order to understand the quantum circuit in Fig. 1, we firstly show the operation of the quantum circuit U , following the states in the marked positions, as shown below:1152432151452243331451452233000000000000NNNx xNN NNNx f x x xy Nx f x xy x y c xc U x Hc xf x c xf x yc x f x f y(4.a)(4.b)(5)(6.a)41451452323, ,415243 212431101 N x y x yf x f y f x f y n x x jj xj f x f y n x x jj j xf x yf y f x yf y c x f x f y c x f x f y50xf x f y.In (1)-(4.b), N is the number of qubits, hence, there are 2N elements in the database. In (4.b), 1 n (x ) 2N is the number of elements that obey f (x ) f (y ) for a given x and all y ’s. Thus, the better the solution the larger is the value of n (x ). At last, y i can assume any y value.min 515243 215243 16min min 101 00Nn x N x j x j f x f y n x N x j xj f x f y x c x f x c x f x c x f xminmin 1522443321423 10001 0Nn x N N N N x j xj x x f x f y n x N Nx kxk x x f x f y c x fx y c xf x H y5(6.b)(7)(8)min minmin 6min min 115224343 154231 021421002001001N j NNNN NNx x xx x n x NN Nx j x j x x H y f x f y n x N x kk c x f x c n x xf x c x y c x f x H ymin53 00N x x x f x f y.Finallymin 71515243243 15423 1 0214210000200000000N N j NN NN NNNx x xxf x f y n xNNNx j xj x x H y f x f y NN x k k U c xc n x x c x y c x y53 01Nn x N xf x f yThe worse the solution the lower is the value of n (x ) and the faster is the decay. Now, using k -times the gate U together with the multi-controlled CNOTs, the quantum state just before the quantum amplification is:min 15243 1154231 1 021142120000200020N j N k N N N Nx x f x f y n x k r N N NN x j r x j x x H y f x f y n r N N Nx k k c n x x c n x x H y c n x x H y31 501x k N r x f x f y(9)At the amplitude amplification, the oracle recognizes the state 0 N 0 in the fourth and fifth quantum registers. Therefore, only the first term in (8) will have their amplitudes amplified. Looking closer the first term in (8), one sees that the amplitude of the searched answer is min x c , since n (x )=2N for x =x min . The second term with largest amplitude has n (x )=2N -1, hence, after the k -th application of U its amplitude will be c x /2k . Thus, if k is large enough only the term corresponding to the searched answer will have considerable amplitude and, after amplitude amplification, the searched answer will be obtained with high probability ina single set of measurements. For example, choosing x c for all x , the largest amplitude after kusage of U(before amplitude amplification) is min x c while the second largest amplitude will bemin 2k x x c c . The oracle in the amplitude amplification is applied /(4 ) times, where [9]sin k.If k is large enough, then 2Ngood p . Obviously, the efficiency of the proposed algorithm is equal to theefficiency of the amplitude amplification.3. Quantum algorithm for finding a lower bound of DA Diophantine equation is a polynomial equation with any number of unknowns and with integercoefficients. For example, (x +1)n +(y +2)n +(z +3)n -Cxyz =0, where C and n . Since there is not a universal process to determine whether any Diophantine equation accepts or not integer solutions, the best that one can do is to use a brute-force method. In this case, one can not determine the probability of finding a Diophantine equation that accepts integer solutions, D , exactly, but it is possible to determine a lower bound for it. In this direction, the quantum algorithm proposed can be used to implement a brute-force attack (using the intrinsic quantum parallelism) and the accuracy of the answer obtained is limited by the amount of qubits used, the larger the amount of qubits used the closer to D is the obtained answer. Since only a finite amount of qubits can be used, the answer obtained is simply a lower bound of D . Since the (infinite) set of Diophantine equations considered is enumerable, we will use integer numbers to enumerate the Diophantine equations. The quantum circuit for the proposed quantum algorithm is shown in Fig. 2. In this quantum circuit, we assume that the set of Diophantine equations considered has only three unknowns (x ,y ,z ), however, the extension to a larger number of unknowns is straightforward.(10)(11)(12)(13)(15)(16)(14)(17)31126354,,32126543,,3125643,,,,,,0000,,,,000,,,,,,00lM lxyz px y z Ml xyz p px y z lxyz p px y zx y z p c x y z c x y z D x y z p c x y z D x y z y z4125634,,,,51126534,, ,,,,,,,,,,,,,,0,,,,,,,,1p p i j mxyz p p px y z x y z xyzp i j mp i j mpx y zi j mD x y z D x y z c x y z D x y z y z D x y z cx y zD x y z y z D x y z !62126543,, ,,,,,,352317132,,,,,,01,,,,0,,2,,,,p p i j mpp s s s s s s opts lxyzp i j mpx y zi j mD x y z D x y z t p p p p p pM opt opt opt p opt opt opt xyz s M xyz p pcx y zD x y z y z c x y z D x y z c n x y z x y z D x y !4634353,,,,,,010p p popt opt opt l Mpx y z x y z x y z z !!Finally38126354,,2133163542 ,,,,,,,,0000,,0000,,2,,pps s s opt sp p popt opt opt lM lxyz px y zt p p popt opt opt xyz s l MlM pxyz x y zx y z x y z U c x y z c x y z p c n x y z x y z !In (17), 1 n (x ,y ,z ) 23M is the number of elements that obey |D p (x,y,z )| |D p (x’,y’,z’)| for a given x ,y ,z and all x’,y’,z’. Thus, the better the solution the larger is the value of n (x ,y ,z ). We assume that, for the Diophantine equation number p there are t p minimums. Moreover3,,2s s sp p pMopt opt opt n x y z for s =1,2,…,t p . Thestates !i i =1,2,3,4 and ! are states with undesired terms. Now, using k -times the gate U together with the multi-controlled CNOT gates, the quantum state just before the quantum amplification is:(20)(19)(18)3312635421 ,,,,,,,,,,2,,0000p p s s s opt sp p popt opt optt kp p p M l Mlopt opt optxyz xyz ps x y zx y z x y z p c x y z c n x y z x y z"The state " represents the uninteresting terms. At the amplitude amplification, the oracle recognizes the state 0 3M 0 in the fifth and sixth quantum registers. Therefore, only the first term in (18) will have their amplitudes amplified. Looking closer the first term in (18), one sees that the amplitude of one of the searched answers is p opt sxyz c (comparing to the single variable case in Section 2, we have n (x,y,z )=23M for,,,,p p popt opt opt x y z x y z ). The second term with largest amplitude has n (x ,y ,z )=23M -1, hence, after the k -thapplication of U its amplitude will be c xyz /2k . Thus, if k is large enough only the term corresponding to the searched answer will have considerable amplitude and, after amplitude amplification, it will be obtained with high probability in a single measurement. For example,choosing xyz c for all xyz , thelargest amplitude after k usage of U (before amplitude amplification)is p opt sxyz c thesecond largest amplitude will be 2popt sk xyz xyz c c . The oracle in the amplitude amplification is applied/(4 ) times, wheresin k.If k is large enough, then 32Mgood p p t . The efficiency of the proposed algorithm is equal to theefficiency of the amplitude amplification. Thus, the final state after the amplitude amplification is1,,,,ps s ss s st p p pp p pend opt opt opt p opt opt opt ps y z D x y z errIn (20) one can see that, for each one of the 2N Diophantine equations, the algorithm calculate the minimums values testing 2M different values for each unknown. Moreover, the term err represents the remained unwanted states due to the non ideal amplitude amplification.In order to calculate the lower bound of D , we use the state (20) as input of a quantum countingalgorithm whose oracle recognizes a zero in the last register of (20), that is,,,0p p pp opt opt optD x y z .The quantum counting algorithm returns the number of elements in the database that pass in the oracle test. Let us name this value by r (N ,M ). Hence,(21),lim ,2N D N M r N M #$and a lower bound for D is simply ,2estND r N M .4. DiscussionsThe problem to decide if a given Diophantine equation has or not integer solutions is, according toChurch-Turing computability thesis, an undecidable problem. In fact, this problem is related to the halting problem for Turing machine: The Diophantine equation problem could be solved if and only if the Turing halting problem could be [10]. Since there is not a universal process to decide if a given program will halt or not a Turing machine, the best that one can do is to run the program and wait for a halt. The point is: one can never know if he/she waited enough time in order to observe the halt. The Diophantine equation problem has a similar statement. Since there is not a universal procedure to decide if a given Diophantine equation accepts an integer solution, the best that one can do in order to check if it accepts or not an integer solution, is to test the integer numbers. The point here is: one can never know the answer because there are infinite integers to be tested. Associated to the halting problem is the Chaitin number , the probability of a given program to halt the Turing machine [11-17]. The number is an incompressible number and it can not be calculated. Similarly, one can define the probability of finding a Diophantine equation with integer solutions, D . Obviously, D is also a number that can not be calculated, because the knowledge of D implies in the knowledge of a way to decide if any given Diophantine equation accepts or not any integer solution that, by its turn, implies in the knowledge of and in the solution of the halting problem. As happens with , it is possible to calculate classically a lower bound for D , however, it must be a brute-force algorithm. Classical algorithms for calculation of D will differ basically in the method that the zeros of the Diophantine equation are obtained. The quantum algorithm here proposed is also a brute-force algorithm but its advantage is the quantum parallelism, since several Diophantine equations are tested simultaneously.4. ConclusionsIt was proposed a new quantum algorithm for finding the minimum of a function that requires only one measurement. This is important since the complete circuit is simplified and measurements of qubits are not noise free. This algorithm is an asymptotic quantum search. As an example of its application, we constructed a quantum algorithm for finding a lower bound for the probability of finding a Diophantine equation that accepts integer solutions. A lower bound can also be calculated using a classical computer,however, our algorithm is more efficient since it can test 2N(for N qubits) Diophantine equations simultaneously. At last, the algorithm proposed requires only one measurement, after the quantum counting stage.AcknowledgeThe authors thank the anonymous reviewer for useful comments.References1.Grover, L. V.: A fast quantum mechanical algorithm for database search, in Proc., 28th Annual ACM Symposium on the Theory of Computing, New York, pp. 212 (1996).2. Grover, L. V.: Quantum mechanics helps in searching for a needle in a haystack, Phys. Rev. Lett., 79, pp. 325 (1997).3. M. A. Nielsen, and I. L. Chuang, Quantum Computation and Quantum Information, Cambridge Univ. Press, Cambridge, ch. 4 and 6 (2000).4. Dürr C., and Høyer, P.: A quantum algorithm for finding the minimum, arXive:quant-ph/9607014 (1999).5. Ahuja, A., and Kapoor, S.: A quantum algorithm for finding the maximum, arXive:quant-ph/9911082 (1999).6. Boyer, M., Brassard, G., Høyer, P., and Tapp, A.: Tight bounds on quantum searching, arXive:quant-ph/9605034 (1996).7. Kowada, L. A. B., Lavor, C., Portugal, R., and de Figueiredo, C. M. H.: A new quantum algorithm to solve the minimum searching problem, International Journal of Quantum Information, Vol. 6, no. 3, 427 - 436, 2008.8. Kaye, P., Laflamme, R., and Mosca, M.: An introduction to Quantum Computing, 1st Ed., Oxford University Press, 2007.9. Oliveira, D. S., and Ramos, R. V.: Quantum Bit String Comparator: Circuits and Applications, Quantum Computers and Computing, vol. 7, no. 1, 17-26, 2008.10. T. D. Kieu, Quantum algorithms for Hilbert’s tenth problem, – quant-ph/0110136, 2003.11. C. Calude, Theories of Computational Complexity, North-Holland, Amsterdam, 1998.12. G. Chaitin, Meta Math!: the quest for omega, Pantheon Books, 2005.13. G. Chaitin, Irreducible Complexity in Pure Mathematics, Preprint at (/math.HO/0411091), 2004.14. G. Chaitin, Information-theoretic computational complexity, IEEE Trans. on Inf. Theory, IT-20, 10, 1974.15. C. Calude, Information and Randomness, Spring-Verlag, Berlin, 1994.16. G. Chaitin, Algorithmic Information Theory,1a Ed. Cambridge University Press, 1987.17. G. Markowsky, Introduction to algorithmic information theory, J. of Universal Comp. Science, 2, no. 5, 245, 1996.。

Cubature Kalman Filters

1254IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 54, NO. 6, JUNE 2009Cubature Kalman FiltersIenkaran Arasaratnam and Simon Haykin, Life Fellow, IEEEAbstract—In this paper, we present a new nonlinear ﬁlter for high-dimensional state estimation, which we have named the cubature Kalman ﬁlter (CKF). The heart of the CKF is a spherical-radial cubature rule, which makes it possible to numerically compute multivariate moment integrals encountered in the nonlinear Bayesian ﬁlter. Speciﬁcally, we derive a third-degree spherical-radial cubature rule that provides a set of cubature points scaling linearly with the state-vector dimension. The CKF may therefore provide a systematic solution for high-dimensional nonlinear ﬁltering problems. The paper also includes the derivation of a square-root version of the CKF for improved numerical stability. The CKF is tested experimentally in two nonlinear state estimation problems. In the ﬁrst problem, the proposed cubature rule is used to compute the second-order statistics of a nonlinearly transformed Gaussian random variable. The second problem addresses the use of the CKF for tracking a maneuvering aircraft. The results of both experiments demonstrate the improved performance of the CKF over conventional nonlinear ﬁlters. Index Terms—Bayesian ﬁlters, cubature rules, Gaussian quadrature rules, invariant theory, Kalman ﬁlter, nonlinear ﬁltering.• Time update, which involves computing the predictive density(3)where denotes the history of input; is the measurement pairs up to time and the state transition old posterior density at time is obtained from (1). density • Measurement update, which involves computing the posterior density of the current stateI. INTRODUCTIONUsing the state-space model (1), (2) and Bayes’ rule we have (4) where the normalizing constant is given byIN this paper, we consider the ﬁltering problem of a nonlinear dynamic system with additive noise, whose statespace model is deﬁned by the pair of difference equations in discrete-time [1] (1) (2)is the state of the dynamic system at discrete where and are time ; is the known control input, some known functions; which may be derived from a compensator as in Fig. 1; is the measurement; and are independent process and measurement Gaussian noise sequences with zero and , respectively. means and covariances In the Bayesian ﬁltering paradigm, the posterior density of the state provides a complete statistical description of the state at that time. On the receipt of a new measurement at time , we in update the old posterior density of the state at time two basic steps:Manuscript received July 02, 2008; revised July 02, 2008, August 29, 2008, and September 16, 2008. First published May 27, 2009; current version published June 10, 2009. This work was supported by the Natural Sciences & Engineering Research Council (NSERC) of Canada. Recommended by Associate Editor S. Celikovsky. The authors are with the Cognitive Systems Laboratory, Department of Electrical and Computer Engineering, McMaster University, Hamilton, ON L8S 4K1, Canada (e-mail: aienkaran@grads.ece.mcmaster.ca; haykin@mcmaster. ca). Color versions of one or more of the ﬁgures in this paper are available online at . Digital Object Identiﬁer 10.1109/TAC.2009.2019800To develop a recursive relationship between the predictive density and the posterior density in (4), the inputs have to satisfy the relationshipwhich is also called the natural condition of control [2]. has sufﬁcient This condition therefore suggests that information to generate the input . To be speciﬁc, the can be generated using . Under this condiinput tion, we may equivalently write (5) Hence, substituting (5) into (4) yields (6) as desired, where (7) and the measurement likelihood function obtained from (2). is0018-9286/$25.00 © 2009 IEEEARASARATNAM AND HAYKIN: CUBATURE KALMAN FILTERS1255Fig. 1. Signal-ﬂow diagram of a dynamic state-space model driven by the feedback control input. The observer may employ a Bayesian ﬁlter. The label denotes the unit delay.The Bayesian ﬁlter solution given by (3), (6), and (7) provides a uniﬁed recursive approach for nonlinear ﬁltering problems, at least conceptually. From a practical perspective, however, we ﬁnd that the multi-dimensional integrals involved in (3) and (7) are typically intractable. Notable exceptions arise in the following restricted cases: 1) A linear-Gaussian dynamic system, the optimal solution for which is given by the celebrated Kalman ﬁlter [3]. 2) A discrete-valued state-space with a ﬁxed number of states, the optimal solution for which is given by the grid ﬁlter (Hidden-Markov model ﬁlter) [4]. 3) A “Benes type” of nonlinearity, the optimal solution for which is also tractable [5]. In general, when we are confronted with a nonlinear ﬁltering problem, we have to abandon the idea of seeking an optimal or analytical solution and be content with a suboptimal solution to the Bayesian ﬁlter [6]. In computational terms, suboptimal solutions to the posterior density can be obtained using one of two approximate approaches: 1) Local approach. Here, we derive nonlinear ﬁlters by ﬁxing the posterior density to take a priori form. For example, we may assume it to be Gaussian; the nonlinear ﬁlters, namely, the extended Kalman ﬁlter (EKF) [7], the central-difference Kalman ﬁlter (CDKF) [8], [9], the unscented Kalman ﬁlter (UKF) [10], and the quadrature Kalman ﬁlter (QKF) [11], [12], fall under this ﬁrst category. The emphasis on locality makes the design of the ﬁlter simple and fast to execute. 2) Global approach. Here, we do not make any explicit assumption about the posterior density form. For example, the point-mass ﬁlter using adaptive grids [13], the Gaussian mixture ﬁlter [14], and particle ﬁlters using Monte Carlo integrations with the importance sampling [15], [16] fall under this second category. Typically, the global methods suffer from enormous computational demands. Unfortunately, the presently known nonlinear ﬁlters mentioned above suffer from the curse of dimensionality [17] or divergence or both. The effect of curse of dimensionality may often become detrimental in high-dimensional state-space models with state-vectors of size 20 or more. The divergence may occur for several reasons including i) inaccurate or incomplete model of the underlying physical system, ii) informationloss in capturing the true evolving posterior density completely, e.g., a nonlinear ﬁlter designed under the Gaussian assumption may fail to capture the key features of a multi-modal posterior density, iii) high degree of nonlinearities in the equations that describe the state-space model, and iv) numerical errors. Indeed, each of the above-mentioned ﬁlters has its own domain of applicability and it is doubtful that a single ﬁlter exists that would be considered effective for a complete range of applications. For example, the EKF, which has been the method of choice for nonlinear ﬁltering problems in many practical applications for the last four decades, works well only in a ‘mild’ nonlinear environment owing to the ﬁrst-order Taylor series approximation for nonlinear functions. The motivation for this paper has been to derive a more accurate nonlinear ﬁlter that could be applied to solve a wide range (from low to high dimensions) of nonlinear ﬁltering problems. Here, we take the local approach to build a new ﬁlter, which we have named the cubature Kalman ﬁlter (CKF). It is known that the Bayesian ﬁlter is rendered tractable when all conditional densities are assumed to be Gaussian. In this case, the Bayesian ﬁlter solution reduces to computing multi-dimensional integrals, whose integrands are all of the form nonlinear function Gaussian. The CKF exploits the properties of highly efﬁcient numerical integration methods known as cubature rules for those multi-dimensional integrals [18]. With the cubature rules at our disposal, we may describe the underlying philosophy behind the derivation of the new ﬁlter as nonlinear ﬁltering through linear estimation theory, hence the name “cubature Kalman ﬁlter.” The CKF is numerically accurate and easily extendable to high-dimensional problems. The rest of the paper is organized as follows: Section II derives the Bayesian ﬁlter theory in the Gaussian domain. Section III describes numerical methods available for moment integrals encountered in the Bayesian ﬁlter. The cubature Kalman ﬁlter, using a third-degree spherical-radial cubature rule, is derived in Section IV. Our argument for choosing a third-degree rule is articulated in Section V. We go on to derive a square-root version of the CKF for improved numerical stability in Section VI. The existing sigma-point approach is compared with the cubature method in Section VII. We apply the CKF in two nonlinear state estimation problems in Section VIII. Section IX concludes the paper with a possible extension of the CKF algorithm for a more general setting.1256IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 54, NO. 6, JUNE 2009II. BAYESIAN FILTER THEORY IN THE GAUSSIAN DOMAIN The key approximation taken to develop the Bayesian ﬁlter theory under the Gaussian domain is that the predictive density and the ﬁlter likelihood density are both Gaussian, which eventually leads to a Gaussian posterior den. The Gaussian is the most convenient and widely sity used density function for the following reasons: • It has many distinctive mathematical properties. — The Gaussian family is closed under linear transformation and conditioning. — Uncorrelated jointly Gaussian random variables are independent. • It approximates many physical random phenomena by virtue of the central limit theorem of probability theory (see Sections 5.7 and 6.7 in [19] for more details). Under the Gaussian approximation, the functional recursion of the Bayesian ﬁlter reduces to an algebraic recursion operating only on means and covariances of various conditional densities encountered in the time and the measurement updates. A. Time Update In the time update, the Bayesian ﬁlter computes the mean and the associated covariance of the Gaussian predictive density as follows: (8) where is the statistical expectation operator. Substituting (1) into (8) yieldsTABLE I KALMAN FILTERING FRAMEWORKB. Measurement Update It is well known that the errors in the predicted measurements are zero-mean white sequences [2], [20]. Under the assumption that these errors can be well approximated by the Gaussian, we write the ﬁlter likelihood density (12) where the predicted measurement (13) and the associated covariance(14) Hence, we write the conditional Gaussian density of the joint state and the measurement(15) (9) where the cross-covariance is assumed to be zero-mean and uncorrelated Because with the past measurements, we get (16) On the receipt of a new measurement , the Bayesian ﬁlter from (15) yielding computes the posterior density (17) (10) where is the conventional symbol for a Gaussian density. Similarly, we obtain the error covariance where (18) (19) (20) If and are linear functions of the state, the Bayesian ﬁlter under the Gaussian assumption reduces to the Kalman ﬁlter. Table I shows how quantities derived above are called in the Kalman ﬁltering framework. The signal-ﬂow diagram in Fig. 2 summarizes the steps involved in the recursion cycle of the Bayesian ﬁlter. The heart of the Bayesian ﬁlter is therefore how to compute Gaussian(11)ARASARATNAM AND HAYKIN: CUBATURE KALMAN FILTERS1257Fig. 2. Signal-ﬂow diagram of the recursive Bayesian ﬁlter under the Gaussian assumption, where “G-” stands for “Gaussian-.”weighted integrals whose integrands are all of the form nonGaussian density that are present in (10), linear function (11), (13), (14) and (16). The next section describes numerical integration methods to compute multi-dimensional weighted integrals. III. REVIEW ON NUMERICAL METHODS FOR MOMENT INTEGRALS Consider a multi-dimensional weighted integral of the form (21) is some arbitrary function, is the region of where for all integration, and the known weighting function . In a Gaussian-weighted integral, for example, is a Gaussian density and satisﬁes the nonnegativity condition in the entire region. If the solution to the above integral (21) is difﬁcult to obtain, we may seek numerical integration methods to compute it. The basic task of numerically computing the integral (21) is to ﬁnd a set of points and weights that approximates by a weighted sum of function evaluations the integral (22) The methods used to ﬁnd can be divided into product rules and non-product rules, as described next. A. Product Rules ), we For the simplest one-dimensional case (that is, may apply the quadrature rule to compute the integral (21) numerically [21], [22]. In the context of the Bayesian ﬁlter, we mention the Gauss-Hermite quadrature rule; when the is in the form of a Gaussian density weighting functionis well approximated by a polynomial and the integrand in , the Gauss-Hermite quadrature rule is used to compute the Gaussian-weighted integral efﬁciently [12]. The quadrature rule may be extended to compute multidimensional integrals by successively applying it in a tensorproduct of one-dimensional integrals. Consider an -point per dimension quadrature rule that is exact for polynomials of points for functional degree up to . We set up a grid of evaluations and numerically compute an -dimensional integral while retaining the accuracy for polynomials of degree up to only. Hence, the computational complexity of the product quadrature rule increases exponentially with , and therefore , suffers from the curse of dimensionality. Typically for the product Gauss-Hermite quadrature rule is not a reasonable choice to approximate a recursive optimal Bayesian ﬁlter. B. Non-Product Rules To mitigate the curse of dimensionality issue in the product rules, we may seek non-product rules for integrals of arbitrary dimensions by choosing points directly from the domain of integration [18], [23]. Some of the well-known non-product rules include randomized Monte Carlo methods [4], quasi-Monte Carlo methods [24], [25], lattice rules [26] and sparse grids [27]–[29]. The randomized Monte Carlo methods evaluate the integration using a set of equally-weighted sample points drawn randomly, whereas in quasi-Monte Carlo methods and lattice rules the points are generated from a unit hyper-cube region using deterministically deﬁned mechanisms. On the other hand, the sparse grids based on Smolyak formula in principle, combine a quadrature (univariate) routine for high-dimensional integrals more sophisticatedly; they detect important dimensions automatically and place more grid points there. Although the non-product methods mentioned here are powerful numerical integration tools to compute a given integral with a prescribed accuracy, they do suffer from the curse of dimensionality to certain extent [30].1258IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 54, NO. 6, JUNE 2009C. Proposed Method In the recursive Bayesian estimation paradigm, we are interested in non-product rules that i) yield reasonable accuracy, ii) require small number of function evaluations, and iii) are easily extendable to arbitrarily high dimensions. In this paper we derive an efﬁcient non-product cubature rule for Gaussianweighted integrals. Speciﬁcally, we obtain a third-degree fullysymmetric cubature rule, whose complexity in terms of function evaluations increases linearly with the dimension . Typically, a set of cubature points and weights are chosen so that the cubature rule is exact for a set of monomials of degree or less, as shown by (23)Gaussian density. Speciﬁcally, we consider an integral of the form (24)deﬁned in the Cartesian coordinate system. To compute the above integral numerically we take the following two steps: i) We transform it into a more familiar spherical-radial integration form ii) subsequently, we propose a third-degree spherical-radial rule. A. Transformation In the spherical-radial transformation, the key step is a change of variable from the Cartesian vector to a radius and with , so direction vector as follows: Let for . Then the integral (24) can be that rewritten in a spherical-radial coordinate system as (25) is the surface of the sphere deﬁned by and is the spherical surface measure or the area element on . We may thus write the radial integral (26) is deﬁned by the spherical integral with the unit where weighting function (27) The spherical and the radial integrals are numerically computed by the spherical cubature rule (Section IV-B below) and the Gaussian quadrature rule (Section IV-C below), respectively. Before proceeding further, we introduce a number of notations and deﬁnitions when constructing such rules as follows: • A cubature rule is said to be fully symmetric if the following two conditions hold: implies , where is any point obtainable 1) from by permutations and/or sign changes of the coordinates of . on the region . That is, all points in 2) the fully symmetric set yield the same weight value. For example, in the one-dimensional space, a point in the fully symmetric set implies that and . • In a fully symmetric region, we call a point as a generator , where if , . The new should not be confused with the control input . zero coordinates and use • For brevity, we suppress to represent a complete fully the notation symmetric set of points that can be obtained by permutating and changing the sign of the generator in all possible ways. Of course, the complete set entails where; are non-negative integers and . Here, an important quality criterion of a cubature rule is its degree; the higher the degree of the cubature rule is, the more accurate solution it yields. To ﬁnd the unknowns of the cubature rule of degree , we solve a set of moment equations. However, solving the system of moment equations may be more tedious with increasing polynomial degree and/or dimension of the integration domain. For example, an -point cubature rule entails unknown parameters from its points and weights. In general, we may form a system of equations with respect to unknowns from distinct monomials of degree up to . For the nonlinear system to have at least one solution (in this case, the system is said to be consistent), we use at least as many unknowns as equations [31]. That is, we choose to be . Suppose we obtain a cu. In this case, we solve bature rule of degree three for nonlinear moment equations; the re) sulting rule may consist of more than 85 ( weighted cubature points. To reduce the size of the system of algebraically independent equations or equivalently the number of cubature points markedly, Sobolev proposed the invariant theory in 1962 [32] (see also [31] and the references therein for a recent account of the invariant theory). The invariant theory, in principle, discusses how to restrict the structure of a cubature rule by exploiting symmetries of the region of integration and the weighting function. For example, integration regions such as the unit hypercube, the unit hypersphere, and the unit simplex exhibit symmetry. Hence, it is reasonable to look for cubature rules sharing the same symmetry. For the case considered above and ), using the invariant theory, we may con( cubature points struct a cubature rule consisting of by solving only a pair of moment equations (see Section IV). Note that the points and weights of the cubature rule are in. Hence, they can be computed dependent of the integrand off-line and stored in advance to speed up the ﬁlter execution. where IV. CUBATURE KALMAN FILTER As described in Section II, nonlinear ﬁltering in the Gaussian domain reduces to a problem of how to compute integrals, whose integrands are all of the form nonlinear functionARASARATNAM AND HAYKIN: CUBATURE KALMAN FILTERS1259points when are all distinct. For example, represents the following set of points:Here, the generator is • We use . set B. Spherical Cubature Rule. to denote the -th point from theWe ﬁrst postulate a third-degree spherical cubature rule that takes the simplest structure due to the invariant theory (28) The point set due to is invariant under permutations and sign changes. For the above choice of the rule (28), the monomials with being an odd integer, are integrated exactly. In order that this rule is exact for all monomials of degree up to three, it remains to require that the rule is exact , 2. Equivalently, to for all monomials for which ﬁnd the unknown parameters and , it sufﬁces to consider , and due to the fully symmonomials metric cubature rule (29) (30) where the surface area of the unit sphere with . Solving (29) and (30) , and . Hence, the cubature points are yields located at the intersection of the unit sphere and its axes. C. Radial Rule We next propose a Gaussian quadrature for the radial integration. The Gaussian quadrature is known to be the most efﬁcient numerical method to compute a one-dimensional integration [21], [22]. An -point Gaussian quadrature is exact and constructed as up to polynomials of degree follows: (31) where is a known weighting function and non-negative on ; the points and the associated weights the interval are unknowns to be determined uniquely. In our case, a comparison of (26) and (31) yields the weighting function and and , respecthe interval to be tively. To transform this integral into an integral for which the solution is familiar, we make another change of variable via yielding. The integral on the right-hand side of where (32) is now in the form of the well-known generalized GaussLaguerre formula. The points and weights for the generalized Gauss-Laguerre quadrature are readily obtained as discussed elsewhere [21]. A ﬁrst-degree Gauss-Laguerre rule is exact for . Equivalently, the rule is exact for ; it . is not exact for odd degree polynomials such as Fortunately, when the radial-rule is combined with the spherical rule to compute the integral (24), the (combined) spherical-radial rule vanishes for all odd-degree polynomials; the reason is that the spherical rule vanishes by symmetry for any odd-degree polynomial (see (25)). Hence, the spherical-radial rule for (24) is exact for all odd degrees. Following this argument, for a spherical-radial rule to be exact for all third-degree polyno, it sufﬁces to consider the ﬁrst-degree genermials in alized Gauss-Laguerre rule entailing a single point and weight. We may thus write (33) where the point is chosen to be the square-root of the root of the ﬁrst-order generalized Laguerre polynomial, which is orthogonal with respect to the modiﬁed weighting function ; subsequently, we ﬁnd by solving the zeroth-order moment equation appropriately. In this case, we , and . A detailed account have of computing the points and weights of a Gaussian quadrature with the classical and nonclassical weighting function is presented in [33]. D. Spherical-Radial Rule In this subsection, we describe two useful results that are used to i) combine the spherical and radial rule obtained separately, and ii) extend the spherical-radial rule for a Gaussian weighted integral. The respective results are presented as two propositions: Proposition 4.1: Let the radial integral be computed numer-point Gaussian quadrature rule ically by theLet the spherical integral be computed numerically by the -point spherical ruleThen, an by-point spherical-radial cubature rule is given(34) Proof: Because cubature rules are devised to be exact for a subspace of monomials of some degree, we consider an integrand of the form(32)1260IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 54, NO. 6, JUNE 2009where are some positive integers. Hence, we write the integral of interestwhereFor the moment, we assume the above integrand to be a mono. Making the mial of degree exactly; that is, change of variable as described in Section IV-A, we getWe use the cubature-point set to numerically compute integrals (10), (11), and (13)–(16) and obtain the CKF algorithm, details of which are presented in Appendix A. Note that the above cubature-point set is now deﬁned in the Cartesian coordinate system. V. IS THERE A NEED FOR HIGHER-DEGREE CUBATURE RULES? In this section, we emphasize the importance of third-degree cubature rules over higher-degree rules (degree more than three), when they are embedded into the cubature Kalman ﬁltering framework for the following reasons: • Sufﬁcient approximation. The CKF recursively propagates the ﬁrst two-order moments, namely, the mean and covariance of the state variable. A third-degree cubature rule is also constructed using up to the second-order moment. Moreover, a natural assumption for a nonlinearly transformed variable to be closed in the Gaussian domain is that the nonlinear function involved is reasonably smooth. In this case, it may be reasonable to assume that the given nonlinear function can be well-approximated by a quadratic function near the prior mean. Because the third-degree rule is exact up to third-degree polynomials, it computes the posterior mean accurately in this case. However, it computes the error covariance approximately; for the covariance estimate to be more accurate, a cubature rule is required to be exact at least up to a fourth degree polynomial. Nevertheless, a higher-degree rule will translate to higher accuracy only if the integrand is well-behaved in the sense of being approximated by a higher-degree polynomial, and the weighting function is known to be a Gaussian density exactly. In practice, these two requirements are hardly met. However, considering in the cubature Kalman ﬁltering framework, our experience with higher-degree rules has indicated that they yield no improvement or make the performance worse. • Efﬁcient and robust computation. The theoretical lower bound for the number of cubature points of a third-degree centrally symmetric cubature rule is given by twice the dimension of an integration region [34]. Hence, the proposed spherical-radial cubature rule is considered to be the most efﬁcient third-degree cubature rule. Because the number of points or function evaluations in the proposed cubature rules scales linearly with the dimension, it may be considered as a practical step for easing the curse of dimensionality. According to [35] and Section 1.5 in [18], a ‘good’ cubature rule has the following two properties: (i) all the cubature points lie inside the region of integration, and (ii) all the cubature weights are positive. The proposed rule equal, positive weights for an -dimensional entails unbounded region and hence belongs to a good cubature family. Of course, we hardly ﬁnd higher-degree cubature rules belonging to a good cubature family especially for high-dimensional integrations.Decomposing the above integration into the radial and spherical integrals yieldsApplying the numerical rules appropriately, we haveas desired. As we may extend the above results for monomials of degree less than , the proposition holds for any arbitrary integrand that can be written as a linear combination of monomials of degree up to (see also [18, Section 2.8]). Proposition 4.2: Let the weighting functions and be and . such that , we Then for every square matrix have (35) Proof: Consider the left-hand side of (35). Because a positive deﬁnite matrix, we factorize to be , we get Making a change of variable via is .which proves the proposition. For the third-degree spherical-radial rule, and . Hence, it entails a total of cubature points. Using the above propositions, we extend this third-degree spherical-radial rule to compute a standard Gaussian weighted integral as follows:ARASARATNAM AND HAYKIN: CUBATURE KALMAN FILTERS1261In the ﬁnal analysis, the use of higher-degree cubature rules in the design of the CKF may marginally improve its performance at the expense of a reduced numerical stability and an increased computational cost. VI. SQUARE-ROOT CUBATURE KALMAN FILTER This section addresses i) the rationale for why we need a square-root extension of the standard CKF and ii) how the square-root solution can be developed systematically. The two basic properties of an error covariance matrix are i) symmetry and ii) positive deﬁniteness. It is important that we preserve these two properties in each update cycle. The reason is that the use of a forced symmetry on the solution of the matrix Ricatti equation improves the numerical stability of the Kalman ﬁlter [36], whereas the underlying meaning of the covariance is embedded in the positive deﬁniteness. In practice, due to errors introduced by arithmetic operations performed on ﬁnite word-length digital computers, these two properties are often lost. Speciﬁcally, the loss of the positive deﬁniteness may probably be more hazardous as it stops the CKF to run continuously. In each update cycle of the CKF, we mention the following numerically sensitive operations that may catalyze to destroy the properties of the covariance: • Matrix square-rooting [see (38) and (43)]. • Matrix inversion [see (49)]. • Matrix squared-form amplifying roundoff errors [see (42), (47) and (48)]. • Substraction of the two positive deﬁnite matrices present in the covariant update [see (51)]. Moreover, some nonlinear ﬁltering problems may be numerically ill-conditioned. For example, the covariance is likely to turn out to be non-positive deﬁnite when i) very accurate measurements are processed, or ii) a linear combination of state vector components is known with greater accuracy while other combinations are essentially unobservable [37]. As a systematic solution to mitigate ill effects that may eventually lead to an unstable or even divergent behavior, the logical procedure is to go for a square-root version of the CKF, hereafter called square-root cubature Kalman ﬁlter (SCKF). The SCKF essentially propagates square-root factors of the predictive and posterior error covariances. Hence, we avoid matrix square-rooting operations. In addition, the SCKF offers the following beneﬁts [38]: • Preservation of symmetry and positive (semi)deﬁniteness of the covariance. Improved numerical accuracy owing to the fact that , where the symbol denotes the condition number. • Doubled-order precision. To develop the SCKF, we use (i) the least-squares method for the Kalman gain and (ii) matrix triangular factorizations or triangularizations (e.g., the QR decomposition) for covariance updates. The least-squares method avoids to compute a matrix inversion explicitly, whereas the triangularization essentially computes a triangular square-root factor of the covariance without square-rooting a squared-matrix form of the covariance. Appendix B presents the SCKF algorithm, where all of the steps can be deduced directly from the CKF except for the update of the posterior error covariance; hence we derive it in a squared-equivalent form of the covariance in the appendix.The computational complexity of the SCKF in terms of ﬂops, grows as the cube of the state dimension, hence it is comparable to that of the CKF or the EKF. We may reduce the complexity signiﬁcantly by (i) manipulating sparsity of the square-root covariance carefully and (ii) coding triangularization algorithms for distributed processor-memory architectures. VII. A COMPARISON OF UKF WITH CKF Similarly to the CKF, the unscented Kalman ﬁlter (UKF) is another approximate Bayesian ﬁlter built in the Gaussian domain, but uses a completely different set of deterministic weighted points [10], [39]. To elaborate the approach taken in the UKF, consider an -dimensional random variable having with mean and covariance a symmetric prior density , within which the Gaussian is a special case. Then a set of sample points and weights, are chosen to satisfy the following moment-matching conditions:Among many candidate sets, one symmetrically distributed sample point set, hereafter called the sigma-point set, is picked up as follows:where and the -th column of a matrix is denoted ; the parameter is used to scale the spread of sigma points by from the prior mean , hence the name “scaling parameter”. Due to its symmetry, the sigma-point set matches the skewness. Moreover, to capture the kurtosis of the prior density closely, it is sug(Appendix I of [10], gested that we choose to be [39]). This choice preserves moments up to the ﬁfth order exactly in the simple one-dimensional Gaussian case. In summary, the sigma-point set is chosen to capture a number as correctly as of low-order moments of the prior density possible. Then the unscented transformation is introduced as a method that are related to of computing posterior statistics of by a nonlinear transformation . It approximates the mean and the covariance of by a weighted sum of projected space, as shown by sigma points in the(36)(37)。

【1】Lambertian Reflectance and Linear Subspaces

Lambertian Reflectance and Linear Subspaces Ronen Basri,Member,IEEE,and David W.Jacobs,Member,IEEE Abstract—We prove that the set of all Lambertian reflectance functions(the mapping from surface normals to intensities)obtained with arbitrary distant light sources lies close to a9D linear subspace.This implies that,in general,the set of images of a convex Lambertian object obtained under a wide variety of lighting conditions can be approximated accurately by a low-dimensional linear subspace,explaining prior empirical results.We also provide a simple analytic characterization of this linear space.We obtain these results byrepresenting lighting using spherical harmonics and describing the effects of Lambertian materials as the analog of a convolution.These results allow us to construct algorithms for object recognition based on linear methods as well as algorithms that use convex optimization to enforce nonnegative lighting functions.We also show a simple way to enforce nonnegative lighting when the images of an object lie near a 4D linear space.We apply these algorithms to perform face recognition by finding the3D model that best matches a2D query image.Index Terms—Face recognition,illumination,Lambertian,linear subspaces,object recognition,specular,spherical harmonics.æ1I NTRODUCTIONV ARIABILITY in lighting has a large effect on the appearance of objects in images,as is illustrated in Fig.1.But we show in this paper that the set of images an object produces under different lighting conditions can,in some cases,be simply characterized as a nine dimensional subspace in the space of all possible images.This characterization can be used to construct efficient recogni-tion algorithms that handle lighting variations.Under normal conditions,light coming from all direc-tions illuminates an object.When the sources of light are distant from an object,we may describe the lighting conditions by specifying the intensity of light as a function of its direction.Light,then,can be thought of as a nonnegative function on the surface of a sphere.This allows us to represent scenes in which light comes from multiple sources,such as a room with a few lamps and, also,to represent light that comes from extended sources, such as light from the sky,or light reflected off a wall.Our analysis begins by representing these lighting func-tions using spherical harmonics.This is analogous to Fourier analysis,but on the surface of the sphere.With this representation,low-frequency light,for example,means light whose intensity varies slowly as a function of direction.To model the way diffuse surfaces turn light into an image,we look at the amount of light reflected as a function of the surface normal(assuming unit albedo),for each lighting condition.We show that these reflectance functions are produced through the analog of a convolution of the lighting function using a kernel that represents Lambert’s reflection. This kernel acts as a low-pass filter with99.2percent of its energy in the first nine components,the zero,first,and second order harmonics.(This part of our analysis was derived independently also by Ramamoorthi and Hanrahan[31].)We use this and the nonnegativity of light to prove that under any lighting conditions,a nine-dimensional linear subspace,for example,accounts for at least98percent of the variability in the reflectance function.This suggests that in general the set of images of a convex,Lambertian object can be approxi-mated accurately by a low-dimensional linear subspace.We further show how to analytically derive this subspace from a model of an object that includes3D structure and albedo.To provide some intuition about these results,consider the example shown in Fig.2.The figure shows a white sphere made of diffuse material,illuminated by three distant lights.The lighting function can be described in this case as the sum of three delta functions.The image of the sphere, however,is smoothly shaded.If we look at a cross-section of the reflectance function,describing how the sphere reflects light,we can see that it is a very smoothed version of three delta functions.The diffuse material acts as a filter,so that the reflected light varies much more slowly than the incoming light.Our results help to explain recent experimental work(e.g., Epstein et al.[10],Hallinan[15],Yuille et al.[40])that has indicated that the set of images produced by an object under a wide range of lighting conditions lies near a low dimensional linear subspace in the space of all possible images.Our results also allow us to better understand several existing recogni-tion methods.For example,previous work showed that,if we restrict every point on the surface of a diffuse object to face every light source(that is,ignoring attached shadows),then the set of images of the object lies in a3D linear space(e.g., Shashua[34]and Moses[26]).Our analysis shows that,in fact, this approach uses the linear space spanned by the three first order harmonics,but omits the significant zeroth order(DC) component.Koenderink and van Doorn[21]augmented this space in order to account for an additional,perfect diffuse component.The additional component in their method is the missing DC component.Our analysis also leads us to new methods of recognizing objects with unknown pose and lighting conditions.In particular,we discuss how the harmonic basis,which is derived analytically from a model of an object,can be used in a linear subspace-based object recognition algorithm,in place.R.Basri is with the Department of Computer Science,The WeizmannInstitute of Science,Rehovot,76100Israel.E-mail:ronen.basri@weizmann.ac.il.. D.W.Jacobs is with NEC Research Institute,4Independence Way,Princeton,NJ08540.E-mail:dwj@.Manuscript received19June2001;revised31Dec.2001;accepted30Apr.2002.Recommended for acceptance by P.Belhumeur.For information on obtaining reprints of this article,please send e-mail to:tpami@,and reference IEEECS Log Number114379.0162-8828/03/$17.00ß2003IEEE Published by the IEEE Computer Societyof a basis derived by performing SVD on large collections of rendered images.Furthermore,we show how we can enforce the constraint that light is nonnegative everywhere by projecting this constraint to the space spanned by the harmonic basis.With this constraint recognition is expressed as a nonnegative least-squares problem that can be solved using convex optimization.This leads to an algorithm for recognizing objects under varying pose and illumination that resembles Georghiades et al.[12],but works in a low-dimensional space that is derived analytically from a model.The use of the harmonic basis,in this case,allows us to rapidly produce a representation to the images of an object in poses determined at runtime.Finally,we discuss the case in which a first order approximation provides an adequate approxima-tion to the images of an object.The set of images then lies near a 4D linear subspace.In this case,we can express the nonnegative lighting constraint analytically.We use this expression to perform recognition in a particularly efficient way,without complex,iterative optimization techniques.The paper is divided as follows:Section 2briefly reviews the relevant studies.Section 3presents our analysis of Lambertian reflectance.Section 4uses this analysis to derive new algorithms for object recognition.Finally,Section 5discusses extensions to specular reflectance.2P AST A PPROACHESOur work is related to a number of recent approaches to object recognition that represent the set of images that an object can produce using low-dimensional linear subspaces of the space of all images.Ullman and Basri [38]analytically derive such a representation for sets of 3D points undergoing scaled orthographic projection.Shashua [34]and Moses [26](see also Nayar and Murase [28]and Zhao and Yang [41])derive a 3D linear representation of the set of images produced by a Lambertian object as lighting changes,but ignoring shadows.Hayakawa [16]uses factorization to build 3D models using this linear representation.Koenderink and van Doorn [21]extend this to a 4D space by allowing the light to include a diffuse component.Our work differs from these in that ourrepresentation accounts for attached shadows.These shadows occur when a surface faces away from a light source.We do not account for cast shadows,which occur when an intervening part of an object blocks the light from reaching a different part of the surface.For convex objects,only attached shadows occur.As is mentioned in Section 1,we show below that the 4D space used by Koenderink and van Doorn is in fact the space obtained by a first order harmonic approximation of the images of the object.The 3D space used by Shashua,Moses,and Hayakawa is the same space,but it lacks the significant DC component.Researchers have collected large sets of images and performed PCA to build representations that capture within class variations (e.g.,Kirby and Sirovich [19],Turk and Pentland [37],and Cootes et al.[7])and variations due to pose and lighting (Murase and Nayar [27],Hallinan [15],Belhumeur et al.[3],and Yuille et al.[40];see also Malzbender et al.[24]).This approach and its variations have been extremely popular in the last decade,particularly in applications to face recognition.Hallinan [15],Epstein et al.[10],and Yuille et al.[40]perform experiments that show that large numbers of images of real,Lambertian objects,taken with varied lighting conditions,do lie near a low-dimensional linear space,justifying this representation.Belhumeur and Kriegman [4]have shown that the set of images of an object under arbitrary illumination forms a convex cone in the space of all possible images.This analysis accounts for attached shadows.In addition,for convex,Lambertian objects,they have shown that this cone (called the illumination cone )may have unbounded dimension.They have further shown how to construct the cone from as few as three images.Georghiades et al.[11],[12]use this representa-tion for object recognition.To simplify the representation (an accurate representation of the illumination cone requires all the images that can be obtained with a single directional source),they further projected the images to low-dimen-sional subspaces obtained by rendering the objects and applying PCA to the rendered images.Our analysis allows us to further simplify this process by using instead the harmonic basis,which is derived analytically from a model of the object.This leads to a significant speed up of the recognition process (see Section 4).Spherical harmonics have been used in graphics to efficiently represent the bidirectional reflection distribution function (BRDF)of different materials by,e.g.,Cabral et al.[6]and Westin et al.[39].Koenderink and van Doorn [20]proposed replacing the spherical harmonics basis with a basis for functions on the half-sphere that is derived from the Zernike polynomials,since BRDFs are defined over a half sphere.Nimeroff et al.[29],Dobashi et al.[8],and Teo et al.Fig.1.The same face,under two different lighting conditions.Fig.2.On the left,a white sphere illuminated by three directional (distant point)sources of light.All the lights are parallel to the image plane,one source illuminates the sphere from above and the two others illuminate the sphere from diagonal directions.In the middle,a cross-section of the lighting function with three peaks corresponding to the three light sources.On the right,a cross-section indicating how the sphere reflects light.We will make precise the intuition that the material acts as a low-pass filtering,smoothing the light as it reflects it.[35]explore specific lighting configurations(e.g.,daylight) that can be represented efficiently as a linear combination of basis lightings.Dobashi et al.[8],in particular,use spherical harmonics to form such a basis.Miller and Hoffman[25]were first to describe the process of turning incoming light into reflection as a convolution. D’Zmura[9]describes this process in terms of spherical harmonics.With this representation,after truncating high order components,the reflection process can be written as a linear transformation and,so,the low-order components of the lighting can be recovered by inverting the transformation. He used this analysis to explore ambiguities in lighting.We extend this work by deriving subspace results for the reflectance function,providing analytic descriptions of the basis images,and constructing new recognition algorithms that use this analysis while enforcing nonnegative lighting.Independent of and contemporaneous with our work, Ramamoorthi and Hanrahan[31],[32],[33]have described the effect of Lambertian reflectance as a convolution and analyzed it in terms of spherical harmonics.Like D’Zmura, they use this analysis to explore the problem of recovering lighting from reflectances.Both the work of Ramamoorthi and Hanrahan and ours(first described in[1])show that Lambertian reflectance acts as a low-pass filter with most of the energy in the first nine components.In addition to this,we show that the space spanned by the first nine harmonics accurately approximates the reflectance function under any light configuration,even when the light is dominated by high frequencies.Furthermore,we show how to use this space for object recognition.Since the first introduction of our work,a number of related papers have further used and extended these ideas in a number of directions.Specifically,Ramamoorthi[30] analyzed the relationship between the principal components of the images produced by an object and the first nine harmonics.Lee et al.[23]constructed approximations to this space using physically realizable lighting.Basri and Jacobs[2] used the harmonic formulation to construct algorithms for photometric stereo under unknown,arbitrary lighting. Finally,Thornber and Jacobs[36]and Ramamoorthi and Hanrahan[32]further examined the effect of specularity and cast shadows.3M ODELING I MAGE F ORMATIONIn this section,we construct an analytically derived repre-sentation of the images produced by a convex,Lambertian object illuminated by distant light sources.We restrict ourselves to convex objects,so we can ignore the effect of shadows cast by one part of the object on another part of it.We assume that the surface of the object reflects light according to Lambert’s law[22],which states that materials absorb light and reflect it uniformly in all directions.The only parameter of this model is the albedo at each point on the object,which describes the fraction of the light reflected at that point.This relatively simple model applies to diffuse(nonshiny) materials.It has been analyzed and used effectively in a number of vision applications.By a“distant”light source we mean that it is valid to make the approximation that a light shines on each point in the scene from the same angle,and with the same intensity(this also rules out,for example,slide projectors).Lighting, however,may come from multiple sources,including diffuse sources such as the sky.We can therefore describe the intensity of the light as a single function of its direction that does not depend on the position in the scene.It is important to note that our analysis accounts for attached shadows,which occur when a point in the scene faces away from a light source.While we are interested in understanding the images created by an object,we simplify this problem by breaking it into two parts.We use an intermediate representation,the reflectance function(also called the reflectance map,see Horn [17,chapters10,11]).Given our assumptions,the amount of light reflected by a white surface patch(a patch with albedo of one)depends on the surface normal at that point,but not on its spatial position.For a specific lighting condition,the reflectance function describes how much light is reflected by each surface normal.In the first part of our analysis,we consider the set of possible reflectance functions produced under different illumination conditions.This analysis is independent of the structure of the particular object we are looking at;it depends only on lighting conditions and the properties of Lambertian reflectance.Then,we discuss the relationship between the reflectance function and the image. This depends on object structure and albedo,but not on lighting,except as it determines the reflectance function.We begin by discussing the relation of lighting and reflectance.Before we proceed,we would like to clarify the relation between the reflectance function and the bidirectional reflection distribution function(BRDF).The BRDF of a surface material is a function that describes the ratio of radiance,the amount of light reflected by the surface in every direction (measured in power per unit area per solid angle),to irradiance,the amount of light falling on the surface in every direction(measured in power per unit area).BRDF is commonly specified in a local coordinate frame,in which the surface normal is fixed at the north pole.The BRDF of a Lambertian surface is constant,since such a surface reflects light equally in all direction,and it is equal to1=%.In contrast, thereflectancefunctiondescribestheradianceofaunitsurface area given the entire distribution of light in the scene.The reflectance function is obtained by integrating the BRDF over all directions of incident light,weighting the intensity of the light by the foreshortening of the surface as seen from each lightsource.Inaddition,thereflectancefunctionisspecifiedin a global,viewer centered coordinate frame in which the viewing direction is fixed at the north pole.For example,if a scene is illuminated by a single directional source(a distant point source)of unit intensity,the reflectance function for every surface normal will contain the appropriate foreshortening of the surface with respect to the light source direction scaled by 1=%.(For surface normals that face away from the light source the reflectance function will vanish.)For simplicity,we omit below the extra factor of1=%that arises from the Lambertian BRDF since it only scales the intensities in the image by a constant factor.3.1Image Formation as the Analog of aConvolutionBoth lighting and reflectance can be described as functions on the surface of the sphere.We describe the intensity of light as a function of its direction.This formulation allows us to consider multiple light sources that illuminate an object simultaneously from many directions.We describe reflec-tance as a function of the direction of the surface normal.To begin,we introduce notation for describing such functions.Let S 2denote the surface of a unit sphere centered at the origin.We will use u;v to denote unit vectors.We denote their Cartesian coordinates as ðx;y;z Þ,with x 2þy 2þz 2¼1.When appropriate,we will denote such vectors by a pair of angles,ð ;0Þ,withu ¼ðx;y;z Þ¼ðcos 0sin ;sin 0sin ;cos Þ;ð1Þwhere 0 %and 0 0 2%.In this coordinate frame,the poles are set at ð0;0;Æ1Þ, denotes the angle between u and ð0;0;1Þ,and it varies with latitude,and 0varies with longitude.We will use ð l ;0l Þto denote a direction of light and ð r ;0r Þto denote a direction of reflectance,although we will drop this subscript when there is no ambiguity.Similarly,we may express the lighting or reflectance directions using unit vectors such as u l or v r .Since we assume that the sphere is illuminated by a distant set of lights all points are illuminated by identical lighting conditions.Consequently,the configuration of lights that illuminate the sphere can be expressed as a nonnegative function ‘ð l ;0l Þ,giving the intensity of the light reaching the sphere from each direction ð l ;0l Þ.We may also write this as ‘ðu l Þ,describing lighting direction with a unit vector.According to Lambert’s law,if a light ray of intensity l and coming from the direction u l reaches a surface point with albedo &and normal direction v r ,then the intensity,i ,reflected by the point due to this light is given byi ¼l ðu l Þ&max ðu l Áv r ;0Þ:ð2ÞIf we fix the lighting,and ignore &for now,then the reflected light is a function of the surface normal alone.We write this function as r ð r ;0r Þ,or r ðv r Þ.If light reaches a point from a multitude of directions,then the light reflected by the point would be the sum of (or in the continuous case the integral over)the contribution for each direction.If we denote k ðu Áv Þ¼max ðu Áv;0Þ,then we can write:r ðv r Þ¼ZS 2k ðu l Áv r Þ‘ðu l Þdu l ;ð3Þwhere RS 2denotes integration over the surface of the sphere.Below,we will occasionally abuse notation and write k ðu Þto denote the max of zero and the cosine of the angle between u and the north pole (that is,omitting v means that v is the north pole).We therefore call k the half-cosine function.We can also write k ð Þ,where is the latitude of u ,since k only depends on the component of u .For any fixed v ,as we vary u (as we do while integrating (3)),then k ðu Áv Þcomputes the half cosine function centered around v instead of the north pole.That is,since v r is fixed inside the integral,we can think of k as a function just of u ,which gives the max of zero and the cosine of the angle between u and v r .Thus,intuitively,(3)is analogous to a convolution,in which we center a kernel (the half-cosine function defined by k ),and integrate its product with a signal (‘).In fact,we will call this a convolution,and writer ðv r Þ¼k Ã‘¼defZ S 2k ðu l Áv r Þ‘ðu l Þdu l :ð4ÞNote that there is some subtlety here since we cannot,ingeneral,speak of convolving a function on the surface of the sphere with an arbitrary kernel.This is because we have three degrees of freedom in how we position a convolution kernel on the surface of the sphere,but the output of theconvolution should be a function on the surface of the sphere,which has only two degrees of freedom.However,since k is rotationally symmetric this ambiguity disappears.In fact,we have been careful to only define convolution for rotationally symmetric k .3.2Spherical Harmonics and the Funk-Hecke TheoremJust as the Fourier basis is convenient for examining the results of convolutions in the plane,similar tools exist for understanding the results of the analog of convolutions on the sphere.We now introduce these tools,and use them to show that in producing reflectance,k acts as a low-pass filter.The surface spherical harmonics are a set of functions that form an orthonormal basis for the set of all functions on the surface of the sphere.We denote these functions by Y nm ,with n ¼0;1;2;...and Àn m n :Y nm ð ;0Þ¼ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃð2n þ1Þ4%ðn Àj m jÞ!ðn þj m jÞ!s P n j m j ðcos Þe im0;ð5Þwhere P nm are the associated Legendre functions ,defined asP nm ðz Þ¼1Àz 2ðÞm=22n !d n þm dz z 2À1ÀÁn :ð6ÞWe say that Y nm is an n th order harmonic.In the course of this paper,it will sometimes be convenient to parameterize Y nm as a function of space coordinates ðx;y;z Þrather than angles.The spherical harmonics,written Y nm ðx;y;z Þ,then become polynomials of degree n in ðx;y;z Þ.The first nine harmonics then becomeY 00¼1ﬃﬃﬃﬃ4%p Y 10¼ﬃﬃﬃﬃ34%q z Y e 11¼ﬃﬃﬃﬃ34%q x Y o11¼ﬃﬃﬃﬃ34%q yY 20¼12ﬃﬃﬃﬃ54%q ð3z 2À1ÞY e 21¼3ﬃﬃﬃﬃﬃﬃ512%q xz Y o 21¼3ﬃﬃﬃﬃﬃﬃ5q yz Y e 22¼3ﬃﬃﬃﬃﬃﬃ5q x 2Ày 2ðÞY o 22¼3ﬃﬃﬃﬃﬃﬃ512%q xy;ð7Þwhere the superscripts e and o denote the even and the odd components of the harmonics,respectively,(soY nm ¼Y e n j m j ÆiY on j m j ,according to the sign of m ;in fact the even and odd versions of the harmonics are more convenient to use in practice since the reflectance function is real).Because the spherical harmonics form an orthonormal basis,thismeansthatany piecewisecontinuousfunction,f ,on the surface of the sphere can be written as a linear combination of an infinite series of harmonics.Specifically,for any f ,f ðu Þ¼X 1n ¼0X n m ¼Ànf nm Y nm ðu Þ;ð8Þwhere f nm is a scalar value,computed as:f nm ¼ZS 2f ðu ÞY Ãnm ðu Þdu;ð9Þand Y Ãnmðu Þdenotes the complex conjugate of Y nm ðu Þ.If we rotate a function f,this acts as a phase shift.Define for every n the n th order amplitude of f asA n¼defﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ12nþ1X nm¼Ànf2nms:ð10ÞThen,rotating f does not change the amplitude of a particular order.It may shuffle values of the coefficients, f nm,for a particular order,but it does not shift energy between harmonics of different orders.For example, consider a delta function.As in the case of the Fourier transform,the harmonic transform of a delta function has equal amplitude in every order.If the delta function is at the north pole,its transform is nonzero only for the zonal harmonics,in which m¼0.If the delta function is,in general,position,it has some energy in all harmonics.But in either case,the n th order amplitude is the same for all n.Both the lighting function,‘,and the Lambertian kernel, k,can be written as sums of spherical harmonics.Denote by‘¼X1n¼0X nm¼Ànl nm Y nm;ð11Þthe harmonic expansion of‘,and bykðuÞ¼X1n¼0k n Y n0:ð12ÞNote that,because kðuÞis circularly symmetric about the north pole,only the zonal harmonics participate in this expansion,andZ S2kðuÞYÃnmðuÞdu¼0;m¼0:ð13ÞSpherical harmonics are useful in understanding the effect of convolution by k because of the Funk-Hecke theorem, which is analogous to the convolution theorem.Loosely speaking,the theorem states that we can expand‘and k in terms of spherical harmonics and,then,convolving them is equivalent to multiplication of the coefficients of this expansion.We will state the Funk-Hecke theorem here in a form that is specialized to our specific concerns.Our treatment is based on Groemer[13],but Groemer presents a more general discussion in which,for example,the theorem is stated for spaces of arbitrary dimension.Theorem1(Funk-Hecke).Let kðuÁvÞbe a bounded,integrable function on[-1,1].Then:kÃY nm¼ n Y nmwithn¼ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ4%rk n:That is,the theorem states that the convolution of a (circularly symmetric)function k with a spherical harmonic Y mn(as defined in(4))results in the same harmonic,scaled by a scalar n. n depends on k and is tied directly to k n,the n th order coefficient of the harmonic expansion of k.Following the Funk-Hecke theorem,the harmonic ex-pansion of the reflectance function,r,can be written as:r¼kÃ‘¼X1n¼0X nm¼Ànð n l nmÞY nm:ð14ÞThis is the chief implication of the Funk-Hecke theorem for our purposes.3.3Properties of the Convolution KernelThe Funk-Hecke theorem implies that in producing the reflectance function,r,the amplitude of the light,‘,at every order n is scaled by a factor n that depends only on the convolution kernel,k.We can use this to infer analytically what frequencies will dominate r.To achieve this,we treat‘as a signal and k as a filter,and ask how the amplitudes of ‘change as it passes through the filter.The harmonic expansion of the Lambertian kernel(12) can be derived(with some tedious manipulation detailed in Appendix A)yieldingk n¼ﬃﬃ%p2n¼0ﬃﬃ%3pn¼1ðÀ1Þn2þ1ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃð2nþ1Þ%pnnn2n!2;even0n!2;odd:8>>>><>>>>:ð15ÞThe first few coefficients,for example,arek0¼ﬃﬃ%p2%0:8862k1¼ﬃﬃ%3p%1:0233k2¼ﬃﬃﬃﬃ5%p8%0:4954k4¼Àﬃﬃ%p16%À0:1108k6¼ﬃﬃﬃﬃﬃﬃ13%p128%0:0499k8¼ﬃﬃﬃﬃﬃﬃ17%p256%À0:0285:ð16Þ(k3¼k5¼k7¼0),j k n j approaches zero as OðnÀ2Þ.A graph representation of the coefficients is shown in Fig.3.The energy captured by every harmonic term is measured commonly by the square of its respective coefficient divided by the total squared energy of the transformed function.The total squared energy in the half cosine function is given byZ2%Z%k2ð Þsin d d0¼2%Z%2cos2 sin d ¼2%:ð17ÞFig.3.From left to right:A graph representation of the first11coefficients of the Lambertian kernel,the relative energy captured by each of the coefficients,and the cumulative energy.。

Eigenvalue inequalities for Klein-Gordon Operators

2
The plan of attack is to use trace identities to derive universal spectral bounds and geometric spectral bounds for Hm,Ω . The generator of the Cauchy process, corresponding to the√ case m = 0, is often referred to as the fractional Laplacian and designated −∆. The latter is, unfortunately, ambiguous notation, √ since this operator is distinct from the operator −∆Ω as deﬁned by the functional calculus for the Dirichlet Laplacian −∆Ω , except when Ω is all of Rd . For this reason we shall avoid the ambiguous notation when speaking of compact Ω. (For the spectral theorem and the functional calculus, see, e.g., [47].) Whereas several universal eigenvalue bounds, mostly of unknown or indiﬀerent sharpness, have been obtained for higher-order partial diﬀerential operators such as the bilaplacian (e.g., [32,25,14,54,57]), and for some ﬁrst-order Dirac operators [11], universal bounds for pseudodiﬀerential operators appear not to have been studied before. In a ﬁnal section we study interacting Klein-Gordon operators of the form H = Hm,Ω + V (x), (1.2)

On the Mapping of Time-Dependent Densities onto Potentials in Quantum Mechanics

combined with Eq. (2) gives the DoMP9. Before we analyze more fully the real space case, let us first consider a simple two level system with wavefunction
t F ˆ ˆ rd 3r t ; i v r, t n 0 0
(2)
ˆ r t . yields a TD wavefunction t for which n r, t n
From Heisenberg’s equation of motion for the second timederivative of the density, we find an implicit equation for v :
ˆ and interaction ˆ T ˆ as the sum of kinetic energy T ˆ U F ˆ 1 v r r of the particles we want to energy U 12 i j 2 i, j
find the potential v r, t for which the S.E
nt 1 0.8 0.6 0.4 0.2 0.01 0.02 0.03 0.04 t
x t
(8)
1
0.8 0.6 0.4 0.2
0.2
ˆ is useful in finite basis calculations. The integral form for D In real space and Fourier grid applications the operator is in fact essentially positive definite. Because of this latter propˆ is invertible) and Eq. (3) erty, VR is assured in TDDFT ( D

How large can the first eigenvalue be on a surface of genus two

a rX iv:mat h /59398v1[mat h.SP]18Se p25How large can the ﬁrst eigenvalue be on a surface of genus two?Dmitry Jakobson ∗Michael Levitin †Nikolai Nadirashvili ‡Nilima Nigam §Iosif Polterovich ¶11September 2005Abstract Sharp upper bounds for the ﬁrst eigenvalue of the Laplacian on a surface of a ﬁxed area are known only in genera zero and one.We investigate the genus two case and conjecture that the ﬁrst eigenvalue is maximized on a singular surface which is realized as a double branched covering over a sphere.The six ramiﬁcation points are chosen in such a way that this surface has a complex structure of the Bolza surface.We prove that our conjecture follows from a lower bound on the ﬁrst eigenvalue of a certain mixed Dirichlet-Neumann boundary value problem on a half-disk.The latter can be studied numerically,and we present conclusive evidence supporting the conjecture.Keywords:Laplacian,ﬁrst eigenvalue,surface of genus two,mixed boundary value problem.1Introduction and main results1.1Upper bounds on theﬁrst eigenvalueLet M be a closed surface of genusγand let g be the Riemannian metric on M.Denote by∆the Laplace-Beltrami operator on M,and byλ1the smallest positive eigenvalue of the Laplacian.Let the area Area(M)beﬁxed.How large canλ1be on such a surface?Sharp bounds for theﬁrst eigenvalue are known only for the sphere([H],see also[SY]),the projective plane([LY]),the torus([Ber],[N]),and the Klein bottle ([JNP],[EGJ]).The present paper is concerned with the surface of genus2.Let M be orientable and letΠ:M→S2be a non-constant holomorphic map(or,conformal branched covering)of degree d.It was proved in[YY]thatλ1Area(M)≤8πd.(1.1.1) Any Riemann surface of genusγcan be represented as a branched cover over S2of degree d= γ+32 .(1.1.2) In general,(1.1.2)is not sharp,for example forγ=1([N]).Let M=P be a surface of genusγ=2.Then(1.1.2)impliesλ1Area(P)≤16π.(1.1.3) The aim of this paper is to show,using a mixture of analytic and numerical tools, that(1.1.3)is sharp.Main results of this paper were announced(without proofs) in[JLNP,section4].1.2The Bolza surfaceLetΠ:P→S2be a branched covering of degree d=2.The Riemann-Hurwitz formula(see[GH])implies that this cover is ramiﬁed at6points.We choose these points to be the intersections of the round sphere S2centered at the origin with the coordinate axes in R3.The surface P can be realized as(z,w)∈C2:w2=F(z):=z(z−1)(z−i)This surface has the conformal structure of the Bolza surface.It has an octahe-dral group of holomorphic automorphisms and its symmetry group is the largest among surfaces of genus two[I,KW].Interestingly enough,the Bolza surface appears in some other extremal problems,in particular for systoles(see[KS]).To simplify calculations it is convenient to rotate the equatorial plane byπ/4. The equation of P becomesP:= (z,w)∈C2:w2=F(z):=z(z−eπi/4)(z−e3πi/4)u 2,where the scalar product ·,· and the norm · are taken in the space L2(P,g0). The Sobolev space H10(P,g0)of functions supported away from the singularities is obtained by the closure of C∞0(P,g0):={v∈C∞(P,g0)|Since(P,g0)is a double cover of the standard S2,we haveArea(P,g0)=2Area(S2)=8π.Therefore,in order to prove Conjecture1.3.1it suﬃces to show thatλ1(P,g0)=λ1(S2)=2.(1.3.2)Unfortunately,we are unable to prove(1.3.2),and therefore establish Con-jecture1.3.1.We can however reduce the conjecture to the following spectral problem on a quarter-sphere Q⊂S2that can be treated using numerical ly,let,in usual spherical coordinates(φ,θ),Q={(φ,θ):0<φ<π/2,0<θ<π}.We split the boundary∂Q into two parts:∂Q=Finally,we note that the spectral problem(1.3.3)easily reduces via the stere-ographic projection to the following mixed Dirichlet-Neumann problem on a half-disk D:={(r,ψ)∈R2:r<1,0<ψ<π}(here(r,ψ)are usual planar polar coordinates):−∆v=4Λ(1+r2)2v on D,v|∂2D=0,(∂v/∂n)|∂1D=0.(1.3.7)We refer to[JLNP]for a further discussion on Dirichlet-Neumann swap isospec-trality.Figure1:Geometry of boundary value problems(1.3.6)(left)and(1.3.7)(right). Here and further on,the solid red line denotes Dirichlet boundary condition and the dashed blue line—the Neumann one.Remark1.3.8.One can check that a surface with aﬁnite number of conical singularities can be approximated by a sequence of smooth surfaces of the same genus and area in such a way that the corresponding sequence of theﬁrst non-zero eigenvalues converges toλ1on the original surface.Thus,Conjecture1.3.1 means that(1.1.3)is sharp in the class of smooth metrics,although the equality is not necessarily attained.For a general result about the convergence of the whole spectrum see[Ro].52Symmetries2.1Hyperelliptic involutionLet T:P→P,T2=Id be a map that intertwines the preimages of points of S2under a two-sheeted coveringΠ:P→S2.Clearly,the Laplace operator∆commutes with T.By the spectral theorem,we can consider separately the restrictions of the Laplacian onto the spaces of functions which are either even or odd with respect to T.The even functions on P can be identiﬁed with the functions on S2. Therefore,asλ1(S2)=2,we haveλ1(P)≤2,and the equality in(1.3.2)will be achieved if and only if theﬁrst eigenvalueλodd1of the Laplacian acting on the odd subspace satisﬁesλodd1≥2.2.2Isometries of PConsider the following isometries of S2(as usual,we identify S2and C by stere-ographic projection):σ1:z→¯z or(χ,η,ξ)→(χ,−η,ξ),σ2:z→−¯z or(χ,η,ξ)→(−χ,η,ξ),(2.2.1)σ3:z→1/¯z or(χ,η,ξ)→(χ,η,−ξ).Here z=x+iy is a point in the equatorial plane upon which a point(χ,η,ξ)∈S2 is stereographically projected.The hyperelliptic involution T is given by T:(z,w)→(z,−w).For1≤j≤3,a symmetryσj of S2has two corresponding symmetries s j and T◦s j satisfyingΠ◦s j=Π◦T◦s j=σj◦Π.(2.2.2) Those symmetries,with account of(1.2.1)are given by the explicit formulaes1:(z,w)→(¯z,¯z/¯w),s2:(z,w)→(−¯z,i¯w),(2.2.3)s3:(z,w)→(1/¯z,¯w/¯z).As an illustration,we demonstrate how the last of these formulae is obtained:if w2=F(z),then by(1.2.1),F 1¯z(1/¯z−eπi/4)(1/¯z−e3πi/4)F(z)thus giving the expression for s3.It easily seen that all s j commute with T and satisfys2j=Id,j=1,2,3;(2.2.4)s1s3=s3s1,s2s3=s3s2;s2s1=T s1s2.Remark2.2.5.In the proof of Theorem1.3.5we will use only the symmetries s1and s3.Calculations for s2are presented for the sake of completeness(see Remark3.4.1).2.3Fixed point sets of isometriesLet Fix(S)denote aﬁxed point set of a mapping S.As easily seen from(2.2.1), the sets Fix(σj),for j=1,2,3,lie in the union of the coordinate lines and a unit circle of C,and we introduce the following notation for future reference.The coordinate lines are divided into two rays each by the ramiﬁcation point r0:=0, and we denotea1:={z=t,t>0},a2:={z=it,t>0},a3:={z=t,t<0},a4:={z=it,t<0}.The circle is divided into four arcs by the ramiﬁcation points r1:=e−πi/4, r2:=eπi/4,r3:=e3πi/4,and r4:=e−3πi/4,and we denote the arcs bya k+4:={z=e tπi/4,t∈(2k−3,2k−1)},k=1,2,3,4,so that the arc a5goes from r1to r2,the arc a6goes from r2to r3,the arc a7 goes from r3to r4,andﬁnally a8goes from r4to r1.In this notation,theﬁxed point sets Fix(σj)are written asFix(σ1)=a1∪a3,Fix(σ2)=a2∪a4,Fix(σ3)=a5∪a6∪a7∪a8.(2.3.1) Note that each of the rays a j(j=1,2,3,4)intersects an arc a j+4at a single point which we denote z j:z1=1,z2=i,z3=−1,z4=−i,see Figure2.7Figure2:Ramiﬁcation points,rays,arcs and intersections 2.4Fixed point sets of s1,s2,s3Each of the points z j has exactly two pre-images p(m)j :=(w(m)j,z j)∈Π−1z j,m=1,2,where w(1,2)j are the solutions of the equation(w j)2=F(z j),with Fgiven in(1.2.1).These solutions are easily found from(1.2.1);we are of course at liberty to choose which of the two solutions is denoted w(1)j and which is8denoted w(2)j.For deﬁniteness we setw(1)1=i,w(2)1=−i;w(1)2=1+i2,w(2)2=−1+i2; w(1)3=1,w(2)3=−1;w(1)4=1−i2,w(2)4=−1−i2.(2.4.1)For future use,we need to know the images of points p(m)j under the sym-metries s l,l=1,2,3.These are easily calculated from(2.2.3);it turns out thats l p(m)j =p(n)kwith some indices k∈{1,2,3,4},n∈{1,2}.The results of thecalculations are summarized in the following Table1.(j,m)(k,n)l=2(1,1)(1,1)(1,2)(1,2)(1,2)(1,1)(2,1)(2,2)(3,1)(3,2)(3,2)(3,2)(3,1)(3,1)(4,2)(4,1)the notation we postulate that b(m)k∋w(m)k′,k′=((k−1)mod4)+1,e.g. w(1)1∈b(1)1∩b(1)5,w(2)3∈b(2)3∩b(2)7,etc.We now have at our disposal all the information we need in order to obtain theﬁxed point sets of s1,s2,s3.We start with the following two simple Lemmas. Lemma2.4.2.ΠFix(s j)⊆Fix(σj).Proof.Let z∈ΠFix(s j).Then there exists p∈P such thatΠp=z and s j p= p.ThusΠs j p=z and by(2.2.2)σjΠp=σj z=z,so that z∈Fix(σj). Lemma2.4.3.Let a k⊆Fix(σj).Then,for m=1,2,either b(m)k⊆Fix(s j)or b(m)k⊆Fix(T◦s j).Proof.We haveΠb(m)k =a k,so thatσjΠb(m)k=σj a k=a k,and so by(2.2.2)Πs j b(m)k =a k=Πb(m)k=ΠT b(m)k.The result follows from the obvious observa-tion:ifΠα=Πβ,then eitherα=βorα=Tβ.The lemmas lead to the followingProposition2.4.4.Fix(s1)=b(1)1∪b(2)1,Fix(T s1)=b(1)3∪b(2)3,Fix(s2)=b(1)2∪b(2)2,Fix(T s2)=b(1)4∪b(2)4,Fix(s3)=b(1)6∪b(2)6∪b(1)8∪b(2)8,Fix(T s3)=b(1)5∪b(2)5∪b(1)7∪b(2)7.Proof.By Lemmas2.4.2and2.4.3,for any given j theﬁxed sets Fix(s j)and Fix(T s j)consist only of the pre-images of the components a k of the correspond-ingﬁxed sets Fix(σj)(given by(2.3.1)).However we still need to describe whichcomponent b(m)k,m=1,2,lies in Fix(s j)and which in Fix(T s j).As each com-ponent b k(m)is uniquely determined by the point w(m)kgiven by(2.4.1),it issuﬃcient just to check in Table1whether s j w(m)k =w(m)kor T s j w(m)k=w(m)k.For example,toﬁnd Fix(s2)we need only to inspect b(m)2and b(m)4.As,byTable1,s2w(m)2=w(m)2and T s2w(m)4=w(m)4,we have Fix(s2)=b(1)2∪b(2)2andFix(T s2)=b(1)4∪b(2)4.The rest of Proposition2.4.4is obtained in the same manner.103Proof of Theorem1.3.5We divide the proof of Theorem1.3.5into several steps.3.1Even eigenfunctions with respect to TConsider the subspace V+⊂L2(P)consisting of all even eigenfunctions with respect to T.Any such eigenfunction has a well-deﬁned projection on S2.There-fore,if there exists aﬁrst eigenfunction of P that belongs to V+,its projection is an eigenfunction on S2and hence the corresponding eigenvalue is greater or equal than two(recall thatλ1(S2)=2).Hence,in this case the Conjecture1.3.1 is veriﬁed.3.2Use of symmetries s1,s3.Denote by G13the subgroup of the automorphism group of P generated by the symmetries{T,s1,s3}.It follows from(2.2.4)that G13is commutative.Note also that all the elements of G13commute with the Laplacian on P.Therefore,we can choose a basis of L2(P)consisting of joint eigenfunctions of all s∈G13and∆.Given a joint eigenfunction f of all s∈G13,we denote byµ(f,s)the corresponding eigenvalue of s,i.e.f(sx)=µ(f,s)f(x).Since s2j=T2=Id for j=1,3,we see thatµ(f,s)=±1for all s∈G13.3.3Odd eigenfunctions with respect to TConsider now the space V−⊂L2(P)consisting of all eigenfunctions of the Laplacian which are odd with respect to T.Letφ1be a joint eigenfunction of {T,s1,s3,∆},corresponding to the smallest eigenvalue of∆ V−Now,sinceµ(φ1,T)=−1and s23T=T,we haveµ(φ1,s1)µ(φ1,s1T)=µ(φ1,T)=−1,and similarlyµ(φ1,s3)µ(φ1,s3T)=−1.Without loss of generality we may assume thatµ(φ1,s1)=−1.We recall from section2.3that theﬁxed point set Fix s1consists of the arcs b(1)1,b(2)1. Thusφ1must vanish on these arcs.11Consider now the symmetries s3,s3T.We must have one of the following two cases:i)µ(φ1,s3T)=−1,µ(φ1,s3)=1;ii)µ(φ1,s3)=−1,µ(φ1,s3T)=1.Considerﬁrst Case i).Proposition3.3.1.In Case i)the functionφ1vanishes on the arcsb(1)1,b(2)1,b(1)5,b(2)5,b(1)7,b(2)7,and its normal derivative∂nφ1vanishes on the arcsb(1)3,b(2)3,b(1)6,b(2)6,b(1)8,b(2)8.Proof.By Proposition2.4.4,theﬁxed-point set of s3T consists of the arcs b15,b25,b17,b27.Accordingly,φ1vanishes on all those arcs,as well as on b11,b21. Moreover,φ1hasµ(φ1,s3)=µ(φ1,s1T)=1.It follows that the normal deriva-tive of∂nφ1vanishes on theﬁxed-point sets of those symmetries.It remains to apply once more Proposition2.4.4in order to complete the proof.Consider next Case ii).Proposition3.3.2.In Case ii)the functionφ1vanishes on the arcsb(1)3,b(2)3,b(1)6,b(2)6,b(1)8,b(2)8,and its normal derivative∂nφ1vanishes on the arcsb(1)1,b(2)1,b(1)5,b(2)5,b(1)7,b(2)7.Proposition3.3.2is proved in the same way as Proposition3.3.1.3.4Final step of the proofSinceφ1is an odd function with respect to the hyperelliptic involution T,its projection upon S2is not well-deﬁned.However,the projection of|φ1|to S2is well-deﬁned.Denote it byψ1.In Case i),the functionψ1can be chosen as a test function for the mixed Dirichlet-Neumann boundary value problem(1.3.3).Assume now Conjecture121.3.4is true and theﬁrst eigenvalue of(1.3.3)satisﬁesΛ1≥2.Then the Rayleigh quotient ofψ1and hence ofφ1satisﬁes the same inequality.But this means thatψ1cannot be theﬁrst eigenfunction on P since we get a contradiction with(1.1.3).Therefore,theﬁrst eigenfunction of P is even with respect to T, and as was shown in section3.1this implies Conjecture1.3.1.Similarly,in Case ii),the functionψ1can be chosen as a test function for the mixed Dirichlet-Neumann boundary volume problem which is obtained from (1.3.3)by swapping the Dirichlet and the Neumann conditions.However,it was shown in[JLNP]that this problem is isospectral to(1.3.3).Therefore, repeating the same arguments as above we prove that Conjecture1.3.1holds. This completes the proof of Theorem1.3.5.Remark3.4.1.In the proof of Theorem1.3.5we have used only the symmetries s1and s3.Alternatively,we could have used s2and s3.One can check directly using Proposition2.4.4that applying s2one obtains a mixed Dirichlet-Neumann boundary value problem which is equivalent to(1.3.3)and hence no additional information about theﬁrst eigenfunction is obtained.3.5A family of extremal surfaces of genus twoThe purpose of this section is to prove the followingCorollary3.5.1.Conjecture1.3.4implies that there exists a continuous family P t of surfaces of genus2such thatλ1Area(P t)=16π.Proof.Consider the Riemann surface P t deﬁned by the equation(z,w):w2=z z−e i(π/2−t) z−e i(π/2+t)(1+r2)2v on D,v|∂1(t)=0,(∂v/∂n)|∂2(t)=0.(3.5.2)and−∆v=4ΛHere∂1(t):={(r,0):r∈(0,1)}∪{(1,ψ):|ψ−π/2|<t}and∂(t)D:= {(r,π):r∈(0,1)}∪{(1,ψ):π/2>|ψ−π/2|>t}.We remark that for t=π/4these two problems are not ing Dirichlet-Neumann bracketing it is easy to see that(3.5.2)has a smallerﬁrst eigenvalue than(3.5.3)if t<π/4and a larger one if t>π/4.Denote the minimalﬁrst eigenvalue of the two problems byΛ1(t).According to Conjec-ture2and numerical calculations,Λ1(π/4)>2.Since theﬁrst eigenvalues of both problems depend continuously and monotonically on parameter t,and since Λ1(0)=Λ1(π/2)=0.75(see section1.3),there exist numbers t∗1∈(0,π/4) and t∗2∈(π/4,π/2)such thatΛ1(t∗1)=Λ1(t∗2)=2and soΛ1(t)≥2for t∈[t∗1,t∗2].Arguing is above,we deduce that for all surfaces P t corresponding to these values of t,estimate(1.1.3)is sharp.This completes the proof of the theorem.Corollary3.5.1implies that16πis a degenerate maximum forλ1Area(M) for surfaces of genus two.This is not the case for surfaces of lower genus on which the metric maximizing theﬁrst eigenvalue is unique.Note also that the extremal metrics in genera zero and one are analytic,while the surfaces P t have singular points.4Numerical investigations4.1Basics of the Finite Element MethodIn this section we describe the numerical experiments used to estimate theﬁrst eigenvalue of(1.3.7).We deﬁne the space H as the closure of{v∈C∞(D)|dD.(4.1.1)(1+r2)2We useﬁnite elements to approximate the eigenvalues and eigenfunctions of (4.1.1).The general procedure we follow is:1.Discretize the region D using a triangular mesh T h= N h i=1τi,with a sizeof an individual triangleτ∈T h parameterized by h>0.142.Introduce aﬁnite-dimensional subspace V h of H,consisting ofﬁnite ele-ment basis functions{φi}N h i=1on T h;3.Denote(v h,λh)∈V h×R,with v h=(v1,v2,...,v N h)t,the solution oftheﬁnite-dimensional generalized eigenvalue problemA h v h=λhB h v h,(4.1.2)where(A h)ij:= D∇φi·∇φj dD,(B h)ij:= D4φiφjFigure3:Aﬁnite element meshλh N h No.ofArnoldi iterates2.00434573363e-052882.36301118569625118.16135748742e-0846082.301112381849409112.79565739833e-1073728dD,(1+r2)2satisﬁed|Err|<5×10−10.The results are tabulated in Table2.Experiment2:This experiment was conducted using MATLAB’sﬁnite el-ement package PDEToolbox,and the eigenvalue solve was performed using ARPACK routines.A sequence of triangular meshes was created,starting from the coarsest mesh,and reﬁning5times.The major diﬀerence between this and the previous experiment is in the manner in which the zero Dirichlet data is enforced.16λh No.ofTriangles772.4040011835691850410612.305829341498988064163372.28276090970583129024258881dD,(1+r2)2satisﬁed|Err|<5×10−10.The results are presented in Table4.In each of the experiments above,we found that the computed eigenvalues appeared to converge to a value greater than2.27.The associated eigenfunctions also appear to converge to a function whose contour lines are shown in Figure4.17λh N h No.ofArnoldi iterates -1.5494060025e-05288-7.89999667122e-071152-3.8455570927e-084608-1.87263030138e-0918432-8.78059287464e-1173728fellowship and Alfred P.Sloan Foundation fellowship.The research of N.Nig. and I.P.was partially supported by NSERC and FQRNT.References[ArDu]M.Armentano and R.Duran,Asymptotic lower bounds for eigenvalues by nonconformingﬁnite element methods,Electron.Trans.Numer.Anal.17 (2004),93–101.[Bab]I.Babuska and J.Osborn,Eigenvalue problems,in Handbook of Numerical Analysis Vol.II,Finite Element Methods(Part1).Edited by P.G.Ciarlet and J.L.Lions.1991,Elsevier.[Ber]M.Berger,Sur les premi`e res valeurs propres des vari´e t´e s riemanniennes, Compositio Math.26(1973),129–149.[Br]D.Braess,Finite elements,Cambridge University Press,1997.[EGJ]A.El Souﬁ,H.Giacomini,M.Jazar,Greatest least eigenvalue of the Laplacian on the Klein bottle,preprint math.MG/0506585(2005).[GovL]G.Golub,C.van Loan,Matrix Computations,3rd Ed.,John Hopkins University Press,Baltimore,1996.[GH]P.Griﬃths,J.Harris,Principles of algebraic geometry,Wiley,N.Y.,1978.[Gun]R.Gunning,Lectures on Riemann surfaces,Jacobi varieties,Mathematical Notes,No.12.,Princeton Univ.Press,1972.[H]J.Hersch,Quatre propri´e t´e s isop´e rim´e triques de membranes sph´e rique ho-mog`e nes,C.R.Acad.Paris270(1970),1645–1648.[I]J.Igusa,Arithmetic varierty of moduli for genus two,Annals of Math.72(1960),612–649.[JLNP].D.Jakobson,M.Levitin,N.Nadirashvili,I.Polterovich,Spectral prob-lems with mixed Dirichlet-Neumann boundary conditions:isospectrality and beyond,to appear in p.Appl.Math.(2005).[JNP]D.Jakobson,N.Nadirashvili,I.Polterovich,Extremal metric for theﬁrst eigenvalue on the Klein bottle,to appear in Canadian J.Math.(2005).19[KW]H.Karcher,M.Weber,The geometry of Klein’s Riemann surface.The eightfold way,MSRI Publ.35(1999),9–49,Cambridge Univ.Press.[KS]M.Katz,S.Sabourau,An optimal systolic inequality for CAT(0)metrics in genus two,preprint math.DG/0501017(2005).[Ke]J.Keller,Singularities at the tip of a plane angular sector,J.Math.Phys.40(1999),no.2,1087–1092.[LY]P.Li,S.-T.Yau.A new conformal invariant and its applications to the Willmore conjecture and theﬁrst eigenvalue of compact surfaces,Invent.Math.69(1982),269–291.[N]N.Nadirashvili,Berger’s isoperimetric problem and minimal immersions of surfaces,GAFA6(1996),877–897.[Ro]J.Rowlett,Spectral convergence of the Laplacian on a compact manifold with degenerating metric,to appear as a part of the Stanford University Ph.D.thesis under the supervision of R.Mazzeo.[SY]R.Schoen and S.-T.Yau,Lectures on Diﬀerential Geometry,International Press,1994.[TrBa]L.N.Trefethen and D.Bau III,Numerical Linear Algebra,SIAM,1997.[YY]P.Yang,S.-T.Yau,Eigenvalues of the Laplacian of compact Riemann surfaces and minimal submanifolds,Ann.Scuola Norm.Sup.Pisa Cl.Sci.(4) 7(1980),no.1,55–63.20。

Low-momentum interactions with smooth cutoffs

a rXiv:n ucl-t h /693v11Sep26Low-momentum interactions with smooth cutoﬀs S.K.Bogner 1,R.J.Furnstahl 1,S.Ramanan 1and A.Schwenk 2,31Department of Physics,The Ohio State University,Columbus,OH 432102Department of Physics,University of Washington,Seattle,WA 98195-15603TRIUMF,4004Wesbrook Mall,Vancouver,BC,Canada,V6T 2A31IntroductionInternucleon potentials with variable momentum cutoﬀs,known generically as “V low k ,”show great promise for few-and many-body calculations [1,2,3,4,5,6,7].Changing the cutoﬀleaves observables unchanged by construction,but shiftscontributions between the potential and the sums over intermediate states in loop integrals.These shifts can weaken or largely eliminate sources of non-perturbative behavior such as strong short-range repulsion or the ten-sor force[8].An additional bonus is that the corresponding three-nucleon interactions become perturbative at lower cutoﬀs[4].As a result,it is found in practice that few-and many-body calculations can be greatly simpliﬁed or converge more rapidly by lowering the cutoﬀ.This has been observed with few-body variational methods[9,10],the coupled-cluster approach[11],and for nuclear matter[5].Low-momentum nucleon-nucleon interactions were originally derived and con-structed in energy-independent form using model-space methods(such as Lee-Suzuki[1,2]or Okubo[12])applied in momentum space.These approaches deﬁne orthogonal subspaces with projection operators P and Q,such that P+Q=1and P Q=QP=0.In momentum space,the latter condition implies a sharp cutoﬀΛin(relative)momentum,so that P-space integrals run from0toΛ,while Q-space integrals run fromΛto∞(or to a large “bare”cutoﬀ).Subsequently,this V low k construction was shown to be equiva-lent to a Renormalization Group(RG)treatment,derived by requiring cutoﬀindependence of the half-or fully-on-shell T matrix[3].With a sharp cutoﬀ, the equations take a particularly simple form and two-body observables are preserved for all momenta up to the cutoﬀ.However,a sharp cutoﬀalso leads to cusp-like behavior for the interaction (close to the cutoﬀ)in some channels and for the deuteron wave function, which becomes increasingly evident as the cutoﬀis lowered below2fm−1.In some applications,this leads to slow convergence,for example at the10–100keV level in few-body calculations using harmonic oscillator bases(see Fig.1).The reduction of the repulsive short-range interaction simultaneously reduces short-range correlations in the wave functions,which means that vari-ational calculations can be eﬀective with much simpler trial ans¨a tze[9].One would expect that such calculations for the deuteron and the triton,which are particularly low-energy bound states,should show improvement for cutoﬀs well below2fm−1,but instead a degradation was observed in Ref.[9].This result was attributed to the use of sharp cutoﬀs and a preliminary study[10]showed that these problems are alleviated by using a smooth-cutoﬀlow-momentum interaction.In this paper,we verify this conclusion and explore in detail the construction and application of V low k interactions with smooth cutoﬀs. While smooth cutoﬀs seem incompatible with methods requiring P Q=0,it is not a conceptual problem for the RG approach.Indeed,there is an appreciable literature on smooth-cutoﬀregulators for applications of the functional or ex-act RG[14].The functional RG keeps invariant the full generating functional, which translates into preserving all matrix elements of the inter-nucleon T matrix.While this straightforwardly leads to RG equations,it also impliesFig.1.The tharmonic oscillator basis of the low-momentum Hamiltonian derived from the Ar-gonne v18potential[13]with cutoﬀΛ=2fm−1,as a function of the size of the oscillator space(N max ωexcitations).The open circles are calculated with a sharp cutoﬀfor aﬁxed oscillator parameter b while theﬁlled ones correspond to optimiz-ing b at each N max.The dashed line indicates the exact Faddeev result using the sharp-cutoﬀinteraction[4],and shows the slow convergence of the diagonalization at the100kev level.The squares are for a smooth Fermi-Dirac regulator,Eq.(32), that solves the convergence problem.an energy-dependent interaction,which is undesirable for practical few-and many-body calculations.We resolve this conﬂict in Sect.2by constructing a low-momentum,energy-independent interaction with smooth cutoﬀs in three steps:(1)Evolve a large-cutoﬀpotential to a lower,smooth cutoﬀwhile preserv-ing the full oﬀ-shell T matrix.This generates an energy-dependent low-momentum interaction.(2)Convert the energy dependence to momentum dependence,which resultsin a non-hermitian smooth-cutoﬀinteraction.(3)Perform a similarity transformation to hermitize the low-momentum in-teraction.Along with the possibility of many diﬀerent functional forms for the smooth regulators,various hermitization schemes are possible,which reﬂects the gen-eral freedom in low-energy eﬀective theories[15].The freedom in deﬁning low-energy potentials has consequences in practical calculations.This is already evident in Fig.1,where the(converged)triton binding energy for a sharp cutoﬀis50keV less than for a particular Fermi-Dirac regulator.The diﬀerence reﬂects the diﬀerent contribution from the (neglected)short-range three-body force.Just as changing a cutoﬀwith a sharp regulator moves one along a Tjon line[4],we expect that changing the form of the regulator and the hermitization scheme atﬁxed cutoﬀalso does. Thus,each combination of regulator(speciﬁed by one or more“sharpness”parameters)and hermitization scheme will have diﬀerent corresponding three-body(and higher many-body)interactions.In Sect.3,we derive alternative energy-independent RG equations that are generalizations from sharp cutoﬀs.This approach has the advantage of a one-step construction,but accurate solutions of the resulting RG equations for low-momentum cutoﬀs are technically more challenging.The low-momentum interactions with smooth cutoﬀs,with diﬀerent regulator types and diﬀerent hermitization schemes,are applied in the two-body sector and to calcula-tions of the triton in Sect.4.We test the convergence properties and doc-ument the distortions of various combinations.In most cases we derive the low-momentum interactions starting from chiral eﬀectiveﬁeld theory(EFT) potentials at N3LO[16,17],but also use the Argonne v18potential[13]for comparison.We summarize our conclusions in Sect.5and give an outlook for future applications.2Smooth cutoﬀinteractions via an energy-dependent RGOur goal is to construct a smooth cutoﬀversion of the energy-independent and hermitian low-momentum interaction V low k.In this section,we describe a three-step method that utilizes an energy-dependent RG equation to lower the cutoﬀ,followed by two transformations that remove the energy dependence and the resulting non-hermiticity in the interaction.In the next section,we derive an equivalent method that uses a hermitian and energy-independent RG equation to construct the low-momentum interaction in one step. First,we derive the RG equation for an energy-dependent low-momentum interaction V eﬀ(E),which is cut oﬀby smooth regulators.It is convenient and eﬃcient for numerical calculations to deﬁne the partial-wave interaction V eﬀand the corresponding T eﬀmatrix in terms of a reduced potential v and a reduced t matrix asV eﬀ(k′,k;E)=f(k′)v(k′,k;E)f(k),(1) T eﬀ(k′,k;E)=f(k′)t(k′,k;E)f(k),(2)where f(k)is a smooth cutoﬀfunction satisfyingf(k)k≪Λ−→1and f(k)k≫Λ−→0.(3) The regulator functions are the same in each partial wave and possible choices are discussed in Sect.4.Given the low-momentum interaction V eﬀ(E),the reduced fully-oﬀ-shell t ma-trix satisﬁes the Lippmann-Schwinger equation,t(k′,k;E)=v(k′,k;E)+2E−p2,(4)where we use units with =c=m=1and a principal value integral is implicit here and in the following.Note that the smooth cutoﬀis on the loop momentum but not on external momenta,and that v and f are cutoﬀdependent.We impose that the fully-oﬀ-shell t(k′,k;E)is independent of the cutoﬀ,dt(k′,k;E)/dΛ=0.This leads to an energy-dependent RG equation for the change in the reduced low-momentum interaction v(k′,k;E)as the cutoﬀis lowered.In operator form,we have t=v(1+GΛ0t)and thus v=t(1+GΛ0t)−1. With dU−1=−U−1dU U−1,we obtain(see also Ref.[18])dvdΛt(1+GΛ0t)−1=−vdGΛ0dΛv(k′,k;E)=2dΛ[f2(p)]v(p,k;E)2w k/π|k ,where w k are the Gauss-Legendre weights.The new basis states are normalized as ¯k|¯p =δk,p.Tak-ing matrix elements of v(E)between the discretized plane-wave states gives v k′,k;E≡2w k′w k v(k′,k;E),and Eq.(6)takes the simple matrix formddΛ[f2p]If we take a nucleon-nucleon(NN)potential model V NN as the large-cutoﬀinitial condition and numerically integrate the RG equation,the resulting energy-dependent v(k′,k;E)preserves the fully-oﬀ-shell T NN for all external momenta and energies,t(k′,k;E)=T NN(k′,k;E).Therefore,V eﬀ(k′,k;E)= f(k′)v(k′,k;E)f(k)preserves the low momentum fully-oﬀ-shell T NN matrix up to factors of the smooth cutoﬀfunction,T eﬀ(k′,k;E)=f(k′)T NN(k′,k;E)f(k). In the second step,we convert the energy dependence of V eﬀ(E)to momen-tum dependence by using a method similar toﬁeld redeﬁnitions.For this pur-pose,we introduce an energy-independent(but non-hermitian)V low k(k′,k) that reproduces the half-on-shell T eﬀmatrix(and hence wave functions)as V eﬀ(k′,k;E),k′|T eﬀ(p2)|p = k′|V eﬀ(p2)|χp ≡ k′|V low k|χp ,(8) where|χp are the eigenstates of the energy-dependent Hamiltonian H eﬀ(p2)= T+V eﬀ(p2)with relative kinetic energy ing the completeness of the interacting eigenstates,this leads toV low k(k′,k)= ∞0p2dp2of the energy-independent RG equation derived from half-on-shell t matrix equivalence in the next section.The method can be improved further by integrating the energy-dependent RG equation,Eq.(6),formally instead of a numerical integration.That is,using dv−1=−v−1dv v−1=dGΛ0,we obtain v−1−V−1NN=GΛ0−G0,and we recover the Bloch-Horowitz equation generalized to a smooth cutoﬀ,v(k′,k;E)=V NN(k′,k)+2E−p2.(12)Converting the RG equation to this integral equation and solving by matrix methods speeds up the calculation considerably and reduces the accumulation of numerical errors.1Finally,we note that the initial energy-dependent RG equation is not needed if one starts directly from the smooth-cutoﬀgeneralization of the Bloch-Horowitz equation.For this purpose,we separate the free two-nucleon propagator into a smooth-cutoﬀlow-momentum GΛ0and a high-momentum partGΛ0(E))T NN(E),and a rearrangement gives the fully-oﬀ-shell T NN(E)matrix with only low-momentum propagators,T NN(E)=v(E)+v(E)GΛ0(E)T NN(E),(13) where v(E)is the solution to the smooth-cutoﬀBloch-Horowitz equation, Eq.(12).2.1Hermitian low-momentum interactionsIn the third step,we remove the non-hermiticity of V low k by a similarity transformation that orthogonalizes the set of eigenvectors{|χp }of the non-hermitian Hamiltonian H low k=T+V low k.Following Holt et al.[20],we deﬁne a transformation Z byZ|χp =|ξp and ξp|ξp′ =δpp′,(14) where the Kronecker delta normalization implies the discretization procedure of the previous section has been carried out.The hermitian low-momentum interaction is then given by1One recovers the folded diagram series,if one applies the same trick to the energy-independent RG equation discussed in the next section.However,this series does not sum to a simple linear integral equation.There is not a unique choice for the transformation Z,which is a reﬂection of the general freedom in low-energy eﬀective theories.In Ref.[20],it was shown how several common hermitization methods correspond to diﬀerent choices for Z.For example,within the Lee-Suzuki framework[21,22]for deriving eﬀective interactions(which implies a sharp cutoﬀcorresponding to orthogonal P Q=0 projection operators),the eigenstates of the non-hermitian V low k obeyχp|P+ω†ω|χp′ =δpp′,(16) whereω=QωP is the wave operator that parameterizes the Lee-Suzuki decoupling transformation.Identifying Z†Z=P+ω†ωdeﬁnes a class of valid√transformations for Z.For example,setting Z=V low k that preserve the low-momentum fully-on-shell T NN matrix,up to factors of the regulator function T low k(k,k;k2)=f2(k)T NN(k,k;k2),and the deuteron binding energy.The corresponding three-body interactions will diﬀer,however, as discussed below.The Gram-Schmidt method can be applied directly to the smooth-cutoﬀV low k. As in Ref.[20],the orthogonal basis{|ξp }is constructed via|ξ1 =Z11|χ1|ξ2 =Z21|χ1 +Z22|χ2 (17) |ξ3 =Z31|χ1 +Z32|χ2 +Z33|χ3...where the Z pp′are determined sequentially so that ξ′p|ξp =δpp′.We have chosen to take the eigenstate of H low k with lowest energy as the starting vector, although any other linear combination could have been used.In practice,the modiﬁed Gram-Schmidt algorithm with re-orthogonalization of Ref.[26]is utilized to guard against round-oﬀerrors.The transformation Z corresponding to the Gram-Schmidt hermitization is then given byZ= p|ξp χp|and Z−1= p|χp ξp|.(18)In contrast to the Gram-Schmidt method,the other hermitization schemes are formulated in terms of the Lee-Suzuki wave-operatorω,which apparently relies on the use of orthogonal projection operators P Q=0corresponding to sharp cutoﬀs.In order to generalize the Okubo and Andreozzi hermitization schemes to smooth cutoﬀs,it is necessary to eliminate all references toω.This is easily done by noting that the bi-orthogonal complement vectors are deﬁned through χp|χp′ =δpp′.For a sharp cutoﬀ,Eq.(16)implies| χp =(P+ω†ω)|χp ,and thusP+ω†ω= p| χp χp|.(19)For smooth cutoﬀs,the obvious generalization is to construct the operator p| χp χp|and decompose it asZ†Z= p| χp χp|.(20)As for sharp cutoﬀs,the generalized Okubo transformation corresponds to Z=π ∞0p2dp v(k′,p)f2(p)t(p,k;k2)choice preserves the on-shell t matrix while also maintaining energy indepen-dence.The result is0=2dΛ πδ(p−k)k2−p2−2dΛ[f2(p)]t(p,k;k2)π ∞0p2dp dv(k′,p)f(k)χk(p)=2dΛv(k′,p)f(p)t(p,k;k2)π ∞0p2dp dv(k′,p)π ∞0p2dp2d f(p)p2−k2 p|V low k|χk=2dΛv(k′,p) p|V low k G(p2)|χk ,(26)where G(E)=(E−H low k)−1is the interacting two-nucleon Green’s ing the completeness relation, k2dk χ∗k(p′)χk(p)=πdΛv(k′,k)=2dΛ[f2(p)]t(p,k;p2)dΛV low k(k′,k)=21−(k/Λ)2.(28)Finally,we can use the Okubo transformation to hermitize V low k.In order to generalize the hermitization to smooth cutoﬀs,we consider the sharp-cutoﬀOkubo transformation under an inﬁnitesimal change of the cutoﬀ.In this case,one can show that the RG equation for the hermitiandΛπ T low k(Λ,k;Λ2)T low k(k′,Λ;Λ2)1−(k′/Λ)2 .(29)A simple generalization of the Okubo transformation to smooth cutoﬀs is therefore obtained by symmetrizing the smooth-cutoﬀRG equation,Eq.(27), to obtain1v(k′,p)d t(p,k;p2)dv(k′,k)=t(k′,p;p2)d v(p,k)V low k preserves the low-momentum fully-on-shell T NN matrix,up to factors of the regulator function T low k(k,k;k2)=f2(k)T NN(k,k;k2),and the deuteron bind-ing energy.The freedom in the hermitization method can be expressed as an auxiliary condition dt(k′,k;k2)/dΛ=(k2−k′2)Φ(k′,k),whereΦ(k′,k)is a function with lim k→k′(k2−k′2)Φ(k′,k)=0.The above RG equation derived from the Okubo transformation makes a particular choice forΦ(k′,k).The numerical solution of the energy-independent RG equation is compli-cated by the t matrix calculation involved in each step.The computational overhead slows down the ODE solver signiﬁcantly.In addition,the RG equa-tion involves two-dimensional interpolations and principal-value integrals over narrowly peaked functions.Therefore,it is easy to introduce small errors at each step that can accumulate as the cutoﬀis lowered.Finally,some potential models exhibit spurious resonances(at order GeV energies and momenta)[27] and these need to be subtracted before solving the energy-independent RG equation for these potentials.Because of these diﬃculties,we have exclusively used the three-step method from Sect.2to generate the results presented in the next section.4ResultsIn this section,we apply the formalism discussed in Sect.2to derive hermitian, low-momentum interactions with smooth cutoﬀs.Electromagnetic contribu-tions are included in the evolution to low momenta.Since all of the following results are for the hermitian V low k,we drop the overbar hereafter.In select-ing a smooth regulator function satisfying the conditions of Eq.(3),there is obviously much freedom,which parallels the freedom ofﬁeld redeﬁnitions in low-energy eﬀective theories and the functional RG[28].However,there are trade-oﬀs in the choice.A smoother cutoﬀwill dampen more the artifacts of a theta-function regulator but will distort more the phase shifts for momenta near the cutoﬀ.We present results for two choices for f(k),each with a range of parameters. These are the exponential form used in current chiral EFT potentials[16,17] with integer n determining the smoothness,f(k)=e−(k2/Λ2)n,(31) and a Fermi-Dirac form with a sharper cutoﬀachieved with smallerǫ,f(k)=11+(k2/Λ2)n,(33)a hyperbolic tangent form with anǫparameter that plays a similar role as in the Fermi-Dirac function,f(k)=1Λkǫ ,(34)a complementary error function withǫparameter,f(k)=1ǫ ,(35)and a Strutinsky averaging withǫparameter,f(k)=1ǫ2 +erfΛ2−k2Fig.2.Plots of momentum k forΛ=2fm−1and a range of parameters n andǫ.In each case,the function and its derivatives are continuous,and the parameter n orǫcontrols the smoothness.In Fig.2,the quantity[f(k)]2is plotted against k forﬁxed cutoﬀΛ=2fm−1 using some candidate parameterizations of the exponential and Fermi-Dirac forms.While a sharp cutoﬀ(for which f(k)=θ(Λ−k))preserves the on-shell T matrix up to the cutoﬀ,the T matrix for a smooth cutoﬀis multiplied by [f(k)]2,leading to distortions in the phase shifts near the cutoﬀ.The expo-nential regulator(“exp”)from Eq.(31)with n=3corresponds to what is used in N3LO chiral potentials[16,17];it is evident that this regulator applied in the present context will signiﬁcantly distort the phase shifts for momenta well belowΛ.As n is increased,the regulator gets sharper;for numerical rea-sons,n=10is probably the practical upper limit.The Fermi-Dirac form (“FD”)interpolates smoothly between sharp and smooth as a function of the ǫparameter.It causes less distortion for lower momenta than the exponential regulator.The eﬀects of diﬀerent regulators on V low k potentials with the same starting (“bare”)potential and the same cutoﬀare illustrated in Fig.3.In channels where the potential is close to zero atΛ,such as the3S1partial wave,the diﬀerences between sharp and smooth are slight,particularly as the regulator gets sharper(n exp is smoother thanǫFD=0.5fm−1;see Fig.2).However,the diﬀerence in other channels can be striking,as seen in Fig.3for the3D2partial wave.The observed cusp-like behavior is due to reproducing the phase shifts for momenta up to the cutoﬀ.The existence of sharp regulator artifacts is not a problem in principle,as the potential is not an observable,but in practice it can lead to convergence problems at the10–100keV level in the deuteron and triton.Fig.3.Diagonal matrix elements V low k(k,k)forΛ=2fm,derived from the N LO chiral potential of Ref.[16]with a sharp and two smooth regulators.One of the striking properties of V low k with a sharp cutoﬀis the“collapse”of diﬀerent high-precision NN potentials to almost the same low-momentum potential as the cutoﬀis lowered below3fm−1[2].This behavior is expected as a consequence of the same long-range pion-exchange interaction together with phase shift equivalence of the potentials up to k≈2.1fm−1.Therefore, it is not surprising that it remains a property of V low k interactions derived with smooth regulators,as shown in Fig.4for a set of chiral potentials with diﬀerent initial(“bare”)cutoﬀs.Note that the low-momentum interactions from diﬀerent starting potentials are close but not identical.The diﬀerences will be paralleled by diﬀerences in the corresponding short-range three-body interactions.Low-momentum interactions with smooth cutoﬀs reproduce the initial phase shifts up to factors of the regulator function.The error in phase shifts due to the regulator alone is illustrated in Fig.5for some representative two-body phase shifts as a function of laboratory energy.In particular,for each energy,the on-shell T-matrix from a bare potential(in this case the N3LO chiral potential from Ref.[16])is multiplied by[f(k)]2using the momentum k corresponding to that energy.This corresponds to the distortion that would be present if numerical errors in constructing V low k were negligible.The latter are documented next and are small.We have no universal rule for deciding whether a distortion is acceptable;it depends on how it propagates to the observable in question.For example,for the low-energy bound-state of the deuteron,none of the distortions in Fig.5is important.The distortion is analogous to the error band from a chiral EFT truncation(but we have not formulated a corresponding power counting rule).Therefore,we expect that there is no concern if the distortion is comparable to the EFT truncation error.In addition,for low-energy properties,the error incurred here can be absorbed by the short-range part of the corresponding three-body interactions.Fig.4.with a Fermi-Dirac regulator(ǫFD=0.5fm−1)as the cutoﬀis lowered in the1S0 and3S1channels.The diagonal matrix elements V low k(k,k)are shown,but as in Ref.[2],weﬁnd similar results for the oﬀ-diagonal matrix elements.The diﬀerent lines correspond to diﬀerent starting potentials with the corresponding cutoﬀs,Λ[16] orΛ/ Λ[17],in MeV given in the legends.For Ref.[17], Λis the spectral function cutoﬀ.Consequently,the propagation of phase-shift errors to many-body observables can only be studied after including three-nucleon forces.It is evident,however, that n exp=8andǫFD=0.5only distort minimally.We will show below that these are also good choices for convergence.In future applications,the regulator eﬀects can be tested by varying the parameters determining thesmoothness.tization methods using the exponential regulator withΛ=2fm−1.The potentials used on the left were derived from Argonne v18[13]with n exp=4and those used on the right were derived from the N3LO chiral potential from Ref.[16]with n exp=8. Phase shifts can also diﬀer from those calculated from the input(“bare”) potential because of numerical errors in generating the low-momentum in-teraction.In Fig.6,the eﬀects of numerical errors on the phase shifts for diﬀerent hermitization methods(butﬁxed regulator)are isolated by plotting the diﬀerence of the calculated low-momentum phase shifts and the distortedspace( u(k)and w(k)respectively)for the bare N3LO chiral potential from Ref.[16] and those derived using smooth and sharp cutoﬀs atΛ=2.0fm−1and1.5fm−1. bare phase shifts.We conclude that the Okubo hermitization is numericallymore robust than the Gram-Schmidt or Cholesky methods,with errors in the phase shifts of order10−4degrees below200MeV and then varying up to10−3degrees depending on the regulator and the initial potential.Consequently, we use the Okubo hermitization in the following unless otherwise speciﬁed.Forﬁxed hermitization but diﬀerent regulators,we have found that relativelysmooth cutoﬀs(e.g.,ǫFD=0.5fm−1)achieve10−4degree accuracy with a moderate(of order50)Gauss points,but sharper cutoﬀs(e.g.,ǫFD=0.2fm−1)have errors for energies above E lab=200MeV that can grow as large as10−1degrees.Greater accuracy can be obtained with a more carefully prescribed distribution of points.2The deuteron wave functions for smooth and sharp cutoﬀs are contrasted inmomentum space in Fig.7and in coordinate space in Fig.8.We follow thenotation of Ref.[29],with S–wave and D–wave components denoted in coordi-nate space by u and w respectively,and with tildes in momentum space.Theyare normalized as ∞0dr[u(r)2+w(r)2]=1and ∞0dk k2[ u(k)2+ w(k)2]=1. The sharp-cutoﬀwave functions develop cusp-like structure in momentumspace below2fm−1(see inset in Fig.7),which are removed by the Fermi-Dirac(or any other smooth)regulator.The diﬀerent momentum-space behav-ior is evident in the coordinate-space wave functions as smaller amplitudescutoﬀfor selected partial waves.The V low k interactions are derived from the N 3LO chiral potential of Ref.[16].in the large-distance oscillations with increasing smoothing (for the S–state component this is visible only under additional magniﬁcation).Note that the “wound”in the N 3LO coordinate-space wave function is removed by running down the cutoﬀ(we plot u (r )/r to make the suppression near the origin more explicit).This feature of the wave functions leads to more perturbative be-havior in nuclear matter [5]as well as in few-body systems [8,10].The “perturbativeness”of V low k interactions with sharp cutoﬀs was exam-ined in Ref.[8]using Weinberg eigenvalues as a diagnostic.For details on theFig.10.D–state probability P D (left axis),binding energy E d (lower right axis),and asymptotic D/S ratio ηd (upper right axis)of the deuteron as a function of the cutoﬀ,starting from the Argonne v 18[13](left)and the N 3LO chiral potential of Ref.[16](right)with diﬀerent smooth regulators.Weinberg analysis we refer the reader to Refs.[8,30].In Fig.9,we show the largest repulsive Weinberg eigenvalues as a function of the cutoﬀfor selected channels,using the N 3LO chiral potential from Ref.[16],which is constructed with a cutoﬀof 500MeV.Although this is already a fairly soft potential,we still observe the characteristic decrease with low-momentum cutoﬀstarting as high as 3.5fm −1(rather than at 2.5fm −1,as one might naively expect).This translates into weaker correlations in many-body wave functions (and there-fore better convergence).The rate of decrease is largely independent of the smoothness of the cutoﬀ.Various deuteron properties are shown in Fig.10as a function of the cutoﬀusing potentials derived from the Argonne v 18potential [13]with exponential regulator n exp =8on the left,and from the N 3LO chiral potential of Ref.[16]with exponential regulator n exp =6on the right.Plotted on the left axis is the D–state probability P D ,deﬁned asP D ≡∞0dr w (r )2=∞dk k 2 w(k )2.(37)The cutoﬀdependence reﬂects the fact that P D is not an observable [31,32].The D–state probability evolves with the short-range part of the potential and,in particular,the short-range tensor interaction decreases as the cutoﬀis lowered.This decrease is desirable to reduce correlations in many-body wave functions [5].The qualitative change in P D is the same for other regulators and hermitization schemes (the Gram-Schmidt procedure is used in Fig.10).In contrast to the D–state probability,the deuteron binding energy E d and the asymptotic D/S ratio ηd are observables and thus cutoﬀindependent (up to nu-merical tolerances),as shown by the right axes in Fig.10.The asymptotic D/S ratio is calculated here by an extrapolation of the ratio of the deuteron wavederator as a function of the cutoﬀfor diﬀerent smoothness regulators(left)and for diﬀerent hermitization schemes(right).The low-momentum interactions are derived from the Argonne v18potential[13],and the experimental quadrupole moment is indicated with an arrow.function D–to S–state components to the deuteron pole at k2=−mE d[31], rather than the conventional approach of extrapolating the3S1–3D1mixing angleε1.That is,we evaluate− w(k2)/ u(k2)on the Gauss mesh for positive k2,and then make a(near-linear)extrapolation to k2=−mE d.The constancy ofηd in Fig.10directly refutes the claim of Ref.[33]that this quantity cannot be reproduced using low-momentum interactions.Matrix elements of operators that are dominated by distance scales larger than the inverse cutoﬀare to a good approximation preserved as the cutoﬀis lowered.We investigate the evolution of operators by studying the expectation values in the deuteron of the bare quadrupole moment,rms radius and1/r operators.The relevant formulas are[29]Q d=18u(r)−w(r))=−18 k2d u(k)dk+3k w(k)d u(k)dk2+6 w(k)2 ,(38)r d=12∞0dk k d u(k)dk 2+6 w(k)2 1/2,(39)。

Consensus and Cooperation in Networked Multi-Agent Systems

Consensus and Cooperation in Networked Multi-Agent SystemsAlgorithms that provide rapid agreement and teamwork between all participants allow effective task performance by self-organizing networked systems.By Reza Olfati-Saber,Member IEEE,J.Alex Fax,and Richard M.Murray,Fellow IEEEABSTRACT|This paper provides a theoretical framework for analysis of consensus algorithms for multi-agent networked systems with an emphasis on the role of directed information flow,robustness to changes in network topology due to link/node failures,time-delays,and performance guarantees. An overview of basic concepts of information consensus in networks and methods of convergence and performance analysis for the algorithms are provided.Our analysis frame-work is based on tools from matrix theory,algebraic graph theory,and control theory.We discuss the connections between consensus problems in networked dynamic systems and diverse applications including synchronization of coupled oscillators,flocking,formation control,fast consensus in small-world networks,Markov processes and gossip-based algo-rithms,load balancing in networks,rendezvous in space, distributed sensor fusion in sensor networks,and belief propagation.We establish direct connections between spectral and structural properties of complex networks and the speed of information diffusion of consensus algorithms.A brief introduction is provided on networked systems with nonlocal information flow that are considerably faster than distributed systems with lattice-type nearest neighbor interactions.Simu-lation results are presented that demonstrate the role of small-world effects on the speed of consensus algorithms and cooperative control of multivehicle formations.KEYWORDS|Consensus algorithms;cooperative control; flocking;graph Laplacians;information fusion;multi-agent systems;networked control systems;synchronization of cou-pled oscillators I.INTRODUCTIONConsensus problems have a long history in computer science and form the foundation of the field of distributed computing[1].Formal study of consensus problems in groups of experts originated in management science and statistics in1960s(see DeGroot[2]and references therein). The ideas of statistical consensus theory by DeGroot re-appeared two decades later in aggregation of information with uncertainty obtained from multiple sensors1[3]and medical experts[4].Distributed computation over networks has a tradition in systems and control theory starting with the pioneering work of Borkar and Varaiya[5]and Tsitsiklis[6]and Tsitsiklis,Bertsekas,and Athans[7]on asynchronous asymptotic agreement problem for distributed decision-making systems and parallel computing[8].In networks of agents(or dynamic systems),B con-sensus[means to reach an agreement regarding a certain quantity of interest that depends on the state of all agents.A B consensus algorithm[(or protocol)is an interaction rule that specifies the information exchange between an agent and all of its neighbors on the network.2 The theoretical framework for posing and solving consensus problems for networked dynamic systems was introduced by Olfati-Saber and Murray in[9]and[10] building on the earlier work of Fax and Murray[11],[12]. The study of the alignment problem involving reaching an agreement V without computing any objective functions V appeared in the work of Jadbabaie et al.[13].Further theoretical extensions of this work were presented in[14] and[15]with a look toward treatment of directed infor-mation flow in networks as shown in Fig.1(a).Manuscript received August8,2005;revised September7,2006.This work was supported in part by the Army Research Office(ARO)under Grant W911NF-04-1-0316. R.Olfati-Saber is with Dartmouth College,Thayer School of Engineering,Hanover,NH03755USA(e-mail:olfati@).J.A.Fax is with Northrop Grumman Corp.,Woodland Hills,CA91367USA(e-mail:alex.fax@).R.M.Murray is with the California Institute of Technology,Control and Dynamical Systems,Pasadena,CA91125USA(e-mail:murray@).Digital Object Identifier:10.1109/JPROC.2006.8872931This is known as sensor fusion and is an important application of modern consensus algorithms that will be discussed later.2The term B nearest neighbors[is more commonly used in physics than B neighbors[when applied to particle/spin interactions over a lattice (e.g.,Ising model).Vol.95,No.1,January2007|Proceedings of the IEEE2150018-9219/$25.00Ó2007IEEEThe common motivation behind the work in [5],[6],and [10]is the rich history of consensus protocols in com-puter science [1],whereas Jadbabaie et al.[13]attempted to provide a formal analysis of emergence of alignment in the simplified model of flocking by Vicsek et al.[16].The setup in [10]was originally created with the vision of de-signing agent-based amorphous computers [17],[18]for collaborative information processing in ter,[10]was used in development of flocking algorithms with guaranteed convergence and the capability to deal with obstacles and adversarial agents [19].Graph Laplacians and their spectral properties [20]–[23]are important graph-related matrices that play a crucial role in convergence analysis of consensus and alignment algo-rithms.Graph Laplacians are an important point of focus of this paper.It is worth mentioning that the second smallest eigenvalue of graph Laplacians called algebraic connectivity quantifies the speed of convergence of consensus algo-rithms.The notion of algebraic connectivity of graphs has appeared in a variety of other areas including low-density parity-check codes (LDPC)in information theory and com-munications [24],Ramanujan graphs [25]in number theory and quantum chaos,and combinatorial optimization prob-lems such as the max-cut problem [21].More recently,there has been a tremendous surge of interest V among researchers from various disciplines of engineering and science V in problems related to multia-gent networked systems with close ties to consensus prob-lems.This includes subjects such as consensus [26]–[32],collective behavior of flocks and swarms [19],[33]–[37],sensor fusion [38]–[40],random networks [41],[42],syn-chronization of coupled oscillators [42]–[46],algebraic connectivity 3of complex networks [47]–[49],asynchro-nous distributed algorithms [30],[50],formation control for multirobot systems [51]–[59],optimization-based co-operative control [60]–[63],dynamic graphs [64]–[67],complexity of coordinated tasks [68]–[71],and consensus-based belief propagation in Bayesian networks [72],[73].A detailed discussion of selected applications will be pre-sented shortly.In this paper,we focus on the work described in five key papers V namely,Jadbabaie,Lin,and Morse [13],Olfati-Saber and Murray [10],Fax and Murray [12],Moreau [14],and Ren and Beard [15]V that have been instrumental in paving the way for more recent advances in study of self-organizing networked systems ,or swarms .These networked systems are comprised of locally interacting mobile/static agents equipped with dedicated sensing,computing,and communication devices.As a result,we now have a better understanding of complex phenomena such as flocking [19],or design of novel information fusion algorithms for sensor networks that are robust to node and link failures [38],[72]–[76].Gossip-based algorithms such as the push-sum protocol [77]are important alternatives in computer science to Laplacian-based consensus algorithms in this paper.Markov processes establish an interesting connection between the information propagation speed in these two categories of algorithms proposed by computer scientists and control theorists [78].The contribution of this paper is to present a cohesive overview of the key results on theory and applications of consensus problems in networked systems in a unified framework.This includes basic notions in information consensus and control theoretic methods for convergence and performance analysis of consensus protocols that heavily rely on matrix theory and spectral graph theory.A byproduct of this framework is to demonstrate that seem-ingly different consensus algorithms in the literature [10],[12]–[15]are closely related.Applications of consensus problems in areas of interest to researchers in computer science,physics,biology,mathematics,robotics,and con-trol theory are discussed in this introduction.A.Consensus in NetworksThe interaction topology of a network of agents is rep-resented using a directed graph G ¼ðV ;E Þwith the set of nodes V ¼f 1;2;...;n g and edges E V ÂV .TheFig.1.Two equivalent forms of consensus algorithms:(a)a networkof integrator agents in which agent i receives the state x j of its neighbor,agent j ,if there is a link ði ;j Þconnecting the two nodes;and (b)the block diagram for a network of interconnecteddynamic systems all with identical transfer functions P ðs Þ¼1=s .The collective networked system has a diagonal transfer function and is a multiple-input multiple-output (MIMO)linear system.3To be defined in Section II-A.Olfati-Saber et al.:Consensus and Cooperation in Networked Multi-Agent Systems216Proceedings of the IEEE |Vol.95,No.1,January 2007neighbors of agent i are denoted by N i ¼f j 2V :ði ;j Þ2E g .According to [10],a simple consensus algorithm to reach an agreement regarding the state of n integrator agents with dynamics _x i ¼u i can be expressed as an n th-order linear system on a graph_x i ðt Þ¼X j 2N ix j ðt ÞÀx i ðt ÞÀÁþb i ðt Þ;x i ð0Þ¼z i2R ;b i ðt Þ¼0:(1)The collective dynamics of the group of agents following protocol (1)can be written as_x ¼ÀLx(2)where L ¼½l ij is the graph Laplacian of the network and itselements are defined as follows:l ij ¼À1;j 2N i j N i j ;j ¼i :&(3)Here,j N i j denotes the number of neighbors of node i (or out-degree of node i ).Fig.1shows two equivalent forms of the consensus algorithm in (1)and (2)for agents with a scalar state.The role of the input bias b in Fig.1(b)is defined later.According to the definition of graph Laplacian in (3),all row-sums of L are zero because of P j l ij ¼0.Therefore,L always has a zero eigenvalue 1¼0.This zero eigenvalues corresponds to the eigenvector 1¼ð1;...;1ÞT because 1belongs to the null-space of L ðL 1¼0Þ.In other words,an equilibrium of system (2)is a state in the form x Ã¼ð ;...; ÞT ¼ 1where all nodes agree.Based on ana-lytical tools from algebraic graph theory [23],we later show that x Ãis a unique equilibrium of (2)(up to a constant multiplicative factor)for connected graphs.One can show that for a connected network,the equilibrium x Ã¼ð ;...; ÞT is globally exponentially stable.Moreover,the consensus value is ¼1=n P i z i that is equal to the average of the initial values.This im-plies that irrespective of the initial value of the state of each agent,all agents reach an asymptotic consensus regarding the value of the function f ðz Þ¼1=n P i z i .While the calculation of f ðz Þis simple for small net-works,its implications for very large networks is more interesting.For example,if a network has n ¼106nodes and each node can only talk to log 10ðn Þ¼6neighbors,finding the average value of the initial conditions of the nodes is more complicated.The role of protocol (1)is to provide a systematic consensus mechanism in such a largenetwork to compute the average.There are a variety of functions that can be computed in a similar fashion using synchronous or asynchronous distributed algorithms (see [10],[28],[30],[73],and [76]).B.The f -Consensus Problem and Meaning of CooperationTo understand the role of cooperation in performing coordinated tasks,we need to distinguish between un-constrained and constrained consensus problems.An unconstrained consensus problem is simply the alignment problem in which it suffices that the state of all agents asymptotically be the same.In contrast,in distributed computation of a function f ðz Þ,the state of all agents has to asymptotically become equal to f ðz Þ,meaning that the consensus problem is constrained.We refer to this con-strained consensus problem as the f -consensus problem .Solving the f -consensus problem is a cooperative task and requires willing participation of all the agents.To demonstrate this fact,suppose a single agent decides not to cooperate with the rest of the agents and keep its state unchanged.Then,the overall task cannot be performed despite the fact that the rest of the agents reach an agree-ment.Furthermore,there could be scenarios in which multiple agents that form a coalition do not cooperate with the rest and removal of this coalition of agents and their links might render the network disconnected.In a dis-connected network,it is impossible for all nodes to reach an agreement (unless all nodes initially agree which is a trivial case).From the above discussion,cooperation can be infor-mally interpreted as B giving consent to providing one’s state and following a common protocol that serves the group objective.[One might think that solving the alignment problem is not a cooperative task.The justification is that if a single agent (called a leader)leaves its value unchanged,all others will asymptotically agree with the leader according to the consensus protocol and an alignment is reached.However,if there are multiple leaders where two of whom are in disagreement,then no consensus can be asymptot-ically reached.Therefore,alignment is in general a coop-erative task as well.Formal analysis of the behavior of systems that involve more than one type of agent is more complicated,partic-ularly,in presence of adversarial agents in noncooperative games [79],[80].The focus of this paper is on cooperative multi-agent systems.C.Iterative Consensus and Markov ChainsIn Section II,we show how an iterative consensus algorithm that corresponds to the discrete-time version of system (1)is a Markov chainðk þ1Þ¼ ðk ÞP(4)Olfati-Saber et al.:Consensus and Cooperation in Networked Multi-Agent SystemsVol.95,No.1,January 2007|Proceedings of the IEEE217with P ¼I À L and a small 90.Here,the i th element of the row vector ðk Þdenotes the probability of being in state i at iteration k .It turns out that for any arbitrary graph G with Laplacian L and a sufficiently small ,the matrix P satisfies the property Pj p ij ¼1with p ij !0;8i ;j .Hence,P is a valid transition probability matrix for the Markov chain in (4).The reason matrix theory [81]is so widely used in analysis of consensus algorithms [10],[12]–[15],[64]is primarily due to the structure of P in (4)and its connection to graphs.4There are interesting connections between this Markov chain and the speed of information diffusion in gossip-based averaging algorithms [77],[78].One of the early applications of consensus problems was dynamic load balancing [82]for parallel processors with the same structure as system (4).To this date,load balancing in networks proves to be an active area of research in computer science.D.ApplicationsMany seemingly different problems that involve inter-connection of dynamic systems in various areas of science and engineering happen to be closely related to consensus problems for multi-agent systems.In this section,we pro-vide an account of the existing connections.1)Synchronization of Coupled Oscillators:The problem of synchronization of coupled oscillators has attracted numer-ous scientists from diverse fields including physics,biology,neuroscience,and mathematics [83]–[86].This is partly due to the emergence of synchronous oscillations in coupled neural oscillators.Let us consider the generalized Kuramoto model of coupled oscillators on a graph with dynamics_i ¼ Xj 2N isin ð j À i Þþ!i (5)where i and !i are the phase and frequency of the i thoscillator.This model is the natural nonlinear extension of the consensus algorithm in (1)and its linearization around the aligned state 1¼...¼ n is identical to system (2)plus a nonzero input bias b i ¼ð!i À"!Þ= with "!¼1=n P i !i after a change of variables x i ¼ð i À"!t Þ= .In [43],Sepulchre et al.show that if is sufficiently large,then for a network with all-to-all links,synchroni-zation to the aligned state is globally achieved for all ini-tial states.Recently,synchronization of networked oscillators under variable time-delays was studied in [45].We believe that the use of convergence analysis methods that utilize the spectral properties of graph Laplacians willshed light on performance and convergence analysis of self-synchrony in oscillator networks [42].2)Flocking Theory:Flocks of mobile agents equipped with sensing and communication devices can serve as mobile sensor networks for massive distributed sensing in an environment [87].A theoretical framework for design and analysis of flocking algorithms for mobile agents with obstacle-avoidance capabilities is developed by Olfati-Saber [19].The role of consensus algorithms in particle-based flocking is for an agent to achieve velocity matching with respect to its neighbors.In [19],it is demonstrated that flocks are networks of dynamic systems with a dynamic topology.This topology is a proximity graph that depends on the state of all agents and is determined locally for each agent,i.e.,the topology of flocks is a state-dependent graph.The notion of state-dependent graphs was introduced by Mesbahi [64]in a context that is independent of flocking.3)Fast Consensus in Small-Worlds:In recent years,network design problems for achieving faster consensus algorithms has attracted considerable attention from a number of researchers.In Xiao and Boyd [88],design of the weights of a network is considered and solved using semi-definite convex programming.This leads to a slight increase in algebraic connectivity of a network that is a measure of speed of convergence of consensus algorithms.An alternative approach is to keep the weights fixed and design the topology of the network to achieve a relatively high algebraic connectivity.A randomized algorithm for network design is proposed by Olfati-Saber [47]based on random rewiring idea of Watts and Strogatz [89]that led to creation of their celebrated small-world model .The random rewiring of existing links of a network gives rise to considerably faster consensus algorithms.This is due to multiple orders of magnitude increase in algebraic connectivity of the network in comparison to a lattice-type nearest-neighbort graph.4)Rendezvous in Space:Another common form of consensus problems is rendezvous in space [90],[91].This is equivalent to reaching a consensus in position by a num-ber of agents with an interaction topology that is position induced (i.e.,a proximity graph).We refer the reader to [92]and references therein for a detailed discussion.This type of rendezvous is an unconstrained consensus problem that becomes challenging under variations in the network topology.Flocking is somewhat more challenging than rendezvous in space because it requires both interagent and agent-to-obstacle collision avoidance.5)Distributed Sensor Fusion in Sensor Networks:The most recent application of consensus problems is distrib-uted sensor fusion in sensor networks.This is done by posing various distributed averaging problems require to4In honor of the pioneering contributions of Oscar Perron (1907)to the theory of nonnegative matrices,were refer to P as the Perron Matrix of graph G (See Section II-C for details).Olfati-Saber et al.:Consensus and Cooperation in Networked Multi-Agent Systems218Proceedings of the IEEE |Vol.95,No.1,January 2007implement a Kalman filter [38],[39],approximate Kalman filter [74],or linear least-squares estimator [75]as average-consensus problems .Novel low-pass and high-pass consensus filters are also developed that dynamically calculate the average of their inputs in sensor networks [39],[93].6)Distributed Formation Control:Multivehicle systems are an important category of networked systems due to their commercial and military applications.There are two broad approaches to distributed formation control:i)rep-resentation of formations as rigid structures [53],[94]and the use of gradient-based controls obtained from their structural potentials [52]and ii)representation of form-ations using the vectors of relative positions of neighboring vehicles and the use of consensus-based controllers with input bias.We discuss the later approach here.A theoretical framework for design and analysis of distributed controllers for multivehicle formations of type ii)was developed by Fax and Murray [12].Moving in formation is a cooperative task and requires consent and collaboration of every agent in the formation.In [12],graph Laplacians and matrix theory were extensively used which makes one wonder whether relative-position-based formation control is a consensus problem.The answer is yes.To see this,consider a network of self-interested agents whose individual desire is to minimize their local cost U i ðx Þ¼Pj 2N i k x j Àx i Àr ij k 2via a distributed algorithm (x i is the position of vehicle i with dynamics _x i ¼u i and r ij is a desired intervehicle relative-position vector).Instead,if the agents use gradient-descent algorithm on the collective cost P n i ¼1U i ðx Þusing the following protocol:_x i ¼Xj 2N iðx j Àx i Àr ij Þ¼Xj 2N iðx j Àx i Þþb i (6)with input bias b i ¼Pj 2N i r ji [see Fig.1(b)],the objective of every agent will be achieved.This is the same as the consensus algorithm in (1)up to the nonzero bias terms b i .This nonzero bias plays no role in stability analysis of sys-tem (6).Thus,distributed formation control for integrator agents is a consensus problem.The main contribution of the work by Fax and Murray is to extend this scenario to the case where all agents are multiinput multioutput linear systems _x i ¼Ax i þBu i .Stability analysis of relative-position-based formation control for multivehicle systems is extensively covered in Section IV.E.OutlineThe outline of the paper is as follows.Basic concepts and theoretical results in information consensus are presented in Section II.Convergence and performance analysis of consensus on networks with switching topology are given in Section III.A theoretical framework for cooperative control of formations of networked multi-vehicle systems is provided in Section IV.Some simulationresults related to consensus in complex networks including small-worlds are presented in Section V.Finally,some concluding remarks are stated in Section VI.RMATION CONSENSUSConsider a network of decision-making agents with dynamics _x i ¼u i interested in reaching a consensus via local communication with their neighbors on a graph G ¼ðV ;E Þ.By reaching a consensus,we mean asymptot-ically converging to a one-dimensional agreement space characterized by the following equation:x 1¼x 2¼...¼x n :This agreement space can be expressed as x ¼ 1where 1¼ð1;...;1ÞT and 2R is the collective decision of the group of agents.Let A ¼½a ij be the adjacency matrix of graph G .The set of neighbors of a agent i is N i and defined byN i ¼f j 2V :a ij ¼0g ;V ¼f 1;...;n g :Agent i communicates with agent j if j is a neighbor of i (or a ij ¼0).The set of all nodes and their neighbors defines the edge set of the graph as E ¼fði ;j Þ2V ÂV :a ij ¼0g .A dynamic graph G ðt Þ¼ðV ;E ðt ÞÞis a graph in which the set of edges E ðt Þand the adjacency matrix A ðt Þare time-varying.Clearly,the set of neighbors N i ðt Þof every agent in a dynamic graph is a time-varying set as well.Dynamic graphs are useful for describing the network topology of mobile sensor networks and flocks [19].It is shown in [10]that the linear system_x i ðt Þ¼Xj 2N ia ij x j ðt ÞÀx i ðt ÞÀÁ(7)is a distributed consensus algorithm ,i.e.,guarantees con-vergence to a collective decision via local interagent interactions.Assuming that the graph is undirected (a ij ¼a ji for all i ;j ),it follows that the sum of the state of all nodes is an invariant quantity,or P i _xi ¼0.In particular,applying this condition twice at times t ¼0and t ¼1gives the following result¼1n Xix i ð0Þ:In other words,if a consensus is asymptotically reached,then necessarily the collective decision is equal to theOlfati-Saber et al.:Consensus and Cooperation in Networked Multi-Agent SystemsVol.95,No.1,January 2007|Proceedings of the IEEE219average of the initial state of all nodes.A consensus algo-rithm with this specific invariance property is called an average-consensus algorithm [9]and has broad applications in distributed computing on networks (e.g.,sensor fusion in sensor networks).The dynamics of system (7)can be expressed in a compact form as_x ¼ÀLx(8)where L is known as the graph Laplacian of G .The graph Laplacian is defined asL ¼D ÀA(9)where D ¼diag ðd 1;...;d n Þis the degree matrix of G with elements d i ¼Pj ¼i a ij and zero off-diagonal elements.By definition,L has a right eigenvector of 1associated with the zero eigenvalue 5because of the identity L 1¼0.For the case of undirected graphs,graph Laplacian satisfies the following sum-of-squares (SOS)property:x T Lx ¼12Xði ;j Þ2Ea ij ðx j Àx i Þ2:(10)By defining a quadratic disagreement function as’ðx Þ¼12x T Lx(11)it becomes apparent that algorithm (7)is the same as_x ¼Àr ’ðx Þor the gradient-descent algorithm.This algorithm globallyasymptotically converges to the agreement space provided that two conditions hold:1)L is a positive semidefinite matrix;2)the only equilibrium of (7)is 1for some .Both of these conditions hold for a connected graph and follow from the SOS property of graph Laplacian in (10).Therefore,an average-consensus is asymptotically reached for all initial states.This fact is summarized in the following lemma.Lemma 1:Let G be a connected undirected graph.Then,the algorithm in (7)asymptotically solves an average-consensus problem for all initial states.A.Algebraic Connectivity and Spectral Propertiesof GraphsSpectral properties of Laplacian matrix are instrumen-tal in analysis of convergence of the class of linear consensus algorithms in (7).According to Gershgorin theorem [81],all eigenvalues of L in the complex plane are located in a closed disk centered at Áþ0j with a radius of Á¼max i d i ,i.e.,the maximum degree of a graph.For undirected graphs,L is a symmetric matrix with real eigenvalues and,therefore,the set of eigenvalues of L can be ordered sequentially in an ascending order as0¼ 1 2 ÁÁÁ n 2Á:(12)The zero eigenvalue is known as the trivial eigenvalue of L .For a connected graph G , 290(i.e.,the zero eigenvalue is isolated).The second smallest eigenvalue of Laplacian 2is called algebraic connectivity of a graph [20].Algebraic connectivity of the network topology is a measure of performance/speed of consensus algorithms [10].Example 1:Fig.2shows two examples of networks of integrator agents with different topologies.Both graphs are undirected and have 0–1weights.Every node of the graph in Fig.2(a)is connected to its 4nearest neighbors on a ring.The other graph is a proximity graph of points that are distributed uniformly at random in a square.Every node is connected to all of its spatial neighbors within a closed ball of radius r 90.Here are the important degree information and Laplacian eigenvalues of these graphsa Þ 1¼0; 2¼0:48; n ¼6:24;Á¼4b Þ 1¼0; 2¼0:25; n ¼9:37;Á¼8:(13)In both cases, i G 2Áfor all i .B.Convergence Analysis for Directed Networks The convergence analysis of the consensus algorithm in (7)is equivalent to proving that the agreement space characterized by x ¼ 1; 2R is an asymptotically stable equilibrium of system (7).The stability properties of system (7)is completely determined by the location of the Laplacian eigenvalues of the network.The eigenvalues of the adjacency matrix are irrelevant to the stability analysis of system (7),unless the network is k -regular (all of its nodes have the same degree k ).The following lemma combines a well-known rank property of graph Laplacians with Gershgorin theorem to provide spectral characterization of Laplacian of a fixed directed network G .Before stating the lemma,we need to define the notion of strong connectivity of graphs.A graph5These properties were discussed earlier in the introduction for graphs with 0–1weights.Olfati-Saber et al.:Consensus and Cooperation in Networked Multi-Agent Systems220Proceedings of the IEEE |Vol.95,No.1,January 2007。

Eigenvalues of a real supersymmetric tensor

Abstract In this paper, we deﬁne the symmetric hyperdeterminant, eigenvalues and E-eigenvalues of a real supersymmetric tensor. We show that eigenvalues are roots of a one-dimensional polynomial, and when the order of the tensor is even, E-eigenvalues are roots of another one-dimensional polynomial. These two one-dimensional polynomials are associated with the symmetric hyperdeterminant. We call them the characteristic polynomial and the E-characteristic polynomial of that supersymmetric tensor. Real eigenvalues (E-eigenvalues) with real eigenvectors (E-eigenvectors) are called H-eigenvalues (Z-eigenvalues). When the order of the supersymmetric tensor is even, H-eigenvalues (Z-eigenvalues) exist and the supersymmetric tensor is positive deﬁnite if and only if all of its H-eigenvalues (Z-eigenvalues) are positive. An m th-order n -dimensional supersymmetric tensor where m is even has exactly n (m − 1)n −1 eigenvalues, and the number of its E-eigenvalues is strictly less than n (m − 1)n −1 when m ≥ 4. We show that the product of all the eigenvalues is equal to the value of the symmetric hyperdeterminant, while the sum of all the eigenvalues is equal to the sum of the diagonal elements of that supersymmetric tensor, multiplied by (m − 1)n −1 . The n (m − 1)n −1 eigenvalues are distributed in n disks in C. The centers and radii of these n disks are the diagonal elements, and the sums of the absolute values of the corresponding off-diagonal elements, of that supersymmetric tensor. On the other hand, E-eigenvalues are invariant under orthogonal transformations. © 2005 Elsevier Ltd. All rights reserved.

Breaking RSA Generically is Equivalent to Factoring

2
A function f (κ) is considered a non-negligible function in κ if there exists c > 0 and 1 k0 ∈ N such that for all κ > k0 , |f (κ)| > κ c.
Breaking RSA Generically is Equivalent to Factoring
Divesh Aggarwal and Ueli Maurer
Department of Computer Science ETH Zurich CH-8092 Zurich, Switzerland {divesha,maurer}@inf.ethz.ch
We give a brief description of the model of [17]. The model is characterized by a black-box B which can store values from a certain setБайду номын сангаасT in internal state variables V0 , V1 , V2 , · · · . The initial state (the input of the problem to be solved) consists of the values of [V0 , . . ., V ] for some positive integer , which are set according to some probability distribution (e.g. the uniform distribution). The black box B allows two types of operations: – Computation operations. For a set Π of operations of some arities on T , a computation operation consists of selecting f ∈ Π (say t-ary) as well as the indices i1 , . . . , it+1 of t+1 state variables. B computes f (Vi1 , . . . , Vit ) and stores the result in Vit+1 . – Relation Queries. For a set Σ of relations (of some arities) on T , a query consists of selecting a relation ρ ∈ Σ (say t-ary) as well as the indices i1 , . . . , it of t state variables. The query is replied by the binary output ρ(Vi1 , . . . , Vit ) that takes the value 1 if the relation is satisﬁed, and 0 otherwise. For this paper, we only consider the case t = 2 and the only relation queries we consider are equality queries.

A Lower Bound for Quantum Phase Estimation

a r X i v :q u a n t -p h /0412008v 2 26 J a n 2005A Lower Bound for Quantum Phase EstimationArvid J.Bessen ∗Columbia UniversityDepartment of Computer Science(Dated:February 1,2008)We obtain a query lower bound for quantum algorithms solving the phase estimation problem.Our analysis generalizes existing lower bound approaches to the case where the oracle Q is given by controlled powers Q p of Q ,as it is for example in Shor’s order ﬁnding algorithm.In this setting we will prove a Ω(log 1/ǫ)lower bound for the number of applications of Q p 1,Q p 2,....This bound is tight due to a matching upper bound.We obtain the lower bound using a new technique based on frequency analysis.PACS numbers:03.67.LxKeywords:Quantum computing,complexity theory,lower boundsI.INTRODUCTIONWe study lower bounds for the phase estimation prob-lem.In this problem we are given a unitary transfor-mation Q as a black-box and we know that |q is an eigenvector of Q ,i.e.Q |q =e 2πiϕ|q ,ϕ∈[0,1).(1)We want to determine the phase ϕup to precision ǫ.The quantum phase estimation algorithm approxi-mates ϕgiven |q and is the main building block in Shor’s factoring algorithm,the counting algorithm,and the eigenvalue estimation algorithm [1–7].The main element of this algorithm are controlled pow-ers of Q ,which we deﬁne as follows.Let Q be a t qubit unitary transformation and |ψ an arbitrary t qubit state.For l =1,...,c ,and p ∈N we deﬁne the c +t qubit trans-formationW p l (Q )|x 1...x c |ψ =|x 1...x c |ψ x l =0|x 1...x c Q p |ψ x l =1.(2)We call W p l (Q )a (controlled)power query .If the trans-formation Q is clear from the context,we will just write W p l =W p l (Q ).This notation allows us to write the phase estimation algorithm in a compact form,see ﬁgure 1.The algorithm returns an approximation ϕof the phase ϕof |q .|0...F −12TW 2T...|qFIG.1:The quantum phase estimation algorithm in powerquery notation.H is the Hadamard gate,W p l a power queryas in equation (2),and F −12T is the inverse quantum Fourier transform on T qubits.ǫ)power queries.We prove Theorem 1in section IV.The query cost of the phase estimation algorithm may be given by counting each application of W p l (Q )as p applications of Q ,i.e.the query cost of this algorithm is1+2+4+...+2T −1=2T −1.For certain problems,like order-ﬁnding,it is possible to exploit some knowledge about Q .Here Q |y =|xy mod N for a certain ﬁxed x ,and thereforeQ 2j y = x 2j y mod N = x2j −1 2y mod N is easy to compute by repeated squaring and modular multiplication,see e.g.[2].In this case we can use powerqueries W 2kl with essentially the same cost as an applica-tion of Q .Thus we can execute the phase estimation al-gorithm with query cost T and have exponential speedup for the query cost:from 2T −1to T .Let us stress thatthis speedup only applies if the cost for computing Q 2jis similar to that for computing Q .II.PRIOR WORKQuantum query complexity has been important in quantum computing since Grover’s search algorithm,2which is provably superior to classical algorithms [8,9]in the number of queries.The ﬁrst lower bound result was given in [10],which used an adversary argument.Our lower bound approach is based on the ideas of the “polynomial approach”of Beals et.al.,[11–14].Other approaches include the quantum adversary argument and its generalizations [15–17].These approaches only cover problems concerning Boo-lean functions,so we have to extend them to numerical problems.This has been done through extensions of the polynomial method [18,19]and has been applied widely to integration [20–22],path integration [23],approxima-tion [24–26],and eigenvalue estimation [6,7].In this paper we apply the approach of [19]to the phase estimation problem.Instead of using a maximum degree argument,which is not applicable in the case of arbitrary powers,we will develop a new lower bound technique based on frequency analysis.III.QUANTUM ALGORITHMS WITH POWER QUERIES FOR PHASE ESTIMATIONWe would like to derive lower bounds for any quantumalgorithm with power queries that estimates the phase of a matrix Q for a given eigenvector |q .In other words the set of allowed inputs for our problem isQ |q ,t =Q :Q is a unitary t qubit transform ,|q is an eigenvector of Q.We now give a framework that is general enough toallow us to analyze any algorithm with power queries W p l (Q )that solves this problem.The most general algo-rithm will be of the following form:ψ(T )(Q )=U T W p T l T(Q )U T −1...W p 1l 1(Q )U 0 ψ(0) .(4)Here the U 0,U 1,...,U T are arbitrary but ﬁxed c +t qubitunitary transformations and ψ(0) a ﬁxed c +t qubit state,for example |0 |q .In our analysis we neglect the cost to implement the U j or to prepare ψ(0) .The varying parts of algorithm (4)are the W pj l j =W pj l j (Q ):power queries of Q for p j ∈N ,l j =1,...,c ,and c ∈N arbitrary.A measurement of the ﬁnal state ψ(T )(Q ) in the standard basis yields a state |k ,k =0,...,2c +t −1,with probability p k,Q ,from which we get a solution ϕ(k )∈[0,1).If for all Q ∈Q |q ,t the probability to get an ǫ-estimate to the correct phase ϕof Qk : ϕ− ϕ(k ) <ǫp k,Q ≥34in T power queries.We are interested in the smallest number T such that a quantum power query algorithm of form (4)fulﬁlls con-dition (5)for all Q ∈Q |q ,t .IV.GENERAL CONTROLLED ARBITRARYPOWER QUERIESWe consider arbitrary powers p 1,...,p T .This requires us to introduce a new proof technique.To illustrate the idea consider the phase estimation algorithm,performed as in ﬁgure 1with c =T =3control qubits.(F −123⊗I )W 41W 22W 13(H ⊗3⊗I ) 0 |q .Let us trace through each of the steps in this algorithm(we neglect normalization factors).1.(|0 +|1 +|2 +|3 +...+|7 )|q2.(|0 +e 2πiϕ|1 +|2 +e 2πiϕ|3 +...+e 2πiϕ|7 )|q3.(|0 +e 2πiϕ|1 +e 2πi 2ϕ|2 +...+e 2πi 3ϕ|7 )|q The possible multiplicities j of ϕin the coeﬃcients e 2πijϕareJ 2={0,p 1,p 2,p 1+p 2}={0,1,2,3}.4.(|0 +e 2πiϕ|1 +e 2πi 2ϕ|2 +...+e 2πi 7ϕ|7 )|q The possible multiplicities after this step are J 3={j,j +p 3:j ∈J 2}={0,1,...,7}.The ﬁnal step,the inverse Fourier transform,does not depend on ϕ.It also does not change the possible mul-tiplicities of ϕ,but just creates linear combinations of them.Consider,e.g.,the coeﬃcient of the state |2 :7 j =0e−2πi 2j/8e 2πijϕ|2 |q =7 j =0e 2πij (ϕ−1/4)|2 |q ,which gives the probability p 2(ϕ)of measuring |2 :p 2(ϕ)= 7 j =0e 2πij (ϕ−1/4) 2=7 j,l =0e 2πi (j −l )(ϕ−1/4),which is plotted in ﬁgure 2.Figure 2shows that the probability that |2 is measured is high if ϕis close to 0.25,which is the value represented by |2 .The ﬁgure indicates that the width of this probability peak depends on the frequencies present in the proba-bility function:higher frequencies allow sharper peaks.The goal of this paper is to prove that every halving of the width of the probability peak requires one additional step of the algorithm.The proof consists of three steps.The ﬁrst is to quantify the inﬂuence of each additional application of W p l on the frequencies present in the probability func-tion (Theorem 2).Now consider a probability function as in ﬁgure 2.It must have a high peak ≥3/4with small width ǫand should be close to zero everywhere else.We will show that such a function requires a large range of frequencies to be present (lemma 3).Finally we3ΦProb.to FIG.2:(Color online)The probability of measuring the state|2 depending on ϕfor the algorithm depicted in ﬁgure 1with T =2and T =3.will show an upper bound on the number of frequencies that a power query algorithm can generate with T power queries,which proves Theorem 1.In Theorem 2consider a general Q ,which acts on t qubits and therefore has 2t eigenvectors |ψs with eigen-values e 2πiϕs .We assume that the eigenvectors |ψs are ﬁxed,but that the eigenvalues change.We will prove that after T steps only coeﬃcients likeαe 2πi (j 1ϕ1+...+j 2t ϕ2t )will occur.Here (j 1,...,j 2t )is from the set J T deﬁned by the recursionJ T +1:=(j 1,...,j 2t ),(j 1+p T +1,...,j 2t ),...,(j 1,...,j 2t +p T +1):(j 1,...,j 2t )∈J T(6)and J 0={(0,...,0)}.Theorem 2.Let Q be a unitary operation with eigen-vectors |ψs and corresponding eigenvalues e 2πiϕs ,s =1,...,2t .Let the |ψs be ﬁxed and vary the phases ϕs ∈[0,1).Any quantum algorithm with power queries W p l =W p l (Q ),ﬁxed unitary transformations U j and starting state ψ(0) ,can be written as U T W p T l T...U 1W p 1l 1U 0ψ(0) =kS (T )k (ϕ1,...,ϕ2t )|k(7)for all ϕs ∈[0,1),where the S (T )k (ϕ1,...,ϕ2t )aretrigonometric polynomials of the following form:S (T )k (ϕ1,...,ϕ2t )=(j 1,...,j 2t )∈J Tα(T )k,(j 1,...,j 2t )e 2πi (j 1ϕ1+...+j 2t ϕ2t ),(8)with α(T )k,(j 1,...,j2t )∈C and J T deﬁned by (6).To shorten our proofs we will use the following short-hand notations.We say that a trigonometric polynomial S (T )k (ϕ1,...,ϕ2t )is of powers J T ,indicated by the (T )superscript,if it can be written as a sum over J T as in (8).Furthermore we abbreviate the vectors j =(j 1,...,j 2t ), ϕ=(ϕ1,...,ϕ2t ),and deﬁne the following notation:j · ϕ=j 1ϕ1+...+j 2t ϕ2t .Proof of Theorem 2:To simplify the proof we choose a diﬀerent basis instead of the standard basis.We divide the quantum state into a control part |m and an eigen-vector part |ψs and show that we can writeU T W p T l T ...W p 1l 1U 0 ψ(0) =2c−1 m =02ts =1S (T )m,s ( ϕ)|m,ψs ,(9)for trigonometric polynomials of powers J T ,S (T )m,s ( ϕ)=j ∈J Tα(T )m,s, je 2πij ·ϕ.(10)The proof is by induction on the number of queries T .For T =0we have no dependence on ϕand S (0)m,s ( ϕ)isjust a constant since Q was not applied:U 0 ψ(0)=2c−1 m =02t s =1m,ψs U 0 ψ(0) |m,ψs =:2c−1m =02t s =1α(0)m,s,(0,...,0)|m,ψs =2c −1m =02t s =1j ∈J 0α(0)m,s, je 2πij ·ϕ|m,ψs .Let now T be arbitrary and let equations (9)and (10)hold.If we apply W p T +1l T +1to (9),only those states |m,ψs are aﬀected for which the l T +1-th control bit is set,i.e.m l T +1=1.For these states we getW pT +1l T +1|m,ψs =|m Q p T +1|ψs =|m e 2πip T +1ϕs |ψs and therefore the coeﬃcient of |m,ψs changes toS (T )m,s ( ϕ)e2πip T +1ϕs =j ∈J Tα(T )m,s, je 2πij · ϕe 2πip T +1ϕs=j ∈J Tα(T )m,s, je 2πi (j 1ϕ1+...+(j s +p T +1)ϕs +...j 2t ϕ2t ).But this is a trigonometric polynomial of powers J T +1(recall eqn.(6)),and we can write it asS (T +1)m,s ( ϕ)=j ∈J T +1α(T +1)m,s, je 2πij ·ϕ4if we deﬁne the coeﬃcients α(T +1)m,s, jproperly: α(T +1)m,s,(j 1,...,j s ,...,j2t )= α(T )m,s,(j 1,...,j s −p T +1,...,j2t )for p T +1≤j s and 0otherwise.Finally we deﬁne S(T +1)m,s ( ϕ):= S (T )m,s ( ϕ)for the states for which the control bit is not set (m l t +1=0)and we can writeW p T +1l T +1U T ...W p 1l 1U 0 ψ(0) =2c−1 m =02ts =1S (T +1)m,s ( ϕ)|m,ψs .Now we use that the transformation U T +1and theeigenvectors |ψs are ﬁxed for all algorithms we considerand deﬁne u (T +1)m,s ;n,t = m,ψs |U T +1|n,ψt .Then U T +1n,tS (T +1)n,t ( ϕ)|n,ψt=m,s,n,tS (T +1)n,t( ϕ)u (T +1)m,s ;n,t |m,ψs ,(11)which gives the following coeﬃcient for |m,ψs :n,t j ∈J T +1α(T +1)n,t, j e 2πij ·ϕu (T +1)m,s ;n,t= j ∈J T +1n,tu (T +1)m,s ;n,t α(T +1)n,t, je 2πij ·ϕ=:j ∈J T +1α(T +1)m,s, je 2πi j · ϕ=: S (T +1)m,s ( ϕ).(12)This completes the induction and establishes equations(9)and (10).Using the same argumentation as in equations (11)and (12)we can ﬁnally rewrite the state in equation (9)in the standard basis |k ∈{|0 ,|1 ,...,|2c +t −1 }throughα(T +1)k,j=m,sk |m,ψs α(T +1)m,s,jand S (T )k ( ϕ)is of the same powers as S(T )m,s ( ϕ),which isof powers J T .This proves equation (7)and (8).We now focus on the speciﬁc problem of phase esti-mation.The next lemma provides us with a necessary condition on the powers p 1,p 2,...such that a quantum algorithm with power queries can solve the phase estima-tion problem with precision ǫ.Lemma 3.Any quantum algorithm estimating the phase ϕof an eigenvector |q of matrices Q from the class Q |q ,t =Q :Q is a unitary t qubit transform ,|q is an eigenvector of Q.up to precision ǫhas to use power queries W p 1j 1,W p 2j 2,...,W p Tj T such that the setM T = l −l ′|l,l ′= k ∈Kp k K ⊆{1,...,T }(13)has more than |M T |≥1Nfor some N ∈N .(14)We will analyze the behavior of all possible algorithmsfor the phase estimation problem on a special subset of Q |q ,t .Fix some arbitrary vectors |ψ2 ,...,|ψ2t such that |q ,|ψ2 ,...,|ψ2t form an orthonormal basis and consider the following input:Q r :=e 2πi 2rǫ|q q |+2ts =2e 2πiϕs |ψs ψs |.(15)The phase ϕwe are interested in is ϕ=2rǫfor input Q r .Since the diﬀerence between the phases of the matrices Q r is 2ǫand we require ǫcorrectness,a measurement will yield states |k from the distinct sets B r :B r = k : 2rǫ− k <ǫ .(16)Depending on the number of qubits we use in our quan-tum algorithm,the sets B r can contain one or morestates.By Theorem 2we know that we can write the coeﬃ-cient of a state |k ∈B r after T queries as S (T )k (ϕ,ϕ2,...,ϕ2t )=(j 1,...,j 2t )∈J T α(T )k,(j 1,...,j 2t )e 2πi (j 1ϕ+j 2ϕ2+...+j 2t ϕ2t ).In this proof we are only interested in the behavior for the Q r .Therefore we can drop the dependence on ϕ2,...,ϕ2t and letS (T )k (ϕ):=S (T )k (ϕ,ϕ2,...,ϕ2t )=l ∈L Tβ(T )k,l e 2πilϕ,where L 0={0}andL T =j 1:(j 1,...,j 2t )∈J T=j 1,j 1+p T :(j 1,...,j 2t )∈J T −1 = k ∈Kp k K ⊆{1,...,T }5and the coeﬃcients (note that l is ﬁxed on the right side)β(T )k,l =(l,j 2,...,j 2t )∈J Tα(T )k,(l,j 2,...,j2t )e 2πi (j 2ϕ2+...+j 2t ϕ2t )The probability p B r (ϕ)of measuring a state from the set B r deﬁned in (16)of all ǫapproximations to ϕ=2rǫis now given by:p B r (ϕ):= k ∈B rS (T )k (ϕ)2=k ∈B r l ∈L Tl ′∈L Tβ(T )k,lβ(T )k,l β(T )k,l ′.For illustration recall ﬁgure 2,which shows exactly one of these probability functions p B r (ϕ)and also their highly oscillatory behavior.In the case of the phase estimation algorithm,B r ={|r }and the ﬁgure shows p B 2(ϕ).We apply the Discrete Inverse Fourier Transform to p B r (ϕ)(evaluated at the points ϕ=n/N for n =0,...,N −1)and get for the k -th coeﬃcientN −1 n =0p B r (n/N )e −2πikn/N=N −1n =0m ∈M Tη(T )r,me 2πi (m −k )n/N=Nm ∈M Tm ≡k mod Nη(T )r,m ,(18)since for m ≡k mod NN −1 n =0e2πi (m −k )n/N=1−e 2πi (m −k )N/NN )e −2πiknN )e −2πikrN)e −2πikn4−N −1 n =0n =rp B r (n4(recall the deﬁnitions of p B r and Q r ).If we knew that for the second term in (20)N −1 n =0n =rp B r (n/N )<3N=Nm ∈M Tm ≡k mod Nη(T )r,m>0.We will show that this property,while not necessarily always true,will be true for at least most of the p B r (ϕ).There are N diﬀerent possible outcome sets B 0,...,B N −1.We know that for any ϕ=n/N all mutually exclusive probabilities of measuring a state from B r for r =0,...,N −1have to add up to at most 1:N −1 r =0p B r (n/N )≤1for n =0,...,N −1,Let R <⊆{0,...,N −1}be the set of all r for which(21)holds and R ≥the set for which it does not.|R <|has to be greater than 1since we can splitN =N −1 n =01≥N −1 n =0N −1 r =0p B r (nN )+ r ∈R <N−1 n =0n =rp B r (nN)≥34R ≥ =34R ≥and therefore R ≥≤13N .Thus there is an r ∈R <for which eqn.(21)holds and0< N m ∈M Tm ≡k mod Nη(T )r,m≤N m ∈M Tm ≡k mod Nη(T )r,m (22)for all k =0,...,N −1.This means at least N of the η(T )r,m have to be nonzero and thus p B r (ϕ)from eqn.(17)must have at least N nonzero terms.In other words|M T |≥N =1ǫ).Proof of Theorem 1:From lemma 3we know that the set M T has to have |M T |≥12ǫ,and the number of queries T must grow likeT ≥12ǫ=Ω(log1[18]S.Heinrich,Journal of Complexity18,1(2002),quant-ph/0105116.[19]A.J.Bessen,Journal of Complexity20,699(2004),quant-ph/0308140.[20]E.Novak,Journal of Complexity17,2(2001),quant-ph/0008124.[21]S.Heinrich,Journal of Complexity19,19(2003),quant-ph/0112153.[22]S.Heinrich,M.Kwas,and H.Wozniakowski,in MonteCarlo and Quasi-Monte Carlo Methods2002,edited byH.Niederreiter(Springer Verlag,2003),pp.243–258,quant-ph/0311036.[23]J.F.Traub and H.Wozniakowski,Quantum InformationProcessing1,365(2002),quant-ph/0109113.[24]E.Novak,I.H.Sloan,and H.Wo´z niakowski,Found.Comput.Math.4,121(2004),quant-ph/0206023. [25]S.Heinrich,Journal of Complexity20,5(2004),quant-ph/0305030.[26]S.Heinrich,Journal of Complexity20,27(2004),quant-ph/0305031.。