Sparsity and Incoherence in Compressive Sampling
On Recovery of Sparse Signals Via Minimization

3388IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 7, JULY 2009On Recovery of Sparse Signals Via `1 MinimizationT. Tony Cai, Guangwu Xu, and Jun Zhang, Senior Member, IEEEAbstract—This paper considers constrained `1 minimization methods in a unified framework for the recovery of high-dimensional sparse signals in three settings: noiseless, bounded error, and Gaussian noise. Both `1 minimization with an ` constraint (Dantzig selector) and `1 minimization under an `2 constraint are considered. The results of this paper improve the existing results in the literature by weakening the conditions and tightening the error bounds. The improvement on the conditions shows that signals with larger support can be recovered accurately. In particular, our results illustrate the relationship between `1 minimization with an `2 constraint and `1 minimization with an ` constraint. This paper also establishes connections between restricted isometry property and the mutual incoherence property. Some results of Candes, Romberg, and Tao (2006), Candes and Tao (2007), and Donoho, Elad, and Temlyakov (2006) are extended.11linear equations with more variables than the number of equations. It is clear that the problem is ill-posed and there are generally infinite many solutions. However, in many applications the vector is known to be sparse or nearly sparse in the sense that it contains only a small number of nonzero entries. This sparsity assumption fundamentally changes the problem. Although there are infinitely many general solutions, under regularity conditions there is a unique sparse solution. Indeed, in many cases the unique sparse solution can be found exactly through minimization subject to (I.2)Index Terms—Dantzig selector`1 minimization, restricted isometry property, sparse recovery, sparsity.I. INTRODUCTION HE problem of recovering a high-dimensional sparse signal based on a small number of measurements, possibly corrupted by noise, has attracted much recent attention. This problem arises in many different settings, including compressed sensing, constructive approximation, model selection in linear regression, and inverse problems. Suppose we have observations of the formTThis minimization problem has been studied, for example, in Fuchs [13], Candes and Tao [5], and Donoho [8]. Understanding the noiseless case is not only of significant interest in its own right, it also provides deep insight into the problem of reconstructing sparse signals in the noisy case. See, for example, Candes and Tao [5], [6] and Donoho [8], [9]. minWhen noise is present, there are two well-known imization methods. One is minimization under an constraint on the residuals -Constraint subject to (I.3)(I.1) with is given and where the matrix is a vector of measurement errors. The goal is to reconstruct the unknown vector . Depending on settings, the error vector can either be zero (in the noiseless case), bounded, or . It is now well understood that Gaussian where minimization provides an effective way for reconstructing a sparse signal in all three settings. See, for example, Fuchs [13], Candes and Tao [5], [6], Candes, Romberg, and Tao [4], Tropp [18], and Donoho, Elad, and Temlyakov [10]. A special case of particular interest is when no noise is present . This is then an underdetermined system of in (I.1) andManuscript received May 01, 2008; revised November 13, 2008. Current version published June 24, 2009. The work of T. Cai was supported in part by the National Science Foundation (NSF) under Grant DMS-0604954, the work of G. Xu was supported in part by the National 973 Project of China (No. 2007CB807903). T. T. Cai is with the Department of Statistics, The Wharton School, University of Pennsylvania, Philadelphia, PA 19104 USA (e-mail: tcai@wharton.upenn. edu). G. Xu and J. Zhang are with the Department of Electrical Engineering and Computer Science, University of Wisconsin-Milwaukee, Milwaukee, WI 53211 USA (e-mail: gxu4uwm@; junzhang@uwm. edu). Communicated by J. Romberg, Associate Editor for Signal Processing. Digital Object Identifier 10.1109/TIT.2009.2021377Writing in terms of the Lagrangian function of ( -Constraint), this is closely related to finding the solution to the regularized least squares (I.4) The latter is often called the Lasso in the statistics literature (Tibshirani [16]). Tropp [18] gave a detailed treatment of the regularized least squares problem. Another method, called the Dantzig selector, was recently proposed by Candes and Tao [6]. The Dantzig selector solves the sparse recovery problem through -minimization with a constraint on the correlation between the residuals and the column vectors of subject to (I.5)Candes and Tao [6] showed that the Dantzig selector can be computed by solving a linear program subject to and where the optimization variables are . Candes and Tao [6] also showed that the Dantzig selector mimics the perfor. mance of an oracle procedure up to a logarithmic factor0018-9448/$25.00 © 2009 IEEEAuthorized licensed use limited to: University of Pennsylvania. Downloaded on June 17, 2009 at 09:48 from IEEE Xplore. Restrictions apply.CAI et al.: ON RECOVERY OF SPARSE SIGNALS VIAMINIMIZATION3389It is clear that some regularity conditions are needed in order for these problems to be well behaved. Over the last few years, many interesting results for recovering sparse signals have been obtained in the framework of the Restricted Isometry Property (RIP). In their seminal work [5], [6], Candes and Tao considered sparse recovery problems in the RIP framework. They provided beautiful solutions to the problem under some conditions on the so-called restricted isometry constant and restricted orthogonality constant (defined in Section II). These conditions essentially require that every set of columns of with certain cardinality approximately behaves like an orthonormal system. Several different conditions have been imposed in various settings. For example, the condition was used in Candes and in Candes, Romberg, and Tao [5], Tao [4], in Candes and Tao [6], and in Candes [3], where is the sparsity index. A natural question is: Can these conditions be weakened in a unified way? Another widely used condition for sparse recovery is the Mutual Incoherence Property (MIP) which requires the to be pairwise correlations among the column vectors of small. See [10], [11], [13], [14], [18]. minimization methods in a In this paper, we consider single unified framework for sparse recovery in three cases, minnoiseless, bounded error, and Gaussian noise. Both constraint (DS) and minimization imization with an constraint ( -Constraint) are considered. Our under the results improve on the existing results in [3]–[6] by weakening the conditions and tightening the error bounds. In particular, miniour results clearly illustrate the relationship between mization with an constraint and minimization with an constraint (the Dantzig selector). In addition, we also establish connections between the concepts of RIP and MIP. As an application, we present an improvement to a recent result of Donoho, Elad, and Temlyakov [10]. In all cases, we solve the problems under the weaker condition (I.6) The improvement on the condition shows that for fixed and , signals with larger support can be recovered. Although our main interest is on recovering sparse signals, we state the results in the general setting of reconstructing an arbitrary signal. It is sometimes convenient to impose conditions that involve only the restricted isometry constant . Efforts have been made in this direction in the literature. In [7], the recovery result was . In [3], the weaker established under the condition was used. Similar conditions have condition also been used in the construction of (random) compressed and sensing matrices. For example, conditions were used in [15] and [1], respectively. We shall remark that, our results implies that the weaker conditionsparse recovery problem. We begin the analysis of minimization methods for sparse recovery by considering the exact recovery in the noiseless case in Section III. Our result improves the main result in Candes and Tao [5] by using weaker conditions and providing tighter error bounds. The analysis of the noiseless case provides insight to the case when the observations are contaminated by noise. We then consider the case of bounded error in Section IV. The connections between the RIP and MIP are also explored. Sparse recovery with Gaussian noise is treated in Section V. Appendices A–D contain the proofs of technical results. II. PRELIMINARIES In this section, we first introduce basic notation and definitions, and then present a technical inequality which will be used in proving our main results. . Let be a vector. The Let support of is the subset of defined byFor an integer , a vector is said to be -sparse if . For a given vector we shall denote by the vector with all but the -largest entries (in absolute value) set to zero and define , the vector with the -largest entries (in absolute value) set to zero. We shall use to denote the -norm of the vector . the standard notation Let the matrix and , the -restricted is defined to be the smallest constant isometry constant such that (II.1) for every -sparse vector . If , we can define another quantity, the -restricted orthogonality constant , as the smallest number that satisfies (II.2) for all and such that and are -sparse and -sparse, respectively, and have disjoint supports. Roughly speaking, the and restricted orthogonality constant isometry constant measure how close subsets of cardinality of columns of are to an orthonormal system. and For notational simplicity we shall write for for hereafter. It is easy to see that and are monotone. That is if if Candes and Tao [5] showed that the constants related by the following inequalities and (II.3) (II.4) are (II.5)suffices in sparse signal reconstruction. The paper is organized as follows. In Section II, after basic notation and definitions are reviewed, we introduce an elementary inequality, which allow us to make finer analysis of theAs mentioned in the Introduction, different conditions on and have been used in the literature. It is not always immediately transparent which condition is stronger and which is weaker. We shall present another important property on andAuthorized licensed use limited to: University of Pennsylvania. Downloaded on June 17, 2009 at 09:48 from IEEE Xplore. Restrictions apply.3390IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 7, JULY 2009which can be used to compare the conditions. In addition, it is especially useful in producing simplified recovery conditions. Proposition 2.1: If , then (II.6)Theorem 3.1 (Candes and Tao [5]): Let satisfies. Suppose (III.1)Let be a -sparse vector and minimizer to the problem. Then .is the uniqueIn particular,.A proof of the proposition is provided in Appendix A. Remark: Candes and Tao [6] imposes in a more recent paper Candes [3] uses consequence of Proposition 2.1 is that a strictly stronger condition than sition 2.1 yields implies that and . A direct is in fact since Propowhich means .As mentioned in the Introduction, other conditions on and have also been used in the literature. Candes, Romberg, and . Candes and Tao Tao [4] uses the condition [6] considers the Gaussian noise case. A special case with noise of Theorem 1.1 in that paper improves Theorem 3.1 level to by weakening the condition from . Candes [3] imposes the condition . We shall show below that these conditions can be uniformly improved by a transparent argument. A direct application of Proposition 2.2 yields the following result which weakens the above conditions toWe now introduce a useful elementary inequality. This inequality allows us to perform finer estimation on , norms. It will be used in proving our main results. Proposition 2.2: Let be a positive integer. Then any descending chain of real numbersNote that it follows from (II.3) and (II.4) that , and . So the condition is weaker . It is also easy to see from (II.5) and than is also weaker than (II.6) that the condition and the other conditions mentioned above. Theorem 3.2: Let . Suppose satisfiessatisfiesand obeys The proof of Proposition 2.2 is given in Appendix B. where III. SIGNAL RECOVERY IN THE NOISELESS CASE As mentioned in the Introduction, we shall give a unified minimization with an contreatment for the methods of straint and minimization with an constraint for recovery of sparse signals in three cases: noiseless, bounded error, and Gaussian noise. We begin in this section by considering the simplest setting: exact recovery of sparse signals when no noise is present. This is an interesting problem by itself and has been considered in a number of papers. See, for example, Fuchs [13], Donoho [8], and Candes and Tao [5]. More importantly, the solutions to this “clean” problem shed light on the noisy case. Our result improves the main result given in Candes and Tao [5]. The improvement is obtained by using the technical inequalities we developed in previous section. Although the focus is on recovering sparse signals, our results are stated in the general setting of reconstructing an arbitrary signal. with and suppose we are given and Let where for some unknown vector . The goal is to recover exactly when it is sparse. Candes and Tao [5] showed that a sparse solution can be obtained by minimization which is then solved via linear programming.. Then the minimizerto the problem., i.e., the In particular, if is a -sparse vector, then minimization recovers exactly. This theorem improves the results in [5], [6]. The improvement on the condition shows that for fixed and , signals with larger support can be recovered accurately. Remark: It is sometimes more convenient to use conditions only involving the restricted isometry constant . Note that the condition (III.2) implies . This is due to the factby Proposition 2.1. Hence, Theorem 3.2 holds under the condican also be used. tion (III.2). The condition Proof of Theorem 3.2: The proof relies on Proposition 2.2 and makes use of the ideas from [4]–[6]. In this proof, we shall also identify a vector as a function by assigning .Authorized licensed use limited to: University of Pennsylvania. Downloaded on June 17, 2009 at 09:48 from IEEE Xplore. Restrictions apply.CAI et al.: ON RECOVERY OF SPARSE SIGNALS VIAMINIMIZATION3391Let Letbe a solution to the and letminimization problem (Exact). be the support of . WriteClaim:such that. Let In fact, from Proposition 2.2 and the fact that , we have(III.4)For a subset characteristic function of, we use , i.e., if if .to denote theFor each , let . Then is decomposed to . Note that ’s are pairwise disjoint, , and for . We first consider the case where is divisible by . For each , we divide into two halves in the following manner: with where is the first half of , i.e., andIt then follows from Proposition 2.2 thatProposition 2.2 also yieldsand We shall treat four equal parts. as a sum of four functions and divide withintofor any and. ThereforeWe then define that Note thatfor .by. It is clear(III.3)In fact, since, we haveSince, this yieldsIn the rest of our proof we write . Note that . So we get the equation at the top of the following page. This yieldsThe following claim follows from our Proposition 2.2.Authorized licensed use limited to: University of Pennsylvania. Downloaded on June 17, 2009 at 09:48 from IEEE Xplore. Restrictions apply.3392IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 7, JULY 2009It then follows from (III.4) thatIV. RECOVERY OF SPARSE SIGNALS IN BOUNDED ERROR We now turn to the case of bounded error. The results obtained in this setting have direct implication for the case of Gaussian noise which will be discussed in Section V. and let Letwhere the noise is bounded, i.e., for some bounded set . In this case the noise can either be stochastic or deterministic. The minimization approach is to estimate by the minimizer of subject to We now turn to the case that is not divisible by . Let . Note that in this case and are understood as and , respectively. So the proof for the previous case works if we set and for and We specifically consider two cases: and . The first case is closely connected to the Dantzig selector in the Gaussian noise setting which will be discussed in more detail in Section V. Our results improve the results in Candes, Romberg, and Tao [4], Candes and Tao [6], and Donoho, Elad, and Temlyakov [10]. We shall first consider where satisfies shallLet be the solution to the (DS) problem given in (I.1). The Dantzig selector has the following property. Theorem 4.1: Suppose satisfying . If and with (IV.1) then the solution In this case, we need to use the following inequality whose proof is essentially the same as Proposition 2.2: For any descending chain of real numbers , we have to (DS) obeys (IV.2) with In particular, if . and is a -sparse vector, then .andRemark: Theorem 4.1 is comparable to Theorem 1.1 of Candes and Tao [6], but the result here is a deterministic one instead of a probabilistic one as bounded errors are considered. The proof in Candes and Tao [6] can be adapted to yield aAuthorized licensed use limited to: University of Pennsylvania. Downloaded on June 17, 2009 at 09:48 from IEEE Xplore. Restrictions apply.CAI et al.: ON RECOVERY OF SPARSE SIGNALS VIAMINIMIZATION3393similar result for bounded errors under the stronger condition . Proof of Theorem 4.1: We shall use the same notation as in the proof of Theorem 3.2. Since , letting and following essentially the same steps as in the first part of the proof of Theorem 3.2, we getwhere the noise satisfies . Once again, this problem can be solved through constrained minimization subject to (IV.3)An alternative to the constrained minimization approach is the so-called Lasso given in (I.4). The Lasso recovers a sparse regularized least squares. It is closely connected signal via minimization. The Lasso is a popular to the -constrained method in statistics literature (Tibshirani [16]). See Tropp [18] regularized least squares for a detailed treatment of the problem. By using a similar argument, we have the following result on the solution of the minimization (IV.3). Theorem 4.2: Let vector and with . Suppose . If is a -sparseIf that, then and for every , and we have. The latter forces . Otherwise(IV.4) then for any To finish the proof, we observe the following. 1) . be the submatrix obIn fact, let tained by extracting the columns of according to the in, as in [6]. Then dices in , the minimizer to the problem (IV.3) obeys (IV.5) with .imProof of Theorem 4.2: Notice that the condition , so we can use the first part of the proof plies that of Theorem 3.2. The notation used here is the same as that in the proof of Theorem 3.2. First, we haveand2) In factNote thatSoWe get the result by combining 1) and 2). This completes the proof. We now turn to the second case where the noise is bounded with . The problem is to in -norm. Let from recover the sparse signalRemark: Candes, Romberg, and Tao [4] showed that, if , then(The was set to be This impliesin [4].) Now suppose which yields. ,Authorized licensed use limited to: University of Pennsylvania. Downloaded on June 17, 2009 at 09:48 from IEEE Xplore. Restrictions apply.3394IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 7, JULY 2009since and Theorem 4.2 that, with. It then follows fromProof of Theorem 4.3: It follows from Proposition 4.1 thatfor all -sparse vector where . Therefore, Theorem 4.2 improves the above result in Candes, Romberg, and Tao [4] by enlarging the support of by 60%. Remark: Similar to Theorems 3.2 and 4.1, we can have the estimation without assuming that is -sparse. In the general case, we haveSince Theorem 4.2, the conditionholds. ByA. Connections Between RIP and MIP In addition to the restricted isometry property (RIP), another commonly used condition in the sparse recovery literature is the mutual incoherence property (MIP). The mutual incoherence property of requires that the coherence bound (IV.6) be small, where are the columns of ( ’s are also assumed to be of length in -norm). Many interesting results on sparse recovery have been obtained by imposing conand the sparsity , see [10], ditions on the coherence bound [11], [13], [14], [18]. For example, a recent paper, Donoho, is a -sparse Elad, and Temlyakov [10] proved that if with , then for any , the vector and minimizer to the problem ( -Constraint) satisfiesRemarks: In this theorem, the result of Donoho, Elad, and Temlyakov [10] is improved in the following ways. to 1) The sparsity is relaxed from . So roughly speaking, Theorem 4.3 improves the result in Donoho, Elad, and Temlyakov [10] by enlarging the support of by 47%. is usually very 2) It is clear that larger is preferred. Since small, the bound is tightened from to , as is close to .V. RECOVERY OF SPARSE SIGNALS IN GAUSSIAN NOISE We now turn to the case where the noise is Gaussian. Suppose we observe (V.1) and wish to recover from and . We assume that is known and that the columns of are standardized to have unit norm. This is a case of significant interest, in particular in statistics. Many methods, including the Lasso (Tibshirani [16]), LARS (Efron, Hastie, Johnstone, and Tibshirani [12]) and Dantzig selector (Candes and Tao [6]), have been introduced and studied. The following results show that, with large probability, the Gaussian noise belongs to bounded sets. Lemma 5.1: The Gaussian error satisfies (V.2) and (V.3) Inequality (V.2) follows from standard probability calculations and inequality (V.3) is proved in Appendix D. Lemma 5.1 suggests that one can apply the results obtained in the previous section for the bounded error case to solve the Gaussian noise problem. Candes and Tao [6] introduced the Dantzig selector for sparse recovery in the Gaussian noise setting. Given the observations in (V.1), the Dantzig selector is the minimizer of subject to where . (V.4)with, provided.We shall now establish some connections between the RIP and MIP and show that the result of Donoho, Elad, and Temlyakov [10] can be improved under the RIP framework, by using Theorem 4.2. The following is a simple result that gives RIP constants from MIP. The proof can be found in Appendix C. It is remarked that the first inequality in the next proposition can be found in [17]. Proposition 4.1: Let be the coherence bound for and Now we are able to show the following result. Theorem 4.3: Suppose with satisfying (or, equivalently, the minimizer is a -sparse vector and . Let . If ), then for any . Then (IV.7),to the problem ( -Constraint) obeys (IV.8)with.Authorized licensed use limited to: University of Pennsylvania. Downloaded on June 17, 2009 at 09:48 from IEEE Xplore. Restrictions apply.CAI et al.: ON RECOVERY OF SPARSE SIGNALS VIAMINIMIZATION3395In the classical linear regression problem when , the least squares estimator is the solution to the normal equation (V.5) in the convex program The constraint (DS) can thus be viewed as a relaxation of the normal (V.3). And similar to the noiseless case, minimization leads to the “sparsest” solution over the space of all feasible solutions. Candes and Tao [6] showed the following result. Theorem 5.1 (Candes and Tao [6]): Suppose -sparse vector. Let be such that Choose the Dantzig selector is asince . The improvement on the error bound is minor. The improvement on the condition is more significant as it shows signals with larger support can be recovered accurately for fixed and . Remark: Similar to the results obtained in the previous sections, if is not necessarily -sparse, in general we have, with probabilitywhere probabilityand, and within (I.1). Then with large probability, obeys (V.6) where and .with.1As mentioned earlier, the Lasso is another commonly used method in statistics. The Lasso solves the regularized least squares problem (I.4) and is closely related to the -constrained minimization problem ( -Constraint). In the Gaussian error be the mincase, we shall consider a particular setting. Let imizer of subject to (V.7)Remark: Candes and Tao [6] also proved an Oracle Inequality in the Gaussian noise setting under for the Dantzig selector . With some additional work, our the condition approach can be used to improve [6, Theorems 1.2 and 1.3] by . weakening the condition to APPENDIX A PROOF OF PROPOSITION 2.1 Let be -sparse and be supports are disjoint. Decompose such that is -sparse. Suppose their aswhere . Combining our results from the last section together with Lemma 5.1, we have the following results on the Dantzig seand the estimator obtained from minimizalector tion under the constraint. Again, these results improve the previous results in the literature by weakening the conditions and providing more precise bounds. Theorem 5.2: Suppose matrix satisfies Then with probability obeys (V.8) with obeys (V.9) with . , and with probability at least , is a -sparse vector and the-sparse for and for . Using the Cauchy–Schwartz inequality, we have, the Dantzig selectorThis yields we also have. Since . APPENDIX B PROOF OF PROPOSITION 2.2,Remark: In comparison to Theorem 5.1, our result in Theto orem 5.2 weakens the condition from and improves the constant in the bound from to . Note thatCandes and Tao [6], the constant C was stated as C appears that there was a typo and the constant C should be C .1InLet)= . It = 4=(1 0 0Authorized licensed use limited to: University of Pennsylvania. Downloaded on June 17, 2009 at 09:48 from IEEE Xplore. Restrictions apply.3396IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 7, JULY 2009where eachis given (and bounded) byThereforeand the inequality is proved. Without loss of generality, we assume that is even. Write APPENDIX C PROOF OF PROPOSITION 4.1 Let be a -sparse vector. Without loss of generality, we as. A direct calculation shows sume that thatwhereandNow let us bound the second term. Note thatNowThese give usand henceFor the second inequality, we notice that follows from Proposition 2.1 that. It thenand APPENDIX D PROOF OF LEMMA 5.1 The first inequality is standard. For completeness, we give . Then each a short proof here. Let . Hence marginally has Gaussian distributionAuthorized licensed use limited to: University of Pennsylvania. Downloaded on June 17, 2009 at 09:48 from IEEE Xplore. Restrictions apply.CAI et al.: ON RECOVERY OF SPARSE SIGNALS VIAMINIMIZATION3397where the last step follows from the Gaussian tail probability bound that for a standard Gaussian variable and any constant , . is We now prove inequality (V.3). Note that a random variable. It follows from Lemma 4 in Cai [2] that for anyHencewhere. It now follows from the fact that[6] E. J. Candes and T. Tao, “The Dantzig selector: Statistical estimation when p is much larger than n (with discussion),” Ann. Statist., vol. 35, pp. 2313–2351, 2007. [7] A. Cohen, W. Dahmen, and R. Devore, “Compressed Sensing and Best k -Term Approximation” 2006, preprint. [8] D. L. Donoho, “For most large underdetermined systems of linear equations the minimal ` -norm solution is also the sparsest solution,” Commun. Pure Appl. Math., vol. 59, pp. 797–829, 2006. [9] D. L. Donoho, “For most large underdetermined systems of equations, the minimal ` -norm near-solution approximates the sparsest near-solution,” Commun. Pure Appl. Math., vol. 59, pp. 907–934, 2006. [10] D. L. Donoho, M. Elad, and V. N. Temlyakov, “Stable recovery of sparse overcomplete representations in the presence of noise,” IEEE Trans. Inf. Theory, vol. 52, no. 1, pp. 6–18, Jan. 2006. [11] D. L. Donoho and X. Huo, “Uncertainty principles and ideal atomic decomposition,” IEEE Trans. Inf. Theory, vol. 47, no. 7, pp. 2845–2862, Nov. 2001. [12] B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani, “Least angle regression (with discussion,” Ann. Statist., vol. 32, pp. 407–451, 2004. [13] J.-J. Fuchs, “On sparse representations in arbitrary redundant bases,” IEEE Trans. Inf. Theory, vol. 50, no. 6, pp. 1341–1344, Jun. 2004. [14] J.-J. Fuchs, “Recovery of exact sparse representations in the presence of bounded noise,” IEEE Trans. Inf. Theory, vol. 51, no. 10, pp. 3601–3608, Oct. 2005. [15] M. Rudelson and R. Vershynin, “Sparse reconstruction by convex relaxation: Fourier and Gaussian measurements,” in Proc. 40th Annu. Conf. Information Sciences and Systems, Princeton, NJ, Mar. 2006, pp. 207–212. [16] R. Tibshirani, “Regression shrinkage and selection via the lasso,” J. Roy. Statist. Soc. Ser. B, vol. 58, pp. 267–288, 1996. [17] J. Tropp, “Greed is good: Algorithmic results for sparse approximation,” IEEE Trans. Inf. Theory, vol. 50, no. 10, pp. 2231–2242, Oct. 2004. [18] J. Tropp, “Just relax: Convex programming methods for identifying sparse signals in noise,” IEEE Trans. Inf. Theory, vol. 52, no. 3, pp. 1030–1051, Mar. 2006. T. Tony Cai received the Ph.D. degree from Cornell University, Ithaca, NY, in 1996. He is currently the Dorothy Silberberg Professor of Statistics at the Wharton School of the University of Pennsylvania, Philadelphia. His research interests include high-dimensional inference, large-scale multiple testing, nonparametric function estimation, functional data analysis, and statistical decision theory. Prof. Cai is the recipient of the 2008 COPSS Presidents’ Award and a fellow of the Institute of Mathematical Statistics.Inequality (V.3) now follows by verifying directly that for all . ACKNOWLEDGMENT The authors wish to thank the referees for thorough and useful comments which have helped to improve the presentation of the paper. REFERENCES[1] W. Bajwa, J. Haupt, J. Raz, S. Wright, and R. Nowak, “Toeplitz-structured compressed sensing matrices,” in Proc. IEEE SSP Workshop, Madison, WI, Aug. 2007, pp. 294–298. [2] T. Cai, “On block thresholding in wavelet regression: Adaptivity, block size and threshold level,” Statist. Sinica, vol. 12, pp. 1241–1273, 2002. [3] E. J. Candes, “The restricted isometry property and its implications for compressed sensing,” Compte Rendus de l’ Academie des Sciences Paris, ser. I, vol. 346, pp. 589–592, 2008. [4] E. J. Candes, J. Romberg, and T. Tao, “Stable signal recovery from incomplete and inaccurate measurements,” Commun. Pure Appl. Math., vol. 59, pp. 1207–1223, 2006. [5] E. J. Candes and T. Tao, “Decoding by linear programming,” IEEE Trans. Inf. Theory, vol. 51, no. 12, pp. 4203–4215, Dec. 2005.Guangwu Xu received the Ph.D. degree in mathematics from the State University of New York (SUNY), Buffalo. He is now with the Department of Electrical engineering and Computer Science, University of Wisconsin-Milwaukee. His research interests include cryptography and information security, computational number theory, algorithms, and functional analysis.Jun Zhang (S’85–M’88–SM’01) received the B.S. degree in electrical and computer engineering from Harbin Shipbuilding Engineering Institute, Harbin, China, in 1982 and was admitted to the graduate program of the Radio Electronic Department of Tsinghua University. After a brief stay at Tsinghua, he came to the U.S. for graduate study on a scholarship from the Li Foundation, Glen Cover, New York. He received the M.S. and Ph.D. degrees, both in electrical engineering, from Rensselaer Polytechnic Institute, Troy, NY, in 1985 and 1988, respectively. He joined the faculty of the Department of Electrical Engineering and Computer Science, University of Wisconsin-Milwaukee, and currently is a Professor. His research interests include image processing and computer vision, signal processing and digital communications. Prof. Zhang has been an Associate Editor of IEEE TRANSACTIONS ON IMAGE PROCESSING.Authorized licensed use limited to: University of Pennsylvania. Downloaded on June 17, 2009 at 09:48 from IEEE Xplore. Restrictions apply.。
一种基于混合策略的压缩感知方法

• The proposed algorithm
EM算法中的E
Compressed sensing in MRI BM3D based spatially adaptive filter Multiscale L0-continuation method Proposed algorithm
Simulation results
44
22
bf
BM3D
42
bfBM3D
21
40
20
19 38
18 36
17
34 16
32 15
30
28 0 5 10 15 20 25 30 35 40 45 50 iteration
bf
14
BM3D
bfBM3D
13
0
10
20
30
40
50
60
70
iteration
Figure 5: The PSNR vs. iteration number for the multiscale L0-continuation reconstruction, spatially adaptive filtering reconstruction and our proposed method. Left:The PSNR of Experiment 1; Right: The PSNR of Experiment 2.
Simulated results1: Compressive sensing of MRI image
Simulated results2: Compressive sensing of Nature image
Compressive sensing of Nature image
从稀疏视差测量中恢复稠密视差图

Dense Disparity Maps from Sparse Disparity Measurements Simon Hawe,Martin Kleinsteuber,and Klaus DiepoldDepartment of Electrical Engineering and Information TechnologyTechnische Universit¨a t M¨u nchen,Arcisstraße21,80290M¨u nchen,Germany {simon.hawe,kleinsteuber,kldi}@tum.deAbstractIn this work we propose a method for estimating dispar-ity maps from very few measurements.Based on the the-ory of Compressive Sensing,our algorithm accurately re-constructs disparity maps only using about5%of the en-tire map.We propose a conjugate subgradient method for the arising optimization problem that is applicable to large scale systems and recovers the disparity map efficiently.Ex-periments are provided that show the effectiveness of the proposed approach and robust behavior under noisy condi-tions.1.IntroductionOne of the oldest and also most important research topics in computer vision is the stereo correspondence problem. Given two or more images of the same scene but taken from different viewpoints,the correspondence problem aims at determining which image points in different images are pro-jections of the same physical point in space.Essentially, correspondences are found by comparing image intensities, feature points,or certain shapes extracted from the images at hand.A common simplification for this search is to rectify the stereo images,as after this transformation cor-responding points will lie on the same horizontal scanline and therefore the search space is reduced by one dimension. When correspondences for all points of an image have been established,a dense disparity map of the observed scene can be created.The term disparity denotes the difference in coordinates of corresponding points within two images, and is an inverse measurement of distance.Disparity maps are essential for various applications like3D-television,im-age based rendering,or robotic navigation.In the following we will shortly review the state-of-the-art concepts used for disparity estimation.The most basic tool needed forfinding corresponding points is a matching cost function that measures image similarity.Most widely used are matching cost functions that compare image intensities by their absolute or squared differences,but there exist several other techniques[16]. Unfortunately,only considering a simple matching cost is highly ambiguous due to occlusions,or either missing or repetitive textures.To overcome these ambiguities,a vari-ety of dense stereo methods have been proposed throughout the last40years,see[25]for an excellent overview and per-formance comparison.Generally,these approaches can be divided into two groups:local methods that assign a dispar-ity to each position individually,and global methods that minimize an energy function either along a single scanline, or over the entire disparity map.Local methods aggregate the matching cost of all pix-els lying in a support window surrounding a point q,and assign the disparity to q that results in the lowest aggre-gated cost.These methods assume constant disparity over the support,which does not hold at disparity discontinuities and leads to boundary fattening artifacts.Explicit occlusion detection[11],multiple windowing[14],or adaptive local support weighting[29]reduce this effect,but cannot avoid it completely.A further problem appears in regions showing no texture or repetitive textures,due to many indistinguishable possi-ble matches.Global methods tackle these ambiguities by assuming that disparities vary smoothly in general.This is enforced by minimizing an energy function consisting of the matching cost and a discontinuity penalty.In that way,global support is provided for local regions lacking textures.Scanline methods minimize an energy function along each horizontal scanline separately,using dynamic programming[23,25].In regions fulfilling the smoothness assumption,this approach results in accurate disparity es-timation,but leads to horizontal streaking artifacts around discontinuities.This problem is bypassed by methods that enforce the smoothness constraint in both vertical and hori-zontal direction.The most prominent stereo algorithms for minimizing the2D cost function are based on belief propa-gation[13,17,26]and graph cuts[2,18].Besides the dense stereo methods reviewed above,al-gorithms that explicitly produce reliable sparse disparity maps have been proposed in the literature[24].However,to the best of our knowledge,no technique has been pro-posed so far that aims at reconstructing a dense disparity map from these sparse measurements,except simple inter-polation which is prone to errors and creates bulky disparity maps,especially at object boarders.In this paper,we present a method that accurately recon-structs dense disparity maps by only using a small set of reliable support points.It is based upon the theory of Com-pressive Sensing and incorporates prior knowledge about both the general structure of disparity maps and the specific image at hand into the optimization scheme.We propose a conjugate subgradient method that abandons the use of a smoothing parameter and thus leads to unbiased solutions. Our experiments show,that only aboutfive percent of the entire disparities are necessary for our method to produce very accurate results.Hence,the proposed algorithm is suit-able for a combined approach in creating and compressing disparity maps.The paper is organized as follows.In the next section, we shortly recall the basics of Compressive Sensing and in-troduce the required mathematical notation.In Section3, we explain the shortcomings of random sampling and how to overcome them.A conjugate subgradient method is pre-sented in Section4that efficiently solves the arising non-smooth optimization problem,and in Section5we provide numerical simulations that show our algorithm’s capability.pressive SensingA common ground to many interesting signals is that they have a sparse representation or are compressible in some transform domain,which means that many transform coefficient are zero or close to zero.For example,the stan-dard representation of an image of a natural scene by its in-tensities is dense,whereas the wavelet basis admits a sparse representation.The important information is contained in only a few dominant coefficients.To reconstruct the en-tire image without severe loss of quality,it is sufficient to know these large wavelet coefficients together with their re-spective positions.If we could directly sample the impor-tant coefficients in the sparse domain,we could bypass the inefficient process offirst sampling densely,then calculate the sparse representation to afterwards compress.The con-cept of Compressive Sensing(CS)[4,8]offers a joint sam-pling and compression mechanism,which exploits a sig-nal’s sparsity to perfectly reconstruct it from a small number of acquired measurements.Let s∈R n be a column vector that represents a dis-crete n-dimensional none-sparse real-valued signal.We de-note its k-sparse representation by x∈R n,where k-sparse means that only k<n entries of x are nonzero.We write the corresponding linear transformation ass=Ψx,(1)whereΨ∈R n×n is an orthonormal basis of R n,called representation basis.Furthermore,letΦ∈R m×n be the sampling basis that transforms s into the vector y∈R m that contains m<n linear measurementsy=Φs=ΦΨx.(2)We aim to reconstruct s given only the measurements y by computing x from equation(2)and exploiting the fact that x is rmally speaking,we are seeking the sparsest vector x that is compatible with the acquired mea-surements.Formally,this leads to the following minimiza-tion problemminimizex∈R nx 0subject to y=ΦΨx,(3)where x 0is theℓ0-pseudo norm of x,i.e.the number of nonzero entries.Unfortunately,solving(3)is computationally intractable as it is a combinatorial NP-hard problem[20].Instead,it has been shown in[9]that under some generic assumptions on the matrixΦΨit is equivalent to replace theℓ0-pseudo norm by theℓ1-norm x 1:= i|x(i)|,which leads to the so called Basis Pursuit[7]minimizex∈R nx 1subject to y=ΦΨx.(4)This is a convex optimization problem that can be recast into a linear program,which is solved in polynomial time. The theory of CS says that if the number of measurements m is high enough compared to the sparsity factor k,the so-lution to equation(4)is exact[5],and consequently the sig-nal is perfectly reconstructed by solving equation(1),using the computed x.When the signal is sampled in the same basis in which it is sparse,a lot of samples are required for reconstruction (since most of the samples would be zero).Hence,it is intuitively clear that sampling and representation basis have to be as disjoint as possible.This is measured by the mutual coherence betweenΦandΨµ(Φ,Ψ)=√where C is some positive constant.Interestingly,randomly drawn sampling matrices provide a low coherence with high probability,[10].A slightly modified coherence measure is introduced in[12],which is used tofind optimized sam-pling matrices.We emphasize,that this concept is not ap-plicable to our problem at hand,since our sampling space is restricted to the pixel basis.3.Sampling and Representing Disparity MapsLet D∈R h×w be a disparity map having n=hw en-tries and assume that m≪n disparities are known.In con-trast to methods that rely on particular camera set-ups,these disparities can be computed by any method.The accuracy of our recovery algorithm only depends on the sampling po-sitions.In general,disparity maps mainly consist of large ho-mogenous regions of equal disparity with only a few dis-continuities at the transitions between those regions.Re-garding the wavelet transform,large homogenous regions are represented by only a small number of wavelet coeffi-cients,while the important coefficients cluster around dis-continuities.For this,we can assume the wavelet transform of disparity maps to be sparse,and use the wavelet domain as the representation domain.Let s∈R n be the vectorized unknown disparity map D, and let y∈R m denote the disparity measurements.More-over,letΨdenote a Daubechies Wavelet basis,then s=Ψx with x∈R n is the sparse vector of wavelet coefficients. To each measurement y i there corresponds a standard basis vector e i∈R n such thaty i=e⊤i s=e⊤iΨx,(7)where(·)⊤denotes transpose.Generally,we denote by v(i) the i-th entry of the vector v.Let p∈N m be a vector con-taining the indices of the measured disparities,the sampling basis for our problem reads asΦ=[e p(1),...,e p(m)]⊤.(8)Unfortunately,the mutual coherence between the canonical basis and the wavelet basis is high.According to equation (6),this requires a high number of random measurements. However,by selecting particular sampling positions,we can achieve accurate reconstruction results using only a small number of measurements,even though the bases do not ful-fill the low coherence requirement.Motivated by the property of the wavelet transform that the relevant coefficients coincide with discontinuities,the particular sampling positions are precisely those disparities lying at the discontinuities.As we do not know the po-sitions of the discontinuities before knowing the disparity map,we use the assumption that disparity discontinues co-incide with image intensity edges.Therefore,we applytheFigure1:Sampling pattern covering5%of the image. Cannyfilter[6]to the reference image,and take the posi-tions of the detected edges as the sampling positions.Note, that also other edge detectors are thinkable.Furthermore, to control the minimum sampling density,we divide the im-age into non-overlapping tiles and select one sampling po-sition inside each tile where no edge has been previously detected,see Figure1for an exemplarily sampling pattern. For these selected positions,we calculate the respective dis-parities,which are the input to the reconstruction algorithm described in the next section.4.Reconstructing Dense Disparity MapsVarious generic solvers based on second-order methods [15],and algorithms based onfirst-order methods[1,21] to tackle problem(4)have been proposed in the litera-ture.Second-order methods are very accurate but require too much computational resources to solve the problem at hand.Due to its large scale,we suggest afirst-order method, which roughly follows the ideas in[19],but uses a new sub-gradient method.In order to enhance legibility,we stick to the matrix vector notation.However,regarding the imple-mentation,we want to mention that it is not possible to sim-ply use these matrix vector multiplications,due to the large size of the involved matrices.Fortunately,all matrix op-erations we are performing can be efficiently implemented without explicitly creating the matrices by using imagefil-tering.A Matlab-code of the algorithm is available at the author’s webpage1.4.1.PrerequisitesThe local variation of the disparity map D at entry(i,j) is measured by∇D(i,j):=[D(i,j)−D(i,j+1),D(i,j)−D(i+1,j)].(9) From this,we define the total variation(TV)norm of D asD TV=h−1i=1w−1 j=1 ∇D(i,j) 2.(10)The TV-norm is commonly used by various image recon-struction algorithms as a discontinuity preserving smooth-ness prior,and we also employ it here.It is easily seen that by means of two suitable matrices G x,G y∈R n×n we haves TV:= D TV=n j=12 A x−y 22+λ( W x 1+γ Ψx TV).(13) The Lagrange multiplierλweighs between the contributionof the objective and the constraint.4.2.Conjugate SubgradientMinimizing the objective of(13)by afirst-order method requires to compute its gradient.A common procedure to overcome its non-smoothness is to approximate the abso-lute value with the smooth differentiable function|x|≈√|(W x)(i)|if(W x)(i)=0[−1,1]otherwise.(14)An implementation of a subgradient for the prior Ψx TV is computationally unfeasible due to the compu-tation of the subgradient with smallest Euclidean norm. Therefore,we use the Huber functionalhν(x)= |x|−ν2νotherwise,(15) for a smooth approximation of the TV-norms TV≈ s ν,TV:=nj=1hν2 A x−y 22+λ( W x 1+γ Ψx ν,TV)(19) is the set∂f(x)=A⊤(A x−y)+λ(∂ W x 1+γ∇ Ψx ν,TV).(20) Let us denoteb=λ−1A⊤(A x−y)+γ∇ Ψx ν,TV.(21) It is easily verified that thefinal subgradient with smallest Euclidean norm is given byg(x)=A⊤(A x−y)+λ(∇ W x 1+γ∇ Ψx ν,TV),(22)Tsukuba Teddy179337500874374701Table1:Number of iterations until convergence for differ-ent update formulas.where∇ W x 1(i):= (W x)(i)h⊤i(g i+1−g i)h i,(25) where g i:=g(x i)and initial value h0=−g0.Equations(24)and(25)are iterated either until conver-gence is achieved,or a maximum number of iterations has been reached.As a stopping criterion we choose the norm of g ually,conjugate gradient methods use a reset af-ter n iterations,i.e.h i:=−g i if i mod n=0.Since our algorithm converges in relatively few steps compared to the dimension n,we do not stress the issue of resetting. The output of this algorithm are the reconstructed wavelet coefficients x from which wefinally compute the dense dis-parity map using equation(1).5.ResultsIn this section we evaluate the accuracy of our recon-struction algorithm on the standard Middlebury dataset [25],which provides stereo images together with the respec-tive ground truth disparity maps.As usual,the disparity is given in units between0and255.We measure the recon-struction quality in two ways:•the amount of defective pixels,i.e.the percentage of pixels whose reconstructed value differs by more than one from its true value•the mean absolute disparity error over the entire dis-parity map.All experiments were performed with the same parame-ter setting.We used Daubechies four tap wavelets(db2-wavelets)due to their low computational cost and their qual-itatively high results.In contrast,the usage of Haar wavelets led to pronged edges in the reconstructed disparity map.We chose the wavelet decomposition level one,since higher de-composition levels did not improve the reconstruction qual-ity.The scaling factors from equation(19)were experimen-tally determined toλ=0.01,γ=10,andν=0.01.We compare reconstruction by interpolation via Delau-nay triangulation(RDT)with:1.our Compressive Sensing based reconstruction algo-rithm using Canny Edge(CSC)detector as explained in Section32.our Compressive Sensing based reconstruction algo-rithm using random samplings(CSR).For RDT we use the same sampling positions as for CSC. The two quality criteria are plotted against the number of measurements m in Figure2.The results were gathered from reconstructing the standard images Tsukuba,Venus, Teddy,and Cones[25].The plots in each row correspond to the reconstruction of the respective disparity map shown in thefirst column.To show the effectiveness of our method,wefirst discuss the reconstruction accuracy using ideal disparity measure-ments extracted from the ground truth disparity maps,cf. column(b)and(d)of Figure2.It can be seen that all three algorithms perform reasonably well if the number of mea-surements is high,and that CSC always performs best.The advantage of CSC over the other methods becomes signifi-cantly high when few measurements are used.There is another very important advantage of the CSC method.While RDT leads to fringy reconstruction results in the neighborhood of discontinuities,CSC does not yield this highly unwanted effect and produces very accurate disparity maps,in particular in the neighborhood of edges,cf.Figure 3.To evaluate the influence of corrupted measurements on the reconstruction results,we use the same sampling po-sitions as for the ideal case and add uniformly distributednoise drawn from the interval[−15,15]to randomly cho-sen25%of the data.Columns(c)and(e)of Figure2il-lustrate the performance of the tested methods.While CSC and CSR are very robust to corrupted measurements,the performance of RDT dramatically drops if corrupted mea-surements are present.From column(c)it can be noticed that the reconstruction error slightly increases once a cer-tain number of measurements has been reached.The reason for this is that the total number of corrupted measurements increases,which consequently leads to a higher percentage of badly reconstructed disparities.Finally,we evaluate our method using real estimated dis-parities.To compute the disparities,we used the adaptive support window approach combined with a left right con-sistency check[29].The last column of Figure3shows the reconstructed disparity maps with computed disparities. Our results are compared with the dense approach from[29] and RDT using the Middlebury criteria,cf.Table2.The results of our algorithm are comparable to the dense ref-erence method,while RDT is significantly worse.Recall, that we did not implement a specific sparse algorithm.By improving the quality of the disparity measurements for ex-ample by incorporating accurate feature matchers like SIFT, the accuracy of the measurements will increase and conse-quently,the reconstruction quality will also improve.6.Conclusion and OutlookIn this paper we present an algorithm for dense dispar-ity map reconstruction using sparse reliable disparity mea-surements.Based on the theory of Compressive Sensing, we perform the reconstruction by exploiting the sparsity of disparity maps in the Wavelet domain and a particular sam-pling strategy.To efficiently solve the arising large scale optimization problem,we introduce a conjugate subgradi-ent method.Our numerical results show that we achieve accurate disparity map reconstruction by using only5%of the disparity information.Moreover,they illustrate the al-gorithm’s robustness to corrupted measurements.Yet,we have not optimized the computation time of our algorithm.Our Matlab implementation reconstructs a640×480disparity map in approximately one minute on a standard desktop computer.As the computationally most demanding part is the wavelet transform which has to be performed at each iteration,a massive speedup can be achieved when parallelizing the computations within each iteration.Subject matter of future research is to investigate if the reconstruction can be improved by further incorporating weights based on confidence measurements of the com-puted disparities into the datafidelity term.Furthermore, we currently develop a disparity estimator explicitly de-signed for sparse estimation which also delivers confidence measurements.As we do not pose any requirements on how the disparities are computed,applications like single image 2D-3D conversion where disparity cues are sparsely spread over an image could benefit from our approach. AcknowledgementsThis work has been partially supported by the Cluster of Excellence CoTeSys-Cognition for Technical Systems, funded by the Deutsche Forschungsgemeinschaft(DFG). References[1]S.Becker,J.Bobin,and E.Cand`e s.Nesta:a fast and accu-ratefirst-order method for sparse recovery.SIAM Journal on Imaging Sciences,4(1):1–39,2009.3[2]Y.Boykov,O.Veksler,and R.Zabih.Fast approximate en-ergy minimization via graph cuts.IEEE Transactions on Pat-tern Analysis and Machine Intelligence,23(11):1222–1239, 2001.1[3] E.J.Cand`e s and J.Romberg.Sparsity and incoherence incompressive sampling.Inverse Problems,23(3):969–985, 2007.2[4] E.J.Cand`e s,J.Romberg,and T.Tao.Robust uncertaintyprinciples:exact signal reconstruction from highly incom-plete frequency information.IEEE Transactions on Infor-mation Theory,52(2):489–509,2006.2[5] E.J.Cand`e s and T.Tao.Near-optimal signal recoveryfrom random projections:Universal encoding strategies?IEEE Transactions on Information Theory,52(12):5406–5425,2006.2[6]J.Canny.A computational approach to edge detection.IEEETransactions on Pattern Analysis and Machine Intelligence, 8(6):679–698,1986.3[7]S.S.Chen,D.L.Donoho,and M.A.Saunders.AtomicDecomposition by Basis Pursuit.SIAM Journal on Scientific Computing,1999.2[8] pressed sensing.IEEE Transactions onInformation Theory,52(4):1289–1306,2006.2[9] D.L.Donoho and M.Elad.Optimally sparse representa-tion in general(nonorthogonal)dictionaries viaℓ1minimiza-tion.Proceedings of the National Academy of Sciences of the United States of America,100(5):2197–2202,2003.2 [10] D.L.Donoho and X.Huo.Uncertainty principles and idealatomic decomposition.IEEE Transactions on Information Theory,47(7):2845–2862,2001.2,3[11]G.Egnal and R.P.Wildes.Detecting binocular half-occlusions:empirical comparisons offive approaches.IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(8):1127–1133,2002.1[12]M.Elad.Optimized projections for compressed sens-ing.IEEE Transactions on Signal Processing,55(12):5695–5702,2007.3[13]P.F.Felzenszwalb and D.P.Huttenlocher.Efficient beliefpropagation for early vision.International Journal of Com-puter Vision,70(1):41–54,2006.1[14] A.Fusiello,V.Roberto,and E.Trucco.Efficient stereo withmultiple windowing.In Proceedings of the10th IEEE Con-Algorithm Tsukuba Venus Teddy Conesnonocc all disc nonocc all disc[29]0.71 1.19 6.13 3.979.798.26(a)Ground truth MapsFigure2:Comparisonalgorithm using Canny Edge detector(CSC)and our CS based reconstruction algorithm using random sampling(CSR).In column(a):ground truth images;(b)and(c):the percentage of pixels whose reconstructed value differs by more than one from its true value,(b)with ground truth samplings and(c)from corrupted measurements;(d)and(e):the mean absolute disparity error over the entire disparity map,(d)with ground truth samplings and(e)from corrupted measurements.ference on Computer Vision and Pattern Recognition,pages 858–863,1997.1[15]M.Grant and S.Boyd.Graph implementations for nons-mooth convex programs.In Recent Advances in Learning and Control,Lecture Notes in Control and Information Sci-ences.3Figure 3:Reconstruction of the four test disparity maps from 5%measurements:CSC (1st column),CSR (2nd column),RDT (3rd column),reconstruction from computed disparities (4th column).[16]H.Hirschm¨u ller and D.Scharstein.Evaluation of cost func-tions for stereo matching.In Proceedings of the 20th IEEEConference on Computer Vision and Pattern Recognition ,pages 1–8,2007.1[17] A.Klaus,M.Sormann,and K.Karner.Segment-based stereomatching using belief propagation and a self-adapting dis-similarity measure.In 18th International Conference on Pat-tern Recognition ,pages 15–18,2006.1[18]V .Kolmogorov and R.Zabih.Multi-camera scene recon-struction via graph cuts.In Proceedings of the 7th European Conference on Computer Vision ,pages 82–96,2002.1[19]M.Lustig,D.Donoho,and J.M.Pauly.Sparse MRI:Theapplication of compressed sensing for rapid MR imaging.Magnetic Resonance in Medicine ,58(6):1182–1195,2007.3,4,5[20] B.K.Natarajan.Sparse approximate solutions to linear sys-tems.SIAM Journal on Computing ,24(2):227–234,1995.2[21]Y .Nesterov.Smooth minimization of non-smooth functions.Mathematical Programming ,103(1):127–152,2005.3[22]J.Nocedal and S.J.Wright.Numerical Optimization,2ndEd.Springer,New York,2006.5[23]Y .Ohta and T.Kanade.Stereo by intra-and inter-scanlinesearch using dynamic programming.IEEE Transactions onPattern Analysis and Machine Intelligence ,7(2):139–154,1985.1[24]M.Sarkis and K.Diepold.Sparse stereo matching using be-lief propagation.In IEEE International Conference on Image Processing (ICIP),pages 1780–1783,2008.1[25]D.Scharstein and R.Szeliski.A taxonomy and evaluation of dense two-frame stereo correspondence algorithms.Inter-national Journal of Computer Vision ,47(1-3):7–42,2002.1,5[26]J.Sun,N.-N.Zheng,and H.-Y .Shum.Stereo matching using belief propagation.IEEE Transactions on Pattern Analysis and Machine Intelligence ,25(7):787–800,2003.1[27]N.Vaswani and W.Lu.Modified-CS:modifying com-pressive sensing for problems with partially known support.IEEE Transactions on Signal Processing ,58(9):4595–4607,2010.4[28]P.Wolfe.A method of conjugate subgradients for minimiz-ing nondifferentiable functions.In Nondifferentiable Opti-mization ,volume 3of Mathematical Programming Studies ,pages 145–173.1975.4[29]K.-J.Yoon and I.S.Kweon.Adaptive support-weight ap-proach for correspondence search.IEEE Transactions on Pattern Analysis and Machine Intelligence ,28(4):650–656,2006.1,6,7。
磁共振的英文缩写

磁共振的英文缩写MRI:Magnetic Resonance Imaging磁共振成像NMRI:Nuclear Magnetic Resonance Imaging核磁共振成像MRA:Magnetic Resonance Angiography磁共振血管造影CE-MRA:contrast enhanced magnetic resonance angiography对比增强磁共振血管成像MRV:Magnetic Resonance Venography磁共振静脉造影VW-MRI:vessel wall magnetic resonance imaging磁共振血管壁成像MRCP:Magnetic Resonance cholangiopancreatography磁共振胰胆管成像MRM:Magnetic Resonance Myelography磁共振脊髓成像MRU:Magnetic Resonance urography磁共振尿路成像MRN:Magnetic Resonance neurography磁共振神经成像CMR:Cardiovascular MR心血管磁共振检查技术fMRI:functional magnetic resonance imaging磁共振功能成像MRE:Magnetic Resonance Elastography磁共振弹性成像T1WI:T1-weighted imagingT1加权成像T2WI:T2-weighted imagingT2加权成像PDWI:proton density weighted imaging质子密度加权成像EPI:echo planar imaging平面回波成像MS-EPI:multi shot echo planar imaging多激发平面回波成像DWI:diffusion weighted imaging扩散加权成像(小视野弥散Philips-ZOOM/Siemens-ZOOMit/GE-FOCUS)ADC:apparent diffusion coefficient表观扩散系数DWIBS:diffusion weighted imaging with background suppression背景抑制扩散加权成像RESOLVE:readout segment of long variable echo trains 基于读出方向分段K空间的多次激发弥散加权成像(Siemens)MUSE:multi-slab parallel EPI多激发节段式EPI采集空间信号敏感性编码图像重建(GE)DTI:diffusion tensor imaging扩散张量成像PWI:perfusion weighted imaging灌注加权成像BOLD:blood oxygenation level dependent血氧水平依赖RF:Radio Frequency射频TR:repetition time重复时间TE:echo time回波时间(Effective TE有效TE)Minimum TE:部分回波技术TI:inversion time反转时间ES:echo space回波间隙ETL:echo train length回波链长度BW:bandwidth带宽FA:flip angle反转角TA:Acquisition time采集时间NA:number of acquisitions采集次数NSA:number of signal averaged信号平均次数NEX:number of excitation激励次数TD:time of delay延迟时间WFS:water fat shift水脂位移FC:flow pensation流动补偿TOF:time of flight时间飞跃TRICKS:time resolved imaging of contrast Kinetics对比剂动态成像PC:phase contrast相位对比VENC:velocity encoding流速编码NPW:no phase wrap去相位卷褶IR:inversion recovery反转恢复MT:magnetization transfer磁化转移(磁化传递)FT:fourier transform傅里叶变换VPS:Views Per Segment每段视图BSP TI:blood suppression TI血夜抑制反转时间(IFIR参数)序列SE:spin echo自旋回波FSE:fast spin echo快速自旋回波TSE:turbo spin echo快速自旋回波FRFSE:fast recovery fast spin echo快速恢复快速自旋回波(GE)TSE-Restore:快速恢复快速自旋回波(Siemens)TSE DRIVE(TSE driven equilibrium DE驱动平衡):快速恢复快速自旋回波(Philips)SSFSE:single shot fast spin echo单次激发快速自旋回波HALF-SS-TSE:half-fourier acquisition single-shot turbo spin echo半傅里叶单次激发快速自旋回波(Philips)HASTE:half-fourier acquisition single-shot turbo spin echo半傅里叶单次激发快速自旋回波(Siemens)FLAIR:fluid attenuated inversion recovery水抑制反转恢复ASL:arterial spin labeling动脉自旋标记BPAS:basi-parallel anatomical scanning平行椎基底动脉系统扫描FIR:fast inversion recovery快速反转恢复(TIR:turbo inversion recovery)DIR:dual inversion recovery(有资料译为double inversion recovery)双重反转恢复下面三个技术(VISTA/CUBE/SPACE)摘自懋氏百科全书,后面两个的中文是我瞎翻译的:VISTA(3D VIEW):volume isotropic turbo spin echo acquisition各向同性快速自旋回波容积采集(Philips)CUBE:3D fast spin echo with an extended echo train acquisition长回波链3D快速自旋回波采集(GE)SPACE:sampling perfection with application optimized contrast using different flip angle evolution最优可变翻转角改善对比完美采样(Siemens)梯度回波GRE:gradient recalled echo梯度回波(GE)FFE:fast field echo快速场回波(Philips)GE:gradient echo梯度回波(Siemens)TFE:turbo field echo超快速场回波FISP:fast imaging with steady-state precession稳态进动快速成像(Siemens)PSIF(Siemens):采集刺激回波的GRE序列;在时序上与FISP 相反遂命名为PSIF(Philips为T2-FFE;GE为CE-GRASS:contrast enhanced GRASS)DESS:dual spin steady state双回波稳态进动(Siemens独有3D序列,显示软骨优势;同时采集FISP信号和PSIF信号)MEDIC:multiple echo data image bination多回波数据合并成像(Siemens)MERGE:multiple echo recalled gradient echo多回波梯度回波 (GE 2D)COSMIC:coherent oscillatory state acquisition for the manipulation imaging contrast连续振荡状态采集操控成像对比(GE 3D多回波合并成像)mFFE(Philips多回波):multiple fast field echoSWI:susceptibility weighted imaging磁敏感加权成像QSM:quantitative susceptibility mapping定量磁化率成像SSFP:steady state free precession普通稳态自由进动(GE 的GRE、Fast GRE均属该类型;西门子为FISP;在飞利浦上称为conventional FFE)Balance-SSFP:balance steady state free precession平衡式稳态自由进动(Philips)FIESTA:fast imaging employing steady stateacquisition稳态采集快速成像(GE)FIESTA-C:FIESTA-cycled phases双激发稳态采集快速成像(GE)True FISP:true fast imaging with steady state precession真稳态自由进动快速成像(Siemens)CISS:constructive interference in the steady state稳态进动结构相干(双激发)B-FFE:balance fast field echo平衡式快速场回波(Philips)TRANCE:triggered angiography non-contrast enhanced触发血管造影非对比增强(Philips; Siemens为 Native truefisp; GE为IFIR: InFlow Inversion Recovery)QISS:Quiescent-Interval Single-Shot MR血管造影-静态间隔单次激发成像是一种用于外周MRA的非增强MRA技术(Siemens)。
Compressive_Sensing

“压缩传感”引论香港大学电机电子工程学系高效计算方法研究小组沙威 2008年11月20日Email: wsha@eee.hku.hkPersonal Website: http://www.eee.hku.hk/~wsha到了香港做博士后,我的研究领域继续锁定计算电磁学。
我远离小波和计算时谐分析有很长的时间了。
尽管如此,我仍然陆续收到一些年轻学者和工程人员的来信,和我探讨有关的问题。
对这个领域,我在理论上几乎有很少的贡献,唯一可以拿出手的就是几篇网上发布的帖子和一些简单的入门级的代码。
但是,从小波和其相关的理论学习中,我真正懂得了一些有趣的知识,并获益良多。
我深切地感到越来越多的学者和工程师开始使用这个工具解决实际的问题,我也发现互联网上关于这方面的话题多了起来。
但是,我们永远不能只停留在某个阶段,因为当今学术界的知识更新实在太快。
就像我们学习了一代小波,就要学习二代小波;学习了二代小波,就要继续学习方向性小波(X-let)。
我也是在某个特殊的巧合下不断地学习某方面的知识。
就像最近,我的一个友人让我帮她看看“压缩传感”(Compressive Sensing)这个话题的时候,我的兴趣又一次来了。
我花了一个星期,阅读文献、思考问题、编程序、直到写出今天的帖子。
我希望这篇帖子,能对那些没进入且迫切想进入这个领域的学者和工程师有所帮助。
并且,我也希望和我一个星期前一样,对这个信号处理学界的“一个大想法”(A Big Idea)丝毫不了解的人,可以尝试去了解它。
我更希望,大家可以和我探讨这个问题,因为我到现在甚至不完全确定我对压缩传感的某些观点是否正确,尽管我的简单的不到50行的代码工作良好。
在这个领域中,华裔数学家陶哲轩和斯坦福大学的统计学家David Donoho 教授做出了重要的贡献。
在这个引言中,我用简单的关键字,给出我们为什么需要了解甚至是研究这个领域的原因。
那是因为,我们从中可以学习到,下面的这些:矩阵分析、统计概率论、拓扑几何、优化与运筹学、泛函分析、时谐分析(傅里叶变换、小波变换、方向性小波变换、框架小波)、信号处理、图像处理等等。
陶 哲 轩 的 1 0 岁 与 3 0 岁 ( 2 0 2 0 )

压缩感知(compressed sensing)的通俗解释在我看来,压缩感知是信号处理领域进入21世纪以来取得的最耀眼的成果之一,并在磁共振成像、图像处理等领域取得了有效应用。
压缩感知理论在其复杂的数学表述背后蕴含着非常精妙的思想。
基于一个有想象力的思路,辅以严格的数学证明,压缩感知实现了神奇的效果,突破了信号处理领域的金科玉律——奈奎斯特采样定律。
即,在信号采样的过程中,用很少的采样点,实现了和全采样一样的效果。
------------------------------------------------------------------------------------------------------------------------------------------------------------一、什么是压缩感知(CS)?compressed sensing又称compressed sampling,似乎后者看上去更加直观一些。
没错,CS是一个针对信号采样的技术,它通过一些手段,实现了“压缩的采样”,准确说是在采样过程中完成了数据压缩的过程。
因此我们首先要从信号采样讲起:1. 我们知道,将模拟信号转换为计算机能够处理的数字信号,必然要经过采样的过程。
问题在于,应该用多大的采样频率,即采样点应该多密多疏,才能完整保留原始信号中的信息呢?2. 奈奎斯特给出了答案——信号最高频率的两倍。
一直以来,奈奎斯特采样定律被视为数字信号处理领域的金科玉律。
3. 至于为什么是两倍,学过信号处理的同学应该都知道,时域以τ为间隔进行采样,频域会以1-τ为周期发生周期延拓。
那么如果采样频率低于两倍的信号最高频率,信号在频域频谱搬移后就会发生混叠。
4. 然而这看似不容置疑的定律却受到了几位大神的挑战。
Candes 最早意识到了突破的可能,并在不世出的数学天才陶哲轩以及Candes 的老师Donoho的协助下,提出了压缩感知理论,该理论认为:如果信号是稀疏的,那么它可以由远低于采样定理要求的采样点重建恢复。
压缩感知理论及应用

x在
k N
时就称向量 是稀疏的。对应于公式(1)而言,若 是一个稀疏向量,则
称信号 x 可以在 域进行稀疏表示或 x 是可压缩的。
[1]R Baraniuk.A lecture on comperessive sensing[J].IEEE Signal Processing Magazine ,2007,24(4):118-121.
目前,CS理论与应用研究在不断进行:
在美国、欧洲等许多国家的知名大学如麻省理工学院、莱斯大学、斯坦 福大学、杜克大学等成立了专门课题组对CS进行研究;如莱斯大学建立的 专门的Compressive Sensing网站 /cs ,里面有关于该 理论大量资源和该方向的最新研究成果。
由正交基扩展到有多个正交基构成的正交基字典:即在某个正交基字典里, 自适应地寻找可以逼近某一种信号特征的最优正交基,根据不同的信号寻找 最适合信号特性的一个正交基,对信号进行变换以得到最稀疏的信号表示。
用超完备的冗余函数库取代基函数,称之为冗余字典:字典中的元素被称 为原子.字典的选择应尽可能好地符合被逼近信号的结构,其构成可以没有 任何限制.从冗余字典中找到具有最佳线性组合的K项原子来表示一个信号, 称作信号的稀疏逼近或高度非线性逼近。
于是可提出问题: 存不存在新的数据采集和处理的方法,使得在保证信 息不损失的况下,远低于奈奎斯特采样定理要求的速率采样信号,获取 少量的数据就可以重构信号?
近些年出现的一种新的理论——压缩感知(Compressed Sensing,CS) 表明这种实现是可能的。
压缩感知理论指出:如果信号是可压缩的或在某个变换域是稀疏的, 那么就可以用一个与变换基不相关的观测矩阵将变换所得高维信号投 影到一个低维空间上,然后通过求解一个优化问题就可以从这些少量 的投影中以高概率重构出原信号。
压缩采样技术及其应用

第32卷第2期电子与信息学报Vol.32No.2 2010年2月 Journal of Electronics & Information Technology Feb. 2010压缩采样技术及其应用金坚谷源涛梅顺良(清华大学电子工程系北京 100084)摘要:如何降低宽带模拟信号数字化过程中的采样率,以及如何有效的对大量数据进行压缩存储一直是学者们关心的问题。
该文综述了最近出现的一种新型信号处理方法—压缩采样(Compressive Sampling, CS),也称压缩传感(Compressive Sensing)。
该方法通过对稀疏信号进行观测而非采样,只需少量观测点就能精确的重构原始信号。
结果表明新方法的观测频率可以远远低于奈奎斯特采样频率。
该文除介绍其基本原理和主要实现方法外,同时列举了多种应用,并指出若干待研究的问题。
关键词:压缩采样;稀疏性;观测矩阵;信号恢复中图分类号:TN919.5 文献标识码:A 文章编号:1009-5896(2010)02-0470-06 DOI: 10.3724/SP.J.1146.2009.00497An Introduction to Compressive Sampling and Its ApplicationsJin Jian Gu Yuan-tao Mei Shun-liang(Department of Electronic Engineering, Tsinghua University, Beijing 100084, China)Abstract: The problems of how to reduce the sampling rate in the broadband analog signal digitization and how to compress effectively the large amount of data for storage are always concerned by researchers. The recent proposed Compressive Sampling or Compressive Sensing method to solve the said problems is introduced in this paper. The method, which employs non-adaptive linear projections that preserve the structure of the signal, can capture and represent the compressible signal at a rate significantly below Nyquist rate. This paper not only presents the key procedures of this theory but also lists a variety of applications and points out the questions to be studied.Key words: Compressive Sampling (CS); Sparsity; Measurement matrix; Signal reconstruction1引言香农/奈奎斯特采样定理指出:为避免信息丢失,实现无失真恢复原始信号,采样率至少要两倍于信号带宽。
Chapter 09 Graphical Models Concepts in Compressed Sensing

1
Introduction
The problem of reconstructing a high-dimensional vector x ∈ R n from a collection of observations y ∈ Rm arises in a number of contexts, ranging from statistical learning to signal processing. It is often assumed that the measurement process is approximately linear, i.e. that y = Ax + w , (1.1)
Graphical Models Concepts in Compressed Sensing
Andrea Montanari∗
Abstract This paper surveys recent work in applying ideas from graphical models and message passing algorithms to solve large scale regularized regression problems. In particular, the focus is on compressed sensing reconstruction via 1 penalized least-squares (known as LASSO or BPDN). We discuss how to derive fast approximate message passing algorithms to solve this problem. Surprisingly, the analysis of such algorithms allows to prove exact high-dimensional limit results for the LASSO risk. This paper will appear as a chapter in a book on ‘Compressed Sensing’ edited by Yonina Eldar and Gitta Kutynok.
Compressive-Sensing

Fortunately, These Problems Are Often Sparse. Compressive Sensing Exploits This Property
• Spectrum sensing: Usually at any given moment, only a small percentage of available spectrum is occupied • IC Trojan detection: Trojans clearly reveal themselves only under a small fraction of test vectors. Further, they are only cover a very small portion of a chip • Stragglers detection: We only need to detect stragglers which are minority “Anomalies are norm” Compressive sensing exploits this
97.5% of coefficients thrown out
Source: [Candes, et al. 2008] 8
Traditional Signal Acquisition
Signal Measurement N Compress
K << N
Take N measurements (at the Nyquist rate or higher) and perform an N-point transform, but only save K pieces of information
2
Compressive Sensing(2)

Zhuliang YU College of Automation Science and
Engineering South China University of Technology
Motivation
• Digital Revolution • Application and theory demanding
University
CS framework
Mathematical Problems
• How to find suitable sparse representation of signal?
• How to design sensing matrix? • How to get “measurements”? • How to recover the sparse signal from
University (can this apply to more kinds of signals?)
• For general signals, it may not be sparse!
• It is shown that most of the signals are sparse in some transformed domain (DCT, wavelet, FT, …….)
• increasing numbers of modalities (acoustic, RF, visual, IR, UV)
• => deluge of data (how to acquire, store, fuse, process efficiently?)
开题报告—压缩感知

[1] E Candès and T Tao, Near optimal signal recovery from random projections: Universal encoding strategies? IEEE Trans. Inform. Theory, 2006.12, 52(12): 5406 -5425 [2] D Donoho and Y Tsaig, Extensions of compressed sensing. Signal Processi ng, 2006.3, 86(3): 533-548 [3] R Baraniuk, A lecture on compressive sensing, IEEE Signal Processing Magazine, 2007.7, 24(4): 118-121 [4] W Bajwa, J Haupt, G Raz, S Wright and R Nowak, Toeplitz-structured compressed sensing matrices. IEEE Workshop on Statistical Signal Processing (SSP), Madison, Wisconsin, 2007.8, 294-298 [5] E Candès and J Romberg, Sparsity and incoherence in compressive sampling. Inverse Problems, 2007,23(3): 969-985 [6] 方红, 章权兵, 韦穗, 基于亚高斯随机投影的图像重建方法[J],计算机研究与发展, 2008,45(8):1402-1407 [7] 傅迎华, 可压缩传感重构算法与 QR 分解[J], 计算机应用, 2008, 28(9): 2300-2302 [8] R DeVore, Deterministic constructions of compressed sensing matrices. Journal of Complexity, 2007,23(4-6): 918-925 [9] J Tropp and A Gilbert, Signal recovery from random measurements via orthogonal matching pursuit. IEEE Trans. Inform. Theory, 2008.12, 53(12): 4655 -4666 [10] D Needell and R Vershynin, signal recovery from incomplete and inaccurate measurements via regularized orthogonal matching pursuit (preprint, 2007) [11] L Rebollo-Neira and D Lowe, Optimized Orthogonal Matching Pursuit Approach, IEEE Signal Processing Letters, 2002.4, 9: 137-140 [12] T Do Thong, Lu Gan, Nam Nguyen and Trac D Tran, Sparsity adaptive matching pursuit algorithm for practical compressed sensing, Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, California, 2008.10 [13] M Duarte, M Davenport, D Takhar, J Laska, T Sun, K Kelly and R Baraniuk, Single -pixel imaging via compressive sampling, IEEE Signal Processing Magazine, 2008.3, 25(2): 83 -91 11 [14] S Kirolos, J Laska, M Wakin, M Duarte, D Baron, T Ragheb, Y Massoud and R Baraniuk, Analog-to-information conversion via random demodulation, Proceedings of the IEEE Dallas Circuits and Systems Workshop, Washington D. C., 2006, 71 -74 [15] J Laska, S Kirolos, Y Massoud, R Baraniuk, A Gilbert, M Iwen and M Strauss, Random sampling for analog-to-information conversion of wideband signals, IEEE Dallas/CAS Workshop on Design,
基于 建模的稳健稀疏 低秩矩阵分解

这里 · ∗ 表示矩阵奇异值之和, 称为矩阵的核范数, ·
然而,实际中的系统往往会受噪声影响,矩阵D不能严格地表示成低秩矩阵与稀疏矩阵之和。为 此,文献[7]在假设 D − A − E F δ(δ > 0)的情况下提出稳定主成分追踪(Stable Principle Component Pursuit, SPCP) , 并证明求解优化问题(3)所得到的矩阵A与E 的误差平方和以O(δ2 )为上界。
A,E ∈Rm×n
min
A
1/2 S 1/2
+λ E
a la F
(4) δ
s.t.
2
D−A−E
1/2 r 其中 A S 1/2 = ( r ) 表示矩阵A所有奇异值{σi }ii= 对应于S chatten p 范数 p = 1 i=1 σi =1 构成向量的l1/2 范数, 2的 m n a 1/a a 情形, 本文简记为S 1/2 范数。 E la = ( i=1 j=1 |Ei j | ) 表示矩阵E 拉直向量的la 范数, 参数a的选取可采 用如下启发式的方法:δ > 0表示在有噪声情况下对矩阵D进行近似分解,取a = 1即用l1 范数刻画矩阵 1 的一维稀疏性; δ = 0表示对矩阵D进行准确分解, 取a = 2 即用l1/2 范数进一步诱导矩阵的一维稀疏性。
min f ( x) + g(z)
x,z
抽样定理的证明与实际应用

抽样定理的证明与实际应用假人 14年5月摘要:通过实验观测了抽样信号fs(t)的频谱,研究了频谱的特点。
为了能够从抽样信号fs(t)中无失真地重建原信号f(t)而不致于产生混叠现象,采用了提高抽样频率的方法,使得滤波器的输出端只有所需要的信号频谱F(ω),从而进一步验证了著名的“抽样定理”。
关键字:抽样定理离散频谱有源低通滤波器一、抽样的概念所谓抽样。
就是对时间连续的信号隔一定的时间间隔T抽取一个瞬时幅度值(样值),抽样是由抽样门完成的。
在一个频带限制在(0,fh)内的时间连续信号f(t),如果以小于等于1/(2fh)的时间间隔对它进行抽样,那么根据这些抽样值就能完全恢复原信号。
或者说,如果一个连续信号f(t)的频谱中最高频率不超过fh,这种信号必定是个周期性的信号,当抽样频率fs≥2 fh时,抽样后的信号就包含原连续信号的全部信息,而不会有信息丢失,当需要时,可以根据这些抽样信号的样本来还原原来的连续信号。
根据这一特性,可以完成信号的模-数转换和数-模转换过程。
二、验证方法本次设计应用MATLAB验证时域采样定理。
了解MATLAB软件,学习应用MATLAB软件的仿真技术。
它主要侧重于某些理论知识的灵活运用,以及一些关键命令的掌握,理解,分析等。
初步掌握线性系统的设计方法,培养独立工作能力。
加深理解时域采样定理的概念,掌握利用MATLAB分析系统频率响应的方法和掌握利用MATLAB实现连续信号采样、频谱分析和采样信号恢复的方法。
计算在临界采样、过采样、欠采样三种不同条件下恢复信号的误差,并由此总结采样频率对信号恢复产生误差的影响,从而验证时域采样定理。
MATLAB是一套功能十分强大的工程计算及数据分析软件,广泛应用于各行各业。
MATLAB是矩阵实验室之意。
除具备卓越的数值计算能力外,它还提供了专业水平的符号计算,文字处理,可视化建模仿真和实时控制等功能。
MATLAB的基本数据单位是矩阵,它的指令表达式与数学,工程中常用的形式十分相似,故用MATLAB来解算问题要比用C语言和FORTRAN等语言完全相同的事情简捷得多.在新的版本中也加入了对C,c++,FORTRAN,JAVA的支持。
Introduction to Compressed Sensing 压缩传感理论介绍

1
2
Chapter 1. Introduction to Compressed Sensing
images, videos, and other data can be exactly recovered from a set of uniformly spaced samples taken at the so-called Nyquist rate of twice the highest frequency present in the signal of interest. Capitalizing on this discovery, much of signal processing has moved from the analog to the digital domain and ridden the wave of Moore’s law. Digitization has enabled the creation of sensing and processing systems that are more robust, flexible, cheaper and, consequently, more widely used than their analog counterparts. As a result of this success, the amount of data generated by sensing systems has grown from a trickle to a torrent. Unfortunately, in many important and emerging applications, the resulting Nyquist rate is so high that we end up with far too many samples. Alternatively, it may simply be too costly, or even physically impossible, to build devices capable of acquiring samples at the necessary rate [146, 241]. Thus, despite extraordinary advances in computational power, the acquisition and processing of signals in application areas such as imaging, video, medical imaging, remote surveillance, spectroscopy, and genomic data analysis continues to pose a tremendous challenge. To address the logistical and computational challenges involved in dealing with such high-dimensional data, we often depend on compression, which aims at finding the most concise representation of a signal that is able to achieve a target level of acceptable distortion. One of the most popular techniques for signal compression is known as transform coding, and typically relies on finding a basis or frame that provides sparse or compressible representations for signals in a class of interest [31, 77, 106]. By a sparse representation, we mean that for a signal of length n, we can represent it with k n nonzero coefficients; by a compressible representation, we mean that the signal is well-approximated by a signal with only k nonzero coefficients. Both sparse and compressible signals can be represented with high fidelity by preserving only the values and locations of the largest coefficients of the signal. This process is called sparse approximation, and forms the foundation of transform coding schemes that exploit signal sparsity and compressibility, including the JPEG, JPEG2000, MPEG, and MP3 standards. Leveraging the concept of transform coding, compressed sensing (CS) has emerged as a new framework for signal acquisition and sensor design. CS enables a potentially large reduction in the sampling and computation costs for sensing signals that have a sparse or compressible representation. While the NyquistShannon sampling theorem states that a certain minimum number of samples is required in order to perfectly capture an arbitrary bandlimited signal, when the signal is sparse in a known basis we can vastly reduce the number of measurements that need to be stored. Consequently, when sensing sparse signals we might be able to do better than suggested by classical results. This is the fundamental idea behind CS: rather than first sampling at a high rate and then compressing the sampled data, we would like to find ways to directly sense the data in a compressed form — i.e., at a lower sampling rate. The field of CS grew out of the work of Cand` es, Romberg, and Tao and of Donoho, who showed that
1-Bit Compressive Sensing

Abstract—Compressive sensing is a new signal acquisition technology with the potential to reduce the number of measurements required to acquire signals that are sparse or compressible in some basis. Rather than uniformly sampling the signal, compressive sensing computes inner products with a randomized dictionary of test functions. The signal is then recovered by a convex optimization that ensures the recovered signal is both consistent with the measurements and sparse. Compressive sensing reconstruction has been shown to be robust to multi-level quantization of the measurements, in which the reconstruction algorithm is modified to recover a sparse signal consistent to the quantization measurements. In this paper we consider the limiting case of 1-bit measurements, which preserve only the sign information of the random measurements. Although it is possible to reconstruct using the classical compressive sensing approach by treating the 1-bit measurements as ±1 measurement values, in this paper we reformulate the problem by treating the 1bit measurements as sign constraints and further constraining the optimization to recover a signal on the unit sphere. Thus the sparse signal is recovered within a scaling factor. We demonstrate that this approach performs significantly better compared to the classical compressive sensing reconstruction methods, even as the signal becomes less sparse and as the number of measurements increases.
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
a r X i v :m a t h /0611957v 1 [m a t h .S T ] 30 N o v 2006Sparsity and Incoherence in Compressive SamplingEmmanuel Cand`e s †and Justin Romberg ♯†Applied and Computational Mathematics,Caltech,Pasadena,CA 91125♯Electrical and Computer Engineering,Georgia Tech,Atlanta,GA 90332November 2006Abstract We consider the problem of reconstructing a sparse signal x 0∈R n from a limited number of linear measurements.Given m randomly selected samples of Ux 0,where U is an orthonormal matrix,we show that ℓ1minimization recovers x 0exactly when the number of measurements exceeds m ≥Const ·µ2(U )·S ·log n,where S is the number of nonzero components in x 0,and µis the largest entry in U properly normalized:µ(U )=√transform coefficients of x0.For an orthogonal matrix1U withU∗U=nI,(1.1) these transform coefficients are given by y0=Ux0.Of course,if all n of the coefficients y0are observed,recovering x0is trivial:we simply apply11On afirst reading,our choice of normalization of U may seem a bit strange.The advantages of taking the row√vectors of U to have Euclidean norm—m measurements can recover signals with sparsity S m/log(n/m),and all of the constants appearing in the analysis are small[13].Similar bounds have also appeared using greedy[28]and complexity-based[17]recovery algorithms in place ofℓ1minimization.Although theoretically powerful,the practical relevance of results for completely random measure-ments is limited in two ways.Thefirst is that we are not always at liberty to choose the types of measurements we use to acquire a signal.For example,in magnetic resonance imaging(MRI), subtle physical properties of nuclei are exploited to collect samples in the Fourier domain of a two-or three-dimensional object of interest.While we have control over which Fourier coefficients are sampled,the measurements are inherently frequency based.A similar statement can be made about tomographic imaging;the machinery in place measures Radon slices,and these are what we must use to reconstruct an image.The second drawback to completely unstructured measurement systems is computational.Random (i.e.unstructured)measurement ensembles are unwieldy numerically;for large values of m and n, simply storing or applyingΦ(tasks which are necessary to solve theℓ1minimization program)are nearly impossible.If,for example,we want to reconstruct a megapixel image(n=1,000,000)from m=25,000measurements(see the numerical experiment in Section2),we would need more than 3gigabytes of memory just to store the measurement matrix,and on the order of gigaflops to apply it.The goal from this point of view,then,is to have similar recovery bounds for measurement matricesΦwhich can be applied quickly(in O(n)or O(n log n)time)and implicitly(allowing us to use a“matrix free”recovery algorithm).Our main theorem,stated precisely in Section1.2and proven in Section3,states that bounds analogous to(1.3)hold for sampling with general orthogonal systems.We will show that for afixed signal support T of size|T|=S,the programminx x ℓ1subject to UΩx=UΩx 0(1.5)recovers the overwhelming majority of x0supported on T and observation subsetsΩof size|Ω|≥C·µ2(U)·S·log n,(1.6) whereµ(U)is simply the largest magnitude among the entries in U:µ(U)=maxk,j|U k,j|.(1.7) It is important to understand the relevance of the parameterµ(U)in(1.6).µ(U)can be interpreted as a rough measure of how concentrated the rows of U are.Since each row(or column)of U necessarily has anℓ2-norm equal to√n.When the rows of U are perfectlyflat—|U k,j|=1for each k,j,as in the case when U is the discrete Fourier transform,we will haveµ(U)=1,and(1.6)is essentially as good as(1.3).If a row of U is maximally concentrated—all the row entries but one vanish—thenµ2(U)=n,and(1.6)offers us no guarantees for recovery from a limited number of samples.This result is very intuitive.Suppose indeed that U k0,j0=√components of Ux0,which is just about the content of(1.6).This shows informally that(1.6)is fairly tight on both ends of the range of the parameterµ.For a particular application,U can be decomposed as a product of a sparsity basisΨ,and an orthogonal measurement systemΦ.Suppose for instance that we wish to recover a signal f∈R n from m measurements of the form y=Φf.The signal may not be sparse in the time domain but its expansion in the basisΨmay bef(t)=nj=1x0jψj(t),f=Ψx(the columns ofΨare the discrete waveformsψj).Our program searches for the coefficient sequence in theΨ-domain with minimumℓ1norm that explains the samples in the measurement domainΦ. In short,it solves(1.6)withU=ΦΨ,Ψ∗Ψ=I,Φ∗Φ=nI.The result(1.6)then tells us how the relationship between the sensing modality(Φ)and signal model (Ψ)affects the number of measurements required to reconstruct a sparse signal.The parameterµcan be rewritten asµ(ΦΨ)=maxk,j| φk,ψj |,and serves as a rough characterization of the degree of similarity between the sparsity and mea-surement systems.Forµto be close to its minimum value of1,each of the measurement vectors (rows ofΦ)must be“spread out”in theΨdomain.To emphasize this relationship,µ(U)is often referred to as the mutual coherence[11,12].The bound(1.6)tells us that an S-sparse signal can be reconstructed from∼S log n samples in any domain in which the test vectors are“flat”,i.e.the coherence parameter is O(1).1.2Main resultThe ability of theℓ1-minimization program(1.5)to recover a given signal x0depends only on1) the support set T of x0,and2)the sign sequence z0of x0on T.2For afixed support T,our main theorem shows that perfect recovery is achieved for the overwhelming majority of the combinations of sign sequences on T,and sample locations(in the U domain)of size m obeying(1.6).The language“overwhelming majority”is made precise by introducing a probability model on the setΩand the sign sequence z.The model is simple:selectΩuniformly at random from the set of all subsets of the given size m;choose each z(t),t∈T to be±1with probability1/2.Our main result is:Theorem1.1Let U be an n×n orthogonal matrix(U∗U=nI)with|U k,j|≤µ(U).Fix a subset T of the signal domain.Choose a subsetΩof the measurement domain of size|Ω|=m,and a sign sequence z on T uniformly at random.Suppose thatm≥C0·|T|·µ2(U)·log(n/δ)(1.8)and also m≥C′0·log2(n/δ)for somefixed numerical constants C0and C′0.Then with probability exceeding1−δ,every signal x0supported on T with signs matching z can be recovered from y=UΩx0 by solving(1.5).The hinge of Theorem1.1is a new weak uncertainty principle for general orthobases.Given T and Ωas above,it is impossible tofind a signal which is concentrated on T and onΩin the U domain. In the example above where U=ΦΨ,this says that one cannot be concentrated on small sets in theΨandΦdomains simultaneously.As noted in previous publications[3,4],this is a statement about the eigenvalues of minors of the matrix U.Let U T be the n×|T|matrix corresponding to the columns of U indexed by T,and let UΩT be the m×|T|matrix corresponding to the rows of U T indexed byΩ.In Section3,we will prove the following:Theorem1.2Let U,T,andΩbe as in Theorem1.1.Suppose that the number of measurements m obeysm≥|T|·µ2(U)·max(C1log|T|,C2log(3/δ)),(1.9) for some positive constants C1,C2.ThenP 13m2 x 2ℓ2≤ UΩx 2ℓ2≤papers ask that all sparse signals to be simultaneously recoverable from the same set of samples (which is stronger than our goal here),and their bounds have log factors of(log n)6and(log n)5 respectively.These results are based on uniform uncertainty principles,which require(1.10)to hold for all sets T of a certain size simultaneously onceΩis chosen.Whether or not this log power can be reduced in this context remains an open question.A contribution of this paper is to show that if one is only interested in the recovery of nearly all signals on afixed set T,these extra log factors can indeed be removed.We show that to guarantee exact recovery,we only require UΩT to be well behaved for thisfixed T as opposed to all T’s of the same size,which is a significantly weaker requirement.By examining the singular values of UΩT, one can check whether or not(1.11)holds.Our method of proof,as the reader will see in Section3,relies on a variation of the powerful results presented in[24]about the expected spectral norm of certain random matrices.We also introduce a novel large-deviation inequality,similar in spirit to those reviewed in[19,20]but carefully tailored for our purposes,to turn this statement about expectation into one about high probability. Finally,we would like to contrast this work with[29],which also draws on the results from[24]. First,there is a difference in how the problem is framed.In[29],the m×n measurement system is fixed,and bounds for perfect recovery are derived when the support and sign sequence are chosen at random,i.e.afixed measurement system works for most signal supports of a certain size.In this paper,wefix an arbitrary signal support,and show that we will be able to recover from most sets of measurements of a certain size in afixed domain.Second,although slightly more general class of measurement systems is considered in[29],thefinal bounds for sparse recovery in the context of(1.5)do not fundamentally improve on the uniform bounds cited above;[29]draws weaker conclusions since the results are not shown to be universal in the sense that all sparse signals are recovered as in[7]and[25].2ApplicationsIn the1990s,image compression algorithms were revolutionized by the introduction of the wavelet transform.The reasons for this can be summarized with two major points:the wavelet trans-form is a much sparser representation for photograph-like images than traditional Fourier-based representations,and it can be applied and inverted in O(n)computations.To exploit this wavelet-domain sparsity in acquisition,we must have a measurement system which is incoherent with the wavelet representation(so thatµin(1.6)is small)and that can be applied quickly and implicitly(so that large-scale recovery is computationally feasible).In this section,we present numerical experiments for two such measurement strategies.2.1Fourier sampling of sparse wavelet subbandsOurfirst measurement strategy takes advantage of the fact that atfine scales,wavelets are very much spread out in frequency.We will illustrate this in1D;the ideas are readily applied to2D image.6Figure 1:Wavelets in the magnitude of the discrete Fourier transform (2.1)of Daubechies-8wavelets for n =1024and j =1,2,3.The magnitude of ˆψj,·over the subband (2.3)is shown in bold.Labeling the scales of the wavelet transform by j =1,2,...,J ,where j =1is the finest scale and j =J the coarsest,the wavelets 3ψj,k at scale j are almost flat in the Fourier domain over a band of size n j =n 2−j .The magnitude of the Fourier transformˆψj,k (ω)=n t =1ψ(t )e −i 2π(t −1)ω/n ,ω=−n/2+1,...,n/2,(2.1)is the same for each wavelet at scale j ,sinceˆψj,k (ω)=e −i 2π(k −1)ω/n j ˆψj,1(ω).(2.2)These spectrum magnitudes are shown for the Daubechies-8wavelet in Figure 1.We see that over frequencies in the j th subbandω∈B j :={n j /2+1,...,n j }∪{−n j +1,...,−n j /2},(2.3)we have max ω∈B j |ˆψj,k (ω)|2.Suppose now that a signal x 0is a superposition of S wavelets at scale j ,that is,we can writex 0=Ψj w 0where w 0∈R n j is S -sparse,and Ψj is the n ×n j matrix whose columns are the ψj,k (t )for k =1,...,n j .We will measure x 0by selecting Fourier coefficients from the band B j at random.To see how this scheme fits into the domain of the results in the introduction,let ωindex the subband B j ,let F j be the n j ×n matrix whose rows are the Fourier vectors for frequencies in B j ,let D j be a diagonal matrix with (D j )ω,ω=ˆψj,1(ω),ω∈B j ,Table1:Number of measurements required to reconstruct a sparse subband.Here,n=1024,S is the sparsity of the subband,and M(S,j)is the smallest number of measurements so that the S-sparse subband at wavelet level j was recovered perfectly in1000/1000trials.j=1j=2j=3M(S,j)S M(S,j)1002535681524498-and consider the n j×n j systemU=D−1j F jΨj.The columns of F jΨj are just the Fourier transforms of the wavelets given in(2.2),(F jΨj)ω,k=e−i2π(k−1)ω/n jˆψj,1(ω)⇒Uω,k=e−i2π(k−1)ω/n j,and so U is just a n j×n j Fourier system.In fact,one can easily check that U∗U=U∗U=n j I. We choose a set of Fouier coefficientsΩof size m in the band B j,and measurey=FΩx0=FΩΨj w0,which can easily be turned into a set of samples in the U domain y′=UΩw0just by re-weightingy.Since the mutual incoherence of D−1F jΨj isµ=1,we can recover w0from∼S log n samples. Table1summarizes the results of the following experiment:Fix the scale j,sparsity S,and a number of measurements m.Perform a trial for(S,j,m)byfirst generating a signal support T of size S,a sign sequence on that support,and a measurement setΩj of size m uniformly at random,and then measuring y=FΩjΨj x0(x0is just the sign sequence on T and zero elsewhere),solving (1.5),and declaring success if the solution matches x0.A thousand trials were performed for each (S,j,m).The value M(S,j)recorded in the table is the smallest value of m such that the recovery was successful in all1000trials.As with the partial Fourier ensemble(see the numerical results in[4]),we can recover from m≈2S to3S measurements.To use the above results in an imaging system,we wouldfirst separate the signal/image into wavelet subband,measure Fourier coefficients in each subband as above,then reconstruct each subband independently.In other words,if P Wjis the projection operator onto the space spanned by the columns ofΨj,we measurey j=FΩj P Wjx0for j=1,...,J,then set w j to be the solution tomin w ℓ1subject to FΩjΨj w=y j.If all of the wavelet subbands of the object we are imaging are appropriately sparse,we will be able to recover the image perfectly.Finally,we would like to note that this projection onto W j in the measurement process can be avoided by constructing the wavelet and sampling systems a little more carefully.In[12],a“bi-sinusoidal”measurement system is introduced which complements the orthonormal Meyer wavelet8transform.These bi-sinusoids are an alternative orthobasis to the W j spanned by Meyer wavelets at a given scale(with perfect mutual incoherence),so sampling in the bi-sinusoidal basis isolates a given wavelet subband automatically.In the next section,we examine an orthogonal measurement system which allows us to forgo this subband separation all together.2.2Noiselet measurementsIn[8],a complex“noiselet”system is constructed that is perfectly incoherent with the Haar wavelet representation.IfΨis an orthonormal system of Haar wavelets,andΦis the orthogonal noiselet system(renormalized so thatΦ∗Φ=nI),then U=ΦΨhas entries of constant magnitude:|U k,j|=1,∀k,j which impliesµ(U)=1.Just as the canonical basis is maximally incoherent with the Fourier basis,so is the noiselet system with Haar wavelets.Thus if an n-pixel image is S-sparse in the Haar wavelet domain,it can be recovered(with high probability)from∼S log n randomly selected noiselet coefficients.In addition to perfect incoherence with the Haar transform,noiselets have two additional properties that make them ideal for coded image acquisition:1.The noiselet matrixΦcan be decomposed as a multiscalefilterbank.As a result,it can beapplied O(n log n)time.2.The real and imaginary parts of each noiselet function are binary valued.A noiselet mea-surement of an image is just an inner product with a sign pattern,which make their imple-mentation in an actual acquisition system easier.(It would be straightforward to use them in the imaging architecture proposed in[26],for example.)A large-scale numerical example is shown in Figure2.The n=10242pixel synthetic image in panel(a)is an exact superposition of S=25,000Haar wavelets4.The observation vector y was created from m=70,000randomly chosen noiselet coefficients(each noiselet coefficient has a real and imaginary part,so there are really140,000real numbers recorded).From y,we are able to recover the image exactly by solving(1.5).This result is a nice demonstration of the compressed sensing paradigm.A traditional acquisition process would measure all n∼106pixels,transform into the wavelet domain,and record the S that are important.Many measurements are made,but comparably very few numbers are recorded.Here we take only a fraction of the number of measurements,and are able tofind the S active wavelets coefficients without any prior knowledge of their locations.The measurement process can be adjusted slightly in a practical setting.We know that almost all of the coarse-scale wavelet coefficients will be important(see Figure2(b)),so we can potentiallyFigure2:Sparse image recovery from noiselet measurements.(a)Synthetic n=10242-pixel image with S=25,000non-zero Haar wavelet coefficients.(b)Locations(in the wavelet quadtree)of sig-nificant wavelet coefficients.(c)Image recovered from m=70,000complex noiselet measurements. The recovery matches(a)exactly.reduce the number of measurements needed for perfect recovery by measuring these directly.In fact,if we measure the128×128block of coarse wavelet coefficients for the image in Figure2 directly(equivalent to measuring averages over8×8blocks of pixels,16,384measurement total), we are able to recover the image perfectly from an additional41,808complex noiselet measurements (the total number of real numbers recorded is100,000).3Proofs3.1General strategyThe proof of Theorem1.1follows the program set forth in[4,15].As detailed in these references, the signal x0is the unique solution to(1.5)if and only if there exists a dual vectorπ∈R n with the following properties:•πis in the row space of UΩ,•π(t)=sgn x0(t)for t∈T,and•|π(t)|<1for t∈T c.We consider the candidateπ=U∗ΩUΩT(U∗ΩT UΩT)−1z0,(3.1) where z0is a|T|-dimensional vector whose entries are the signs of x0on T,and show that under the conditions in the theorem1)πis well defined(i.e.U∗ΩT UΩT is invertible),and given this2) |π(t)|<1on T c(we automatically have thatπis in the row space of UΩandπ(t)=sgn x(t)on T).10We want to show that with the supportfixed,a dual vector exists with high probability when selectingΩuniformly at random.Following[4],it is enough to show that the desired properties whenΩis sampled using a Bernoulli model.SupposeΩ1of size m is sampled uniformly at random, andΩ2is sampled by settingΩ2:={k:δk=1},where here and belowδ1,δ2,...,δn is a sequence of independent identically distributed0/1Bernoulli random variables withP(δk=1)=m/n.ThenP(Failure(Ω1))≤2P(Failure(Ω2))(3.2) (see[4]for details).With this established,we will establish the existence of a dual vector for x0 with high probability forΩsampled using the Bernoulli model.The matrix U∗ΩT UΩT is now a random variable,which can be written asU∗ΩT UΩT=nk=1δk u k⊗u k,where the u k are the row vectors of U T;u k=(U t,k)t∈T.3.2Proof of Theorem1.2Ourfirst result,which is an analog to a theorem of Rudelson[24,Th.1],states that if m is large enough,then on average the matrix m−1U∗ΩT UΩT deviates little from the identity.Theorem3.1Let U be an orthogonal matrix obeying(1.1).Consider afixed set T and letΩbe a random set sampled using the Bernoulli model.ThenE 1log|T|mmax1≤k≤n uk (3.3)for some positive constant C R,provided the right-hand side is less than1.Since the coherenceµ(U) obeysmax1≤k≤nu k ≤µ(U)mU∗ΩT UΩT−I ≤C R·µ(U) √mnk=1δk u k⊗u k−I.11Note that since U∗U=nI,E Y=1nu k⊗u k−I=1mnk=1δ′k u k⊗u k−I,(3.5)whereδ′1,...,δ′n are independent copies ofδ1,...,δn,and writeE Y ≤E Y−Y′ ,which follows from Jensen’s inequality and the law of iterated expectation(also known as Fubini’s theorem).Now letǫ1,...,ǫn be a sequence of Bernoulli variables taking values±1with probability 1/2(and independent of the sequencesδandδ′).We haveE Y ≤Eδ,δ′1m 1≤k≤nǫk(δk−δ′k)u k⊗u k≤2EǫEδ 1log|T|·maxk:δk=1 uk ·log|T|nk=1δk u k⊗u k≤C R/2· m·max1≤k≤n u k ·where the second inequality uses the fact that for a nonnegative random variable Z,E√E Z.Observe now thatEnk=1δk u k⊗u k =E mY+mI ≤m(E Y +1)and,therefore,(3.7)givesE Y ≤a· log|T|m·max1≤k≤n u k .It then follows that if a≤1,E Y ≤2a,which concludes the proof of the theorem.KBlog 1+BtmU∗ΩT UΩT−I and recall that1n u k⊗u kn u k⊗u kNote that E Y k=0.We are interested in the spectral norm Y .By definition,Y =supf1,f2 f1,Y f2 =supf1,f2nk=1 f1,Y k f2 ,where the supremum is over a countable collection of unit vectors.For afixed pair of unit vectors (f1,f2),let f(Y k)denote the mapping f1,Y k f2 .Since E f(Y k)=0,we can apply Theorem3.2 with B obeying|f(Y k)|≤| f1,u k u k,f2 |m≤B,for all k.As such,we can take B=max1≤k≤n u k 2/m.We now computeE f2(Y k)=mn | f1,u k u k,f2 |2 n 1−m m2| u k,f2 |2.Since 1≤k≤n| u k,f2 |2=n,we proved that1≤k≤n E f2(Y k)≤ 1−m m max1≤k≤n u k 2≤B.In conclusion,with Z= Y =¯Z,Theorem3.2shows thatP(| Y −E Y |>t)≤3exp −t1+E Y .(3.10)Take m large enough so that E Y ≤1/4in(3.4),and pick t=1/4.Since B≤µ2(U)|T|/m, (3.10)givesP( Y >1/2)≤3e−m3.3Proof of Theorem1.1With Theorem1.2established,we know that with high probability the eigenvalues of U∗ΩT UΩT will be tightly controlled—they are all between m/2and3m/2.Under these conditions,the inverse of(U∗ΩT UΩT)not only exists,but we can guarantee that (U∗ΩT UΩT)−1 ≤2/m,a fact which we will use to show|π(t)|<1for t∈T c.For a particular t0∈T c,we can rewriteπ(t0)asπ(t0)= v0,(U∗ΩT UΩT)−1z = w0,z ,where v0is the row vector of U∗ΩUΩT with row index t0,and w0=(U∗ΩT UΩT)−1v0.The following three lemmas give estimates for the sizes of these vectors.From now on and for simplicity,we drop the dependence on U inµ(U).14Lemma 3.1The second moment of Z 0:= v 0 obeysE Z 20≤µ2m |T |.(3.11)Proof Set λ0k =u k,t 0.The vector v 0is given by v 0=n k =1δk λ0k u k =n k =1(δk −E δk )λ0k u k ,where the second equality holds due to the orthogonality of the rows of U : 1≤k ≤n λ0k u k,t = 1≤k ≤n u k,t 0u k,t =0.We thus can view v 0as a sum of independent random variables:v 0=n k =1Y k ,Y k =(δk −m/n )λ0k u k ,(3.12)where we note that E Y k =0.It follows thatE Z 20= k E Y k ,Y k + k ′=k E Y k ,Y k ′ = kE Y k ,Y k .NowE Y k 2=m n |λ0k |2 u k 2≤m n |λ0k |2µ2|T |.Since k |λ0k |2=n ,we proved that E Z 20≤ 1−m The next result shows that the tail of Z 0exhibits a Gaussian behavior.Lemma 3.2Fix t 0∈T c and let Z 0= v 0 .Define σ2=µ2m ·max(1,µ|T |/√m >1and a ≤(m/µ2|T |)1/2otherwise.Then P (Z 0≥µ σ)≤e −γa 2,(3.13)for some positive constant γ>0.The proof of this lemma uses the powerful concentration inequality (3.9).Proof By definition,Z 0is given byZ 0=sup f =1 v 0,f =sup f =1n k =1Y k ,f 15(and observe Z 0=¯Z 0).For a fixed unit vector f ,let f (Y k )denote the mapping Y k ,f .Since E f (Y k )=0,we can apply Theorem 3.2with B obeying|f (Y k )|≤|λ0k || f,u k |≤|λ0k | u k ≤µ2|T |1/2:=B.Before we do this,we also need bounds on σ2and E Z 0.For the latter,we simply useE Z 0≤ m |T |.(3.14)For theformer E f 2(Y k )=m n |λ0k |2| u k ,f |2≤mnµ2| u k ,f |2.Since 1≤k ≤n | u k ,f |2=n ,we proved that 1≤k ≤n E f 2(Y k )≤mµ2 1−mKB log 1+Btm |T | .(3.15)Suppose now m |T |≥µ2m ,and fix t =aσ2.The same is true if m |T |and Bt ≤µ2m .We omit thedetails.The lemma follows from (3.14).|T |/m +2am T |+am (µσ)The claim follows.Lemma 3.4Assume that z (t ),t ∈T is an i.i.d.sequence of symmetric Bernoulli random variables.For each λ>0,we haveP sup t ∈T c |π(t )|>1 ≤2ne −1/2λ2+P sup t 0∈T cw 0 >λ .(3.17)Proof The proof is essentially an application of Hoeffding’s inequality [2].Conditioned on the w 0,this inequality states thatP | w 0,z |>1|w 0 ≤2e −12λ2,which proves the result.|T |/m+2a λ2≥2log(2n/δ).(3.19)Suppose µ|T |≥√σ≤µλ2≥1µ2|T |.(3.20)Suppose now that µ|T |≤√σ≤µσ/m and 116m16µ2min 1a 2 ≥2log(2n/δ).17This analysis shows that the second term is less thanδifm≥K1µ2max(|T|,log(n/δ))log(n/δ)for some constant K1.Finally,by Theorem1.2,the last term will be bounded byδifm≥K2µ2|T|log(n/δ)for some constant K2.In conclusion,we proved that there exists a constant K3such that the reconstruction is exact with probability at least1−δprovided that the number of measurements m obeysm≥K3µ2max(|T|,log(n/δ))log(n/δ).The theorem is proved.4DiscussionIt is possible that a version of Theorem1.1exists that holds for all sign sequences on a set T simultaneously,i.e.we can remove the condition that the signs are chosen uniformly at random. Proving such a theorem with the methods above would require showing that the random vector w0=(U∗ΩT UΩT)−1v0,where v0is as in(3.12),will not be aligned with thefixed sign sequence z.We conjecture that this is indeed true,but proving such a statement seems considerably more involved.The new large-deviation inequality of Theorem1.2can also be used to sharpen results presented in[3]about usingℓ1minimization tofind the sparsest decomposition of a signal in a union of bases.Consider a signal f∈R n that can be written as a sparse superposition of the columns of a dictionary D=(Ψ1Ψ2)where eachΨi is an orthonormal basis.In other words f=Dx0,where x0∈R2n has small support.Given such an f,we attempt to recover x0by solvingminx x ℓ1subject to Dx=f.(4.1) Combining Theorem1.2with the methods used in[3],we can establish that if|supp x|≤Const·nReferences[1]R.G.Baraniuk,M.Davenport,R.DeVore,and M.Wakin.The Johnson-Lindenstrauss lemma meetscompressed sensing.Submitted manuscript,June2006.[2]S.Boucheron,O.Bousquet,and G.Lugosi.Concentration inequalities.In O.Bousquet and G.R¨a tsch,editors,Advanced Lectures in Machine Learning,pages208–240.Springer,2004.[3]E.J.Cand`e s and J.Romberg.Quantitative robust uncertainty principles and optimally sparse decom-positions.Foundations of Comput.Math.,6(2):227–254,2006.[4]E.J.Cand`e s,J.Romberg,and T.Tao.Robust uncertainty principles:Exact signal reconstruction fromhighly incomplete frequency information.IEEE rm.Theory,52(2):489–509,February2006.[5]E.J.Cand`e s,J.Romberg,and T.Tao.Stable signal recovery from incomplete and inaccurate measure-m.on Pure and Applied Math.,59(8):1207–1223,2006.[6]E.J.Cand`e s and T.Tao.Decoding by linear programming.IEEE rm.Theory,51(12):4203–4215,December2005.[7]E.J.Cand`e s and T.Tao.Near-optimal signal recovery from random projections and universal encodingstrategies.IEEE rm.Theory,52(12),December2006.[8]R.Coifman,F.Geshwind,and p.Harmonic Analysis,10:27–44,2001.[9]pressed sensing.IEEE rm.Theory,52(4):1289–1306,April2006.[10]D.L.Donoho.For most large underdetermined systems of linear equations the minimalℓ1-norm solutionis also the sparsest m.Pure Appl.Math.,59(6):797–829,2006.[11]D.L.Donoho and M.Elad.Optimally sparse representation in general(non-orthogonal)dictionariesviaℓA,100:2197–2202,2003.[12]D.L.Donoho and X.Huo.Uncertainty principles and ideal atomic decomposition.IEEE rm.Theory,47:2845–2862,2001.[13]D.L.Donoho and J.Tanner.Neighborliness of randomly-projected simplices in high dimensions.Proc.A,102(27):9452–9457,2005.[14]M.Elad and A.M.Bruckstein.A generalized uncertainty principle and sparse representation in pairsof R N bases.IEEE rm.Theory,48:2558–2567,2002.[15]J.J.Fuchs.On sparse representations in arbitrary redundant bases.IEEE rm.Theory,50:1341–1344,2004.[16]R.Gribonval and M.Nielsen.Sparse representations in unions of bases.IEEE rm.Theory,49:3320–3325,2003.[17]J.Haupt and R.Nowak.Signal reconstruction from noisy random projections.IEEE rm.Theory,52(9):4036–4048,September2006.[18]T.Klein and E.Rio.Concentration around the mean for maxima of empirical processes.Ann.Probab.,33(3):1060–1077,2005.[19]M.Ledoux.The Concentration of Measure Phenomenon.American Mathematical Society,2001.[20]M.Ledoux and M.Talagrand.Probability in Banach Spaces,volume23of Ergebnisse der Mathematikund ihrer Grenzgegiete.Springer-Verlag,Berlin,1991.[21]S.Mallat.A Wavelet Tour of Signal Processing.Academic Press,San Diego,second edition,1999.19。