Robust Subspace Segmentation by Low-Rank Representation
基于 CT 和磁共振 T2加权图像双模态分类模型的自发性脑出血后脑水肿在 CT 图像上的分割
基于 CT 和磁共振 T2加权图像双模态分类模型的自发性脑出血后脑水肿在 CT 图像上的分割陈明扬;朱时才;贾富仓;李晓东;AhmedElazab;胡庆茂【摘要】Segmentation of cerebral edema from computed tomography (CT) scans for patients with intracr-anial hemorrhage (ICH) is challenging as edema does not show clear boundary on CT. By exploiting the clear boundary on T2-weighted magnetic resonance images, a method was proposed to segment edema on CT images through the model learned from 14 patients with both CT and T2-weighted images using ground truth edema from T2-weighted images to train and classify the features extracted on CT images. By constructing negative samples around the positive samples, employing the feature selection based on common subspace measures, and using support vector machine, the classification model was attained corresponding to the optimum segmentation accuracy. The method has been validated against 36 clinical head CT scans presenting ICH to yield a mean Dice coefficient of 0.859±0.037, which is significantly higher than that of region growing method (0.789±0.036,P<0.000 1), semi-automated level set method (0.712±0.118, P<0.000 1), and threshold based method (0.649±0.147, P<0.000 1). Comparative experiments have been carried out to find that the classifier purely from CT will yield a significantly lower Dice coefficient (0.686±0.136, P<0.000 1). The higher segmentation accuracy may suggest that clear boundaries of edema from T2-weighted images provide implicit constraints on CTimages that could differentiate edema from its neighboring brain tissues more accurately. The proposed method could provide a potential tool to quantify edema, evaluate the severity of pathological changes, and guide therapy of patients with ICH.%自发性脑出血后脑水肿在 CT 图像呈现的模糊边缘是 CT 图像上实现脑水肿自动分割的一个严峻挑战。
图像邻域像素分布分析
图像邻域像素分布分析谭啸,冯久超【摘要】从数字图像的成像原理出发,对邻域像素的分布进行了分析,进而对邻域像素跳变分布提出了2种新的函数模型。
实验采用非线性最小二乘法对从实验图像中得到的分布结果进行函数拟合,并用差值能量函数(DPF)对新提出的函数模型与文献[8,12]的分布函数进行对比分析。
结果表明:新提出的函数模型拥有更高的拟合度,更符合邻域像素值的分布情况。
【期刊名称】吉林大学学报(工学版)【年(卷),期】2011(041)002【总页数】6【关键词】信息处理;分布函数;非线性最小二乘法;邻域;像素;跳变近些年,科研工作者展开了大量的基于邻域性质的研究工作,邻域性质被广泛地应用到图像分割和图像重建等领域[1-5]。
同时,水印和密写工作者们依据密写行为会减弱邻域之间相关性的性质,直接利用邻域像素值跳变关系或者考虑Gray-level co-occurrence matrix(GLCM)和Histogram characteristics function(HCF)的变化达到对密写行为分析的目的[6-11]。
可见,邻域像素之间的关系在图像分析工作中具有重要作用。
有研究者认为[8,12]邻域像素值跳变分布函数服从高斯分布或拉普拉斯分布,并在不同的实验中得到较好的拟合效果。
但是,并没有文献对这种分布的正确性加以分析。
本文对数字邻域像素值跳变分布函数进行了分析。
首先,从数字设备的成像原理出发,分析了一定区域内的像素值分布,并依据大数定理得到这些像素值应服从高斯分布。
然后,分析了不同区域之间的像素分布差异,经过数学分析和推导得到最终的分布函数表达式,并证明了邻域像素值跳变更符合拉普拉斯分布,同时给出了更具一般性的分布函数。
1 数字图像成像原理数字成像设备的成像过程是:通过感光部件(CCD或者CMOS)对自然图像进行采样得到采样信号,再按照一定算法(不同的数字图像设备采用不同的算法)对采样信号进行处理得到相应的数字图像[13]。
信号与数据处理中的低秩模型——理论、算法与应用
min rank( A), s.t.
A
( D) ( A)
2 F
,
(2)
以处理测量数据有噪声的情况。 如果考虑数据有强噪声时如何恢复低秩结构的问题,看似这个问题可以用传统的 PCA 解决,但 实际上传统 PCA 只在噪声是高斯噪声时可以准确恢复潜在的低秩结构。对于非高斯噪声,如果噪声 很强,即使是极少数的噪声,也会使传统的主元分析失败。由于主元分析在应用上的极端重要性, 大量学者付出了很多努力在提高主元分析的鲁棒性上,提出了许多号称“鲁棒”的主元分析方法, 但是没有一个方法被理论上严格证明是能够在一定条件下一定能够精确恢复出低秩结构的。 2009 年, Chandrasekaran 等人[CSPW2009]和 Wright 等人[WGRM2009]同时提出了鲁棒主元分析 (Robust PCA, RPCA) 。他们考虑的是数据中有稀疏大噪声时如何恢复数据的低秩结构:
b) 多子空间模型
RPCA 只能从数据中提取一个子空间,它对数据在此子空间中的精细结构无法刻画。精细结构 的最简单情形是多子空间模型,即数据分布在若干子空间附近,我们需要找到这些子空间。这个问 题马毅等人称为 Generalized PCA (GPCA)问题[VMS2015],之前已有很多算法,如代数法、RANSAC 等,但都没有理论保障。稀疏表示的出现为这个问题提供了新的思路。E. Elhamifar 和 R. Vidal 2009 年利用样本间相互表达,在表达系数矩阵稀疏的目标下提出了 Sparse Subspace Clustering (SSC)模型 [EV2009]((6)中 rank( Z ) 换成 Z
* 本文得到国家自然科学基金(61272341, 61231002)资助。
迭代吉洪诺夫正则化的FCM聚类算法
迭代吉洪诺夫正则化的FCM聚类算法蒋莉芳;苏一丹;覃华【摘要】模糊C均值聚类算法(fuzzy C-means,FCM)存在不适定性问题,数据噪声会引起聚类失真.为此,提出一种迭代Tikhonov正则化模糊C均值聚类算法,对FCM的目标函数引入正则化罚项,推导最优正则化参数的迭代公式,用L曲线法在迭代过程中实现正则化参数的寻优,提高FCM的抗噪声能力,克服不适定问题.在UCI 数据集和人工数据集上的实验结果表明,所提算法的聚类精度较传统FCM高,迭代次数少10倍以上,抗噪声能力更强,用迭代Tikhonov正则化克服传统FCM的不适定问题是可行的.%FCM algorithm has the ill posed problem.Regularization method can improve the distortion of the model solution caused by the fluctuation of the data.And it can improve the precision and robustness of FCM through solving the error estimate of solution caused by ill posed problem.Iterative Tikhonov regularization function was introduced into the proposed problem (ITR-FCM),and L-curve method was used to select the optimal regularization parameter iteratively,and the convergence rate of the algorithm was further improved using the dynamic Tikhonov method.Five UCI datasets and five artificial datasets were chosen for the test.Results of tests show that iterative Tikhonov is an effective solution to the ill posed problem,and ITR-FCM has better convergence speed,accuracy and robustness.【期刊名称】《计算机工程与设计》【年(卷),期】2017(038)009【总页数】5页(P2391-2395)【关键词】模糊C均值聚类;不适定问题;Tikhonov正则化;正则化参数;L曲线【作者】蒋莉芳;苏一丹;覃华【作者单位】广西大学计算机与电子信息学院,广西南宁 530004;广西大学计算机与电子信息学院,广西南宁 530004;广西大学计算机与电子信息学院,广西南宁530004【正文语种】中文【中图分类】TP389.1模糊C均值算法已广泛地应用于图像分割、模式识别、故障诊断等领域[1-6]。
稀疏判别分析
稀疏判别分析摘要:针对流形嵌入降维方法中在高维空间构建近邻图无益于后续工作,以及不容易给近邻大小和热核参数赋合适值的问题,提出一种稀疏判别分析算法(seda)。
首先使用稀疏表示构建稀疏图保持数据的全局信息和几何结构,以克服流形嵌入方法的不足;其次,将稀疏保持作为正则化项使用fisher判别准则,能够得到最优的投影。
在一组高维数据集上的实验结果表明,seda是非常有效的半监督降维方法。
关键词:判别分析;稀疏表示;近邻图;稀疏图sparse discriminant analysischen xiao.dong1*, lin huan.xiang 21.school of information and engineering, zhejiang radio and television university, hangzhou zhejiang 310030, china ;2.school of information and electronic engineering,zhejiang university of science and technology, hangzhou zhejiang 310023, chinaabstract:methods for manifold embedding exists in the following issues: on one hand, neighborhood graph is constructed in such thehigh-dimensionality of original space that it tends to work poorly; on the other hand, appropriate values for the neighborhood size and heat kernel parameter involved in graph construction is generally difficult to be assigned. to address these problems, a novel semi-supervised dimensionality reduction algorithm called sparse discriminant analysis (seda) is proposed. firstly, seda sets up a sparse graph to preserve the global information and geometric structure of the data based on sparse representation. secondly, it applies both sparse graph and fisher criterion to seek the optimal projection. experiments on a broad range of data sets show that seda is superior to many popular dimensionality reduction methods.methods for manifold embedding have the following issues: on one hand, neighborhood graph is constructed in suchhigh.dimensionality of original space that it tends to work poorly; on the other hand, appropriate values for the neighborhood size and heat kernel parameter involved in graph construction are generally difficult to be assigned. to address these problems, a new semi.supervised dimensionality reduction algorithm called sparse discriminant analysis (seda) was proposed. firstly, seda set up a sparse graph topreserve the global information and geometric structure of the data based on sparse representation. secondly, it applied both sparse graph and fisher criterion to seek the optimal projection. the experimental results on a broad range of data sets show that seda is superior to many popular dimensionality reduction methods.key words:discriminant analysis; sparse representation; neighborhood graph; sparse graph0 引言在信息检索、文本分类、图像处理和生物计算等应用中,所面临的数据都是高维的。
ICML_NIPS_ICCV_CVPR(14~18)
ICML2014ICML20151. An embarrassingly simple approach to zero-shot learning2. Learning Transferable Features with Deep Adaptation Networks3. A Theoretical Analysis of Metric Hypothesis Transfer Learning4. Gradient-based hyperparameter optimization through reversible learningICML20161. One-Shot Generalization in Deep Generative Models2. Meta-Learning with Memory-Augmented Neural Networks3. Meta-gradient boosted decision tree model for weight and target learning4. Asymmetric Multi-task Learning based on Task Relatedness and ConfidenceICML20171. DARLA: Improving Zero-Shot Transfer in Reinforcement Learning2. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks3. Meta Networks4. Learning to learn without gradient descent by gradient descentICML20181. MSplit LBI: Realizing Feature Selection and Dense Estimation Simultaneously in Few-shotand Zero-shot Learning2. Understanding and Simplifying One-Shot Architecture Search3. One-Shot Segmentation in Clutter4. Meta-Learning by Adjusting Priors Based on Extended PAC-Bayes Theory5. Bilevel Programming for Hyperparameter Optimization and Meta-Learning6. Gradient-Based Meta-Learning with Learned Layerwise Metric and Subspace7. Been There, Done That: Meta-Learning with Episodic Recall8. Learning to Explore via Meta-Policy Gradient9. Transfer Learning via Learning to Transfer10. Rapid adaptation with conditionally shifted neuronsNIPS20141. Zero-shot recognition with unreliable attributesNIPS2015NIPS20161. Learning feed-forward one-shot learners2. Matching Networks for One Shot Learning3. Learning from Small Sample Sets by Combining Unsupervised Meta-Training with CNNs NIPS20171. One-Shot Imitation Learning2. Few-Shot Learning Through an Information Retrieval Lens3. Prototypical Networks for Few-shot Learning4. Few-Shot Adversarial Domain Adaptation5. A Meta-Learning Perspective on Cold-Start Recommendations for Items6. Neural Program Meta-InductionNIPS20181. Bayesian Model-Agnostic Meta-Learning2. The Importance of Sampling inMeta-Reinforcement Learning3. MetaAnchor: Learning to Detect Objects with Customized Anchors4. MetaGAN: An Adversarial Approach to Few-Shot Learning5. Regret bounds for meta Bayesian optimization with an unknown Gaussian process prior6. Meta-Gradient Reinforcement Learning7. Meta-Reinforcement Learning of Structured Exploration Strategies8. Meta-Learning MCMC Proposals9. Probabilistic Model-Agnostic Meta-Learning10. MetaReg: Towards Domain Generalization using Meta-Regularization11. Zero-Shot Transfer with Deictic Object-Oriented Representation in Reinforcement Learning12. Uncertainty-Aware Few-Shot Learning with Probabilistic Model-Agnostic Meta-Learning13. Multitask Reinforcement Learning for Zero-shot Generalization with Subtask Dependencies14. Stacked Semantics-Guided Attention Model for Fine-Grained Zero-Shot Learning15. Delta-encoder: an effective sample synthesis method for few-shot object recognition16. One-Shot Unsupervised Cross Domain Translation17. Generalized Zero-Shot Learning with Deep Calibration Network18. Domain-Invariant Projection Learning for Zero-Shot Recognition19. Low-shot Learning via Covariance-Preserving Adversarial Augmentation Network20. Improved few-shot learning with task conditioning and metric scaling21. Adapted Deep Embeddings: A Synthesis of Methods for k-Shot Inductive Transfer Learning22. Learning to Play with Intrinsically-Motivated Self-Aware Agents23. Learning to Teach with Dynamic Loss Functiaons24. Memory Replay GANs: learning to generate images from new categories without forgettingICCV20151. One Shot Learning via Compositions of Meaningful Patches2. Unsupervised Domain Adaptation for Zero-Shot Learning3. Active Transfer Learning With Zero-Shot Priors: Reusing Past Datasets for Future Tasks4. Zero-Shot Learning via Semantic Similarity Embedding5. Semi-Supervised Zero-Shot Classification With Label Representation Learning6. Predicting Deep Zero-Shot Convolutional Neural Networks Using Textual Descriptions7. Learning to Transfer: Transferring Latent Task Structures and Its Application to Person-Specific Facial Action Unit DetectionICCV20171. Supplementary Meta-Learning: Towards a Dynamic Model for Deep Neural Networks2. Attributes2Classname: A Discriminative Model for Attribute-Based Unsupervised Zero-ShotLearning3. Low-Shot Visual Recognition by Shrinking and Hallucinating Features4. Predicting Visual Exemplars of Unseen Classes for Zero-Shot Learning5. Learning Discriminative Latent Attributes for Zero-Shot Classification6. Spatial-Aware Object Embeddings for Zero-Shot Localization and Classification of ActionsCVPR20141. COSTA: Co-Occurrence Statistics for Zero-Shot Classification2. Zero-shot Event Detection using Multi-modal Fusion of Weakly Supervised Concepts3. Learning to Learn, from Transfer Learning to Domain Adaptation: A Unifying Perspective CVPR20151. Zero-Shot Object Recognition by Semantic Manifold DistanceCVPR20162. Multi-Cue Zero-Shot Learning With Strong Supervision3. Latent Embeddings for Zero-Shot Classification4. One-Shot Learning of Scene Locations via Feature Trajectory Transfer5. Less Is More: Zero-Shot Learning From Online Textual Documents With Noise Suppression6. Synthesized Classifiers for Zero-Shot Learning7. Recovering the Missing Link: Predicting Class-Attribute Associations for UnsupervisedZero-Shot Learning8. Fast Zero-Shot Image Tagging9. Zero-Shot Learning via Joint Latent Similarity Embedding10. Learning to Read Chest X-Rays: Recurrent Neural Cascade Model for Automated ImageAnnotation11. Learning to Co-Generate Object Proposals With a Deep Structured Network12. Learning to Select Pre-Trained Deep Representations With Bayesian Evidence Framework13. DeepStereo: Learning to Predict New Views From the World’s ImageryCVPR20171. One-Shot Video Object Segmentation2. FastMask: Segment Multi-Scale Object Candidates in One Shot3. Few-Shot Object Recognition From Machine-Labeled Web Images4. From Zero-Shot Learning to Conventional Supervised Classification: Unseen Visual DataSynthesis5. Learning a Deep Embedding Model for Zero-Shot Learning6. Low-Rank Embedded Ensemble Semantic Dictionary for Zero-Shot Learning7. Multi-Attention Network for One Shot Learning8. Zero-Shot Action Recognition With Error-Correcting Output Codes9. One-Shot Metric Learning for Person Re-Identification10. Semantic Autoencoder for Zero-Shot Learning11. Zero-Shot Recognition Using Dual Visual-Semantic Mapping Paths12. Matrix Tri-Factorization With Manifold Regularizations for Zero-Shot Learning13. One-Shot Hyperspectral Imaging Using Faced Reflectors14. Gaze Embeddings for Zero-Shot Image Classification15. Zero-Shot Learning - the Good, the Bad and the Ugly16. Link the Head to the “Beak”: Zero Shot Learning From Noisy Text Description at PartPrecision17. Semantically Consistent Regularization for Zero-Shot Recognition18. Semantically Consistent Regularization for Zero-Shot Recognition19. Zero-Shot Classification With Discriminative Semantic Representation Learning20. Learning to Detect Salient Objects With Image-Level Supervision21. Quad-Networks: Unsupervised Learning to Rank for Interest Point DetectionCVPR20181. A Generative Adversarial Approach for Zero-Shot Learning From Noisy Texts2. Transductive Unbiased Embedding for Zero-Shot Learning3. Zero-Shot Visual Recognition Using Semantics-Preserving Adversarial EmbeddingNetworks4. Learning to Compare: Relation Network for Few-Shot Learning5. One-Shot Action Localization by Learning Sequence Matching Network6. Multi-Label Zero-Shot Learning With Structured Knowledge Graphs7. “Zero-Shot” Super-Resolution Using Deep Internal Learning8. Low-Shot Learning With Large-Scale Diffusion9. CLEAR: Cumulative LEARning for One-Shot One-Class Image Recognition10. Zero-Shot Sketch-Image Hashing11. Structured Set Matching Networks for One-Shot Part Labeling12. Memory Matching Networks for One-Shot Image Recognition13. Generalized Zero-Shot Learning via Synthesized Examples14. Dynamic Few-Shot Visual Learning Without Forgetting15. Exploit the Unknown Gradually: One-Shot Video-Based Person Re-Identification byStepwise Learning16. Feature Generating Networks for Zero-Shot Learning17. Low-Shot Learning With Imprinted Weights18. Zero-Shot Recognition via Semantic Embeddings and Knowledge Graphs19. Webly Supervised Learning Meets Zero-Shot Learning: A Hybrid Approach for Fine-Grained Classification20. Few-Shot Image Recognition by Predicting Parameters From Activations21. Low-Shot Learning From Imaginary Data22. Discriminative Learning of Latent Features for Zero-Shot Recognition23. Multi-Content GAN for Few-Shot Font Style Transfer24. Preserving Semantic Relations for Zero-Shot Learning25. Zero-Shot Kernel Learning26. Neural Style Transfer via Meta Networks27. Learning to Estimate 3D Human Pose and Shape From a Single Color Image28. Learning to Segment Every Thing29. Leveraging Unlabeled Data for Crowd Counting by Learning to Rank。
Subspace Clustering
[Ren e Vidal][Applications in motionsegmentation andface clustering]T he past few years have witnessed an explo-sion in the availability of data from multi-ple sources and modalities.For example,millions of cameras have been installedin buildings,streets,airports,and citiesaround the world.This has generated extraordinary advan-ces on how to acquire,compress,store,transmit,and processmassive amounts of complex high-dimensional data.Many of these advances have relied on the observationthat,even though these data sets are high dimensional,theirintrinsic dimension is often much smaller than the dimension of theambient space.In computer vision,for example,the number of pixels in an ©DIGITAL STOCK&LUSPHIXimage can be rather large,yet most computer vision models use only a few parameters to describe the appearance,geometry,and dynamics of a scene.This has motivated the development of a number of techniques for finding a low-dimensional representation of a high-dimen-sional data set.Conventional techniques,such as principal component analysis(PCA),assume that the data are drawn from a single low-dimensional subspace of a high-dimensional space.Such approaches have found widespread applications in many fields,e.g.,pattern recognition,data compression,image processing,and bioinformatics.In practice,however,the data points could be drawn from multiple subspaces,and the mem-bership of the data points to the subspaces might be unknown.For instance,a video sequence could contain several moving objects,and different subspaces might be needed to describe the motion of different objects in the scene.Therefore,there is a need to simultaneously cluster the data into multiple subspaces and find a low-dimensional subspace fitting each group of points. This problem,known as subspace clustering,has found numerous applications in computer vision(e.g.,image segmentation[1],motion segmentation[2],and face clustering[3]),image pro-cessing(e.g.,image representation and compression[4]),and systems theory(e.g.,hybrid system identification[5]).Digital Object Identifier10.1109/MSP.2010.939739Date of publication:17February2011methods from the machine learningand computer vision communities,including algebraic methods[7]–[10],iterative methods [11]–[15],statistical methods [16]–[20],and spectral clustering-based methods [7],[21]–[27].We review these methods,discuss their advantages and disadvantages,and evaluate their performance on the motion segmentation and face-clustering problems.THE SUBSPACE CLUSTERING PROBLEM Consider the problem of modeling a collection of data points with a union of subspaces,as illustrated in Figure 1.Specifically,let f x j 2R D g Nj ¼1be a given set of points drawn from an unknown union of n 1linear or affine subspaces f S i g ni ¼1of unknowndimensions d i ¼dim (S i ),05d i 5D ,i ¼1;...;n .The subspa-ces can be described as S i ¼f x 2R D:x ¼l i þU i y g ,i ¼1,...,n ,(1)where l i 2R Dis an arbitrary point in subspace S i that can bechosen as l i ¼0for linear subspaces,U i 2R D 3d i is a basis forsubspace S i ,and y 2R d iis a low-dimensional representation forpoint x .The goal of subspace clustering is to find the number ofsubspaces n ,their dimensions f d i g ni ¼1,the subspace basesf U ig n i ¼1,the points f l i g ni ¼1,and the segmentation of the pointsaccording to the subspaces.When the number of subspaces is equal to one,this problemreduces to finding a vector l 2R D ,a basis U 2R D 3d,a low-dimensional representation Y ¼½y 1;...;y N 2R d 3N ,and the dimension d .This problem is known as PCA [28].(The problem ofmatrix factorization dates back to the work of Beltrami [29]and Jordan [30].In the context of stochastic signal process-ing,PCA is also known as Karhunen-Loeve transform [31].In the applied statistics literature,PCA is also known as Eck-art-Young decomposition [32].)PCA can be solved in a remark-ably simple way:l ¼(1=N )P Nj ¼1x j is the mean of the data points (U ;Y )can be obtained from the rank-d singular value decomposition (SVD)of the (mean-subtracted)data matrix X ¼½x 1Àl ,x 2Àl ,...,x N Àl 2R D 3N asU ¼UandY ¼R V >,whereX ¼U R V >,(2)and d can be obtained as d ¼rank(X )with noise-free data or using model-selection techniques when the data are noisy [28].When n 41,the subspace clustering problem becomes sig-nificantly more difficult due to a number of challenges.n First,there is a strong coupling between data segmenta-tion and model estimation.Specifically,if the segmentation of the data is known,one could easily fit a single subspaceneither the segmentation of the data nor the subspace parameters are known,and one needs to solve both problems simultaneously.n Second,the distribution of the data inside the subspaces is generally unknown.If the data within each subspace are distributed around a cluster center and the cluster centersfor different subspaces are far apart,the subspace clusteringproblem reduces to the simpler and well-studied centralclustering problem.However,if the distribution of the data points in the subspaces is arbitrary,the subspace clustering problem cannot be solved by central clustering techniques.In addition,the problem becomes more difficult when manypoints lie close to the intersection of two or more subspaces.n Third,the position and orientation of the subspaces rela-tive to each other can be arbitrary.As we will show later,when the subspaces are disjoint or independent,the sub-space clustering problem can be solved more easily.How-ever,when the subspaces are dependent,the subspaceclustering problem becomes much harder.(n linear sub-spaces are disjoint if every two subspaces intersect only at the origin.n linear subspaces are independent if the dimension of their sum is equal to the sum of their dimen-sions.Independent subspaces are disjoint,but the converse is not always true.n affine subspaces are disjoint,inde-pendent,if so are the corresponding linear subspaces in homogeneous coordinates.)n The fourth challenge is that the data can be corrupted by noise,missing entries,and outliers.Although robust estima-tion techniques for handling such nuisances have been devel-oped for the case of a single subspace,the case of multiple subspaces is not well understood.n The fifth challenge is model selection.In classical PCA,the only parameter is subspace dimension,which can befound by searching for the subspace of the smallest dimension[FIG1]A set of sample points in R drawn from a union of three subspaces:two lines and a plane.that fits the data with a given accuracy.In the case of multiple subspaces,one can fit the data with N different subspaces of dimension one,i.e.,one subspace per data point,or with a single subspace of dimension D .Obviously,neither solution is satisfactory.The challenge is to find a model-selection criteria that favors a small number of sub-spaces of small dimensions.In what follows,we present a number of subspace clustering algorithms and show how they try to address these challenges.SUBSPACE CLUSTERING ALGORITHMSALGEBRAIC ALGORITHMSWe first review two algebraic algorithms for clustering noise-free data drawn from multiple linear subspaces,i.e.,l i ¼0.The first algorithm is based on linear algebra,specifically matrix factorization,and is provably correct for independent sub-spaces.The second one is based on polynomial algebra and is provably correct for both dependent and independent subspaces.Although these algorithms are designed for linear subspaces,in the case of noiseless data,they can also be applied to affine subspaces by using homogeneous coordinates,thus interpreting an affine subspace of dimension d in R D as a linear subspace of dimension d þ1in R D þ1.(The homogeneous coordinates of x 2R D are given by ½x >1 >2R D þ1.)Also,while these algorithms operate under the assumption of noise-free data,they provide great insights into the geometry and algebra of the subspace clustering problem.Moreover,they can be extended to handle moderate amounts of noise.MATRIX FACTORIZATION-BASED ALGORITHMSThese algorithms obtain the segmentation of the data from a low-rank factorization of the data matri X .Hence,they are a natural extension of PCA from one to multiple independent linear subspaces.Specifically,let X i 2R D 3N i be the matrix containing the N i points in subspace i .The columns of the data matrix can be sorted according to the n subspaces as ½X 1,X 2,...,X n ¼X C ,where C 2R N 3N is an unknown permutation matrix.Because each matrix X i is of rank d i ,it can be factorized asX i ¼U i Y ii ¼1,...,n ,(3)where U i 2R D 3d i is an orthogonal basis for subspace i and Y i 2R d i 3N i is the low-dimensional representation of the points with respect to U i .Therefore,if the subspaces are independent,then r ¼Drank(X )¼P n i ¼1d i min f D ,N g andX C ¼U 1,U 2,ÁÁÁ,U n ½ Y 1Y 2...Y n2666437775¼DUY ,(4)where U 2R D 3r and Y 2R r 3N .The subspace clustering prob-lem is then equivalent to finding a permutation matrix C ,suchthat X C admits a rank-r factorization into a matrix U and a block diagonal matrix Y .This idea is the basis for the algorithms of Boult and Brown [7],Costeira and Kanade [8],and Gear [9],which compute C from the SVD of X [7],[8]or from the row echelon canonical form of X [9].Specifically,the Costeira and Kanade algorithm proceeds as follows.Let X ¼U R V >be the rank-r SVD of the data matrix,i.e.,U 2R D 3r ,R 2R r 3r ,and V 2R N 3r .Also,letQ ¼VV >2R N 3N :ð5ÞAs shown in [2]and [33],the matrix Q is such thatQ jk ¼0if points j and k are in different subspaces :(6)In the absence of noise,(6)can be used to obtain the segmenta-tion of the data by applying spectral clustering to the eigenvectors of Q [7](see the ‘‘Spectral Clustering-Based Methods’’section)or by sorting and thresholding the entries of Q [8],[34].For instance,[8]obtains the segmentation by maximizing the sum of the squared entries of Q in different groups,while [34]finds the groups by thresholding a subset of the rows of Q .However,as noted in [33]and [35],this thresholding process is very sensitive to noise.Also,the construction of Q requires knowledge of the rank of X ,and using the wrong rank can lead to very poor results [9].Wu et al.[35]use an agglomerative process to reduce the effect of noise.The entries of Q are first thresholded to obtain an initial oversegmentation of the data.A subspace is then fit to each group G i ,and two groups are merged when the distance between their subspaces is below a threshold.A similar approach is followed by Kanatani et al.[33],[36],except that the geometric Akaike informa-tion criterion [37]is used to decide when to merge the two groups.Although these approaches indeed reduce the effect of noise,in practice,they are not effective because the equation Q jk ¼0holds only when the subspaces are independent.In the case of dependent subspaces,one can use the subset of the columns of V that do not span the intersections of the subspaces.Unfortunately,we do not know which columns to choose a priori.Zelnik-Manor and Irani [38]propose to use the top columns of V to define Q .However,this heuristic is not provably correct.Another issue with factorization-based algorithms is that,with a few exceptions,they do not provide a method for computing the number of subspaces,n ,and their dimensions,f d i g n i ¼1.The first exception is when n is known.In this case,d i can be computed from each group after the segmenta-tion has been obtained.The second exception is for independent subspaces of equal dimension d .In this case rank(X )¼nd ,hence we may determine n when d is known or vice versa.GENERALIZED PCAGeneralized PCA (GPCA;see [10]and [39])is an algebraic-geometric method for clustering data lying in (not necessarily independent)linear subspaces.The main idea behind GPCA is that one can fit a union of n subspaces with a set of polynomials of degree n ,whose derivatives at a point give a vector normal to the subspace containing that point.The segmentation of thedata is then obtained by grouping these normal vectors using several possible techniques.The first step of GPCA,which is not strictly needed,is to project the data points onto a subspace of R D of dimension r¼d maxþ1,where d max¼max f d1,...,d n g.(The value of r is determined using model-selection techniques when the subspace dimensions are unknown.)The rationale behind this step is as fol-lows.Since the maximum dimension of each subspace is d max,a projection onto a generic subspace of R D of dimension d maxþ1 preserves the number and dimensions of the subspaces with probabilty one.As a by-product,the subspace clustering problem is reduced to clustering subspa-ces of dimension at most d maxin R d maxþ1.As we shall see,thisstep is very important to reducethe computational complexityof the GPCA algorithm.With anabuse of notation,we will denotethe original and projected sub-spaces as S i,and the original and projected data matrix asX¼½x1,...,x N 2R D3N or R r3N:(7)The second step is to fit a homogeneous polynomial of degree n to the(projected)data.The rationale behind this step is as fol-lows.Imagine,for instance,that the data came from the union of two planes in R3,each one with normal vector b i2R3.The union of the two planes can be represented as a set of points, such that p(x)¼(b>1x)(b>2x)¼0.This equation is nothing but the equation of a conic of the formc1x21þc2x1x2þc3x1x3þc4x22þc5x2x3þc6x23¼0:(8)Imagine now that the data came from the plane b>x¼0or the line b>1x¼b>2x¼0.The union of the plane and the line is the set of points,such that p1(x)¼(b>x)(b>1x)¼0and p2(x)¼(b>x)(b>2x)¼0.More generally,data drawn from the union of n subspaces of R r can be represented with polynomialsof the form p(x)¼(b>1x)ÁÁÁ(b>n x)¼0,where the vector b i2R r is orthogonal to S i.Each polynomial is of degree n in x and can be written as c>m n(x),where c is the vector of coefficients and m n(x) is the vector of all monomials of degree n in x.There areM n(r)¼nþrÀ1nindependent monomials;hence,c2R M n(r).In the case of noiseless data,the vector of coefficients c of each polynomial can be computed fromc>½m n(x1),m n(x2),ÁÁÁ,m n(x N) ¼D c>V n¼0>(9)and the number of polynomials is simply the dimension of the null space of V n.While in general the relationship between the number of subspaces,n,their dimensions,f d i g n i¼1,and the number of polynomials involves the theory of Hilbert functions[40],in the particular case where all the dimensions are equal tod and r¼dþ1,there is a unique polynomial that fits the data. This fact can be exploited to determine both n and d.For exam-ple,given d,n can be computed asn¼min f i:rank(V i)¼M i(r)À1g:(10)In the case of data contaminated with small-to-moderate amounts of noise,the polynomial coefficients(9)can be found using least squares—the vectors c are the left singular vectors ofV n corresponding to the small-est singular values.To handlelarger amounts of noise in theestimation of the polynomialcoefficients,one can resort totechniques from robust statis-tics[20]or rank minimization[41].Model-selection techni-ques can be used to determine the rank of V n and,hence,the number of polynomials,as shown in[42].Model-selection techniques can also be used to determine the number of sub-spaces of equal dimensions in(10),as shown in[10].However, determining n and f d i g n i¼1for subspaces of different dimen-sions from noisy data remains a challenge.The reader is referred to[43]for a model-selection criteria called minimum effective dimension,which measures the complexity of fitting n subspaces of dimensions f d i g n i¼1to a given data set within a certain tolerance,and to[40]and[42]for algebraic relation-ships among n,f d i g n i¼1and the number of polynomials,which can be used for model-selection purposes.The last step is to compute the normal vectors b i from the vec-tor of coefficients c.This can be done by taking the derivatives of the polynomials at a data point.For example,if n¼2,we have r p(x)¼(b>2x)b1þ(b>1x)b2.Thus,if x belongs to the first sub-space,then r p(x)$b1.More generally,in the case of n subspaces,we have p(x)¼(b>1x)ÁÁÁ(b>n x)and r p(x)$b i if x2S i.We can use this result to obtain the set of all normal vectors to S i from the derivatives of all the polynomials at x2S i.This gives us a basis for the orthogonal complement of S i from which we can obtain a basis U i for S i.Therefore,if we knew one point per sub-space,f y i2S i g n i¼1,we could compute the n subspace basesf U ig ni¼1from the gradient of the polynomials at f y i g n i¼1and then obtain the segmentation by assigning each point f x j g N j¼1to its clos-est subspace.A simple method for choosing the points f y i g n i¼1is to select any data point as y1to obtain the basis U1for the first sub-space S1.After removing the points that belong to S1from the data set,we can choose any of the remaining data points as y2to obtain U2,hence S2,and then repeat this process until all the subspaces are found.In the‘‘Spectral Clustering-Based Methods’’section,we will describe an alternative method based on spectral clustering.The first advantage of GPCA is that it is an algebraic algo-rithm;thus,it is computationally cheap when n and d are small.Second,intersections between subspaces are automati-cally allowed;hence,GPCA can deal with both independent andGENERALIZED PCA IS AN ALGEBRAIC-GEOMETRIC METHOD FORCLUSTERING DATA LYING IN (NOT NECESSARILY INDEPENDENT)LINEAR SUBSPACES.dependent subspaces.Third,in the noiseless case,it does notrequire the number of subspaces or their dimensions to be known beforehand.Specifically,the theory of Hilbert functions may be used to determine n and f d i g ,as shown in [40].The first drawback of GPCA is that its complexity increases expo-nentially with n and f d i g .Specifically,each vector c is of dimension O(M n (r )),while there are only O(r P ni ¼1ðr Àd i )Þunknowns inthe n sets of normal vectors.Second,the vector c is computed usingleast squares;thus,the computation of c is sensitive to outliers.Third,the least-squares fit does not take into account nonlinearconstraints among the entries of c (recall that p ðx )must factorize as a product of linear factors).These issues cause the performance of GPCA to deteriorate as n increases.Fourth,the method in [40]to determine n and f d i g ni ¼1does not handle noisy data.Fifth,while GPCA can be applied to affinesubspaces by using homogene-ous coordinates,in our experi-ence,this does not work very well when the data are conta-minated with noise.ITERATIVE METHODSA very simple way of improvingthe performance of algebraic algorithms in the case of noisy datais to use iterative refinement.Intuitively,given an initial seg-mentation,we can fit a subspace to each group using classicalPCA.Then,given a PCA model for each subspace,we can assigneach data point to its closest subspace.By iterating these two steps,we can obtain a refined estimate of the subspaces and seg-mentation.This is the basic idea behind the K -planes [11]algo-rithm,which generalizes the K -means algorithm [44]from data distributed around multiple cluster centers to data drawn from multiple hyperplanes.The K -subspaces algorithm [12],[13]further generalizes K -planes from multiple hyperplanes to multi-ple affine subspaces of any dimensions and proceeds as follows.Let w ij ¼1if point j belongs to subspace i and w ij ¼0otherwise.Referring back to (1),assume that the number of subspaces n andthe subspace dimensions f d i g ni ¼1are known.Our goal is to findthe points f l i 2R D g n i ¼1,the subspace bases f U i 2RD 3d i g ni ¼1,the low-dimensional representations f Y i 2R d i 3N i g ni ¼1,and thesegmentation of the data f w ij g j ¼1,...,Ni ¼1,...,n .We can do so by minimiz-ing the sum of the squared distances from each data point toits own subspacemin f l i g ,f U i g ,f y i g ,f w ij gXn i ¼1X N j ¼1w ij k x j Àl i ÀU i y j k 2subject to w ij 2f 0,1g and Xn i ¼1w ij ¼1:(11)Given f l i g ,f U i g ,and f y j g ,the optimal value for w ij isw ij ¼1if i ¼arg min k ¼1,...,nk x j Àl k ÀU k y j k 20else:((12)Given f w ij g ,the cost function in (11)decouples as the sum of n cost functions,one per subspace.Since each cost function is identical to that minimized by standard PCA,the optimal values for l i ,U i ,and y j are obtained by applying PCA to each group of points.The K -subspaces algorithm then proceeds by alternatingbetween assigning points to subspaces and reestimating the sub-spaces.Since the number of possible assignments of points to subspaces is finite,the algorithm is guaranteed to converge to a local minimum in a finite number of iterations.The main advantage of K -subspaces is its simplicity since it alternates between assigning points to subspaces and estimating the subspaces via PCA.Another advantage is that it can handle both linear and affine subspaces explicitly.The third advantageis that it converges to a local optimum in a finite number ofiterations.However,K -subspaces suffers from a number of draw-backs.First,its convergence to the global optimum depends on a good initialization.If a random initialization is used,several restarts are often neededto find the global optimum.In practice,one may use any of the algorithms described in this article to reduce the number of restarts needed.We refer the reader to [22]and [45]for two addi-tional initialization methods.Second,K -subspaces is sensitive to outliers,partly due to the use of the ‘2-norm.This issue can be addressed using a robust norm,such as the ‘1-norm,as done by the median K -flat algorithm [15].However,this results in a morecomplex algorithm,which requires solving a robust PCA problemat each iteration.Alternatively,one can resort to nonlinear mini-mization techniques,which are only guaranteed to converge to a local minimum.Third,K -subspaces requires n and f d i g n i ¼1to be known beforehand.One possible avenue to be explored is to usethe model-selection criteria for mixtures of subspaces proposed in [43].We refer the reader to [45]and [46]for a more detailed analy-sis of some of the aforementioned issues.STATISTICAL METHODS The approaches described so far seek to cluster the data according to multiple subspaces using mostly algebraic and geometric proper-ties of a union of subspaces.While these approaches can handle noise in the data,they do not make explicit assumptions about the distribution of data inside the subspaces or about the distribution ofnoise.Therefore,the estimates they provide are not optimal,e.g.,in a maximum likelihood (ML)sense.This issue can be addressed by defining a proper generative model for the data,as described next.MIXTURE OF PROBABILISTIC PCA Resorting back to the geometric PCA model (1),probabilistic PCA (PPCA)[47]assumes that the data within a subspace S is generated as x ¼l þU y þ ,(13)where y and are independent zero-mean Gaussian random vec-tors with covariance matrices I and r 2I ,respectively.Therefore,A VERY SIMPLE WAY OF IMPROVINGTHE PERFORMANCE OF ALGEBRAICALGORITHMS IN THE CASE OF NOISYDATA IS TO USE ITERATIVEREFINEMENT.x is also Gaussian with mean l and covariance matrix R¼UU>þr2I.It can be shown that the ML estimate of l is the mean of the data,and ML estimates of U and r can be obtained from the SVD of the data matrix X.PPCA can be naturally extended to a generative model for a union of subspaces[ni¼1S i by using a mixture of PPCA(MPPCA) model[16].Let G(x;l,R)be the probability density function of a D-dimensional Gaussian with mean l and covariance matrix R. MPPCA uses a mixture of Gaussians modelp(x)¼X ni¼1p i G(x;l i,U i U>iþr2i I),X ni¼1p i¼1,(14)where the parameter p i,called the mixing proportion,represents the a priori probability of drawinga point from subspace S i.TheML estimates of the parametersof this mixture model can befound using expectation maxi-mization(EM)[48].EM is aniterative procedure that alter-nates between data segmenta-tion and model estimation.Specifically,given initial values (e li,~U i,~r i,~p i)for the model parameters,in the E-step,the proba-bility that x j belongs to subspace i is estimated as~p ij¼G(x j;l i,~Ui~U>iþ~r2iI)~p ip(x j),(15)and in the M-step,the~p ij s are used to recompute the subspace parameters using PPCA.Specifically,p i and liare updated as~p i¼1N X Nj¼1~p ij and e li¼1N~p iX Nj¼1~p ij x j,(16)and r i and U i are updated from the SVD of~R i ¼1N~p iX Nj¼1~p ij(x jÀe li)(x jÀe l i)>:(17)These two steps are iterated until convergence to a local max-ima of the log-likelihood.Notice that MPPCA can be seen as a probabilistic version of K-subspaces that uses soft assignments p ij2½0;1 rather than hard assignments w ij¼f0;1g.As in the case of K-subspaces,the main advantage of MPPCA is that it is a simple and intuitive method,where each iteration can be computed in closed form by using PPCA.More-over,the MPPCA model is applicable to both linear and affine subspaces and can be extended to accommodate outliers[49] and missing entries in the data points[50].However,an impor-tant drawback of MPPCA is that the number and dimensions of the subspaces need to be known beforehand.One way to address this issue is to put a prior on these parameters,as shown in[51].A second drawback is that MPPCA is not optimal when the data inside each subspace or the noise is not Gaussian.A third drawback is that MPPCA often converges to a local maximum;hence,a good initialization is critical.The initiali-zation problem can be addressed by using any of the methods described earlier for K-subspaces.For example,the multistage learning(MSL)algorithm[17]uses the factorization method of [8]followed by the agglomerative refinement steps of[33]and [36]for initialization.AGGLOMERATIVE LOSSY COMPRESSIONThe agglomerative lossy compression(ALC)algorithm[18] assumes that the data are drawn from a mixture of degenerate Gaus-sians.However,unlike MPPCA,ALC does not aim to obtain an ML estimate of the parameters of the mixture model.Instead,it looks for the segmentation of the data that minimizes the coding lengthneeded to fit the points with amixture of degenerate Gaussiansup to a given distortion.Specifically,the number ofbits needed to optimally code Nindependent identically distrib-uted(i.i.d.)samples from a zero-mean D-dimensional Gaussian, i.e.,X2R D3N,up to a distortion d can be approximated as ½(NþD)=2 log2det(Iþ(D=d2N)XX>).Thus,the total number of bits for coding a mixture of Gaussians can be approximated asX ni¼1N iþD2log2det IþDd2N iX i X>iÀN i log2N iN,(18)where X i2R D3N i is the data from subspace i,and the last term is the number of bits needed to code(losslessly)the membership of the N samples to the n groups.The minimization of(18)over all possible segmentations of the data is,in general,an intractable problem.ALC deals with this issue by using an agglomerative clustering method.Ini-tially,each data point is considered as a separate group.At each iteration,two groups are merged if doing so results in the great-est decrease of the coding length.The algorithm terminates when the coding length cannot be further decreased.Similar agglomerative techniques have been used[52],[53],though with a different criterion for merging subspaces.ALC can naturally handle noise and outliers in the data.Specif-ically,it is shown in[18]that outliers tend to cluster either as a single group or as small separate groups depending on the dimen-sion of the ambient space.Also,in principle,ALC does not need to know the number of subspaces and their dimensions.In practice, however,the number of subspaces is directly related to the param-eter d.When d is chosen to be very large,all the points could be merged into a single group.Conversely,when d is very small,each point could end up as a separate group.Since d is related to the variance of the noise,one can use statistics on the data to deter-mine d(see[22]and[33]for possible methods).When the number of subspaces is known,one can run ALC for several values of d,dis-card the values of d that give the wrong number of subspaces,and choose the d that results in the segmentation with the smallestAN IMPORTANT DRAWBACK OF MPPCA IS THAT THE NUMBER AND DIMENSIONS OF THE SUBSPACES NEED TO BE KNOWN BEFOREHAND.。
图像分割的改进稀疏子空间聚类方法
图像分割的改进稀疏子空间聚类方法李小平;王卫卫;罗亮;王斯琪【期刊名称】《系统工程与电子技术》【年(卷),期】2015(000)010【摘要】提出一种基于改进稀疏子空间聚类的图像分割方法。
首先将图像进行过分割得到一些均匀区域称为超像素,并提取超像素的颜色直方图作为其特征;然后建立特征数据的改进稀疏子空间表示并由此构造图相似度矩阵,最后利用谱聚类算法得到超像素的聚类结果并作为图像分割结果。
实验结果表明,本文提出的改进稀疏子空间聚类方法具有良好的聚类性能,对噪声具有一定的鲁棒性;用于自然图像能够得到更好的分割效果。
%A novel image segmentation method based on improved sparse subspace clustering is presented. The image to be segmented is over-partitioned into some uniform sub-regions called superpixels,and color histo-gram of each superpixel is computed as its feature data.Then by employing an improved sparse subspace repre-sentation model,the sparse representation coefficient matrix is computed and used to construct the affinity ma-trix of a graph.Finally,the spectral clustering algorithm is used to obtain the image segmentation result.Ex-periments show that the proposed improved sparse subspace clustering method performs well in clustering and is robust to noise.It can obtain good segmentation results for natural color images.【总页数】7页(P2418-2424)【作者】李小平;王卫卫;罗亮;王斯琪【作者单位】西安电子科技大学数学与统计学院,陕西西安 710171;西安电子科技大学数学与统计学院,陕西西安 710171;西安电子科技大学数学与统计学院,陕西西安 710171;西安电子科技大学数学与统计学院,陕西西安 710171【正文语种】中文【中图分类】TP391【相关文献】1.改进的稀疏子空间聚类算法 [J], 张彩霞;胡红萍;白艳萍2.一种改进的稀疏子空间聚类算法 [J], 欧阳佩佩;赵志刚;刘桂峰3.改进重加权稀疏子空间聚类算法 [J], 赵晓晓;周治平;贾旋4.基于加权稀疏子空间聚类多特征融合图像分割 [J], 岳温川;王卫卫;李小平5.图像分割的加权稀疏子空间聚类方法 [J], 李涛;王卫卫;翟栋;贾西西因版权原因,仅展示原文概要,查看原文内容请购买。
Hajer Fradi图像处理文献总结
2012年1、Hajer Fradi,Jean-Luc Dugelay. Robust Foreground Segmentation UsingImproved Gaussian Mixture Model and Optical Flow. ICIEV, 248-253, 2012.(前景分离方法)Abstract: GMM background subtraction has been widely employed to separate the moving objects from the static part of the scene. However, The background model estimation step is still problematic; the main difficulty is to decide which distributions of the mixture belong to the background. In this paper,the author propose a new approach based on incorporating an uniform motion model(Optical Flow) into GMM background subtraction.在论文中对一些前景分离方法进行了介绍与比较。
并对光流与GMM结合使用的部分文献进行了讨论。
对改进的GMM算法(参考文献15)和使用二次多项式模型的光流法(参考文献19)进行了说明。
在Pixel level结合使用了两种方法进行前景分离。
结果:在i2r数据集上进行了实验。
与参考文献15和19的方法在检测率(Recall)和精度(Precision)两个方面进行了对比,证明了方法的有效性。
2、Hajer Fradi,Jean-Luc Dugelay. People counting system in crowded scenes based on feature regression. EUSIPCO,136-140,2012.(拥挤状况下人群计数方法)Abstract: The author propose a counting system based on measurements of interest points, where a perspective normalization and a crowd measure-informed density estimation are introduced into a single feature. Then, the correspondence between this feature and the number of persons is learned by Gaussion Process regression.人群计数方法通常包括两类:detection based and features based.论文对两类方法存在的问题和研究现状进行了说明。
开题实验报告(共2篇)-开题报告
开题实验报告(共2篇)-开题报告一、实验目的与背景本次实验的主要目的是为了深入理解开题报告的撰写要求,并通过实践来提升自己的写作能力。
开题报告是研究生阶段开展研究工作的第一次正式报告,对于确定研究方向、明确研究内容以及论证研究的可行性具有重要意义。
在撰写开题报告的过程中,我们需要详细描述研究的背景与意义、研究的目标与内容、研究的可行性与创新点等方面,以便能够为接下来的研究工作提供基础和指导。
二、研究背景与意义2.1 研究背景随着科学技术的不断发展,人工智能在各个领域的应用越来越广泛。
在图像处理领域,机器学习算法的不断发展使得图像识别、图像分类、图像分割等任务取得了突破性进展。
然而,由于图像数据的特殊性,图像处理领域的研究仍面临诸多挑战。
其中,图像噪声的影响是一个重要的问题,它会导致图像质量下降、图像特征失真等问题,从而影响后续的图像处理任务的准确性和可靠性。
2.2 研究意义在实际应用中,图像噪声会导致图像的信息丢失、细节模糊等问题,降低了图像处理算法的准确性和可靠性。
因此,研究如何有效地去噪图像,提高图像的质量,对于改善图像处理算法的性能具有重要意义。
本次研究将探索一种基于深度学习的图像去噪方法,通过构建深度神经网络模型,实现对图像噪声的自动识别与去除,从而提高图像的质量和准确性。
三、研究目标与内容3.1 研究目标本次研究的主要目标是设计一个基于深度学习的图像去噪方法,通过训练一个深度神经网络模型,实现对于不同类型的图像噪声的自动识别与去除,最终提高图像的质量和准确性。
3.2 研究内容本次研究的内容包括以下几个方面:1. 收集和整理图像去噪的相关研究文献,总结图像去噪领域的研究现状; 2. 设计并实现基于深度学习的图像去噪方法,构建一个深度神经网络模型; 3. 使用收集的图像数据集进行模型的训练,并对训练结果进行评估和测试; 4. 对比本次研究方法与传统的图像去噪方法,分析其优缺点和差异。
四、研究的可行性与创新点4.1 研究的可行性本次研究基于深度学习的图像去噪方法,在图像处理领域已经有了一定的研究基础,相关的理论和技术也已经逐渐成熟。
Defeng Sun(50页)
+ w, z
X
∀ (u, w) and (v, z ) ∈ H × X .
Let A∗ : Y → IRp be the adjoint of A. Let c be a given vector in IRp and b an element in Y . The matrix cone programming (MCP) we consider in this paper takes the following form min cT x s.t. Ax ∈ b + Q × K . (1)
m n
(i) f (·) = · (ii) f (·) = · j ≤ n}; (iii) f (·) = ·
m×n , X F , the Frobenius norm, i.e., for each X ∈ IR
F
=(
i=1 j =1
|xij |2 )1/2 ;
∞,
the l∞ norm, i.e., for each X ∈ IRm×n , X
Xi
:= tτ + X, Y
∀ (t, X ) and (τ, Y ) ∈ IR × IRmi ×ni .
Denote the natural inner product of X by ·, · X . Note that for each i ≥ 1, except for the case when fi (·) = · F , the cone epi fi is not self-dual unless min{mi , ni } = 1. So, in general the above defined closed convex cone K is not self-dual, i.e., K = K∗ := {W ∈ X | W, Z X ≥ 0 ∀ Z ∈ K}, the dual cone of K. When f (·) = · F , epi f actually turns to be the second order cone (SOC) if we treat a matrix X ∈ IRm×n as a vector in IRmn by stacking up the columns of X , from the first to the n-th column, on top of each other. The SOC is a well understood convex cone in the literature and thus is not the focus of this paper. We include it here for the sake of convenience in subsequent discussions. Let H be a finite-dimensional real Euclidean space endowed with an inner product ·, · H and its induced norm · H . Let Q ∈ H be the cross product of the origin {0} and a symmetric cone in 2
基于隐空间的低秩稀疏子空间聚类
基于隐空间的低秩稀疏子空间聚类刘建华【摘要】提出了一种基于隐空间的低秩稀疏子空间聚类算法,在聚类的过程中可以对高维数据进行降维,同时在低维空间中利用稀疏表示和低秩表示对数据进行聚类,大大降低了算法的时间复杂度。
在运动分割和人脸聚类问题上的实验证明了算法的有效性。
%T his paper proposed a novel algorithm named low‐rank sparse subspace clustering in latent space (LatLRSSC ) , it can reduce the dimension and cluster the data lying in a union of subspaces simultaneously . The main advatages of our method is that it is computationally efficient . The effectiveness of the algorithm is demonstrated through experiments on motion segmentation and face clustering .【期刊名称】《西北师范大学学报(自然科学版)》【年(卷),期】2015(000)003【总页数】5页(P49-53)【关键词】子空间聚类;稀疏表示;低秩表示;运动分割;人脸聚类【作者】刘建华【作者单位】浙江工商职业技术学院电子与信息工程学院,浙江宁波 315012【正文语种】中文【中图分类】TP391过去的几十年人们见证了数据的爆炸式增长,这对于数据的处理工作提出了巨大的挑战,特别是这些数据集通常都是高维数据.数据的高维特性不仅增加了计算时间,而且由于噪声和环境空间降低了算法的性能.实际上,这些数据的内在尺度往往比实际空间中小得多,这就促使人们运用一些技术发现高维数据的低维表示,比如低秩近似和稀疏表示等[1-3].实际上,在许多问题中,高维空间中的数据往往可以用低维子空间进行表示.子空间聚类算法就是挖掘数据低维子空间的一种聚类算法[4],它已经被广泛地应用在许多领域,如计算机视觉中的运动分割和人脸聚类,控制领域的混合系统辨识,社交网络中的社区集群.为了解决高维数据聚类问题,目前已经提出了很多聚类算法,如混合高斯模型、NMF和一些代数方法(如k-subspace)、混合概率主成分分析(MPPCA)、多阶段学习与RANSAC.这些方法取得了一定的效果,但是还有很多局限性,如计算复杂度太高,对噪音敏感等.最近,利用稀疏表示和低秩表示进行子空间聚类的研究得到了广泛的关注,研究人员提出了一系列相关的新型子空间聚类算法,如稀疏子空间聚类(SSC)[5,6]、低秩表示(LRR)[4,7]、低秩子空间聚类(LRSC)[8]和低秩稀疏子空间聚类(LRSSC)[9],这些方法的本质是每一个数据点可以通过其他数据点稀疏表示或者低秩表示得到.尽管稀疏子空间聚类(SSC)和低秩表示(LRR)取得了巨大的成功,仍然有很多问题没有解决.特别是稀疏表示和低秩表示的计算复杂度相当高,尤其是当数据的维数很高的时候[6].为了解决这个问题,通常的做法是在应用这类聚类算法之前对数据进行降维预处理.一些降维方法如主成分分析(PCA)或者随机投影(RP)可以有效降低数据维数.然而,一个良好学习的投影矩阵可以在更低的数据维度上得到更好的聚类效果.基于低维隐空间的稀疏表示已经有学者提出了一些方法[10,11],但是这些方法都是为分类问题进行设计,而非针对聚类问题.基于上述问题,文中提出一种基于低维隐空间的低秩稀疏子空间聚类方法(LatLRSSC),在数据降维的同时,发掘数据的稀疏和低秩表示.首先算法学习得到数据从原始空间到低维隐空间的变换矩阵,同时在这个低维的隐空间中得到数据的稀疏和低秩系数,最后利用谱聚类算法对数据样本进行分割.为了验证文中提出方法的有效性,分别在HOPKINS 155 数据集和extended Yale B 数据集上进行运动分割和人脸聚类的实验,实验结果表明,文中提出的LatLRSSC算法具有较好的聚类性能.根据文献[5,6],每一个数据点可以表示为其他数据点的稀疏线性组合,通过这些稀疏系数构造清河矩阵进行子空间聚类.也就是说,给定一个数据集X,希望找到一个系数矩阵C,满足X=XC并且diag(C)=0.可以通过求解(1)式得到解.当数据集被噪声G污染时,SSC算法假设每个数据点可以表示为X=XC+G,可以通过求解凸优化问题(2)得到解.1.2 低秩表示(LRR)低秩表示(LRR)算法和稀疏子空间聚类(SSC)算法非常类似,区别在于LRR算法的目标是寻找数据的低秩表示,而SSC算法在于寻找数据的稀疏表示.LRR通过求解凸优化问题(3)得到解.当数据集被噪声G污染时,LRR通过求解凸优化问题(4)得到解.最后,通过得到的稀疏矩阵(利用SSC或者LRR),构造亲和矩阵,在这个亲和矩阵上利用谱聚类算法,就可以得到最终的聚类结果.不同于传统的稀疏子空间聚类算法(SSC)和低秩表示(LRR),文中将数据映射到一个低维的隐空间中,同时在这个低维空间中寻求数据的低秩稀疏系数.令P∈Rt×D为一个线性变换矩阵,它将数据从原始空间RD映射到一个维数为t的隐空间中.通过目标函数的最小化,可以同时得到变换矩阵和数据集的低秩稀疏系数:其中(6)式的第一项为求取数据集的低秩系数;第二项为求取数据集的稀疏系数;第三项的主要目的是去除噪声影响;最后一项是类似于PCA的正则项,主要目的是保证映射变换不能过多丢失一些原始空间的信息;λ1和λ2为非负常数.另外,要求P正交并且归一化,这样就避免了解的退化,并且保证了优化方法的计算效率.可以注意到,(6)式是能够进行扩展的,这样就可以对位于仿射子空间中的数据进行处理.可以对优化问题(5)增加一个约束条件得到2.1 优化问题求解根据上面的定义,有下面的命题.命题1 优化问题(5)存在一个最优化的解P*,对于某些Ψ∈RN×t,N为数据样本数,P*具有以下形式直观上,命题1是说投影变换可以写成数据样本的一个线性组合.文献[12]中,这个形式已经被应用在字典学习的框架中.基于命题1,目标函数(6)可以写为其中K=YTY.约束条件变为所以,优化问题(5)可表示为其中这样,可分别通过Ψ和C来求解这个优化问题.首先固定C,目标函数就变为其中Q=ΨΨT∈RN×N.由约束条件ΨTKΨ=I可得到新的约束条件ΨΨTKΨΨT=ΨΨT或者QKQT=Q,目标函数(12)可以进一步简化为使用同样的约束条件,并且知tr(K)为一个常数,利用K=VSVT的特征值分解,得到 ,其中Ψ.这样(13)式就可以表示为利用ΨTKΨ=MTM和变换得到等价于问题(11)的优化问题:优化问题(14)就是经典的最小特征值问题.它的解就是与Δ的前l个最小特征值相关联的l个特征向量.一旦得到了最优的M*,那么最优的Ψ*就可以利用(5)式得到: 2.3 C的优化步骤固定Ψ,通过求解下列优化问题来得到C其中B=ΨTK.接下来,推导了一个解决优化问题(16)的有效方法.在ADMM框架下,引入两个辅助变量C=C1=C2来区分两个不同的范数,引入J来保证每一步都得到闭合解: 则增广拉格朗日方程为其中μ1和μ2为可调参数.每一步中,通过分别求解J,C1和C2的梯度,更新对偶变量Λ1和Λ2,可以得到ADMM每一步的迭代公式.分别定义一个软阈值操作符和奇异值软阈值操作符Πβ(X)=Uπβ(Σ)VT,其中UΣVT为B=ΨTK的瘦型奇异值分解.得到C1和C2的更新规则如下:Λ1和Λ2的更新规则如下:求解完上述优化问题后,可以得到系数矩阵C,则亲和矩阵定义为T,最后利用谱聚类算法即可得到最终聚类结果.分别验证文中提出的LatLRSSC算法在运动分割和人脸聚类两种问题上的性能.对于运动分割问题,采用Hopkins 155数据集,包含155个视屏序列.对于人脸聚类问题,采用Extended Yale B数据集,包含38类人脸图像数据.实验中,采用聚类错误率来评价聚类算法的性能:聚类错误率.对比算法采用了LRR,LRSC,SSC和LRSSC这4种应用较为广泛的子空间聚类算法.运动分割是指从视频序列中对于不同的刚体运动提取一组二维点轨迹,对这些轨迹进行聚类,实现不同运动物体的分割.这里,数据集X为2F×N维,其中N为二维轨迹的数目,F为视频的帧数.在仿射投影模型中,这些与刚体运动相关联的二维轨迹位于维数为1,2或3的仿射子空间R2F中.实验中,采用Hopkins 155运动分割数据集,其中120个视频序列由2个运动构成,35个视频序列由3个运动构成.平均来说,每一个包含2个运动的视频序列包含N=256个特征轨迹和F=30帧画面,而每一个包含3个运动的视频序列包含N=398个特征轨迹和F=29帧画面.对于每一个视频序列,这些二维轨迹通过跟踪器自动提取,并且噪音点已经手动去除.表1比较了不同算法在Hopkins 155数据集上的聚类表现.实验中,除了文中提出的算法,对于其他算法,利用PCA进行预处理,将数据集降维到4n维(n为子空间数目).从表1 可以看出,对于2个或3个运动,文中提出的算法LatLRSSC相较于其他4种方法具有较好的聚类性能,说明LatLRSSC对于运动分割问题具有很好的效果.对比其他算法可知,相对于直接采用PCA进行降维操作,LatLRSSC通过对数据集的学习能够得到更加合理的映射矩阵.给定多个人在同一角度、不同光照的人脸图像,希望将不同的人脸图像划分开来(图1).在Lambertian假设下,物体图像在固定角度、不同光照条件下位于一个近似的9维子空间中,因此,采集的多个人的人脸图像也位于这样的9维子空间中. 采用Extended Yale B数据集,数据集包含n=38个人的人脸图像(192×168像素),每个人有Ni=64张在不同光照条件下的正面图像.为了降低计算成本和存储代价,将每幅人脸图像采样到48×42像素,并将图像向量化为2 016维,因此维度D=2 016.实验中,除了文中提出的算法,对于其他算法,依然利用PCA进行降维预处理.为了研究这些算法对不同聚类数目的聚类性能,将38类人脸分成4组,前3组分别包含1~10,11~20,21~30个人的人脸图像,第四组包含31~38个人的人脸图像.对于前3组,取n∈{2,3,5,8,10};对最后一组,取n∈{2,3,5,8}.实验结果如表2所示.从表2可以看出,文中提出的LatLRSSC对不同的聚类数目均得到了更低的聚类错误率,说明了该算法优于其他算法.文中提出了一种基于隐空间的低秩稀疏子空间聚类算法.本算法是稀疏子空间聚类和低秩表示的一种扩展,该算法在聚类的过程中可以对高维数据进行降维,同时在低维空间中利用稀疏表示和低秩表示对数据进行聚类.在运动分割和人脸聚类上的实验表明,该算法具有很好的聚类性能.与大多数子空间聚类算法一样,文中假设子空间是线性的,如何将本算法在非线性子空间上进行扩展是接下来需要继续研究的工作.。
改进的稀疏子空间聚类算法
改进的稀疏子空间聚类算法张彩霞;胡红萍;白艳萍【摘要】在现有的稀疏子空间聚类算法理论基础上提出一个改进的稀疏子空间聚类算法:迭代加权的稀疏子空间聚类.稀疏子空间聚类通过解决l1最小化算法并应用谱聚类把高维数据点聚类到不同的子空间,从而聚类数据.迭代加权的l1算法比传统的l1算法有更公平的惩罚值,平衡了数据数量级的影响.此算法应用到稀疏子空间聚类中,改进了传统稀疏子空间聚类对数据聚类的性能.仿真实验对Yale B人脸数据图像进行识别分类,得到了很好的聚类效果,证明了改进算法的优越性.%Based on the existing theory of sparse subspace clustering algorithm,a modified sparse subspace clustering algorithm is put forward:iterative weighted sparse subspace clustering algorithm.In order to cluster data,sparse subspace clustering algorithm clusters high-dimensional data to different subspaces by solving minimization algorithm and applying spectralclustering.Iterative algorithm has more fair punishment value then the traditional algorithm,with balancing the influence of magnitude ofdata.The algorithm is applied to the sparse subspace clustering to improve the traditional sparse subspace clustering performance for data. Simulation experiment recognizing and classify Yale B face data image.The clustering effect is very good,proving the superiority of the improved algorithm.【期刊名称】《火力与指挥控制》【年(卷),期】2017(042)003【总页数】5页(P75-79)【关键词】稀疏子空间聚类;迭代加权;谱聚类算法;人脸识别【作者】张彩霞;胡红萍;白艳萍【作者单位】中北大学理学院,太原 030051;中北大学理学院,太原 030051;中北大学理学院,太原 030051【正文语种】中文【中图分类】TP301.6在很多实际应用中,高维数据无处不在,如计算机视觉,图像处理,运动分割,人脸识别等。
基于有效距离的低秩表示
2021574基于有效距离的低秩表示陶体伟1,2,刘明霞2,王明亮3,王琳琳4,杨德运2,张强51.桂林理工大学信息与工程学院,广西桂林5410062.泰山学院信息科学技术学院,山东泰安2710213.南京航空航天大学计算机科学与技术学院,南京2111064.泰山学院数学与统计学院,山东泰安2710215.大连理工大学计算机科学与技术学院,辽宁大连116000摘要:低秩表示(Low-Rank Representation,LRR)在探索数据中的低维子空间结构方面具有良好的效果,近年来引起了人们的广泛关注。
然而,传统的LRR方法通常使用欧氏距离来度量样本的相似性,仅考虑相邻样本两两之间的距离信息,对于具有流形结构的数据往往不能反映其固有的几何结构。
最近的研究表明,概率激励距离测量(即有效距离)可以有效地对数据的全局信息进行建模,来度量样本间的相似性。
在此基础上,提出了一种基于有效距离的低秩表示模型。
该方法用稀疏表示方法计算样本之间的有效距离来构造拉普拉斯矩阵,并将其进行低秩表示拉普拉斯正则化约束,该模型不仅能表示全局低维结构,而且能捕获流形结构数据中的几何结构信息。
为了评估方法的有效性,在三个公开数据集上进行了分类实验。
实验结果表明,该方法比基于传统欧氏距离的方法,具有更高的分类性能和更强的鲁棒性。
关键词:低秩表示(LRR);有效距离;稀疏表示;分类文献标志码:A中图分类号:TP391.4doi:10.3778/j.issn.1002-8331.1912-0015Effective Distance Based Low-Rank RepresentationTAO Tiwei1,2,LIU Mingxia2,WANG Mingliang3,WANG Linlin4,YANG Deyun2,ZHANG Qiang51.School of Information and Engineering,Guilin University of Technology,Guilin,Guangxi541006,China2.School of Information Science and Technology,Taishan University,Tai’an,Shandong271021,China3.College of Computer Science and Technology,Nanjing University of Aeronautics and Astronautics,Nanjing211106,China4.School of Mathematics and Statistics,Taishan University,Tai’an,Shandong271021,China5.College of Computer Science and Technology,Dalian University of Technology,Dalian,Liaoning116000,ChinaAbstract:Low-Rank Representation(LRR)has recently attracted a great deal of attention due to its pleasing efficacy in exploring low-dimensional subspace structures embedded in data.However,conventional LRR-based methods simply use Euclidean distance to measure the similarity of samples,where cannot reflect the inherent geometric structure of data with manifold structure.Meanwhile,recent studies have shown that a probabilistically motivated distance measurement(called effective distance)can effectively model the global information of data to measure the similarity between samples.To this end,this paper proposes an Effective Distance Based Low-Rank Representation(EDLRR)model,which firstly uses the sparse representation method to calculate the effective distance between samples for constructing a Laplacian matrix,and then develops a Laplacian regularized low-rank representation term.Low rank representation model.This method can not only represent the global low-dimensional structure,but also capture the geometric structure information in the data of the manifold structure.To evaluate the effectiveness of the proposed method,this paper conducts classification experiments基金项目:国家自然科学基金(61703301);山东省自然科学省属高校优秀青年联合基金(ZR2019YQ27);泰山学院科研基金(Y-01-2018019);泰山学者青年专家项目。
孙富春简介pdf
孙富春简介pdf孙富春简历孙富春,清华大学计算机科学与技术系教授,博士生导师,国家863计划专家组成员,国家自然基金委重大研究计划“视听觉信息的认知计算”指导专家组成员,计算机科学与技术系学术委员会副主任, 智能技术与系统国家重点实验室常务副主任; 兼任国际刊物《IEEE Trans. on Fuzzy Systems》,《Mechatronics》和《International Journal of Control, Automation, and Systems》副主编或大区主编,《International Journal of Computational Intelligence Systems》和《Robotics and Autonumous Systems》编委;兼任国内刊物《中国科学F:信息科学》和《自动化学报》编委;兼任中国人工智能学会认知系统与信息处理专业委员会主任,IEEE CSS智能控制技术委员会委员。
98年3月在清华大学计算机应用专业获博士学位。
98年1月至2000年1月在清华大学自动化系从事博士后研究,2000年至今在计算机科学与技术系工作。
工作期间获得的主要奖励有:2000年全国优秀博士论文奖,2001年国家863计划十五年先进个人,2002年清华大学“学术新人奖”,2003年韩国第十八届Choon-Gang 国际学术奖一等奖第一名,2004年教育部新世纪人才奖,2005年清华大学校先进个人,2006年国家杰出青年基金。
获奖成果5项,两项分别获2010年教育部自然科学奖二等奖(排名第一)和2004年度北京市科学技术奖(理论类)二等奖(排名第一)、一项获2002年度教育部提名国家科技进步二等奖(排名第二)、三项获省部级科技进步三等奖。
译书一部,专著两部,在国内外重要刊物发表或录用论文150余篇,其中在IEE、IEEE汇刊、Automatica等国际重要刊物发表论文90余篇,80余篇论文收入SCI,SCI期刊他人引用700余次,200多篇论文收入EI,有两篇论文曾被评为国内二级学会的最佳优秀论文奖。
一种压缩采样中的稀疏度自适应子空间追踪算法
一种压缩采样中的稀疏度自适应子空间追踪算法杨成;冯巍;冯辉;杨涛;胡波【摘要】针对压缩采样中未知稀疏度的信号,本文提出一种自适应子空间追踪算法.首先,采用了一种基于匹配测试的估计方法获取稀疏度的估计值,再通过子空间追踪重构信号.若子空间追踪不能成功重构,则通过渐近增加信号稀疏度的方法实施估计,而上述过程可描述为在弱匹配原则下新原子的选取过程.仿真结果表明,本文的算法可以准确有效重构信号,同时运算量也较低.【期刊名称】《电子学报》【年(卷),期】2010(038)008【总页数】4页(P1914-1917)【关键词】压缩采样;子空间追踪;稀疏分解【作者】杨成;冯巍;冯辉;杨涛;胡波【作者单位】复旦大学电子工程系,上海,200433;复旦大学电子工程系,上海,200433;复旦大学电子工程系,上海,200433;复旦大学电子工程系,上海,200433;复旦大学电子工程系,上海,200433【正文语种】中文【中图分类】TP391.411 引言压缩采样(Compressive Sampling,CS)针对具有稀疏性或在特定域上可转化为稀疏性的信号,通过实施远低于奈奎斯特采样率的的随机采样,可准确完成原始信号的重构[1~4].由于CS有效降低了信号获取、存储及传输的代价,该理论一经出现即得到广大研究人员的密切关注.设x是一个长度为N的实向量,即x∈RN.x中只有不超过K个非零元素,x被称作K-稀疏.研究表明:对x进行随机线性投影,即y=Φ x,其中y是M维观测向量(M≪N),Φ是M×N的测量矩阵,则基于y可有效完成信号x 的重构.如何在已知y和Φ的条件下快速、有效的重建x是CS研究的一个重要方面[5].目前已有的方法可包括组合优化,非凸优化,凸优化,贪婪算法等几类.凸优化方法通过求解一个最小化凸问题逼近信号,如基追踪(Basis Pursuit)[6]等.这些算法中,贪婪算法由于算法结构简单,运算量小等特点受到重视.传统贪婪算法匹配追踪(MP)[7]、正交匹配追踪(OMP)[8,9]已在压缩采样中得到了应用.改进的算法包括分段正交匹配追踪(Stagewise OMP,StOMP)[10]、正则化正交匹配追踪(RegularizedOMP,ROMP)[11].这几种算法只有在信号具有较低的稀疏度时才能较好地重建信号.子空间追踪(Subspace Pursuit,SP)[12]和压缩采样匹配追踪(Compressive Sampling Matching Pursuit,CoSaMP)[13]是两种比较相似的算法,具有较好的性能.但是SP和CoSaMP要求信号稀疏度已知,这在很多应用中难以满足.如果对稀疏度的估计不准确,很多信号不能得到精确重建.文献[14]中提出的稀疏度自适应匹配追踪(Signal Adaptive Matching Pursuit,SAMP)算法在每次迭代中采用增加固定数目原子的方法估计信号稀疏度.当K较大时,SAMP算法由于迭代次数多而导致运算量特别大.2 自适应子空间追踪算法在给出自适应子空间追踪算法构造之前,首先分析约束等距性质(Restricted Isometry Property,RIP),它是重建x的重要基础.文献[4]提出了RIP性质用于描述从y重建x的条件.指定表示l0范数,表示l2范数.如果对所有满足的x,矩阵Φ都满足式(1):那么称Φ是以参数(K,)满足RIP性质的线性算子.目前研究发现当测量次数M足够大时,高斯随机矩阵,伯努利随机矩阵等以极大的概率满足RIP性质[1].CS的信号重建可以看成信号稀疏分解问题.如果把Φ的列向量v i(1≤i≤N)看作原子并归一化,它们的集合构成超完备字典D.由于x中有K个非零元素,观测值y可以看作由D中K个原子线性组合表示的信号.设这K个原子的索引组成集合Γ,y可以表示成y=ΦΓxΓ.其中ΦΓ表示由Φ索引在Γ中的列向量所组成的子矩阵.xΓ表示x中索引在Γ中的元素组成的向量.y在字典D上具有稀疏的表示,即可寻找最少的一组原子{vi|i∈¯Γ},使残差信号r=y-Φ¯Γx¯Γ能量最小.本文提出的SASP算法的目标是在K未知条件下找到这样一组原子{vi|i∈¯Γ}.其基本思想如下(见图1):首先,SASP使用一种稀疏度估计方法得到集合Γ0,Γ0中元素数小于K.SASP通过后续迭代改善估计并重建x.设在第n次迭代中通过子空间追踪选取出原子{vi|i∈Γn}.如Γn中原子能很好地表示y说明对K的估计是准确的.否则选取新的原子将其索引加入Γn,并重复上述过程.下面分别从稀疏度估计,新原子选取,子空间追踪三部分描述SASP算法.2.1 稀疏度估计以下给出一种稀疏度估计方法,其思路是通过匹配测试得到一个原子集合,使其势K0略小于K.令g=Φ*y,Φ*表示Φ的转置矩阵.设信号y的真实支撑集为Γ,用|◦|表示集合的势,则=K.设g中第i个元素为gi,取前K0(1≤K0≤N)个最大值的索引得到的集合为Γ0,|=K0.命题设Φ以参数(K,δK)满足RIP性质.如果证明取|gi|(1≤i≤N)中前K个最大的元素的索引得到集合~Γ,K0≥K时有~Γ⊆Γ0,显然≥根据RIP的定义,ΦΓ的奇异值在和之间.如果用λ(ΦΓ)表示矩阵的特征值,则有1-δK≤λ(ΦΓ)≤1+δK,所以可以得到再由RIP性质的定义可知综合不等式(2)、(3)可以得到得证.根据以上结论的逆否命题,当时,K0<K.从这个命题可以得到对K初始估计方法:K0取初始值1,如果则依次增加K0直到不等式不成立,同时得到的是对Γ的初始估计Γ0.2.2 新原子选取当子空间追踪无法通过迭代使残差能量满足给定阈值时,可选取更多的原子表示y.SASP算法采用弱匹配策略选取与残差信号比较接近的原子,并将其索引加入集合Γn.令g=Φ*r,则g中第i个元素gi是原子向量vi与残差信号r的内积,即<vi,r>.引入实数α为弱匹配参数,α∈(0,1].SASP选取所有满足|gi|≥α|的原子,则Γn的更新可表示为弱匹配可以看作选取残差的投影大于一定阈值的原子,该阈值与|有关.α=1时,与OMP类似每次迭代只选取与残差最匹配的单个原子,Γn中每次增加一个元素.当α比较小时,每次迭代可能选取多个原子.所以通过弱匹配可以根据信号调整增加的原子数.已有实验表明,参数α取值0.7~0.9之间可以兼顾算法性能和运算速度. 2.3 子空间追踪子空间追踪部分也通过迭代改进估计结果,每次迭代中采用了一种后退策略[12].算法的结构如图2所示.设经过n次迭代后得到原子的索引集合为Γn,残差信号为rn.在第n次迭代中,将rn-1分别投影到字典D的各个原子向量上,并选出投影最大的|Γn-1|个原子.把它们的索引与Γn-1合并得到集合^Γ,^Γ中有2|Γn-1|个元素.再把y投影到原子向量{vi|i∈^Γ}张成的空间,从^Γ中去除系数最小的|Γn-1|个原子的索引得到~Γ,|~Γ|=|Γn-1|.再次把y投影到原子{vi|i∈~Γ}张成的空间,如果得到的残差能量小于rn-1的能量则更新Γn并重复上述过程.文献[12]证明如果Φ以参数(K,δ3K)满足RIP性质且δ3K<0.165,则有<,子空间追踪可以从y准确重建x.2.4 算法步骤第(1)~(4)步为稀疏度估计部分,得到初始估计集合Γ0和残差r0.第(5)步初始化迭代次数.第(6)~(12)步为SASP算法迭代主体.第(6)~(11)步通过子空间追踪改进估计结果.第(12)步判断终止条件是否满足,如果满足整个算法退出.否则通过弱匹配在Γn中加入新选取的原子的索引.因为rn-1与vi(i∈Γn-1)正交,Γn-1中元素不会被重复选出.算法的终止条件可设为:(1)当残差能量小于一定值时终止;(2)当原子与残差的相关小于某个阈值时终止.2.5 算法分析由可知rn能量单调递减,算法至少收敛到一个局部最小点.稀疏度估计部分主要运算在于求M次投影,计算量相对较小.SASP算法的计算复杂度与外层迭代的次数密切相关,其上限为O(K2MN).整个算法的计算量中对最小二乘问题求解占很大一部分.在外层迭代中每次重复需要求解一次最小二乘问题,即算法步骤(8).子空间追踪部分第(10)、(12)步分别需要求解一次最小二乘问题.算法外层迭代次数与每次选取的原子数和信号的稀疏度相关.3 实验结果3.1 稀疏度估计为了验证SASP算法中稀疏度估计的结果,本节通过实验测试稀疏度估计部分.实验中,M=256,N=512,Φ为M×N高斯随机矩阵,每项元素是独立分布零均值单位方差的高斯随机变量.从x0中随机取K个元素,每一项值为独立分布零均值单位方差高斯随机变量,x0中其它元素值为零.通过y=Φ x0得到观测向量y,图中横坐标表示实验次数,纵坐标表示K的估计值.图3中比较了K=51条件下δK取不同估计值时得到的K估计值.3.2 信号重建实验这一部分比较OMP、StOMP、ROMP、SP、SAMP、SASP算法性能和运算时间.实验测试K取不同值的结果,其它设置与上一节相同.信号准确重建的条件设为x和x0中非零元素位置一样,且误差的能量小于10-15.StOMP中阈值ts取3,SAMP中步长s=1,SASP中α=0.7,δK取估计值0.3.算法中最小二乘问题都采用QR分解法求得,终止条件都设为残差能量小于ε=10-5.算法在Intel Core2 Duo E8400机器上运行,软件版本为Matlab R2008a.实验中OMP和StOMP算法的实现采用的是SparseLab(/)工具箱.SAMP和ROMP的实现采用作者提供的代码.对于不同的K值,所有算法都运行500次来计算重建成功率和平均运行时间.图4中表示不同稀疏度下信号准确重建率.从图4中可以看出本文提出的算法性能超过OMP、StOMP、ROMP和SP算法,与SAMP相比性能相当.当K/N大于0.25时SASP和SAMP才会有较多信号不能成功重建.图5表示了各个算法运行平均时间.相比SP算法,SASP因为迭代次数增加而导致运算时间超过SP.而与SAMP 算法相比,SASP的运算时间远小于SAMP.3.3 图像实验实验中使用的图像是256×256像素的Lena图像,采样矩阵采用结构化随机采样矩阵[15].实验比较了各个算法重建图像的PSNR和运算时间,所有算法选取出约3000个原子后终止.为了加快SAMP运算时间,实验比较了步长分别取s=50和s=100的结果.所有算法中对于大规模最小二乘问题的求解都采用LSQR算法.表1比较了不同算法重建图像的PSNR和运算时间.从表中可以看出SASP重建图像PSNR最高,运算时间也较短.SAMP需要取合适的步长s才能在运算时间和重建效果间取得一个较好的平衡.表1 不同算法重建时间与性能比较算法Lena(M=0.3*N)Lena(M=0.4*N)Lena(M=0.5*N)PSNR(dB)运行时间(s)PSNR(dB)运行时间(s)PSNR(dB)运行时间(s)OMP27.24336.3128.81338.2629.36339.30 StOMP21.620.8123.970.8526.450.91 ROMP23.041.7426.132.5927.022.41 SP27.1719.5727.6710.4729.496.40SAMP(s=50)27.27217.3129.12217.3429.62206.25SAMP(s=100)27.25118.4829.06122.6529.56109.26SASP27.5019.2529.2919.0329.8019.204 结论本文提出了一种自适应子空间追踪算法SASP,该算法可以在未知信号稀疏度的情况下准确重建信号.算法使用一种新的稀疏度估计方法得到稀疏度的初始估计值,然后通过迭代进行估计的更新.在每次迭代中采用弱匹配原则选取新原子,再通过子空间追踪改善结果并重建信号.实验表明,SASP算法可以有效地重建稀疏信号,同时具有较低的运算量.参考文献:【相关文献】[1]D L pressed sensing[J].IEEE Trans Info Theory,2006,52(4):1289-1306.[2]E J Candès,J Romberg,T Tao.Robust uncertainty principles:Exact signal reconstruction from highly incomplete frequency information[J].IEEE Trans Info Theory,2006,52(2):489-509.[3]E J Candès,T Tao.Near-optimal signal recovery from random projections:Universal encoding strategies[J].IEEE Trans Info Theory,2006,52(12):5406-5425.[4]E J Candès,T Tao.Decoding by linear programming[J].IEEE Trans InfoTheory,2005,51(12):4203-4215.[5]石光明,刘丹华,高大化,等.压缩感知理论及其研究进展[J].电子学报,2009,37(5):1070-1081.Shi Guang-min,Liu Dan-hua,Gao Dahua,etc.Advance s in theory and application of compressed sensing[J].Acta Sinica Electronica,2009,37(5):1070-1081.(in Chinese)[6]S S Chen,D LDonoho,M A.Saunders.Atomic decomposition by basis pursuit[J].SIAM Rev,2001,43(1):129-159.[7]S Mallat,Z Zhang.Matching pursuits with time-frequency dictionaries[J].IEEE Trans Signal Process,1993,41(12):3397-3415.[8]J A Tropp.Greed is good:Algorithmic results for sparse approximation[J].IEEE Trans Info Theory,2004,50(10):2231-2242.[9]J A Tropp,AC Gilbert.Signal recovery from random measurements via orthogonal matching pursuit[J].IEEE Trans Info Theory,2007,53(12):4655-4666.[10]D L Donoho,Y Tsaig,I Drori,etc.Sparse solution of underdetermined linear equations by stagewise Orthogonal Matching Pursuit[OL].2007,/~donoho/Reports/2006/StOMP-20060403.pdf.[11]D Needell,R Vershynin.Uniform uncertainty principle and signal recovery via regularized orthogonal matching pursuit[OL]./abs/0707.4203,2007-7-28/2008-3-15.[12]W Dai,enkovic.Subspace pursuit for compressive sensing signal reconstruction[OL]./abs/0803.0811,2008-3-6/2009-1-8.[13]D Needell,JA Tropp.CoSaMP:Iterative signal recovery from incomplete and inaccurate samples[J].Applied and Computational Harmonic Analysis,2009,26:301-321.[14]Thong T Do,Lu Gan,Nam Nguyen etc.Sparsity adaptive matching pursuit algorithm for practical compressed sensing[A].Proc Asilomar Conference on Signals,Systems,and Computers[C].Pacific Grove,CA,United States:IEEE Signal Processing Society.2008.10:581-587.[15]T Do,T Tran,L Gan,Fast compressive sampling with structurally random matrices[A].Proc ICASSP[C].Piscataway:Institute of Electrical and Electronics Engineers Inc.2008.5:3369-3372.。
15Voting and Parts 伯克利berkeley的计算机视觉课程 PPT
Sparse representation
+ Computationally tractable (105 pixels 101 -- 102 parts) + Generative representation of class + Avoid modeling global variability + Success in specific object recognition
Multiple Features…
Wide variety of proposed local feature representations:
SIFT [Lowe]
Shape context [Belongie et al.]
Superpixels [Ren et al.]
Maximally Stable Extremal Regions [Matas et al.]
• Find regions within image
• Use Kadir and Brady's salient region operator [IJCV ’01]
Location
Appearance
Normalize
(x,y) coords. of region center
Scale
Diameter of region (pixels)
Salient regions [Kadir et al.]
Harris-Affine [Schmid et al.]
Spin images [Johnson and
Hebert]
Geometric Blur [Berg et al.]
【25】Robust recovery of subspace structures by low-rank representation
Robust Recovery of Subspace Structures by Low-Rank RepresentationGuangcan Liu,Member,IEEE,Zhouchen Lin,Senior Member,IEEE,Shuicheng Yan,Senior Member,IEEE,Ju Sun,Student Member,IEEE,Yong Yu,and Yi Ma,Senior Member,IEEEAbstract—In this paper,we address the subspace clustering problem.Given a set of data samples(vectors)approximately drawn from a union of multiple subspaces,our goal is to cluster the samples into their respective subspaces and remove possible outliers as well.To this end,we propose a novel objective function named Low-Rank Representation(LRR),which seeks the lowest rank representation among all the candidates that can represent the data samples as linear combinations of the bases in a given dictionary.It is shown that the convex program associated with LRR solves the subspace clustering problem in the following sense:When the data is clean,we prove that LRR exactly recovers the true subspace structures;when the data are contaminated by outliers,we prove that under certain conditions LRR can exactly recover the row space of the original data and detect the outlier as well;for data corrupted by arbitrary sparse errors,LRR can also approximately recover the row space with theoretical guarantees.Since the subspace membership is provably determined by the row space,these further imply that LRR can perform robust subspace clustering and error correction in an efficient and effective way.Index Terms—Low-rank representation,subspace clustering,segmentation,outlier detectionÇ1I NTRODUCTIONI N pattern analysis and signal processing,an underlying tenet is that the data often contains some type of structure that enables intelligent representation and processing.So one usually needs a parametric model to characterize a given set of data.To this end,the well-known(linear) subspaces are possibly the most common choice,mainly because they are easy to compute and often effective in real applications.Several types of visual data,such as motion[1],[2],[3],face[4],and texture[5],have been known to be well characterized by subspaces.Moreover,by applying the concept of reproducing kernel Hilbert space,one can easily extend the linear models to handle nonlinear data.So the subspace methods have been gaining much attention in recent years.For example,the widely used Principal Component Analysis(PCA)method and the recently established matrix completion[6]and recovery[7]methods are essentially based on the hypothesis that the data is approximately drawn from a low-rank subspace.However, a given dataset can seldom be well described by a single subspace.A more reasonable model is to consider data as lying near several subspaces,namely,the data is considered as samples approximately drawn from a mixture of several low-rank subspaces,as shown in Fig.1.The generality and importance of subspaces naturally lead to a challenging problem of subspace segmentation(or clustering),whose goal is to segment(cluster or group)data into clusters with each cluster corresponding to a subspace. Subspace segmentation is an important data clustering problem and arises in numerous research areas,including computer vision[3],[8],[9],image processing[5],[10],and system identification[11].When the data is clean,i.e.,the samples are strictly drawn from the subspaces,several existing methods(e.g.,[12],[13],[14])are able to exactly solve the subspace segmentation problem.So,as pointed out by Rao et al.[3]and Liu et al.[14],the main challenge of subspace segmentation is to handle the errors(e.g.,noise and corruptions)that possibly exist in data,i.e.,to handle the data that may not strictly follow subspace structures. With this viewpoint,in this paper we therefore study the following robust subspace clustering[15]problem. Problem1.1(Robust Subspace Clustering).Given a set of data samples approximately(i.e.,the data may contain errors).G.Liu is with the Department of Computer Science and Engineering,Shanghai Jiao Tong University,China,the Coordinated Science Labora-tory,University of Illinois,1308West Main Street,Urbana-Champaign,Urbana,IL61801,and the Department of Electrical and ComputerEngineering,National University of Singapore.E-mail:gutty.liu@..Z.Lin is with the Key Laboratory of Machine Perception(MOE),School ofElectronic Engineering and Computer Science,Peking University,No.5Yiheyuan Road,Haidian District,Beijing100871,China.E-mail:zlin@..S.Yan is with the Department of Electrical and Computer Engineering,National University of Singapore,Block E4,#08-27,Engineering Drive3,Singapore117576.E-mail:eleyans@.sg..J.Sun is with the Department of Electrical Engineering,ColumbiaUniversity,1300S.W.Mudd,500West120th Street,New York,NY10027.E-mail:jusun@..Y.Yu is with the Department of Computer Science and Engineering,Shanghai Jiao Tong University,No.800Dongchuan Road,MinhangDistrict,Shanghai200240,China.E-mail:yyu@..Y.Ma is with the Visual Computing Group,Microsoft Research Asia,China,and with the Coordinated Science Laboratory,University of Illinoisat Urbana-Champaign,Room145,1308West Main Street,Urbana,IL61801.Manuscript received14Oct.2010;revised8Sept.2011;accepted24Mar.2012;published online4Apr.2012.Recommended for acceptance by T.Jebara.For information on obtaining reprints of this article,please send e-mail to:tpami@,and reference IEEECS Log NumberTPAMI-2010-10-0786.Digital Object Identifier no.10.1109/TPAMI.2012.88.0162-8828/13/$31.00ß2013IEEE Published by the IEEE Computer Societydrawn from a union of linear subspaces,correct the possibleerrors and segment all samples into their respective subspaces simultaneously.Notice that the word“error”generally refers to the deviation between model assumption(i.e.,subspaces)and data.It could exhibit as noise[6],missed entries[6],outliers [16],and corruptions[7]in reality.Fig.2illustrates three typical types of errors under the context of subspace modeling.In this paper,we shall focus on the sample-specific corruptions(and outliers)shown in Fig.2c,with mild concerns to the cases of Figs.2a and2b.Notice that an outlier is from a different model other than subspaces and is essentially different from a corrupted sample that belongs to the subspaces.We put them into the same category just because they can be handled in the same way,as will be shown in Section5.2.To recover the subspace structures from the data containing errors,we propose a novel method termed Low-Rank Representation(LRR)[14].Given a set of data samples,each of which can be represented as a linear combination of the bases in a dictionary,LRR aims at finding the lowest rank representation of all data jointly.The computational procedure of LRR is to solve a nuclear norm [17]regularized optimization problem,which is convex and can be solved in polynomial time.By choosing a specific dictionary,it is shown that LRR can well solve the subspace clustering problem:When the data is clean,we prove that LRR exactly recovers the row space of the data;for the data contaminated by outliers,we prove that under certain conditions LRR can exactly recover the row space of the original data and detect the outlier as well;for the data corrupted by arbitrary errors,LRR can also approximately recover the row space with theoretical guarantees.Since the subspace membership is provably determined by the row space(we will discuss this in Section3.2),these further imply that LRR can perform robust subspace clustering and error correction in an efficient way.In summary,the contributions of this work include:.We develop a simple yet effective method,termed LRR,which has been used to achieve state-of-the-artperformance in several applications such as motionsegmentation[4],image segmentation[18],saliencydetection[19],and face recognition[4]..Our work extends the recovery of corrupted data from a single subspace[7]to multiple subspaces.Compared to[20],which requires the bases ofsubspaces to be known for handling the corrupteddata from multiple subspaces,our method isautonomous,i.e.,no extra clean data is required..Theoretical results for robust recovery are provided.While our analysis shares similar features asprevious work in matrix completion[6]and RobustPCA(RPCA)[7],[16],it is considerably morechallenging due to the fact that there is a dictionarymatrix in LRR.2R ELATED W ORKIn this section,we discuss some existing subspace segmen-tation methods.In general,existing works can be roughly divided into four main categories:mixture of Gaussian, factorization,algebraic,and spectral-type methods.In statistical learning,mixed data is typically modeled as a set of independent samples drawn from a mixture of probabilistic distributions.As a single subspace can be well modeled by a(degenerate)Gaussian distribution,it is straightforward to assume that each probabilistic distribu-tion is Gaussian,i.e.,adopting a mixture of Gaussian models.Then the problem of segmenting the data is converted to a model estimation problem.The estimation can be performed either by using the Expectation Max-imization(EM)algorithm to find a maximum likelihood estimate,as done in[21],or by iteratively finding a min-max estimate,as adopted by K-subspaces[8]and Random Sample Consensus(RANSAC)[10].These methods are sensitive to errors.So several efforts have been made for improving their robustness,e.g.,the Median K-flats[22]for K-subspaces,the work[23]for RANSAC,and[5]use a coding length to characterize a mixture of Gaussian.These refinements may introduce some robustness.Nevertheless, the problem is still not well solved due to the optimization difficulty,which is a bottleneck for these methods.Factorization-based methods[12]seek to approximate the given data matrix as a product of two matrices such that the support pattern for one of the factors reveals the segmenta-tion of the samples.In order to achieve robustness to noise, these methods modify the formulations by adding extra regularization terms.Nevertheless,such modifications usually lead to non-convex optimization problems which need heuristic algorithms(often based on alternating minimization or EM-style algorithms)to solve.Getting stuck at local minima may undermine their performances, especially when the data is grossly corrupted.It will be shown that LRR can be regarded as a robust generalization of the method in[12](which is referred to as PCA in thisFig.1.A mixture of subspaces consisting of a2D plane and two1D lines.(a)The samples are strictly drawn from the underlying subspaces.(b)The samples are approximately drawn from the underlyingsubspaces.Fig.2.Illustrating three typical types of errors:(a)noise[6],which indicates the phenomena that the data is slightly perturbed around the subspaces(what we show is a perturbed data matrix whose columns are samples drawn from the subspaces),(b)random corruptions[7], which indicate that a fraction of random entries are grossly corrupted, (c)sample-specific corruptions(and outliers),which indicate the phenomena that a fraction of the data samples(i.e.,columns of the data matrix)are far away from the subspaces.paper).The formulation of LRR is convex and can be solved in polynomial time.Generalized Principal Component Analysis(GPCA)[24] presents an algebraic way to model the data drawn from a union of multiple subspaces.This method describes a subspace containing a data point by using the gradient of a polynomial at that point.Then subspace segmentation is made equivalent to fitting the data with polynomials.GPCA can guarantee the success of the segmentation under certain conditions,and it does not impose any restriction on the subspaces.However,this method is sensitive to noise due to the difficulty of estimating the polynomials from real data, which also causes the high computation cost of GPCA. Recently,Robust Algebraic Segmentation(RAS)[25]has been proposed to resolve the robustness issue of GPCA. However,the computation difficulty for fitting polynomials is unfathomably large.So RAS can make sense only when the data dimension is low and the number of subspaces is small.As a data clustering problem,subspace segmentation can be done by first learning an affinity matrix from the given data and then obtaining the final segmentation results by Spectral Clustering(SC)algorithms such as Normalized Cuts(NCut)[26].Many existing methods,such as Sparse Subspace Clustering(SSC)[13],Spectral Curvature Cluster-ing(SCC)[27],[28],Spectral Local Best-fit Flats(SLBF)[29], [30],the proposed LRR method,and[2],[31],possess such a spectral nature,so-called spectral-type methods.The main difference among various spectral-type methods is the approach for learning the affinity matrix.Under the assumption that the data is clean and the subspaces are independent,Elhamifar and Vidal[13]show that a solution produced by Sparse Representation(SR)[32]could achieve the so-called‘1Subspace Detection Property(‘1-SDP):The within-class affinities are sparse and the between-class affinities are all zeros.In the presence of outliers,it is shown in[15]that the SR method can still obey‘1-SDP.However,‘1-SDP may not be sufficient to ensure the success of subspace segmentation[33].Recently,Lerman and Zhang [34]proved that under certain conditions the multiple subspace structures can be exactly recovered via‘p(p1) minimization.Unfortunately,since the formulation is not convex,it is still unknown how to efficiently obtain the globally optimal solution.In contrast,the formulation of LRR is convex and the corresponding optimization problem can be solved in polynomial time.What is more,even if the data is contaminated by outliers,the proposed LRR method is proven to exactly recover the right row space,which provably determines the subspace segmentation results (we shall discuss this in Section3.2).In the presence of arbitrary errors(e.g.,corruptions,outliers,and noise),LRR is also guaranteed to produce near recovery.3P RELIMINARIES AND P ROBLEM S TATEMENT3.1Summary of Main NotationsIn this paper,matrices are represented with capital symbols. In particular,I is used to denote the identity matrix,and the entries of matrices are denoted by using½Á with subscripts. For instance,M is a matrix,½M ij is itsði;jÞth entry,½M i;:is its i th row,and½M :;j is its j th column.For ease of presentation,the horizontal(respectively,vertical)concate-nation of a collection of matrices along row(respectively, column)is denoted by½M1;M2;...;M k (respectively,½M1;M2;...;M k ).The block-diagonal matrix formed by a collection of matrices M1;M2;...;M k is denoted by diag M1;M2;...;M kðÞ¼M10000M20000...000M k2666437775:ð1ÞThe only used vector norm is the‘2norm,denoted by Ák k2.A variety of norms on matrices will be used.The matrix ‘0,‘2;0,‘1,‘2;1norms are defined by Mk k0¼#fði;jÞ:½M ij¼0g,Mk k2;0¼#f i:k½M :;i k2¼0g,Mk k1¼Pi;jj½M ij j,and Mk k2;1¼Pik½M :;i k2,respectively.The matrix‘1norm is defined as Mk k1¼max i;j j½M ij j.The spectral norm of a matrix M is denoted by Mk k,i.e.,Mk k is the largest singular value of M.The Frobenius norm and the nuclear norm(the sum of singular values of a matrix)are denoted by Mk k F and Mk kÃ,respectively.The euclidean inner product between two matrices is h M;N i¼trðM T NÞ,where M T is the transpose of a matrix and trðÁÞis the trace of a matrix.The supports of a matrix M are the indices of its nonzero entries,i.e.,fði;jÞ:½M ij¼0g.Similarly,its column supports are the indices of its nonzero columns.The symbol I (superscripts,subscripts,etc.)is used to denote the column supports of a matrix,i.e.,I¼fðiÞ:k½M :;i k2¼0g.The corresponding complement set(i.e.,zero columns)is I c. There are two projection operators associated with I and I c: P I and P I c.While applying them to a matrix M,the matrix P IðMÞ(respectively,P I cðMÞ)is obtained from M by setting ½M :;i to zero for all i2I(respectively,i2I c).We also adopt the conventions of using spanðMÞto denote the linear space spanned by the columns of a matrix M,using y2spanðMÞto denote that a vector y belongs to the space spanðMÞ,and using Y2spanðMÞto denote that all column vectors of Y belong to spanðMÞ.Finally,in this paper we use several terminologies, including“block-diagonal matrix,”“union and sum of subspaces,”“independent(and disjoint)subspaces,”“full SVD and skinny SVD,”“pseudo-inverse,”“column space and row space,”and“affinity degree.”These terminologies are defined in the Appendix,which can be found in the Computer Society Digital Library at http://doi. /10.1109/TPAMI.2012.88.3.2Relations between Segmentation and RowSpaceLet X0with skinny SVD U0Æ0V T0be a collection of data samples strictly drawn from a union of multiple subspaces (i.e.,X0is clean);the subspace membership of the samples is determined by the row space of X0.Indeed,as shown in[12], when subspaces are independent,V0V T0forms a block-diagonal matrix:Theði;jÞth entry of V0V T0can be nonzero only if the i th and j th samples are from the same subspace. Hence,this matrix,termed Shape Interaction Matrix(SIM) [12],has been widely used for subspace segmentation. Previous approaches simply compute the SVD of the data matrix X¼U XÆX V T X and then use j V X V T X j1for subspace segmentation.However,in the presence of outliers and corruptions,V X can be far away from V0and thus theLIU ET AL.:ROBUST RECOVERY OF SUBSPACE STRUCTURES BY LOW-RANK REPRESENTATION1731.For a matrix M,j M j denotes the matrix with theði;jÞth entry being theabsolute value of½M ij.segmentation using such approaches is inaccurate.In contrast,we show that LRR can recover V0V T0even when the data matrix X is contaminated by outliers.If the subspaces are not independent,V0V T0may not be strictly block-diagonal.This is indeed well expected since when the subspaces have nonzero(nonempty)intersections, then some samples may belong to multiple subspaces simultaneously.When the subspaces are pairwise disjoint (but not independent),our extensive numerical experiments show that V0V T0may still be close to be block-diagonal,as exemplified in Fig.3.Hence,to recover V0V T0is still of interest to subspace segmentation.3.3Problem StatementProblem1.1only roughly describes what we want to study. More precisely,this paper addresses the following problem. Problem 3.1(Subspace Recovery).Let X02I R dÂn with skinny SVD U0Æ0V T0store a set of n d-dimensional samples (vectors)strictly drawn from a union of k subspaces fS i g k i¼1of unknown dimensions(k is unknown either).Given a set of observation vectors X generated byX¼X0þE0;the goal is to recover the row space of X0or to recover the true SIM V0V T0as equal.The recovery of row space can guarantee high segmenta-tion accuracy,as analyzed in Section3.2.Also,the recovery of row space naturally implies success in error correction.So it is sufficient to set the goal of subspace clustering as the recovery of the row space identified by V0V T0.For ease of exploration,we consider the problem under three assump-tions of increasing practicality and difficulty. Assumption1.The data is clean,i.e.,E0¼0.Assumption 2.A fraction of the data samples are grossly corrupted and the others are clean,i.e.,E0has sparse column supports as shown in Fig.2c.Assumption 3.A fraction of the data samples are grossly corrupted and the others are contaminated by small Gaussian noise,i.e.,E0is characterized by a combination of the models shown in Figs.2a and2c.Unlike[14],the independent assumption on the sub-spaces is not highlighted in this paper because the analysis in this work focuses on recovering V0V T0rather than a pursuit of block-diagonal matrix.4L OW-R ANK R EPRESENTATION FOR M ATRIX R ECOVERYIn this section,we abstractly present the LRR method for recovering a matrix from corrupted observations.The basic theorems and optimization algorithms will be presented. The specific methods and theories for handling the sub-space clustering problem are deferred until Section5.4.1Low-Rank RepresentationIn order to recover the low-rank matrix X0from the given observation matrix X corrupted by errors E0(X¼X0þE0), it is straightforward to consider the following regularized rank minimization problem:minD;Erank DðÞþ Ek k‘;s:t:X¼DþE;ð2Þwhere >0is a parameter andÁk k‘indicates certain regularization strategy,such as the squared Frobenius norm (i.e.,kÁk2F)used for modeling the noise as show in Fig.2a[6], the‘0norm adopted by Cande`s et al.[7]for characterizing the random corruptions as shown in Fig.2b,and the ‘2;0norm adopted by Liu et al.[14]and Xu et al.[16]for dealing with sample-specific corruptions and outliers. Suppose DÃis a minimizer with respect to the variable D, then it gives a low-rank recovery to the original data X0.The above formulation is adopted by the recently established the Robust PCA method[7],which has been used to achieve the state-of-the-art performance in several applications(e.g.,[35]).However,this formulation impli-citly assumes that the underlying data structure is a single low-rank subspace.When the data is drawn from a union of multiple subspaces,denoted as S1;S2;...;S k,it actually treats the data as being sampled from a single subspace defined by S¼P ki¼1S i.Since the sumP ki¼1S i can be much larger than the union[k i¼1S i,the specifics of the individual subspaces are not well considered and so the recovery may be inaccurate.To better handle the mixed data,here we suggest a more general rank minimization problem defined as follows: minZ;Erank ZðÞþ Ek k‘;s:t:X¼AZþE;ð3Þwhere A is a“dictionary”that linearly spans the data space. We call the minimizer ZÃ(with regard to the variable Z)the “lowest rank representation”of data X with respect to a dictionary A.After obtaining an optimal solutionðZÃ;EÃÞ, we could recover the original data by using AZÃ(or XÀEÃ). Since rankðAZÃÞrankðZÃÞ,AZÃis also a low-rank recovery to the original data X0.By setting A¼I,the formulation(3) falls back to(2).So LRR could be regarded as a general-ization of RPCA that essentially uses the standard bases as the dictionary.By choosing an appropriate dictionary A,as we will see,the lowest rank representation can recover the underlying row space so as to reveal the true segmentation of data.So,LRR could handle well the data drawn from a union of multiple subspaces.174IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,VOL.35,NO.1,JANUARY2013Fig. 3.An example of the matrix V0V T0computed from dependent subspaces.In this example,we create11pairwise disjoint subspaces, each of which is of dimension20and draw20samples from each subspace without errors.The ambient dimension is200,which is smaller than the sum of the dimensions of the subspaces.So the subspaces are dependent and V0V T0is not strictly block-diagonal.Nevertheless,it is simple to see that high segmentation accuracy can be achieved by using the above affinity matrix to do spectral clustering.4.2Analysis on the LRR ProblemThe optimization problem(3)is difficult to solve due to the discrete nature of the rank function.For ease of exploration, we begin with the“ideal”case that the data is clean.That is, we consider the following rank minimization problem:min Z rank ZðÞ;s:t:X¼AZ:ð4ÞIt is easy to see that the solution to(4)may not be unique. As a common practice in rank minimization problems,we replace the rank function with the nuclear norm,resulting in the following convex optimization problem:min Z Zk kÃ;s:t:X¼AZ:ð5ÞWe will show that the solution to(5)is also a solution to(4) and this special solution is useful for subspace segmentation.In the following,we shall show some general properties of the minimizer to problem(5).These general conclusions form the foundations of LRR(the proofs can be found in Appendix,which is available in the online supplemental material).4.2.1Uniqueness of the MinimizerThe nuclear norm is convex,but not strongly convex.So it is possible that(5)has multiple optimal solutions.Fortunately, it can be proven that the minimizer to(5)is always uniquely defined by a closed form.This is summarized in the following theorem.Theorem 4.1.Assume A¼0and X¼AZ have feasible solution(s),i.e.,X2spanðAÞ.Then,ZüA y Xð6Þis the unique minimizer to(5),where A y is the pseudo-inverse of A.From the above theorem,we have the following corollary which shows that(5)is a good surrogate of(4).Corollary 4.1.Assume A¼0and X¼AZ have feasible solutions.Let ZÃbe the minimizer to(5),then rankðZÃÞ¼rankðXÞand ZÃis also a minimal rank solution to(4).4.2.2Block-Diagonal Property of the MinimizerBy choosing an appropriate dictionary,the lowest rank representation can reveal the true segmentation results. Namely,when the columns of A and X are exactly sampled from independent subspaces,the minimizer to(5)can reveal the subspace membership among the samples.Let fS1;S2;...;S k g be a collection of k subspaces,each of which has a rank(dimension)of r i>0.Also,let A¼½A1; A2;...;A k and X¼½X1;X2;...;X k .Then we have the following theorem.Theorem4.2.Without loss of generality,assume that A i is a collection of m i samples of the i th subspace S i,X i is a collection of n i samples from S i,and the sampling of each A i is sufficient such that rankðA iÞ¼r i(i.e.,A i can be regarded as the bases that span the subspace).If the subspaces are independent,then the minimizer to(5)is block-diagonal:ZüZÃ10000ZÃ20000...000ZÃk2666437775;where ZÃi is an m iÂn i coefficient matrix with rankðZÃiÞ¼rankðX iÞ;8i.Note that the claim of rankðZÃiÞ¼rankðX iÞguarantees the high within-class homogeneity of ZÃi since the low-rank properties generally require ZÃi to be dense.This is different from SR,which is prone to produce a“trivial”solution if A¼X because the sparsest representation is an identity matrix in this case.It is also worth noting that the above block-diagonal property does not require the data samples to have been grouped together according to their subspace memberships.There is no loss of generality to assume that the indices of the samples have been rearranged to satisfy the true subspace memberships,because the solution produced by LRR is globally optimal and does not depend on the arrangements of the data samples.4.3Recovering Low-Rank Matrices by ConvexOptimizationCorollary 4.1suggests that it is appropriate to use the nuclear norm as a surrogate to replace the rank function in (3).Also,the matrix‘1and‘2;1norms are good relaxations of the‘0and‘2;0norms,respectively.So we could obtain a low-rank recovery to X0by solving the following convex optimization problem:minZ;Ek Z kÃþ k E k2;1;s:t:X¼AZþE:ð7ÞHere,the‘2;1norm is adopted to characterize the error term E since we want to model the sample-specific corruptions(and outliers)as shown in Fig.2c.For the small Gaussian noise as shown in Fig.2a,k E k2F should be chosen;for the random corruptions as shown in Fig.2b,k E k1is an appropriate choice.After obtaining the minimizerðZÃ;EÃÞ,we could use AZÃ(or XÀEÃ)to obtain a low-rank recovery to the original data X0.The optimization problem(7)is convex and can be solved by various methods.For efficiency,we adopt in this paper the Augmented Lagrange Multiplier(ALM)[36],[37] method.We first convert(7)to the following equivalent problem:minZ;E;JJ k kÃþ Ek k2;1;s:t:X¼AZþE;Z¼J:This problem can be solved by the ALM method,which minimizes the following augmented Lagrangian function: L¼k J kÃþ k E k2;1þtrÀY T1ðXÀAZÀEÞÁþtrÀY T2ðZÀJÞÁþ2ÀXÀAZÀE2FþZÀJ2FÁ: The above problem is unconstrained.So it can be minimized with respect to J,Z,and E,respectively,by fixing the other variables and then updating the Lagrange multipliers Y1and Y2,where >0is a penalty parameter.The inexact ALM method,also called the alternating direction method,isLIU ET AL.:ROBUST RECOVERY OF SUBSPACE STRUCTURES BY LOW-RANK REPRESENTATION175。
三维模型时空子空间引导的智能视频侦查系统
三维模型时空子空间引导的智能视频侦查系统摘要:为了克服传统视频处理技术面临的“语义鸿沟”等难题,借助三维模型时空子空间所蕴含的信息进行视频处理分析,提出了三维模型时空子空间引导的智能视频侦查技术。
①在体形子空间的引导下从视频中匹配三维目标模型。
②三维模型时空子空间引导下提取视频事件:监控对象视频+三维模型时空子空间→监控对象三维动作。
③三维事件库中的动作比对分类:运动数据+三维事件库→视频类型和性质。
文章涉及图形学、视频处理和刑事技术,探索了使用三维图形学技术解决视频侦查难题的新渠道。
关键词:智能视频侦查;三维模型时空子空间;运动比对;快速研判;三维事件库;大数据中图分类号:TP391.41 文献标志码:A 文章编号:1006-8228(2016)05-16-05Abstract:In order to overcome the problem of the "semantic gap" faced by the traditional video processing technology,this paper proposes a three-dimensional model of spatio-temporal subspace guided smart video detecting technology. Its core idea is that the video data is processed and analyzed with the information contained in the 3D model ofspatio-temporal subspace. This paper include:1,matching 3D target model with the video under the guidance of the shape subspace;2,3D model of spatio-temporal subspace guided extraction of video events:monitor object video + spatio-temporal subspace of 3D model →3D monitored object movement;3,Comparison of movements in 3D event Library:Sports data + 3D event library →video type and nature. This paper is related to graphics,video processing and criminal technology. It establishes new channels for the use of 3D graphics technology to solve the problem of video detection and has an important academic significance.Key words:smart video based crime detecting;3D model spatio-temporal sub-space;motion matching;rapid judge;3D events database;big data0 引言随着大量视频探头的广泛使用,以视频内容为突破口的视频侦查逐渐成为公安机关侦破案件的重要方法。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
several efforts have been made for improving their robustness, e.g., the Median K-flats (Zhang et al., 2009) for K-subspaces, the work of (Yang et al., 2006) for RANSAC, and (Ma et al., 2008a; Wright et al., 2008a) use a coding length to characterize a mixture of Gaussian, which may have some robustness. Nevertheless, the problem is still not well solved due to the optimization difficulty, which is a bottleneck for these methods to achieve robustness. Factorization based methods (Costeira & Kanade, 1998; Gruber & Weiss, 2004) seek to represent the given data matrix as a product of two matrices, so that the support pattern of one of the factors reveals the grouping of the points. These methods aim at modifying popular factor analysis algorithms (often based on alternating minimization or EM-style algorithms) to produce such factorizations. Nevertheless, these methods are sensitive to noise and outliers, and it is not easy to modify them to be robust because they usually need iterative optimization algorithms to obtain the factorizations. Generalized Principal Component Analysis (GPCA) (Ma et al., 2008b) presents an algebraic way to model the data drawn from a union of multiple subspaces. By describing a subspace containing a data point by using the gradient of a polynomial at that point, subspace segmentation is then equivalent to fitting the data with polynomials. GPCA can guarantee the success of the segmentation under certain conditions, and it does not impose any restriction on the subspaces. However, this method is sensitive to noise and outliers due to the difficulty of estimating the polynomials from real data, which also causes the high computation cost of GPCA. Recently, Robust Algebraic Segmentation (RAS)(Rao et al., 2010) is proposed to resolve the robustness issue of GPCA. However, the computation difficulty for fitting polynomials is unfathomed. So RAS can make sense only when the data dimension is low and the number of subspaces is small. Recently, the work of (Rao et al., 2009) and Sparse Subspace Clustering (SSC) (Elhamifar & Vidal, 2009) introduced compressed sensing techniques to subspace segmentation. SSC uses the sparsest representation produced by 1 -minimization (Wright et al., 2008b; Eldar & Mishali, 2008) to define the affinity matrix of an undirected graph. Then subspace segmentation is performed by spectral clustering algorithms such as the Normalized Cuts (NCut) (Shi & Malik, 2000). Under the assumption that the subspaces are independent, SSC shows that the sparsest representation is also “block-sparse”. Namely, the within-cluster affinities are sparse (but nonzero) and the between-cluster
‡
Microsoft Research Asia, NO. 49, Zhichun Road, Hai Dian District, Beijing, China, 100190
Abstract
We propose low-rank representation (LRR) to segment data drawn from a union of multiple linear (or affine) subspaces. Given a set of data vectors, LRR seeks the lowestrank representation among all the candidates that represent all vectors as the linear combination of the bases in a dictionary. Unlike the well-known sparse representat sparsest representation of each data vector individually, LRR aims at finding the lowest-rank representation of a collection of vectors jointly. LRR better captures the global structure of data, giving a more effective tool for robust subspace segmentation from corrupted data. Both theoretical and experimental results show that LRR is a promising tool for subspace segmentation.
Robust Subspace Segmentation by Low-Rank Representation
Guangcan Liu † roth@ Zhouchen Lin ‡ zhoulin@ Yong Yu † yyu@ † Shanghai Jiao Tong University, NO. 800, Dongchuan Road, Min Hang District, Shanghai, China, 200240
subspaces have been gaining much attention in recent years. For example, the hotly discussed matrix compees & Recht, 2009; Keshavan et al., 2009; tition (Cand` Cand` es et al., 2009) problem is essentially based on the hypothesis that the data is drawn from a low-rank subspace. However, a given data set is seldom well described by a single subspace. A more reasonable model is to consider data as lying near several subspaces, leading to the challenging problem of subspace segmentation. Here, the goal is to segment (or cluster) data into clusters with each cluster corresponding to a subspace. Subspace segmentation is an important data clustering problem as it arises in numerous research areas, including machine learning (Lu & Vidal, 2006), computer vision (Ho et al., 2003), image processing (Fischler & Bolles, 1981) and system identification. Previous Work. According to their mechanisms of representing the subspaces, existing works can be roughly divided into four main categories: mixture of Gaussian, factorization, algebraic and compressed sensing. In statistical learning, mixed data are typically modeled as a set of independent samples drawn from a mixture of probabilistic distributions. As a single subspace can be well modeled by a Gaussian distribution, it is straightforward to assume that each probabilistic distribution is Gaussian, so known as the mixture of Gaussian model. Then the problem of segmenting the data is converted to a model estimation problem. The estimation can be performed either by using the Expectation Maximization (EM) algorithm to find a maximum likelihood estimate, as done in (Gruber & Weiss, 2004), or by iteratively finding a min-max estimate, as adopted by Ksubspaces (Ho et al., 2003) and Random Sample Consensus (RANSAC) (Fischler & Bolles, 1981). These methods are sensitive to the noise and outliers. So