DETERMINATION OF IMAGE ORIENTATION SUPPORTED BY IMU AND GPS
结合视觉注意力机制和图像锐度的无参图像质量评价方法
第39卷第1期2018年1月应用光学Journal of Applied Optic;Vol. 39 No. 1 Jan. 2018文章编号!002-2082(2018)01-0051-06结合视觉注意力机制和图像锐度的无参图像质量评价方法王凡,倪晋平,董涛,郭荣礼(西安工业大学光电工程学院,陕西西安710021):针对模糊图像的质量评价,提出一种新的无参图像质量评价方法,该方法结合了自底向上的视觉注意力机制和自顶向下的图像锐度评价标准。
根据人眼视觉注意力机制模型,分别计算颜色、亮度和方向显著度图像,通过竞争机制得到人眼优先关注的区域;利用无参图像锐度评价方法分别对优先关注的区域及背景区域进行评价,综合2个区域的评价结果得到最终的图像质量评价指标。
利用该方法分别对相向运动过程中所产生的模糊图像和图像质量评价L ive数据库中的高斯模糊图像进行了评价,结果表明:针对两类图像的评价结果与主观评价结果的相关系数均较高,其中,针对相向运动模糊图像的主客观评价结果的相关系数达到0. 98。
该方法能够胜任对模糊像的质 价。
:数字图像;图像质量评价;视觉机制;图像锐度中图分类号:TN911.73;TP391.4 文献标志码:A doi:10. 5768/JAO201839. 0102002No-reference image quality assessmentmethodbasedon visualattention mechanism and sharpness metric approachWang Fan,Ni Jinping,Dong Tao,Guo Rongli(School of Optoelectronics Engineering,Xi?an Technological University,Xi?an 710032 ,China)Abstract:A no-reference image quality assessment method based on bottom-up visual attentionmechanism and top-down image sharpness metric is proposed for the blur im tion.Firstly,the color,intensity and orientation feature maps are calculated based on the humanvisual attention mechanism,and then the saliency region of the image is obtained by the winner-take-all competition.Secondly,the image of saliency region and the background region are evaluated using no-reference i mage sharpness evaluation method,and the final image quality indexcan be obtained by the combination of the above two results.The radial blur images produced inthe forward motion imaging and the Gauss fuzzy images in the Live database evaluation are evaluated using the proposed method respectively.Experiment results can prove that the correlationcoefficient between the results of our method and the subjective one is larger than 0. 98 for theradial blur images.The method c an be used for evaluating the blur image subjectively.Key words:digital images;image quality assessment;visual mechanism;image sharpness收稿日期!017-05-24 #修回日期!017-08-07基金项目:国家自然科学基金(11704302"陕西省科技厅项目(2016JQ6053)陕西省教育厅重点实验室科研计划项目(14JS035)西安工业大学光电工程学院院长基金(16GDYJZ02)作者简介:王凡(987 — "女,陕西西安人,博士,讲师,主要从事光电成像与图像质量评价方面的研究。
基于PixelGrid软件的ADS80影像数据生产技术探讨
青海国土经略·技术交流ADS80航摄仪采用线阵推扫成像原理,每一次飞行可以同时获取前视、下视、后视三度重叠连续无缝的具有相同分辨率和良好光谱信息的全色、彩色和红外影像。
因而在立体观测、矢量提取和DEM 生成及编辑等过程中,都可充分利用其前视一下视、前视-后视、下视-后视立体像对,选择最优的交会角度获取高质量数据,这是其他航空数字影像无法比拟的。
相机上集成了GPS 和惯性测量装置(IMU),可以为每条扫描线产生较准确的外方位初值,因此在后期的空三加密数据处理中,不像传统摄影测量需要很多的平高控制点,只需在加测少量平高控制点,或无地面控制点的情况下利用PPP 技术,完成地面目标的三维定位,为摄影测量自动化开辟了崭新途径。
1 数据处理1.1 处理流程ADS80影像数据处理主要包括数据准备,工程建立,数据预处理,空中三角测量解算,DEM 匹配,DEM 编辑正射影像生成等。
其数据处理流程如下(图1):1.2 主要技术1.2.1 数据准备本次任务收集到的ADS80影像数据是L0级数据,而PixelGrid 软件空三加密及后续生产需要的是L1级影像数据,需要L0级到L1级数据的转换。
转换时摘 要:本文通过对ADS80影像数据用于1:1万基础测绘项目生产过程中各技术环节的总结,着重说明了在PixelGrid 软件下处理ADS80数据时各工序的作业方法和注意事项,为以后ADS 系列影像在航测生产中更好、更广泛地应用提供参考。
关键词:ADS80影像;区域网平差;DOM基于PixelGrid 软件的ADS80影像数据生产技术探讨◆ 马永春 李 龙(青海省第二测绘院,青海西宁 810000)65需确认是否存在L0级数据的“.sup”和“.odf”影像参数文件,并将“.sup”文件中将其关联的几个文件的路径进行修改,使其分别对应文件的确切存放路径。
建议在进行数据拷贝时,直接拷贝“sessin”文件目录,修改计算机盘符即可,不要随意移动里面的文件,避免破坏文件组织结构。
image alignment and stitching a tutorial
Richard Szeliski Last updated, December 10, 2006 Technical Report MSR-TR-2004-92
This tutorial reviews image alignment and image stitching algorithms. Image alignment algorithms can discover the correspondence relationships among images with varying degrees of overlap. They are ideally suited for applications such as video stabilization, summarization, and the creation of panoramic mosaics. Image stitching algorithms take the alignment estimates produced by such registration algorithms and blend the images in a seamless manner, taking care to deal with potential problems such as blurring or ghosting caused by parallax and scene movement as well as varying image exposures. This tutorial reviews the basic motion models underlying alignment and stitching algorithms, describes effective direct (pixel-based) and feature-based alignment algorithms, and describes blending algorithms used to produce seamless mosaics. It closes with a discussion of open research problems in the area.
基于HSI色彩坐标相似度的彩色图像分割方法
基于HSI色彩坐标相似度的彩色图像分割方法李宁;许树成;邓中亮【摘要】该文提出一种基于HSI彩色空间的图像分割方法。
欧氏距离作为图像分割中常用的衡量像素点之间彩色关系的依据,在HSI坐标系下却不能很好地反应两个像素点之间的关系。
因此,提出相似度代替欧氏距离作为一种新的衡量两个像素点之间彩色关系的依据。
算法通过确定HSI分量中占主导地位的分量,建立彩色图像分割模型,创建一个和原图尺寸一样的颜色相似度等级图,并利用相应的颜色相似度等级图的颜色信息对像素点进行聚类。
实验结果表明,所提出的分割算法具有很强的鲁棒性和准确性,在其他条件相同的情况下,基于相似度的分割方法优于基于欧氏距离为基准的彩色图像分割。
%A new method for color image segmentation which based on HSI color space is presented in this paper. Euclidean distance as a common basis of measuring the colour relationship between two pixels can not reflect the relationship between the two pixels in the HSI coordinate system. Therefore,the traditional Euclidean distance is abandoned,and the color similarity is proposed as a new basis of measuring the relationship between the two pixels. The algorithm is used to build the color image seg?mentation model by at determining the dominant component in the HSI components and create a color similarity level picture with the size same as the original picture. The color information of the corresponding color level diagram is adopted to cluster the pixel points. The experimental results show that the segmentation algorithm has strong robustness and high accuracy,and under the sameconditions,the segmentation method based on similarity is better than the segmentation method based on Euclidean dis?tance.【期刊名称】《现代电子技术》【年(卷),期】2017(040)002【总页数】5页(P30-33,38)【关键词】图像分割;HSI彩色空间;颜色相似度;欧氏距离【作者】李宁;许树成;邓中亮【作者单位】北京邮电大学,北京 100876;北京邮电大学,北京 100876;北京邮电大学,北京 100876【正文语种】中文【中图分类】TN911.73-34基于彩色信息的图像分割算法在计算机视觉中扮演着重要的角色,并广泛应用于各个领域。
A Label Field Fusion Bayesian Model and Its Penalized Maximum Rand Estimator for Image Segmentation
1610IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 19, NO. 6, JUNE 2010A Label Field Fusion Bayesian Model and Its Penalized Maximum Rand Estimator for Image SegmentationMax MignotteAbstract—This paper presents a novel segmentation approach based on a Markov random field (MRF) fusion model which aims at combining several segmentation results associated with simpler clustering models in order to achieve a more reliable and accurate segmentation result. The proposed fusion model is derived from the recently introduced probabilistic Rand measure for comparing one segmentation result to one or more manual segmentations of the same image. This non-parametric measure allows us to easily derive an appealing fusion model of label fields, easily expressed as a Gibbs distribution, or as a nonstationary MRF model defined on a complete graph. Concretely, this Gibbs energy model encodes the set of binary constraints, in terms of pairs of pixel labels, provided by each segmentation results to be fused. Combined with a prior distribution, this energy-based Gibbs model also allows for definition of an interesting penalized maximum probabilistic rand estimator with which the fusion of simple, quickly estimated, segmentation results appears as an interesting alternative to complex segmentation models existing in the literature. This fusion framework has been successfully applied on the Berkeley image database. The experiments reported in this paper demonstrate that the proposed method is efficient in terms of visual evaluation and quantitative performance measures and performs well compared to the best existing state-of-the-art segmentation methods recently proposed in the literature. Index Terms—Bayesian model, Berkeley image database, color textured image segmentation, energy-based model, label field fusion, Markovian (MRF) model, probabilistic Rand index.I. INTRODUCTIONIMAGE segmentation is a frequent preprocessing step which consists of achieving a compact region-based description of the image scene by decomposing it into spatially coherent regions with similar attributes. This low-level vision task is often the preliminary and also crucial step for many image understanding algorithms and computer vision applications. A number of methods have been proposed and studied in the last decades to solve the difficult problem of textured image segmentation. Among them, we can cite clustering algorithmsManuscript received February 20, 2009; revised February 06, 2010. First published March 11, 2010; current version published May 14, 2010. This work was supported by a NSERC individual research grant. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Peter C. Doerschuk. The author is with the Département d’Informatique et de Recherche Opérationnelle (DIRO), Université de Montréal, Faculté des Arts et des Sciences, Montréal H3C 3J7 QC, Canada (e-mail: mignotte@iro.umontreal.ca). Color versions of one or more of the figures in this paper are available online at . Digital Object Identifier 10.1109/TIP.2010.2044965[1], spatial-based segmentation methods which exploit the connectivity information between neighboring pixels and have led to Markov Random Field (MRF)-based statistical models [2], mean-shift-based techniques [3], [4], graph-based [5], [6], variational methods [7], [8], or by region-based split and merge procedures, sometimes directly expressed by a global energy function to be optimized [9]. Years of research in segmentation have demonstrated that significant improvements on the final segmentation results may be achieved either by using notably more sophisticated feature selection procedures, or more elaborate clustering techniques (sometimes involving a mixture of different or non-Gaussian distributions for the multidimensional texture features [10], [11]) or by taking into account prior distribution on the labels, region process, or the number of classes [9], [12], [13]. In all cases, these improvements lead to computationally expensive segmentation algorithms and, in the case of energy-based segmentation models, to costly optimization techniques. The segmentation approach, proposed in this paper, is conceptually different and explores another strategy initially introduced in [14]. Instead of considering an elaborate and better designed segmentation model of textured natural image, our technique explores the possible alternative of fusing (i.e., efficiently combining) several quickly estimated segmentation maps associated with simpler segmentation models for a final reliable and accurate segmentation result. These initial segmentations to be fused can be given either by different algorithms or by the same algorithm with different values of the internal parameters such as several -means clustering results with different values of , or by several -means results using different distance metrics, and applied on an input image possibly expressed in different color spaces or by other means. The fusion model, presented in this paper, is derived from the recently introduced probabilistic rand index (PRI) [15], [16] which measures the agreement of one segmentation result to multiple (manually generated) ground-truth segmentations. This measure efficiently takes into account the inherent variation existing across hand-labeled possible segmentations. We will show that this non-parametric measure allows us to derive an appealing fusion model of label fields, easily expressed as a Gibbs distribution, or as a nonstationary MRF model defined on a complete graph. Finally, this fusion model emerges as a classical optimization problem in which the Gibbs energy function related to this model has to be minimized. In other words, or analytically expressed in the regularization framework, each quickly estimated segmentation (to be fused) provides a set of constraints in terms of pairs of pixel labels (i.e., binary cliques) that should be equal or not. Finally, our fusion result is found1057-7149/$26.00 © 2010 IEEEMIGNOTTE: LABEL FIELD FUSION BAYESIAN MODEL AND ITS PENALIZED MAXIMUM RAND ESTIMATOR FOR IMAGE SEGMENTATION1611by searching for a segmentation map that minimizes an energy function encoding this precomputed set of binary constraints (thus optimizing the so-called PRI criterion). In our application, this final optimization task is performed by a robust multiresolution coarse-to-fine minimization strategy. This fusion of simple, quickly estimated segmentation results appears as an interesting alternative to complex, computationally demanding segmentation models existing in the literature. This new strategy of segmentation is validated in the Berkeley natural image database (also containing, for quantitative evaluations, ground truth segmentations obtained from human subjects). Conceptually, our fusion strategy is in the framework of the so-called decision fusion approaches recently proposed in clustering or imagery [17]–[21]. With these methods, a series of energy functions are first minimized before their outputs (i.e., their decisions) are merged. Following this strategy, Fred et al. [17] have explored the idea of evidence accumulation for combining the results of multiple clusterings. Reed et al. have proposed a Gibbs energy-based fusion model that differs from ours in the likelihood and prior energy design, as final merging procedure (for the fusion of large scale classified sonar image [21]). More precisely, Reed et al. employed a voting scheme-based likelihood regularized by an isotropic Markov random field priorly used to inpaint regions where the likelihood decision is not available. More generally, the concept of combining classifiers for the improvement of the performance of individual classifiers is known, in machine learning field, as a committee machine or mixture of experts [22], [23]. In this context, Dietterich [23] have provided an accessible and informal reasoning, from statistical, computational and representational viewpoints, of why ensembles can improve results. In this recent field of research, two major categories of committee machines are generally found in the literature. Our fusion decision approach is in the category of the committee machine model that utilizes an ensemble of classifiers with a static structure type. In this class of committee machines, the responses of several classifiers are combined by means of a mechanism that does not involve the input data (contrary to the dynamic structure type-based mixture of experts). In order to create an efficient ensemble of classifiers, three major categories of methods have been suggested whose goal is to promote diversity in order to increase efficiency of the final classification result. This can be done either by using different subsets of the input data, either by using a great diversity of the behavior between classifiers on the input data or finally by using the diversity of the behavior of the input data. Conceptually, our ensemble of classifiers is in this third category, since we intend to express the input data in different color spaces, thus encouraging diversity and different properties such as data decorrelation, decoupling effects, perceptually uniform metrics, compaction and invariance to various features, etc. In this framework, the combination itself can be performed according to several strategies or criteria (e.g., weighted majority vote, probability rules: sum, product, mean, median, classifier as combiner, etc.) but, none (to our knowledge) uses the PRI fusion (PRIF) criterion. Our segmentation strategy, based on the fusion of quickly estimated segmentation maps, is similar to the one proposed in [14] but the criterion which is now used in this new fusion model is different. In [14], the fusion strategy can be viewed as a two-stephierarchical segmentation procedure in which the first step remains identical and a set of initial input texton segmentation maps (in each color space) is estimated. Second, a final clustering, taking into account this mixture of textons (expressed in the set of different color space) is then used as a discriminant feature descriptor for a final -mean clustering whose output is the final fused segmentation map. Contrary to the fusion model presented in this paper, this second step (fusion of texton segmentation maps) is thus achieved in the intra-class inertia sense which is also the so-called squared-error criterion of the -mean algorithm. Let us add that a conceptually different label field fusion model has been also recently introduced in [24] with the goal of blending a spatial segmentation (region map) and a quickly estimated and to-be-refined application field (e.g., motion estimation/segmentation field, occlusion map, etc.). The goal of the fusion procedure explained in [24] is to locally fuse label fields involving labels of two different natures at different level of abstraction (i.e., pixel-wise and region-wise). More precisely, its goal is to iteratively modify the application field to make its regions fit the color regions of the spatial segmentation with the assumption that the color segmentation is more detailed than the regions of the application field. In this way, misclassified pixels in the application field (false positives and false negatives) are filtered out and blobby shapes are sharpened, resulting in a more accurate final application label field. The remainder of this paper is organized as follows. Section II describes the proposed Bayesian fusion model. Section III describes the optimization strategy used to minimize the Gibbs energy field related to this model and Section IV describes the segmentation model whose outputs will be fused by our model. Finally, Section V presents a set of experimental results and comparisons with existing segmentation techniques.II. PROPOSED FUSION MODEL A. Rand Index The Rand index [25] is a clustering quality metric that measures the agreement of the clustering result with a given ground truth. This non-parametric statistical measure was recently used in image segmentation [16] as a quantitative and perceptually interesting measure to compare automatic segmentation of an image to a ground truth segmentation (e.g., a manually hand-segmented image given by an expert) and/or to objectively evaluate the efficiency of several unsupervised segmentation methods. be the number of pixels assigned to the same region Let (i.e., matched pairs) in both the segmentation to be evaluated and the ground truth segmentation , and be the number of pairs of pixels assigned to different regions (i.e., misand . The Rand index is defined as matched pairs) in to the total number of pixel pairs, i.e., the ratio of for an image of size pixels. More formally [16], and designate the set of region labels respecif tively associated to the segmentation maps and at pixel location and where is an indicator function, the Rand index1612IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 19, NO. 6, JUNE 2010is given by the following relation:given by the empirical proportion (3) where is the delta Kronecker function. In this way, the PRI measure is simply the mean of the Rand index computed between each [16]. As a consequence, the PRI pair measure will favor (i.e., give a high score to) a resulting acceptable segmentation map which is consistent with most of the segmentation results given by human experts. More precisely, the resulting segmentation could result in a compromise or a consensus, in terms of level of details and contour accuracy exhibited by each ground-truth segmentations. Fig. 8 gives a fusion map example, using a set of manually generated segmentations exhibiting a high variation, in terms of level of details. Let us add that this probabilistic metric is not degenerate; all the bad segmentations will give a low score without exception [16]. C. Generative Gibbs Distribution Model of Correct Segmentations (i.e., the pairwise empirical As indicated in [15], the set ) defines probabilities for each pixel pair computed over an appealing generative model of correct segmentation for the image, easily expressed as a Gibbs distribution. In this way, the Gibbs distribution, generative model of correct segmentation, which can also be considered as a likelihood of , in the PRI sense, may be expressed as(1) which simply computes the proportion (value ranging from 0 to 1) of pairs of pixels with compatible region label relationships between the two segmentations to be compared. A value of 1 indicates that the two segmentations are identical and a value of 0 indicates that the two segmentations do not agree on any pair of points (e.g., when all the pixels are gathered in a single region in one segmentation whereas the other segmentation assigns each pixel to an individual region). When the number of and are much smaller than the number of data labels in points , a computationally inexpensive estimator of the Rand index can be found in [16]. B. Probabilistic Rand Index (PRI) The PRI was recently introduced by Unnikrishnan [16] to take into accounttheinherentvariabilityofpossible interpretationsbetween human observers of an image, i.e., the multiple acceptable ground truth segmentations associated with each natural image. This variability between observers, recently highlighted by the Berkeley segmentation dataset [26] is due to the fact that each human chooses to segment an image at different levels of detail. This variability is also due image segmentation being an ill-posed problem, which exhibits multiple solutions for the different possible values of the number of classes not known a priori. Hence, in the absence of a unique ground-truth segmentation, the clustering quality measure has to quantify the agreement of an automatic segmentation (i.e., given by an algorithm) with the variation in a set of available manual segmentations representing, in fact, a very small sample of the set of all possible perceptually consistent interpretations of an image [15]. The authors [16] address this concern by soft nonuniform weighting of pixel pairs as a means of accounting for this variability in the ground truth set. More formally, let us consider a set of manually segmented (ground truth) images corresponding to an be the segmentation to be compared image of size . Let with the manually labeled set and designates the set of reat pixel gion labels associated with the segmentation maps location , the probabilistic RI is defined bywhere is the set of second order cliques or binary cliques of a Markov random field (MRF) model defined on a complete graph (each node or pixel is connected to all other pixels of is the temperature factor of the image) and this Boltzmann–Gibbs distribution which is twice less than the normalization factor of the Rand Index in (1) or (2) since there than pairs of pixels for which are twice more binary cliques . is the constant partition function. After simplification, this yields(2) where a good choice for the estimator of (the probability of the pixel and having the same label across ) is simply (4)MIGNOTTE: LABEL FIELD FUSION BAYESIAN MODEL AND ITS PENALIZED MAXIMUM RAND ESTIMATOR FOR IMAGE SEGMENTATION1613where is a constant partition function (with a factor which depends only on the data), namelywhere is the set of all possible (configurations for the) segof size pixels. Let us add mentations into regions that, since the number of classes (and thus the number of regions) of this final segmentation is not a priori known, there are possibly, between one and as much as regions that the number of pixels in this image (assigning each pixel to an individual can region is a possible configuration). In this setting, be viewed as the potential of spatially variant binary cliques (or pairwise interaction potentials) of an equivalent nonstationary MRF generative model of correct segmentations in the case is assumed to be a set of representative ground where truth segmentations. Besides, , the segmentation result (to be ), can be considered as a realization of this compared to generative model with PRand, a statistical measure proportional to its negative likelihood energy. In other words, an estimate of , in the maximum likelihood sense of this generative model, will give a resulting segmented map (i.e., a fusion result) with a to be fused. high fidelity to the set of segmentations D. Label Field Fusion Model for Image Segmentation Let us consider that we have at our disposal, a set of segmentations associated to an image of size to be fused (i.e., to efficiently combine) in order to obtain a final reliable and accurate segmentation result. The generative Gibbs distribution model of correct segmentations expressed in (4) gives us an interesting fusion model of segmentation maps, in the maximum PRI sense, or equivalently in the maximum likelihood (ML) sense for the underlying Gibbs model expressed in (4). In this framework, the set of is computed with the empirical proportion estimator [see (3)] on the data . Once has been estimated, the resulting ML fusion segmentation map is thus defined by maximizing the likelihood distributiontions for different possible values of the number of classes which is not a priori known. To render this problem well-posed with a unique solution, some constraints on the segmentation process are necessary, favoring over segmentation or, on the contrary, merging regions. From the probabilistic viewpoint, these regularization constraints can be expressed by a prior distribution of treated as a realization of the unknown segmentation a random field, for example, within a MRF framework [2], [27] or analytically, encoded via a local or global [13], [28] prior energy term added to the likelihood term. In this framework, we consider an energy function that sets a particular global constraint on the fusion process. This term restricts the number of regions (and indirectly, also penalizes small regions) in the resulting segmentation map. So we consider the energy function (6) where designates the number of regions (set of connected pixels belonging to the same class) in the segmented is the Heaviside (or unit step) function, and an image , internal parameter of our fusion model which physically represents the number of classes above which this prior constraint, limiting the number of regions, is taken into account. From the probabilistic viewpoint, this regularization constraint corresponds to a simple shifted (from ) exponential distribution decreasing with the number of regions displayed by the final segmentation. In this framework, a regularized solution corresponds to the maximum a posteriori (MAP) solution of our fusion model, i.e., that maximizes the posterior distribution the solution , and thus(7) with is the regularization parameter controlling the contribuexpressing fidelity to the set of segtion of the two terms; encoding our prior knowledge or mentations to be fused and beliefs concerning the types of acceptable final segmentations as estimates (segmentation with a number of limited regions). In this way, the resulting criteria used in this resulting fusion model can be viewed as a penalized maximum rand estimator. III. COARSE-TO-FINE OPTIMIZATION STRATEGY A. Multiresolution Minimization Strategy Our fusion procedure of several label fields emerges as an optimization problem of a complex non-convex cost function with several local extrema over the label parameter space. In order to find a particular configuration of , that efficiently minimizes this complex energy function, we can use a global optimization procedure such as a simulated annealing algorithm [27] whose advantages are twofold. First, it has the capability of avoiding local minima, and second, it does not require a good solution. initial guess in order to estimate the(5) where is the likelihood energy term of our generative fusion . model which has to be minimized in order to find Concretely, encodes the set of constraints, in terms of pairs of pixel labels (identical or not), provided by each of the segmentations to be fused. The minimization of finds the resulting segmentation which also optimizes the PRI criterion. E. Bayesian Fusion Model for Image Segmentation As previously described in Section II-B, the image segmentation problem is an ill-posed problem exhibiting multiple solu-1614IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 19, NO. 6, JUNE 2010Fig. 1. Duplication and “coarse-to-fine” minimization strategy.An alternative approach to this stochastic and computationally expensive procedure is the iterative conditional modes (ICM) introduced by Besag [2]. This method is deterministic and simple, but has the disadvantage of requiring a proper initialization of the segmentation map close to the optimal solution. Otherwise it will converge towards a bad local minima . In order associated with our complex energy function to solve this problem, we could take, as initialization (first such as iteration), the segmentation map (8) i.e., in choosing for the first iteration of the ICM procedure amongst the segmentation to be fused, the one closest to the optimal solution of the Gibbs energy function of our fusion model [see (5)]. A more robust optimization method consists of a multiresolution approach combined with the classical ICM optimization procedure. In this strategy, rather than considering the minimization problem on the full and original configuration space, the original inverse problem is decomposed in a sequence of approximated optimization problems of reduced complexity. This drastically reduces computational effort and provides an accelerated convergence toward improved estimate. Experimentally, estimation results are nearly comparable to those obtained by stochastic optimization procedures as noticed, for example, in [10] and [29]. To this end, a multiresolution pyramid of segmentation maps is preliminarily derived, in order to for each at different resolution levels, and a set estimate a set of of similar spatial models is considered for each resolution level of the pyramidal data structure. At the upper level of the pyramidal structure (lower resolution level), the ICM optimization procedure is initialized with the segmentation map given by the procedure defined in (8). It may also be initialized by a random solution and, starting from this initial segmentation, it iterates until convergence. After convergence, the result obtained at this resolution level is interpolated (see Fig. 1) and then used as initialization for the next finer level and so on, until the full resolution level. B. Optimization of the Full Energy Function Experiments have shown that the full energy function of our model, (with the region based-global regularization constraint) is complex for some images. Consequently it is preferable toFig. 2. From top to bottom and left to right; A natural image from the Berkeley database (no. 134052) and the formation of its region process (algorithm PRIF ) at the (l = 3) upper level of the pyramidal structure at iteration [0–6], 8 (the last iteration) of the ICM optimization algorithm. Duplication and result of the ICM relaxation scheme at the finest level of the pyramid at iteration 0, 1, 18 (last iteration) and segmentation result (region level) after the merging of regions and the taking into account of the prior. Bottom: evolution of the Gibbs energy for the different steps of the multiresolution scheme.perform the minimization in two steps. In a first step, the minimization is performed without considering the global constraint (considering only ), with the previously mentioned multiresolution minimization strategy and the ICM optimization procedure until its convergence at full resolution level. At this finest resolution level, the minimization is then refined in a second step by identifying each region of the resulting segmentation map. This creates a region adjacency graph (a RAG is an undirected graph where the nodes represent connected regions of the image domain) and performs a region merging procedure by simply applying the ICM relaxation scheme on each region (i.e., by merging the couple of adjacent regions leading to a reduction of the cost function of the full model [see (7)] until convergence). In the second step, minimization can also be performed . according to the full modelMIGNOTTE: LABEL FIELD FUSION BAYESIAN MODEL AND ITS PENALIZED MAXIMUM RAND ESTIMATOR FOR IMAGE SEGMENTATION1615with its four nearest neighbors and a fixed number of connections (85 in our application), regularly spaced between all other pixels located within a square search window of fixed size 30 pixels centered around . Fig. 3 shows comparison of segmentation results with a fully connected graph computed on a search window two times larger. We decided to initialize the lower (or third upper) level of the pyramid with a sequence of 20 different random segmentations with classes. The full resolution level is then initialized with the duplication (see Fig. 1) of the best segmentation result (i.e., the one associated to the lowest Gibbs energy ) obtained after convergence of the ICM at this lower resolution level (see Fig. 2). We provide details of our optimization strategy in Algorithm 1. Algo I. Multiresolution minimization procedure (see also Fig. 2). Two-Step Multiresolution Minimization Set of segmentations to be fusedPairwise probabilities for each pixel pair computed over at resolution level 1. Initialization Step • Build multiresolution Pyramids from • Compute the pairwise probabilities from at resolution level 3 • Compute the pairwise probabilities from at full resolution PIXEL LEVEL Initialization: Random initialization of the upper level of the pyramidal structure with classes • ICM optimization on • Duplication (cf. Fig 1) to the full resolution • ICM optimization on REGION LEVEL for each region at the finest level do • ICM optimization onFig. 4. Segmentation (image no. 385028 from Berkeley database). From top to bottom and left to right; segmentation map respectively obtained by 1] our multiresolution optimization procedure: = 3402965 (algo), 2] SA : = 3206127, 3] rithm PRIF : = 3312794, 4] SA : = 3395572, 5] SA : = 3402162. SAFig. 3. Comparison of two segmentation results of our multiresolution fusion procedure (algorithm PRIF ) using respectively: left] a subsampled and fixed number of connections (85) regularly spaced and located within a square search window of size = 30 pixels. right] a fully connected graph computed on a search window two times larger (and requiring a computational load increased by 100).NUU 0 U 00 U 0 U 0D. Comparison With a Monoresolution Stochastic Relaxation In order to test the efficiency of our two-step multiresolution relaxation (MR) strategy, we have compared it to a standard monoresolution stochastic relaxation algorithm, i.e., a so-called simulated annealing (SA) algorithm based on the Gibbs sampler [27]. In order to restrict the number of iterations to be finite, we have implemented a geometric temperature cooling schedule , where is the [30] of the form starting temperature, is the final temperature, and is the maximal number of iterations. In this stochastic procedure, is crucial. The temperathe choice of the initial temperature ture must be sufficiently high in the first stages of simulatedC. Algorithm In order to decrease the computational load of our multiresolution fusion procedure, we only use two levels of resolution in our pyramidal structure (see Fig. 2): the full resolution and an image eight times smaller (i.e., at the third upper level of classical data pyramidal structure). We do not consider a complete graph: we consider that each node (or pixel) is connected。
New staff orientation procedure(知名注塑公司新员工入职培训程序)
HI-P INTERNATIONAL LIMITED
Title 文件名
Rev 版本号
HI-P INTERNATIONAL LIMITED 赫比国际有限公司
New Employee Orientation Procedure
新员工入职培训管理程序
2.0
Released Date 发布日期
11 Jan, 2006
4.1.2.2 To manage and update the training record 培训记录管理
4.2 onsibilities of Immediate Supervisor 新员工的直接主管职责
4.2.1 To brief the new employee on the JD, KPI and to highlight the essential knowledge and skills required for the job. 与新员工沟通 JD,KPI,并定义与新员工工作相关的基本知识和技能
作环境,掌握与工作相关的基本知识和技能,从而更好地投入到新的工作中,特制定本程
序。
2 SCOPE 适用范围
2.1 This policy is applicable to all Hi-P Group of companies. 所有赫比公司。
3 DEFINITIONS 定义
3.1 The Welcome Package shall be hereafter known as “WP”. 《新员工 Welcome Package》以下将简称为“WP”
5.1.2.2The contents of the new employee orientation programme, shall include but not limited to: Ethics, Company Introduction, Hi-P Culture, Tracker (for Level 3 Employees), company policy and procedure, EHSS, Basic Knowledge on Quality, etc. Training on Hi-P Culture shall be conducted on the first day and Tracker shall be conducted within a month of the new employee’s reporting date. 新员工入职培训内容应包括但不局限于:道德规范,公司简介、企业文化、 Tracker(Level 3 员工)、公司规章制度、EHSS、基本质量意识等。其中企 业文化在员工入职第一天举行, Tracker 在入职一个月内举行。
The ZπM algorithm for interferometric image reconstruction in SARSAS
Jos´ e M. B. Dias∗ and Jos´ e M. N. Leit˜ ao
Abstract The paper presents an effective algorithm for absolute phase (not simply modulo-2π ) estimation from incomplete, noisy, and modulo-2π observations in interferometric aperture radar and sonar (InSAR/InSAS). The adopted framework is also representative of other applications such as optical interferometry, magnetic resonance imaging, and diffraction tomography. The Bayesian viewpoint is adopted; the observation density is 2π -periodic and accounts for the interferometric pair decorrelation and system noise; the a priori probability of the absolute phase is modelled by a Compound Gauss Markov random field (CGMRF) tailored to piecewise smooth absolute phase images. We propose an iterative scheme for the computation of the maximum a posteriori probability (MAP) phase estimate. Each iteration embodies a discrete optimization step (Z-step), implemented by network programming techniques, and an iterative conditional modes (ICM) step (π -step). Accordingly, the algorithm is termed Zπ M, where the letter M stands for maximization. A set of experimental results, comparing the proposed algorithm with alternative approaches, illustrates the effectiveness of the proposed method.
纹理物体缺陷的视觉检测算法研究--优秀毕业论文
摘 要
在竞争激烈的工业自动化生产过程中,机器视觉对产品质量的把关起着举足 轻重的作用,机器视觉在缺陷检测技术方面的应用也逐渐普遍起来。与常规的检 测技术相比,自动化的视觉检测系统更加经济、快捷、高效与 安全。纹理物体在 工业生产中广泛存在,像用于半导体装配和封装底板和发光二极管,现代 化电子 系统中的印制电路板,以及纺织行业中的布匹和织物等都可认为是含有纹理特征 的物体。本论文主要致力于纹理物体的缺陷检测技术研究,为纹理物体的自动化 检测提供高效而可靠的检测算法。 纹理是描述图像内容的重要特征,纹理分析也已经被成功的应用与纹理分割 和纹理分类当中。本研究提出了一种基于纹理分析技术和参考比较方式的缺陷检 测算法。这种算法能容忍物体变形引起的图像配准误差,对纹理的影响也具有鲁 棒性。本算法旨在为检测出的缺陷区域提供丰富而重要的物理意义,如缺陷区域 的大小、形状、亮度对比度及空间分布等。同时,在参考图像可行的情况下,本 算法可用于同质纹理物体和非同质纹理物体的检测,对非纹理物体 的检测也可取 得不错的效果。 在整个检测过程中,我们采用了可调控金字塔的纹理分析和重构技术。与传 统的小波纹理分析技术不同,我们在小波域中加入处理物体变形和纹理影响的容 忍度控制算法,来实现容忍物体变形和对纹理影响鲁棒的目的。最后可调控金字 塔的重构保证了缺陷区域物理意义恢复的准确性。实验阶段,我们检测了一系列 具有实际应用价值的图像。实验结果表明 本文提出的纹理物体缺陷检测算法具有 高效性和易于实现性。 关键字: 缺陷检测;纹理;物体变形;可调控金字塔;重构
Keywords: defect detection, texture, object distortion, steerable pyramid, reconstruction
II
3D Model Matching with Viewpoint-Invariant Patches (VIP)-CVPR2008
3D Model Matching with Viewpoint-Invariant Patches(VIP)Changchang Wu,1Brian Clipp1,Xiaowei Li1,Jan-Michael Frahm1and Marc Pollefeys1,2 1Department of Computer Science2Department of Computer Science The University of North Carolina at Chapel Hill,NC,USA ETH Zurich,Switzerland {ccwu,bclipp,xwli,jmf}@ marc.pollefeys@inf.ethz.chAbstractThe robust alignment of images and scenes seen from widely different viewpoints is an important challenge for camera and scene reconstruction.This paper introduces a novel class of viewpoint independent local features for robust registration and novel algorithms to use the rich in-formation of the new features for3D scene alignment and large scale scene reconstruction.The key point of our ap-proach consists of leveraging local shape information for the extraction of an invariant feature descriptor.The ad-vantages of the novel viewpoint invariant patch(VIP)are: that the novel features are invariant to3D camera motion and that a single VIP correspondence uniquely defines the 3D similarity transformation between two scenes.In the pa-per we demonstrate how to use the properties of the VIPs in an efficient matching scheme for3D scene alignment. The algorithm is based on a hierarchical matching method which tests the components of the similarity transforma-tion sequentially to allow efficient matching and3D scene alignment.We evaluate the novel features on real data with known ground truth information and show that the features can be used to reconstruct large scale urban scenes.1.IntroductionIn recent years,there have been significant research efforts in fast,large-scale3D scene reconstructionfrom video.Recent systems show real time performance[14]. Large scale reconstruction from only video is a differential technique which can accumulate error over many frames. To avoid accumulated errors,the reconstruction system must recognize previously reconstructed scene parts and de-termine the similarity transformation between the current and previous reconstructions.This similarity transforma-tion is equivalent to the accumulated drift.Our novel feature can be used to establish these recognition based links.Tra-ditionally image-based matching is used to provide the loop closing constraints to bundle adjustment.The irregularityFigure1.Two corresponding VIPs.The green and grey view frus-tums are original camera poses.Red view frustums are viewpoint normalized cameras.Lower left and right show patches in the orig-inal images while center patches are the ortho-textures for the fea-ture rotationally aligned to the dominant gradient direction.of3D structure makes matching3D models using only tex-ture information difficult or impossible for large changes in viewing direction.In urban modeling for example,a video’s path often crosses at intersections where the viewing direc-tion differs by about90◦.We propose the novel viewpoint invariant patch(VIP) which provides the necessary properties to determine the similarity transformation between two3D scenes even un-der significant viewpoint changes.VIPs are extracted from images using their known local geometry,and the detection is performed in a rectified image space to achieve robust-ness to projective distortion while maintaining full knowl-edge of texture.This is an essential advantage over invariant mappings.For example,our method is able to distinguish between squares and rectangles which are indistinguishable using affine invariant methods.As less of the local texture variation is sacrificed to achieve invariance,more is left for discrimination.In our method image textures are rectified with respect to the local geometry of the scene.The rectified texture can be seen as an ortho-texture1of the3D model which is view-1Ortho-texture:Representation of the texture that is projected on the1point independent.Thisfirst rectification step is essential to our new concept because rectification using the local geom-etry delivers robustness to changes of viewpoint.We then determine the salient feature points of the ortho-textures and extract the feature description.In this paper we use the well known SIFT-features and their associated descriptor[10]as interest points.The3D models are then transformed to a set of VIPs,made up of the feature’s3D position,patch scale, surface normal,local gradient orientation in the patch plane, and a SIFT descriptor.The rich information in VIP features makes them particularly suited to3D similarity transforma-tion estimation.One VIP correspondence is sufficient to compute a full similarity transformation between two mod-els by comparing the3D positions of the features,their nor-mals,orientations in the ortho-texture and patch scales.The scale and rotation components of the VIP correspondence are consistent with the relative scale and rotation between the two3D-models.Moreover,each putative correspon-dence can be tested separately facilitating efficient,robust feature matching.These advantages lead to a Hierarchical Efficient Hypothesis Testing(HEHT)scheme which deliv-ers a transformation,by which3D textured models can be stitched automatically.The remainder of the paper is organized as follows:Re-lated work is discussed in Section2.Section3introduces the viewpoint-invariant patch and discusses its properties. An efficient VIP detector for urban scenes is discussed in Section4.Section5describes our novel hierarchical match-ing scheme.The novel algorithms are evaluated and com-pared to existing state of the art features in Section6.2.Related WorkMany texture based feature detectors and descriptors have been developed for robust wide-baseline matching. One of the most popular is Lowe’s SIFT keypoints[10].The SIFT detector defines a feature’s scale in scale space and a feature orientation from the edge map in the image plane. Using the orientation,the SIFT detector generates normal-ized image patches to achieve2D similarity transformation invariance.Many feature detectors,including affine covari-ant features,use the SIFT descriptor to represent patches. We also use the SIFT-descriptor to encode the VIP.How-ever,our approach can also be applied with other feature descriptors.Affine covariant feature go beyond only achiev-ing invariance to affine transformations.Mikolajczyk et al. give a comparison of several such features in[13].Our proposed feature detection and description method goes be-yond affine invariance to robustness to projective transfor-mations.Critically,our features are not invariant to pro-jective transformations but they are stable under projective transformations.Whereas affine invariant approaches can surface with orthogonal projection.not distinguish between a square and a rectangle,our fea-ture representation is able to distinguish between the two. Our representation has fewer intrinsic ambiguities which improves matching performance.Recent advances in Structure from Motion(SfM)and ac-tive sensors have generated increased interest in the align-ment of3D models.In[2]Fitzgibbon and Zisserman pro-posed a hierarchical SfM method to align local3D scene models from connective triplets.The technique exploits3D correspondences from common2D tracks in consecutive triplets to compute the similarity transformation that aligns the features.Their technique works well for the small view-point changes between triplets typically observed in video.Snavely et al.proposed a framework for the registration of photo collections downloaded from the internet in[16]. Their framework also uses SIFT features to automatically extract wide baseline salient feature correspondences from photo collections.Robust matching and bundle adjustment are used to determine camera positions.The method relies on a reasonably dense set of viewpoints.Finally,the cam-eras and images are used to provide an image alignment of the scene.These methods are based only on texture.Goe-sele et al.[3]introduced a method to use the camera regis-tration and3D feature points from[16]to compute the scene geometry.This geometry is bootstrapped from small pla-nar patches and then grown into a larger model.Our novel features could use the small local patches to improve the feature matching for the global registration of local camera clusters.Other approaches are based entirely on geometry and ignore texture information.Iterative closest point(ICP) based methods can be used to compute the alignment by iteratively minimizing the sum of distances between closest points.However,ICP requires an initial approximate scene alignment and local reconstructions which are more accu-rate than are typically available.Another purely geometric approach is to align3D mod-els with the extracted geometric entities called spin im-ages[5].Stamos and Leordeanu used mainly planar regions and3D lines on them to do3D scene alignment[17].The approach uses a pair of matched infinite lines on the two local3D geometries to extract the in-plane rotation of the lines on the planar patches.The translation between the models was computed by estimating it as the vector that connects the mid-points of the matching lines.In general, two pairs of matched3D lines give a unique solution and so can be used efficiently in a RANSAC scheme.There are also methods based on both texture and ge-ometry.Liu et al.in[9]extended their work to align3D points from SfM to range data.Theyfirst register several images independently to range data by matching vanishing points.Then the registered images are used as common points between the range data and a model from SfM.Inthefinal step a robust alignment is computed by minimiz-ing the distance between the range data and the geometry obtained from SfM.After the alignment,photorealistic tex-ture is mapped to3D surface models.An extension of the approach is discussed in[8].King et al.[6]align laser range scans with texture images byfirst matching SIFT keypoints extracted directly from texture images and backprojecting those keypoints onto the range measurements.A single backrojected keypoint correspondence defines the transfor-mation between two models.A region growing variant of ICP is used to refine the model alignment while detecting outlier correspondences.In[24],Zhao and Nist´e r proposed a technique to align 3D point clouds from SfM and3D sensors.They start the method by registering two images,fixing a rough transfor-mation,and use ICP for alignment.ICP is effective because of the precision of3D laser range.Vanden Wyngaerd et al. proposed a method to stitch partially reconstructed3D mod-els.In[23],they extract and match bitangent curve pairs from images using their invariant characteristics.Aligning these curves gives an initialization for more precise meth-ods such as ICP.In an extension of this work[21],they use the symmetric characteristics of surface patches to achieve greater matching accuracy.In[22],texture and shape in-formation guide each other while looking for better regions to match.Additionally,Rothanger et.al.[15]proposed a matching technique whichfinds matches between affine in-variant regions and then verifies the matches based on their normal directions.Concurrent with this research Koeser and Koch[7]de-veloped a very similar approach to ours.The main differ-ence between our approaches is that they extract MSER in the original images,backproject these regions onto a depthmap and then extract normalized images using cam-eras with optical axis parallel to the surface normal.They too use SIFT descriptors as theirfinal invariant patch de-scriptor.Wefind keypoints directly in textures from ortho-graphic virtual cameras with viewing direction parallel to the surface normals.3.Viewpoint-Invariant Patch(VIP)In this section we describe our novel features in detail. Viewpoint-Invariant Patches(VIPs)are features that can be extracted from textured3D models which combine images with corresponding depth maps.VIPs are invariant to3D similarity transformations.They can be used to robustly and efficiently align3D models of the same scene from video taken from significantly different viewpoints.In this pa-per we’ll mostly consider3D models obtained from video by SfM,but our method is equally applicable to textured3D models obtained using LIDAR or other sensors.Our robust-ness to3D similarities exactly corresponds to the ambiguity of3D models obtained from images,while the ambiguities of other sensors can often be described by a3D Euclidean transformation or with even fewer degrees of freedom.Our undistortion is based on local scene planes or on lo-cal planar approximations of the scene.Conceptually,for every point on the surface we estimate the local tangent plane’s normal and generate a texture patch by orthogonal projection onto the plane.Within the local ortho-texture patch we determine if the point corresponds to a local ex-tremal response of the Difference-of-Gaussians(DoG)fil-ter in scale space.If it is we determine its orientation in the tangent plane by the dominant gradient direction and extract a SIFT descriptor on the tangent ing the tangent plane avoids the poor repeatability of interest point detection under projective transformations seen in popular feature detectors[13].The next sections will give more details about the differ-ent steps of the VIP feature detection method.Thefirst step in the feature detection is to achieve a viewpoint normalized ortho-texture for each patch.3.1.Viewpoint NormalizationViewpoint-normalized image patches need to be gener-ated to describe VIPs.Viewpoint-normalization is similar to the normalization of image patches according to scale and orientation performed in SIFT and normalization according to ellipsoid in affine covariant feature detectors.The view-point normalization can be divided into the following steps: 1.Warp the image texture onto the local tangentialplane.Non-planar regions are warped to a local planar approximation to the surface which causes little distor-tion over small surface patches.2.Project the texture into an orthographic camerawith viewing direction parallel to the local tangential plane’s normal.3.Extract the VIP descriptor from the orthographicpatch projection.Invariance to scale is achieved by normalizing the patch according to local ortho-texture scale.Like[10]a DoGfilter and local extrema sup-pression is used.VIP orientation is found based on the dominant gradient direction in the ortho-texture patch.Figure1demonstrates the effect of viewpoint normal-ization.The2nd and3rd column in thefigure are the nor-malized image patches.The normalized image patches of a matched pair are very similar despite significantly different original images due to the largely different viewing direc-tions.3.2.VIP GenerationWith the virtual camera,the size and orientation of a VIP can be obtained by transforming the the scale and orien-Figure2.VIPs detected on the3D model,the cameras correspond-ing to the textures are shown at the bottomtation of its corresponding image feature to world coordi-nates.A VIP is then fully defined as(x,σ,n,d,s)where •x is its3D position,•σis the patch size,•n is the surface normal at this location,•d is texture’s dominant orientation as a vector in3D, and•s is the SIFT descriptor that describes the viewpoint-normalized patch.Note,a sift feature is a sift descrip-tor plus it’s position,scale and orientation.The above steps extractthe VIP features from images and known local3D ing VIPs extracted from two models we can thenfind areas where the models repre-sent the same surface.3.3.VIP MatchingPutative VIP matches can be obtained with a standard nearest neighbor matching of the descriptors or other more scalable methods.After obtaining all the putative matches between two3D scenes,robust estimation methods can be used to select an optimized scene transformation using the 3D hypotheses from each VIP correspondences.Since VIPs are viewpoint invariant,given a correct camera matrix and 3D structure,we can expect the similarity between correct matches to be more accurate than a transformation derived from viewpoint dependent matching techniques.The richness of the VIP feature allows computation of the3D similarity transformation between two scenes from a single match.The ratio of the scales of two VIPs expresses the relative scale between the3D scenes.Relative rotation is obtained using the normal and orientation of the VIP pair. The translation between the scenes is obtained by exam-ining the rotation and scale compensated feature locations.2Local geometry denotes the geometry that is recovered from the im-ages by using SfM and multi-view stereo methods for example.Please note that the local geometry is usually given in the coordinate system of thefirst camera of the sequence with an arbitrary scale w.r.t.the real world motion.Figure3.VIPs detected on dominant planes.Figure4.Original image(left)and its normalized patch(right) The scale and rotation needed to bring corresponding VIP features into alignment is constant for a complete3D model. We will use this property later to set up an Hierarchical Effi-cient Hypothesis Testing(HEHT)scheme to determine the 3D similarity between models.4.Efficient VIP DetectionIn general planar patch detection needs to be executed for every pixel of the image to make the ortho-textures.Each pixel(x,y)together with the camera center C defines a ray, which is intersected with the local3D scene geometry.The point of intersection is the corresponding3D point of the feature.From this point and its spatial neighbors we then compute the tangential planeΠt at the point,which for pla-nar regions coincides with the local plane.For structures that only slightly deviate from a plane we retrieve a pla-nar approximation for local geometry of the patch.Then the extracted plane can be used to compute the VIP feature description with respect to this plane.This method is gen-erally valid for any scene.VIP detection for a set of points that have the same nor-mal can be efficiently done in a single pass.Considering these VIPs,the image coordinate transformations between them are simply2D similarity transformations.This means that the VIP detection for points with the same normal can be done in one pass on a larger planar patch,on which all the points are projected,and the original VIP can be recovered by applying a known similarity transformation.Figure3illustrates a result of detecting VIPs on domi-nant planes.The planes here compensate for the noise in the reconstructed model,and improve VIP localization.Figure 4shows an example of a viewpoint normalized facade.5.Hierarchical Estimation of3D SimilarityTransformationA hierarchical method is proposed in this section to es-timate the3D similarity transformation between two3D models from their putative VIP matches.Each single VIP correspondence gives a unique3D similarity transforma-tion,and so hypothesized matches can be tested efficiently. Furthermore,the rotation and scaling components of the similarity transformation are the same in all inlier VIP matches,and they can be tested separately and efficiently with a voting consensus.5.1.3D Similarity Transformation from a SingleVIP CorrespondenceGiven a VIP correspondence of(x1,σ1,n1,d1,s1)and (x2,σ2,n2,d2,s2),the scaling between them is given byσs=σ1σ2(1)The rotation between them satisfies(n1,d1,d1×n1)R s=(n2,d2,d2×n2).(2) The translation between them isT s=x1−σs R s x2(3) A3D similarity transformation can be formed from the three components as(σs R s,T s).5.2.Hierarchical Efficient Hypothesis-Test(HEHT)MethodThe scale,rotation and translation of a VIP is covariant with the global3D similarity transformation,and the local feature scale change and rotation are the same as the global scaling and rotation.Solving these components separately and hierarchically increases accuracy and dramatically re-duces the search space for the correct similarity transforma-tion.The3D similarity estimation in this paper is done hier-archically in three steps starting from a set of putative VIP correspondences.First,each VIP correspondence is scored by the number of other VIP correspondences that support it’s scaling.All VIP correspondences which are inliers to the VIP correspondence with most support are used to cal-culate a mean scaling and outliers are removed from the pu-tative set.Second,the same process is repeated with scor-ing based on support for each correspondence’s rotation and the putative set is again pruned of outliers.Third,the same process is repeated scoring according to translation to de-termine thefinal set of inlier VIP correspondences.A non-linear optimization is run tofind the scaling,rotation,and translation using all of the remaining ing RANSAC with VIP FeaturesIt is worth note that in our experiments all possible hy-potheses are exhaustively tested,which is very efficient be-cause each VIP correspondence generates one hypothesis and the whole sample space is linear with the number of putative VIP matches.The method described above can be easily extended to a RANSAC scheme by checking only a small set of hypotheses.It is know that the RANSAC requires N=log(1−p)log(1−(1−e)s)random samples to get one inlier sample free of outliers,where e is ratio of outliers, p is the expected probability,and s is number of matches to establish a hypothesis[4].In our case,s=1,so that N=log e(1−p).For example,when outlier ratio is90%, 44random samples are enough to get at least one inlier match with probability99%.This leads to an even more efficient estimation of3D similarity transformations.How-ever,in the cases where there are many outliers,an exhaus-tive test of all transformation hypotheses is the most reliable and still very efficient.6.Experimental Results and EvaluationThis section compares viewpoint invariant patches to other corner detectors in terms of the number of correct correspondences found and the feature re-detection rate.In addition we apply the VIP-based3D alignment to several reconstructed models to demonstrate reliable surface align-ment and perform SfM of a large scene completing a loop around a large building using VIPs.6.1.EvaluationTo measure the performance of the VIP feature we per-formed an evaluation similar to the method of[13].Our test data[1]is a sequence of images of a brick wall taken with increasing angles between the optical axis and the wall’s normal.Each of the images of the wall has a known ho-mography to thefirst image,which was taken with image plane fronto-parallel to the ing this homography we extract a region of overlap between thefirst image and each other image.We extract features in this area of over-lap and use two measures of performance,the number of inlier correspondences and the re-detection rate,to evalu-ate a number of feature detectors.The number of inliers is the number of feature correspondences whichfit the known homography.Re-detection rate is the ratio of inlier corre-spondences found in these overlapping regions to the num-ber features found in the fronto-parallel view.The number of inliers is shown in Figure5and the re-detection rate is shown in Figure6.Figure5shows that the VIP generates a significantly larger number of inliers over a wide range of angles than the other detectors.The other detectors we compare to are SIFT[10],Harris-Affine[12],Hessian-Affine[12],Intensityn u m i n l i e r sFigure 5.Number of inliers under a projective transformationangle (deg)d e t e c t i o n r a t eFigure 6.Re-detection rate of features under a projective transfor-mationBased Region (IBR)[19],Edge Based Region (EBR)[19],and Maximally Stable Extremal Region (MSER)[18].Our novel VIP feature also has a significantly higher re-detection rate than the other detectors as seen in Figure 6.This high re-detection rate is a result of the detection of features on the ortho-textures.Even under large viewpoint changes which often result in a projective transformation between images the VIP performs well.6.2.ExperimentsFor our experimental evaluation of the novel detector we used several image sequences of urban scenes.For each im-age sequence we used SfM to compute its depths map and camera positions.We used two image sequences of each scene with different viewpoints and camera paths.Camera positions were defined relative to the pose of the first cam-era in each sequence.The first scene,shown in Fig.7,con-sists of two facades of a building reconstructed from two different sets of cameras with significantly different view-ing directions (about 45◦).The cameras moved along a path around the building.One can observe reconstruction errors due to trees in front of the building.An offset was added to the second scene model for visualization of the matching VIPs.The red lines connect all of the inlier cor-respondences.Rotation and scaling have been corrected us-ing transformations calculated using VIPs in this visualiza-tion.The HEHT determined 214inliers out of 2085putative matches.The number of putative matches is high because putative matches are generated between all features in each of the models.The second evaluation scene shown in Figure 8consists of two local scene models,with camera paths that inter-sect at an angle of 45degrees.The overlapping region isa small part of the combined models,and it is seen from very different viewpoints in the two videos.Experiments show that our 3D model alignment method can reliably de-tect the small common surface and align the two models.Videos in the supplemental materials illustrate the details of our algorithm.We match models reconstructed from camera paths which cross at a 90◦angle in Figure 9.Note the large differ-ence in viewing directions between the cameras on the left and right in the image.This shows that the VIP can match features reconstructed from widely different viewpoints.Table 1shows quantitative results of the HEHT.Note that scale and rotation verification remove a significant por-tion of the outliers.For evaluation we first measure the dis-tances of the matched points after the first 3stages and after the nonlinear refinement.To measure the quality of surface alignment,we check the point distances between the over-lapping parts of the models.The models are reconstructed with scale matching the real building and so the error is given in meters.The statistics in table 1demonstrate the performance of our matching.Scenes#1(Fig 7)#2(Fig 8)#3(Fig 9)Viewing dir-ection change 45◦45◦90◦Putative #2085494236Inlier #1224/654/214141/42/38133/108/101Error after first 3stages 0.02880.02300.114Error after nonlinear ref.0.01280.0180.0499Surface ali-gnment error0.4340.1350.629Table 1.HEHT running details in 3experiments.The second row gives the approximate viewing direction change between two im-age sequences.The third row the number of putative matches,and the fourth row shows the number of inliers in each stage of the 3-stage HEHT.The errors in the following 3rows are median errors in meter.The large difference between the surface alignment error and feature matching show the large noise in stereo reconstruction.Additionally we compared the VIP-based alignment with SIFT feature and MSER.For SIFT and MSER,the 2D feature locations are projected to the 3D model surface to get 3D points.The putative match generation for them is the same as the VIP matching since they all use SIFT de-scriptors.Then a Least-Square method [20]and RANSAC is used to evaluate the 3D similarity transformation between the point matches.Table 6.2shows the comparison between SIFT,MSER and VIP.The results show that VIP can handle the large viewpoint changes for which SIFT and MSER do not work.The advantages of the VIP for wide baseline matching are perhaps best demonstrated by a large scale reconstruc-。
Cheirality invariants
Self-calibration from image triplets
Notation We will not distinguish between the Euclidean and similarity cases, both will be loosely referred to as metric. Generally vectors will be denoted by x, matrices as H, and tensors as Tjk . Image coordinates are lower case 3-vectors, i e.g. x, world coordinates are upper case 4-vectors, e.g. X. For homogeneous quantities, = indicates equality up to a non-zero scale factor.
Self-Calibration from Image Triplets
Martin Armstrong1 , Andrew Zisserman1 and Richard Hartley2
1ቤተ መጻሕፍቲ ባይዱ
Robotics Research Group, Department of Engineering Science, Oxford University, England 2 The General Electric Corporate Research and Development Laboratory, Schenectady, NY, USA. ibration of a camera with unchanging internal parameters undergoing planar motion. It is shown that a ne calibration is recovered uniquely, and metric calibration up to a two fold ambiguity. The novel aspects of this work are: rst, relating the distinguished objects of 3D Euclidean geometry to xed entities in the image second, showing that these xed entities can be computed uniquely via the trifocal tensor between image triplets third, a robust and automatic implementation of the method. Results are included of a ne and metric calibration and structure recovery using images of real scenes.
《Moments and Moment Invariants in Pattern Recognition》1-Introduction to Moments (p 1-11)
1Introduction to moments1.1MotivationIn our everyday life,each of us almost constantly receives,processes and analyzes a huge amount of information of various kinds,significance and quality and has to make decisions based on this analysis.More than95%of information we perceive is optical in character.Image is a very powerful information medium and communication tool capable of representing complex scenes and processes in a compact and efficient way.Thanks to this, images are not only primary sources of information,but are also used for communication among people and for interaction between humans and machines.Common digital images contain an enormous amount of information.An image you can take and send in a few seconds to your friends by a cellphone contains as much information as several hundred pages of text.This is why there is an urgent need for automatic and powerful image analysis methods.Analysis and interpretation of an image acquired by a real(i.e.nonideal)imaging system is the key problem in many application areas such as robot vision,remote sensing,astronomy and medicine,to name but a few.Since real imaging systems as well as imaging conditions are usually imperfect,the observed image represents only a degraded version of the original scene.Various kinds of degradation(geometric as well as graylevel/color)are introduced into the image during the acquisition process by such factors as imaging geometry,lens aberration, wrong focus,motion of the scene,systematic and random sensor errors,etc.(see Figures1.1, 1.2and1.3).In general,the relation between the ideal image f(x,y)and the observed image g(x,y) is described as g=D(f),where D is a degradation operator.Degradation operator D can usually be decomposed into radiometric(i.e.graylevel or color)degradation operator R and geometric(i.e.spatial)degradation operator G.In real imaging systems R can usually be modeled by space-variant or space-invariant convolution plus noise while G is typically a transform of spatial coordinates(for instance,perspective projection).In practice,both operators are typically either unknown or are described by a parametric model with unknown parameters.Our goal is to analyze the unknown scene f(x,y),an ideal image of which is not available,by means of the sensed image g(x,y)and a-priori information about the degradations.Mome n ts and Mome n t Inv a r i ants in Patte r n R e c o gnition J a n Flus s e r, Tomáš Suk and B a rbara Z i tová© 2009 J o hn Wiley & Sons , L t d. ISBN: 978-0-470-69987-42MOMENTS AND MOMENT INV ARIANTS IN PATTERN RECOGNITIONFigure1.1Perspective distortion of the image caused by a nonperpendicular view.Figure1.2Image blurring caused by wrong focus of the camera.By the term scene analysis we usually understand a complex process consisting of three basic stages.First,the image is preprocessed,segmented and objects of potential interest are detected.Second,the extracted objects are“recognized”,which means they are mathematically described and classified as elements of a certain class from the set of predefined object classes.Finally,spatial relations among the objects can be analyzed.The first stage contains traditional image-processing methods and is exhaustively covered in standard textbooks[1–3].The classification stage is independent of the original data and is carried out in the space of descriptors.This part is comprehensively reviewed in the famous Duda–Hart–Stork book[4].For the last stage we again refer to[3].INTRODUCTION TO MOMENTS3Figure1.3Image distortion caused by a nonlinear deformation of the scene.1.2What are invariants?Recognition of objects and patterns that are deformed in various ways has been a goal of much recent research.There are basically three major approaches to this problem–brute force,image normalization and invariant features.In the brute-force approach we search the parametric space of all possible image degradations.That means the training set of each class should contain not only all class representatives but also all their rotated,scaled, blurred and deformed versions.Clearly,this approach would lead to extreme time complexity and is practically inapplicable.In the normalization approach,the objects are transformed into a certain standard position before they enter the classifier.This is very efficient in the classification stage but the object normalization itself usually requires the solving of difficult inverse problems that are often ill-conditioned or ill-posed.For instance,in the case of image blurring,“normalization”means in fact blind deconvolution[5]and in the case of spatial image deformation,“normalization”requires registration of the image to be performed to some reference frame[6].The approach using invariant features appears to be the most promising and has been used extensively.Its basic idea is to describe the objects by a set of measurable quantities called invariants that are insensitive to particular deformations and that provide enough discrimination power to distinguish objects belonging to different classes.From a mathe-matical point of view,invariant I is a functional defined on the space of all admissible image functions that does not change its value under degradation operator D,i.e.that satisfies the condition I(f)=I(D(f))for any image function f.This property is called invariance. In practice,in order to accommodate the influence of imperfect segmentation,intra-class variability and noise,we usually formulate this requirement as a weaker constraint:I(f) should not be significantly different from I(D(f)).Another desirable property of I,as important as invariance,is discriminability.For objects belonging to different classes,I must have significantly different values.Clearly,these two requirements are antagonistic–the broader the invariance,the less discrimination power and vice versa.Choosing a proper4MOMENTS AND MOMENT INV ARIANTS IN PATTERN RECOGNITION tradeoff between invariance and discrimination power is a very important task in feature-based object recognition(see Figure1.4for an example of a desired situation).Usually,one invariant does not provide enough discrimination power and several invariants I1,...,I n must be used simultaneously.Then,we speak about an invariant vector.In this way,each object is represented by a point in an n-dimensional metric space called feature space or invariant space.Figure1.4Two-dimensional feature space with two classes,almost an ideal example.Each class forms a compact cluster(the features are invariant)and the clusters are well separated (the features are discriminative).1.2.1Categories of invariantThe existing invariant features used for describing2D objects can be categorized from various points of view.Most straightforward is the categorization according to the type of invariance. We recognize translation,rotation,scaling,affine,projective,and elastic geometric invariants. Radiometric invariants exist with respect to linear contrast stretching,nonlinear intensity transforms,and to convolution.Categorization according to the mathematical tools used may be as follows:•simple shape descriptors–compactness,convexity,elongation,etc.[3];•transform coefficient features are calculated from a certain transform of the image–Fourier descriptors[7,8],Hadamard descriptors,Radon transform coefficients,and wavelet-based features[9,10];•point set invariants use positions of dominant points[11–14];•differential invariants employ derivatives of the object boundary[15–19];•moment invariants are special functions of image moments.INTRODUCTION TO MOMENTS5 Another viewpoint reflects what part of the object is needed to calculate the invariant.•Global invariants are calculated from the whole image(including background if no segmentation was performed).Most of them include projections of the image onto certain basis functions and are calculated by pared to local invariants, global invariants are much more robust with respect to noise,inaccurate boundary detection and other similar factors.On the other hand,their serious drawback is the fact that a local change of image influences the values of all the invariants and is not “localized”in a few components only.This is why global invariants cannot be used when the object studied is partially occluded by another object and/or when a part of it is out of thefield of vision.Moment invariants fall into this category.•Local invariants are,in contrast,calculated from a certain neighborhood of dominant points only.Differential invariants are typical representatives of this category.The object boundary is detectedfirst and then the invariants are calculated for each boundary point as functions of the boundary derivatives.As a result,the invariants at any given point depend only on the shape of the boundary in its immediate vicinity.If the rest of the object undergoes any change,the local invariants are not affected.This property makes them a seemingly perfect tool for recognition of partially occluded objects but due to their extreme vulnerability to discretization errors,segmentation inaccuracies,and noise,it is difficult to actually implement and use them in practice.•Semilocal invariants attempt to retain the positive properties of the two groups above and to avoid the negative ones.They divide the object into stable parts(most often this division is based on inflection points or vertices of the boundary)and describe each part by some kind of global invariant.The whole object is then characterized by a string of vectors of invariants and recognition under occlusion is performed by maximum substring matching.This modern and practically applicable approach was used in various modifications in references[20–26].Here,we focus on object description and recognition by means of moments and moment invariants.The history of moment invariants began many years before the appearance of the first computers,in the nineteenth century under the framework of group theory and the theory of algebraic invariants.The theory of algebraic invariants was thoroughly studied by the famous German mathematicians P.A.Gordan and D.Hilbert[27]and was further developed in the twentieth century in references[28]and[29],among others.Moment invariants werefirst introduced to the pattern recognition and image processing community in1962[30],when Hu employed the results of the theory of algebraic invariants and derived his seven famous invariants to the rotation of2D objects.Since that time,hun-dreds of papers have been devoted to various improvements,extensions and generalizations of moment invariants and also to their use in many areas of application.Moment invariants have become one of the most important and most frequently used shape descriptors.Even though they suffer from certain intrinsic limitations(the worst of which is their globalness, which prevents direct utilization for occluded object recognition),they frequently serve as “first-choice descriptors”and as a reference method for evaluating the performance of other shape descriptors.Despite a tremendous effort and a huge number of published papers,many problems remain to be resolved.6MOMENTS AND MOMENT INV ARIANTS IN PATTERN RECOGNITION1.3What are moments?Moments are scalar quantities used to characterize a function and to capture its significant features.They have been widely used for hundreds of years in statistics for description of the shape of a probability density function and in classic rigid-body mechanics to measure the mass distribution of a body.From the mathematical point of view,moments are “projections”of a function onto a polynomial basis (similarly,Fourier transform is a projection onto a basis of harmonic functions).For the sake of clarity,we introduce some basic terms and propositions,which we will use throughout the book.Definition 1.1By an image function (or image )we understand any piece-wise continuous real function f (x,y)of two variables defined on a compact support D ⊂R ×R and having a finite nonzero integral.Definition 1.21General moment M (f )pq of an image f (x,y),where p,q are non-negative integers and r =p +q is called the order of the moment,defined asM (f )pq = D p pq (x,y)f (x,y)d x d y,(1.1)where p 00(x,y),p 10(x,y),...,p kj (x,y),...are polynomial basis functions defined on D .(We omit the superscript (f )if there is no danger of confusion.)Depending on the polynomial basis used,we recognize various systems of moments.1.3.1Geometric and complex momentsThe most common choice is a standard power basis p kj (x,y)=x k y j that leads to geometric moments m pq =∞−∞∞ −∞x p y q f (x,y)d x d y.(1.2)Geometric moments of low orders have an intuitive meaning –m 00is a “mass”of the image (for binary images,m 00is an area of the object),m 10/m 00and m 01/m 00define the center of gravity or centroid of the image.Second-order moments m 20and m 02describe the “distribution of mass”of the image with respect to the coordinate axes.In mechanics they are called the moments of inertia .Another popular mechanical quantity,the radius of gyration with respect to an axis,can also be expressed in terms of moments as √m 20/m 00and √m 02/m 00,respectively.If the image is considered a probability density function (pdf)(i.e.its values are normalized such that m 00=1),then m 10and m 01are the mean values.In case of zero means,m 20and m 02are variances of horizontal and vertical projections and m 11is a covariance between them.In this way,the second-order moments define the orientation of the image.As will be seen later,second-order geometric moments can be used to find the normalized position of an image.In statistics,two higher-order moment characteristics have been 1In some papers one can find extended versions of Definition 1.2that include various scalar factors and/or weighting functions in the integrand.We introduce such extensions in Chapter 6.INTRODUCTION TO MOMENTS 7commonly used –the skewness and the kurtosis .The skewness of the horizontal projection isdefined as m 30/ m 320and that of vertical projection as m 03/ m 302.The skewness measuresthe deviation of the respective projection from symmetry.If the projection is symmetric with respect to the mean (i.e.to the origin in this case),then the corresponding skewness equals zero.The kurtosis measures the “peakedness”of the pdf and is again defined separately for each projection –the horizontal kurtosis as m 40/m 220and the vertical kurtosis as m 04/m 202.Characterization of the image by means of geometric moments is complete in the following sense.For any image function,geometric moments of all orders do exist and are finite.The image function can be exactly reconstructed from the set of its moments (this assertion is known as the uniqueness theorem ).Another popular choice of the polynomial basis p kj (x,y)=(x +iy )k (x −iy )j ,where i is the imaginary unit,leads to complex momentsc pq =∞ −∞∞ −∞(x +iy )p (x −iy )q f (x,y)d x d y.(1.3)Geometric moments and complex moments carry the same amount of information.Each complex moment can be expressed in terms of geometric moments of the same order asc pq =p k =0q j =0p k q j (−1)q −j ·i p +q −k −j ·m k +j,p +q −k −j (1.4)and vice versa 2m pq =12p +q i q p k =0q j =0 p k q j (−1)q −j ·c k +j,p +q −k −j .(1.5)Complex moments are introduced because they behave favorably under image rotation.This property can be advantageously employed when constructing invariants with respect to rotation,as will be shown in the following chapter.1.3.2Orthogonal momentsIf the polynomial basis {p kj (x,y)}is orthogonal,i.e.if its elements satisfy the condition of orthogonalityp pq (x,y)·p mn (x,y)d x d y =0(1.6)or weighted orthogonalityw(x,y)·p pq (x,y)·p mn (x,y)d x d y =0(1.7)for any indexes p =m or q =n ,we speak about orthogonal (OG)moments . is the area of orthogonality.2While the proof of (1.4)is straightforward,the proof of (1.5)requires,first,x and y to be expressed as x =((x +iy)+(x −iy))/2and y =((x +iy)−(x −iy))/2i .8MOMENTS AND MOMENT INV ARIANTS IN PATTERN RECOGNITION In theory,all polynomial bases of the same degree are equivalent because they generate the same space of functions.Any moment with respect to a certain basis can be expressed in terms of moments with respect to any other basis.From this point of view,OG moments of any type are equivalent to geometric moments.However,a significant difference appears when considering stability and computational issues in a discrete domain.Standard powers are nearly dependent both for small and large values of the exponent and increase rapidly in range as the order increases.This leads to correlated geometric moments and to the need for high computational ing lower precision results in unreliable computation of geometric moments.OG moments can capture the image features in an improved,nonredundant way.They also have the advantage of requiring lower computing precision because we can evaluate them using recurrent relations, without expressing them in terms of standard powers.Unlike geometric moments,OG moments are coordinates of f in the polynomial basis in the common sense used in linear algebra.Thanks to this,the image reconstruction from OG moments can be performed easily asf(x,y)=k,jM kj·p kj(x,y).Moreover,this reconstruction is“optimal”because it minimizes the mean-square error when using only afinite set of moments.On the other hand,image reconstruction from geometric moments cannot be performed directly in the spatial domain.It is carried out in the Fourier domain using the fact that geometric moments form Taylor coefficients of the Fouriertransform F(u,v)F(u,v)=pq(−2πi)p+qp!q!m pq u p v q.(To prove this,expand the kernel of the Fourier transform e−2πi(ux+vy)into a power series.) Reconstruction of f(x,y)is then achieved via inverse Fourier transform.We will discuss various OG moments and their properties in detail in Chapter6.Their usage for stable implementation of implicit invariants will be shown in Chapter4and practical applications will be demonstrated in Chapter8.1.4Outline of the bookThis book deals in general with moments and moment invariants of2D and3D images and with their use in object description,recognition,and in other applications.Chapters2–5are devoted to four classes of moment invariant.In Chapter2,we introduce moment invariants with respect to the simplest spatial transforms–translation,rotation,and scaling.We recall the classical Hu invariantsfirst and then present a general method for constructing invariants of arbitrary orders by means of complex moments.We prove the existence of a relatively small basis of invariants that is complete and independent.We also show an alternative approach–constructing invariants via normalization.We discuss the difficulties which the recognition of symmetric objects poses and present moment invariants suitable for such cases.Chapter3deals with moment invariants to the affine transform of spatial coordinates.We present three main approaches showing how to derive them–the graph method,the method of normalized moments,and the solution of the Cayley–Aronhold equation.RelationshipsINTRODUCTION TO MOMENTS9 between invariants from different methods are mentioned and the dependency of generated invariants is studied.We describe a technique used for elimination of reducible and dependent invariants.Finally,numerical experiments illustrating the performance of the affine moment invariants are carried out and a brief generalization to color images and to3D images is proposed.In Chapter4,we introduce a novel concept of so-called implicit invariants to elastic deformations.Implicit invariants measure the similarity between two images factorized by admissible image deformations.For many types of image deformation traditional invariants do not exist but implicit invariants can be used as features for object recognition.We present implicit moment invariants with respect to the polynomial transform of spatial coordinates and demonstrate their performance in artificial as well as real experiments.Chapter5deals with a completely different kind of moment invariant,with invariants to convolution/blurring.We derive invariants with respect to image blur regardless of the convolution kernel,provided that it has a certain degree of symmetry.We also derive so-called combined invariants,which are invariant to composite geometric and blur degradations. Knowing these features,we can recognize objects in the degraded scene without any restoration.Chapter6presents a survey of various types of orthogonal moments.They are divided into two groups,thefirst being moments orthogonal on a rectangle and the second orthogonal on a unit disk.We review Legendre,Chebyshev,Gegenbauer,Jacobi,Laguerre,Hermite, Krawtchouk,dual Hahn,Racah,Zernike,Pseudo–Zernike and Fourier–Mellin polynomials and moments.The use of orthogonal moments on a disk in the capacity of rotation invariants is discussed.The second part of the chapter is devoted to image reconstruction from its moments.We explain why orthogonal moments are more suitable for reconstruction than geometric ones and a comparison of reconstructing power of different orthogonal moments is presented.In Chapter7,we focus on computational issues.Since the computing complexity of all moment invariants is determined by the computing complexity of moments,efficient algorithms for moment calculations are of prime importance.There are basically two major groups of methods.Thefirst one consists of methods that attempt to decompose the object into nonoverlapping regions of a simple shape.These“elementary shapes”can be pixel rows or their segments,square and rectangular blocks,among others.A moment of the object is then calculated as a sum of moments of all regions.The other group is based on Green’s theorem,which evaluates the double integral over the object by means of single integration along the object boundary.We present efficient algorithms for binary and graylevel objects and for geometric as well as selected orthogonal moments.Chapter8is devoted to various applications of moments and moment invariants in image analysis.We demonstrate their use in image registration,object recognition,medical imaging,content-based image retrieval,focus/defocus measurement,forensic applications, robot navigation and digital watermarking.References[1]Gonzalez,R.C.and Woods,R.E.(2007)Digital Image Processing.Prentice Hall,3rd edn.10MOMENTS AND MOMENT INV ARIANTS IN PATTERN RECOGNITION[2]Pratt,W.K.(2007)Digital Image Processing.New York:Wiley Interscience,4th edn.[3]Šonka,M.,Hlaváˇc,V.and Boyle,R.(2007)Image Processing,Analysis and MachineVision.Toronto:Thomson,3rd edn.[4]Duda,R.O.,Hart,P.E.and Stork,D.G.(2001)Pattern Classification.New York:WileyInterscience,2nd edn.[5]Kundur,D.and Hatzinakos,D.(1996)“Blind image deconvolution,”IEEE SignalProcessing Magazine,vol.13,no.3,pp.43–64.[6]Zitová,B.and Flusser,J.(2003)“Image registration methods:A survey,”Image andVision Computing,vol.21,no.11,pp.977–1000.[7]Lin,C.C.and Chellapa,R.(1987)“Classification of partial2-D shapes using Fourierdescriptors,”IEEE Transactions on Pattern Analysis and Machine Intelligence,vol.9, no.5,pp.686–90.[8]Arbter,K.,Snyder,W.E.,Burkhardt,H.and Hirzinger,G.(1990)“Application of affine-invariant Fourier descriptors to recognition of3-D objects,”IEEE Transactions Pattern Analysis and Machine Intelligence,vol.12,no.7,pp.640–47.[9]Tieng,Q.M.and Boles,W.W.(1995)“An application of wavelet-based affine-invariantrepresentation,”Pattern Recognition Letters,vol.16,no.12,pp.1287–96.[10]Khalil,M.and Bayeoumi,M.(2001)“A dyadic wavelet affine invariant function for2Dshape recognition,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.23,no.10,pp.1152–63.[11]Mundy,J.L.and Zisserman,A.(1992)Geometric Invariance in Computer Vision.Cambridge,Massachusetts:MIT Press.[12]Suk,T.and Flusser,J.(1996)“Vertex-based features for recognition of projectivelydeformed polygons,”Pattern Recognition,vol.29,no.3,pp.361–67.[13]Lenz,R.and Meer,P.(1994)“Point configuration invariants under simultaneousprojective and permutation transformations,”Pattern Recognition,vol.27,no.11, pp.1523–32.[14]Rao,N.S.V.,Wu,W.and Glover,C.W.(1992)“Algorithms for recognizing planarpolygonal configurations using perspective images,”IEEE Transactions on Robotics and Automation,vol.8,no.4,pp.480–86.[15]Wilczynski,E.(1906)Projective Differential Geometry of Curves and Ruled Surfaces.Leipzig:B.G.Teubner.[16]Weiss,I.(1988)“Projective invariants of shapes,”in Proceedings of ComputerVision and Pattern Recognition CVPR’88(Ann Arbor,Michigan),pp.1125–34,IEEE Computer Society.[17]Rothwell,C.A.,Zisserman,A.,Forsyth,D.A.and Mundy,J.L.(1992)“Canonicalframes for planar object recognition,”in Proceedings of the Second European Conference on Computer Vision ECCV’92(St.Margherita,Italy),LNCS vol.588, pp.757–72,Springer.INTRODUCTION TO MOMENTS11 [18]Weiss,I.(1992)“Differential invariants without derivatives,”in Proceedings of theEleventh International Conference on Pattern Recognition ICPR’92(Hague,The Netherlands),pp.394–8,IEEE Computer Society.[19]Mokhtarian, F.and Abbasi,S.(2002)“Shape similarity retrieval under affinetransforms,”Pattern Recognition,vol.35,no.1,pp.31–41.[20]Ibrahim Ali,W.S.and Cohen,F.S.(1998)“Registering coronal histological2-Dsections of a rat brain with coronal sections of a3-D brain atlas using geometric curve invariants and B-spline representation,”IEEE Transactions on Medical Imaging, vol.17,no.6,pp.957–66.[21]Yang,Z.and Cohen,F.(1999)“Image registration and object recognition using affineinvariants and convex hulls,”IEEE Transactions on Image Processing,vol.8,no.7, pp.934–46.[22]Flusser,J.(2002)“Affine invariants of convex polygons,”IEEE Transactions on ImageProcessing,vol.11,no.9,pp.1117–18,[23]Rothwell, C. A.,Zisserman,A.,Forsyth,D. A.and Mundy,J.L.(1992)“Fastrecognition using algebraic invariants,”in Geometric Invariance in Computer Vision (Mundy,J.L.and Zisserman,A.,eds),pp.398–407,MIT Press.[24]Lamdan,Y.,Schwartz,J.and Wolfson,H.(1988)“Object recognition by affine invariantmatching,”in Proceedings of Computer Vision and Pattern Recognition CVPR’88(Ann Arbor,Michigan),pp.335–44,IEEE Computer Society.[25]Krolupper,F.and Flusser,J.(2007)“Polygonal shape description for recognition ofpartially occluded objects,”Pattern Recognition Letters,vol.28,no.9,pp.1002–11. [26]Horáˇc ek,O.,Kamenický,J.and Flusser,J.(2008)“Recognition of partially occludedand deformed binary objects,”Pattern Recognition Letters,vol.29,no.3,pp.360–69.[27]Hilbert,D.(1993)Theory of Algebraic Invariants.Cambridge:Cambridge UniversityPress.[28]Gurevich,G.B.(1964)Foundations of the Theory of Algebraic Invariants.Groningen,The Netherlands:Nordhoff.[29]Schur,I.(1968)Vorlesungenüber Invariantentheorie.Berlin:Springer(in German).[30]Hu,M.-K.(1962)“Visual pattern recognition by moment invariants,”IRE Transactionson Information Theory,vol.8,no.2,pp.179–87.。
Calibration from Images with known Objects
3-D Reconstruction and Camera Calibration from Images with known Objects
Gudrun Socher
Universitat Bielefeld, Technische Fakultat, AG Angewandte Informatik, Postfach 100131, 33501 Bielefeld, Germany
2 Model-based 3-D Reconstruction and Camera Parameter Estimation
Model-based 3-D reconstruction is a quantitative method to estimate simultaneously the best viewpoint of all cameras and the object pose parameters by tting the projection of a three-dimensional model to given two-dimensional features. The model- tting is accomplished by minimising a cost function measuring all di erences between projected model features and detected image features as a function of the objects' pose and the camera parameters. Common features in the scenes we are dealing with are points and circles. The projection of circles results in ellipses.
大学生心理问题危害问题英语作文
大学生心理问题危害问题英语作文全文共3篇示例,供读者参考篇1The Mental Health Issues and Hazards Among College StudentsIntroductionMental health has become a critical issue among college students in recent years. As the pressure to excel academically, socially, and personally mounts, many students find themselves struggling with anxiety, depression, and other mental health disorders. These issues not only impact their academic performance but also their overall well-being. In this article, we will explore the various mental health problems faced by college students and the hazards associated with them.Common Mental Health Problems Among College Students1. AnxietyAnxiety is one of the most common mental health issues faced by college students. The constant pressure to perform well in exams, secure internships, and maintain social relationshipscan cause students to feel overwhelmed and anxious. Symptoms of anxiety include panic attacks, irrational fears, and obsessive thoughts. If left untreated, anxiety can severely impact a student's ability to focus on their studies and interact with others.2. DepressionDepression is another major mental health issue affecting college students. Feelings of hopelessness, sadness, and worthlessness can prevent students from engaging in activities they once enjoyed. Depression can also lead to poor academic performance, strained relationships with peers, and even suicidal thoughts. It is crucial for students to seek help if they are experiencing symptoms of depression.3. Eating DisordersMany college students struggle with eating disorders such as anorexia, bulimia, and binge eating. The pressure to maintain a certain body image, combined with the stress of academic responsibilities, can trigger disordered eating habits. Eating disorders not only impact a student's physical health but can also lead to severe mental health issues such as low self-esteem and body dysmorphia.4. Substance AbuseSubstance abuse is a common coping mechanism for college students dealing with stress, anxiety, and depression. Drugs and alcohol may provide temporary relief from negative emotions, but they can also exacerbate mental health issues in the long run. Substance abuse can lead to addiction, impaired judgment, and risky behavior, putting students at risk of academic failure and legal consequences.Hazards Associated with Mental Health Problems1. Academic PerformanceMental health problems can significantly impact a student's academic performance. Anxiety and depression can interfere with a student's ability to concentrate, retain information, and complete assignments on time. As a result, students may experience a decline in grades, difficulty focusing in class, and even dropping out of school.2. Social RelationshipsMental health issues can strain relationships with friends, family members, and romantic partners. Students experiencing anxiety or depression may isolate themselves, avoid social gatherings, or exhibit erratic behavior that alienates thosearound them. This can lead to feelings of loneliness, misunderstanding, and further exacerbate their mental health problems.3. Physical HealthThe connection between mental and physical health iswell-established. Chronic stress, anxiety, and depression can weaken the immune system, disrupt sleep patterns, and increase the risk of developing cardiovascular diseases. Students with mental health problems may neglect their physical well-being, leading to a cycle of poor health outcomes.4. Long-term ConsequencesIf left untreated, mental health issues among college students can have long-term consequences that extend beyond their academic years. Untreated anxiety and depression can persist into adulthood, affecting a student's career, relationships, and overall quality of life. Substance abuse can also lead to addiction, dependence, and irreversible damage to one's physical and mental health.ConclusionIn conclusion, mental health problems among college students are a pressing issue that requires immediate attentionand intervention. It is essential for students to prioritize their well-being, seek help when needed, and practice self-care strategies to cope with the challenges of college life. By addressing mental health issues proactively, students can prevent the hazards associated with untreated conditions and lead a fulfilling and successful academic journey.篇2The Psychological Problems and Hazards Faced by College StudentsCollege students are a unique demographic with a variety of pressures and challenges that can lead to psychological issues. The transition from adolescence to adulthood, the pressure to succeed academically and socially, and the independence of living away from home are all factors that can contribute to mental health problems in college students. It is important for universities to address these issues and provide support for students to ensure their well-being.One of the most common psychological problems faced by college students is stress. The demands of academic work, extracurricular activities, and social life can be overwhelming, leading to anxiety, depression, and other mood disorders.According to the American Psychological Association, stress is a major factor in the decline of college students' mental health. It is important for universities to provide resources such as counseling services, support groups, and stress management workshops to help students cope with stress.Another common issue faced by college students is homesickness. Many students are living away from home for the first time and may feel isolated and lonely. This can lead to feelings of depression and anxiety. Universities should provide resources to help students adjust to life on campus, such as orientation programs, roommate matching services, and support groups for homesick students.Substance abuse is another significant problem among college students. Many students turn to drugs and alcohol as a way to cope with stress, anxiety, and other mental health issues. This can lead to addiction, academic problems, and legal issues. Universities should provide education about the dangers of substance abuse, as well as counseling and support for students struggling with addiction.Eating disorders are also a concern for college students. The pressure to be thin, coupled with the stress of academic and social expectations, can lead to disordered eating habits.Universities should provide resources for students struggling with eating disorders, such as support groups, counseling services, and access to nutritionists.Overall, the psychological problems faced by college students can have serious consequences if left untreated. It is important for universities to prioritize student mental health and provide resources to help students cope with stress, homesickness, substance abuse, and eating disorders. By addressing these issues, universities can create a supportive and healthy environment for their students to thrive academically and personally.篇3The Hazardous Effects of Psychological Issues on College StudentsCollege life is a time of great excitement, growth, and exploration, but it can also present a variety of challenges to students. One of the most prevalent problems facing college students today is psychological issues. These issues can range from stress and anxiety to depression and even more serious mental health conditions. The impact of these psychologicalissues can be profound, affecting students' academic performance, relationships, and overall well-being.One of the most common psychological issues facing college students is stress. The pressures of academic work, extracurricular activities, and social obligations can all contribute to feelings of stress and overwhelm. If left unchecked, chronic stress can lead to a variety of negative consequences, including decreased focus and concentration, poor sleep quality, and even physical health problems. The constant pressure to perform well in all areas of their lives can take a toll on students' mental health and well-being.Another significant psychological issue facing college students is anxiety. Anxiety disorders are the most common mental health condition in the United States, and college students are not immune to their effects. Students may experience feelings of worry, fear, and unease that can interfere with their daily lives. This can manifest in a variety of ways, including difficulty concentrating, irritability, and even panic attacks. Anxiety can have a significant impact on students' academic performance and overall quality of life.Depression is another serious psychological issue that can affect college students. Depression is a mood disordercharacterized by persistent feelings of sadness, hopelessness, and loss of interest in activities. It can be triggered by a variety of factors, including stress, trauma, and genetic predisposition. College students may be particularly vulnerable to depression due to the pressures of academic work, social expectations, and transitioning to a new environment. Depression can have a profound impact on students' ability to function, affecting their motivation, energy levels, and overall sense of well-being.In addition to stress, anxiety, and depression, college students may also face other psychological issues, such as eating disorders, substance abuse, and self-harm. These issues can stem from a variety of factors, including societal pressures, personal trauma, and genetic predisposition. Left untreated, they can have serious consequences for students' physical and mental health.The impact of psychological issues on college students can be profound. These issues can interfere with students' academic performance, relationships, and overall quality of life. Students may struggle to concentrate on their studies, participate in social activities, and take care of themselves. Left untreated, psychological issues can have long-lasting effects on students' mental health and well-being.It is important for colleges and universities to recognize the significance of psychological issues among their students and provide resources and support to help them cope. This can include counseling services, support groups, and education on mental health and wellness. By addressing psychological issues early and effectively, colleges can help students thrive academically, socially, and emotionally.In conclusion, psychological issues pose a significant hazard to college students, affecting their academic performance, relationships, and overall well-being. It is important for colleges and universities to provide resources and support to help students cope with these issues effectively. By addressing psychological issues early and effectively, colleges can help students thrive and succeed in all areas of their lives.。
illumination
Special cases where benefits arise from using the logarithm transform for illumination invariant feature extractionR.B.FisherDepartment of Artificial Intelligence,Edinburgh University5Forrest Hill,Edinburgh EH12QLrbf@AbstractOne of the goals of image feature extraction is to extract features froman image that are dependent on the scene,rather than the image,whichalso includes intensity information.In theory,a logarithmic transfor-mation allows the extraction of many different types of image features,with the magnitude of the extracted feature being more representativeof the scene property.Unfortunately,the magnitude of the effect isusually dominated by quantisation and image noise.This paper out-lines the theory and demonstrates the special cases where there is anadvantage to using the log transform.1IntroductionOne of the goals of image feature extraction is to extract features from an image that are dependent on the scene,rather than the image,which also includes in-tensity information.A standard image model describes a digitally recorded image as:P ij=α(ρxyψxy I xy)γ(1) where:P ij measured intensity value at pixel(i,j)αa scale factorρxy surface albedo at point(x,y)corresponding to pixel(i,j)ψxy surface orientation dependent reflectance scaling at point(x,y)I xy incident illumination magnitude at point(x,y)γgamma factor of the cameraHere,we assume that there is a linear relationship between the quantised image value and the analog input value(so that we can amalgamate all of the linear scale factors intoα).Also,by reflectance,we mean both albedo(intrinsic light reflecting ability)and surface orientation dependent effects.What is apparent from Equation(1)is that the measured pixel intensity value is a non-linear function of several factors.However,what is most commonly desiredfrom image feature extraction are estimates of properties related to the surface reflectanceρxy,independent of the current illumination.Applying the standard feature detectors,such as edge detectors,bar detectors,blob detectors,corner detectors,etc.to an observed intensity image results in feature strengths that depend on the illumination as well as the underlying surface reflectance.This is almost axiomatic in the computer vision community and is taught to most students in theirfirst course(e.g.[1],Sec2.2.3and Sec3.2.4).The usual recognition of this issue leads to the topic of color constancy and lightness([3],Ch9),whereby one assumes that the illumination is slowly vary-ing,to allow separation of illumination and reflectance.The presumption behind the exploitation of lightness theory is that one uses the theory to reconstruct an image with the illumination removed.(This theory also requires many restrictive assumptions(planar world,no mutual illumination,uniform color patches,etc.).However,rather than working on the reconstructed image,we can instead extract illumination independent features from a transformed image:by use of the logarithmic transformation exploited in the lightness computation,one can extract many different types of image features(or correspondingly,scene descrip-tions),with the magnitude of the extracted feature being a function of the un-derlying scene property.This paper describes this method and demonstrates its performance(or lack thereof)on a variety of scenes illuminated with different illuminants.2TheoryThe feature extraction process begins with the same local logarithmic transforma-tion as the lightness computation:log(P ij)=log(α(ρxyψxy I xy)γ)=log(α)+γlog(I xy)+γlog(ψxy)+γlog(ρxy)(2) This has the effect of changing the multiplicative effect of the illumination into an additive effect.If the local feature extraction/transformation operator T l satisfies the property:T l L=0when L is an image whose valuesfit a linear model over the same scale as the operator T l,then T l is able to remove a linear scaling of another image(e.g.,as occurs when a scene is illuminated).Then,the operator applied to an image of a scene V with scaling S gives:T l log(V S)=T l(log(V)+log(S))=T l log(V)+T l log(S)=T l log(V)(3) if log(V)is locally linear.Appendix A demonstrates that the logarithm of the1”illumination component fromR2an image(which may also be a constant illumination,as in the case of a distant point light source).T l is also able to remove the linear intensity variation arising on locally curved Lambertian surfaces.Appendix B demonstrates that the logarithm of the brightness on a curved patch is locally linear.Examples of the T l class of operator are the Laplacian,its discrete approxima-tions,and the difference of Gaussian operators.A second class of interesting operators is T c,have the property:T c C=0when C is a locally constant image.Then,T c is able to remove a constant illumi-nation component from an image,as the log(L)component is also constant in Eqn (3).Examples of this class of operator are the Roberts’Cross,Sobel and Canny operators,as well as the operators in T l.Most local neighborhood arithmetic operators have approximately linear form. For example,doubling the values of the image pixels typically results in the dou-bling of output values.A typical example is the basic vertical gradient operator |p i+1−p i|,where p i+1and p i are the intensity of adjacent pixels.Thus,strength of output is proportional to the contrast between the pixels,and doubling the illumination doubles the pixel values,and hence doubles the contrast and gradient magnitudes.When working in the log domain,a different interpretation arises,wherein the output value has a character more like the contrast ratio(as compared to the absolute contrast).In the case of the vertical gradient operator,the function isnow:|log(p i+1)−log(p i)|=|log(p i+1min(p i+1,p i))Doubling the pixel values now has no effect on the output.While this theory looks promising,for most real images,the potential effects are more limited.For example,assume that there is an illumination contrast across a2562image with a strong gradient creating a brightness of255at one edge and128at the other edge.The expected variation in average illumination across a3×3operator is2255−128256).=128,assuming that the noise levelis on the order of a few low order bits.However,in the course of evaluating the use of the log transform,we did observe several cases where some slight benefits might be achieved,and some where it is not advised to use the log transform.3ExperimentsThis section will look at several applications of this approach:1.a small mask vertical gradient operator(afirst-order operator),with thegoal of assessing if the use of the log function does reduce the illumination gradient effects.2.the Sobel operator(afirst-order operator),with the goal of assessing theelimination of shading effects on curved surfaces.3.a small mask horizontal dark bar operator(a second-order operator),wherewe investigate the effect of different strong illumination gradients on the same scene,with the goal of assessing the elimination of the illumination gradient using a second-order operator.4.a small mask horizontal dark bar operator(a second-order operator),wherewe investigate the effect of different illumination levels on the same scene, with the goal of assessing the elimination of the illumination differences usinga second-order operator.In the experiments below,all input images were normal intensity images ob-tained by a standard CCD camera,which then had the conservative smoothing operator[2]applied to reduce point noise.In all of the comparison experiments there is a problem of establishing the basis for comparison of an operator on two different types of images.The output values,when an algorithm is applied to the log image are much lower than when applied to an intensity image(because of the compression of input values given by the log function).The log image output also has a compression relative to the intensity values,and this requires nonlinear scaling to relate the two output images,as most operators are linear on their input data.In addition,the log image output has a much higher noise distribution arising from quantisation effects in dark areas.Appendix C discusses this effect.The source of the problem can be seen by comparing the difference of adding1quanta noise to an underlying signal of10compared to adding it to an underlying signal of100.In the intensity image,the difference is1in both cases;in the log image, the difference is log(11/10)=0.09versus log(101/100)=0.009,or about10times worse at the low signal levels.In the future,if the data were from a16bit scanner, then there would be a larger dynamic range and so a few quanta of noise would have a smaller effect.So,in general,to relate the two images for display and comparison,we have scaled the log image output so that the features detected in a small test image are comparable to the features detected in the corresponding intensity image(i.e. that the same features were extracted when the same threshold was chosen).The experiments reported here then used this rescale factor when the operators were applied to the test images.3.1Vertical Edge DetectionFigure1(right)shows an intensity image of a test pattern(left)with bars of intensity1,63and255illuminated from the above.Figure2(left)shows the test image operated on by the|p i+1−p i|vertical gradient operator(scaled by4andFigure 1:Synthetic and illuminated real image.Figure 2:Vertical edge detector (inverted image)on intensity image and thresh-olded at 75.clipped at 0and 255).The right image shows the gradient image thresholded at 75.Figure 3shows the effect of the same operator on the log of the intensity image (scaled by 400).The right image shows the gradient image also thresholded at75.Obviously,the choice of threshold is relatively arbitrary;however,here it was chosen to be the same as the intensity image threshold,and the operator scaling was adjusted to produce approximately the same set of edges for a thin horizontal test window across the real parison of the thresholded images shows slightly more edge detected in the log images,which arises because the effect of the illumination gradient has been reduced.Comparison of the two unthresholded edge gradient images also verifies that the log image exposes the weaker edges at the top of the image more clearly,but at the cost of also increasing the noise.Figure 4(left)shows the histogram of a subset (near the edge at column 113–out of 256)of the gradient values from the vertical gradient operator applied to the intensity image and (right)shows the corresponding histogram from the log image.The histograms verify one of the effects of the use of the log operator:the gradient magnitude values along the edge are more tightly clustered in the log image.In theory they should be constant,but in both cases,the histogramFigure 3:Vertical edge detector (inverted image)on the log of intensity image and thresholded at 75.Figure 4:Histogram of intensity image vertical edge at column 109(left)and histogram of corresponding log image edge (right).peaks are spread out because of the aliasing between the edge position and image bluring.In the intensity image case,the gradient values are more spread out (with larger gradient values at the bottom of the image and smaller gradient values at the top of the image),resulting from the varying illumination leading to different contrasts across the edges.3.2Shading on Curved SurfacesFigure 5(left)shows a matte cylindrical surface illuminated to have shading across its surface.The middle image shows the inverted Sobel operator output on the intensity image and the right image shows the same for the Sobel operator on the log image.The log image output was scaled by 60.0to produce a nearly identical set of edges when thresholded at the same level on a small test pattern placed near the cylinder.In both the middle and right images,the point to note is that the smoothly shaded region on the cylinder has much lower gradient estimates in the log image.There is also a region on the right of the cylindrical surface where the measured intensities are low and so noise is increased.However,this effect is partly related to the operator scaling.Applying a Laplacian of Gaussian operator to both the intensity and log im-Figure5:Original image(left),invert of Sobel on intensity image(middle),invert of Sobel on log image(right).ages produced virtually identical(approximately)zero mean gaussian output value distributions,as predicted by Appendix B.3.3Horizontal bar detection with varying illuminationdirection and strong contrastThis experiment looked at applying a5×5bar detection operator:222-1-1-2-2-2-1-1222to the image in Figure6(left),with results from the intensity image in(centre) and the log image at(right).Both bar operator images are inverted for clarity. The same operator was applied to another image where the strong illumination contrast was now from the upper right(results not shown here).As seen in the twofigures,there is virtually no difference in the bar detector output images. 3.4Horizontal bar detection with varying constantillumination magnitudeThis experiment applied the same5×5bar detection operator as in the previous experiment,except to two images with no illumination gradient,but with a large difference in average illumination levels.As seen in Figures7and8,there is virtually no difference in the bar detector outputs,except that there is more noise in the dark image outputs(Figure8),which is amplified in the log image outputs (Figure8(right)).Figure6:Bright image with strong illumination from the lower right(left),invert of bar detector on intensity image(middle),invert of bar detector on log image (right).Figure7:Bright image with constant illumination(left),invert of bar detector on intensity image(middle),invert of bar detector on log image(right).4Discussion and ConclusionsTo summarise the results of the experiments presented above,these are the areas where operator output differs most significantly between the two approaches:•Strong illumination gradients:If there is a strong illumination gradient across the image,then the output of a gradient operator on the log image is slightly less dependent upon the illumination than when applied to the intensity image.This can improve the consistency of operator results across an image,or improve the consistency of operator results between images of the same scene under changing illumination(when varying the illumination gradient direction).•Strong surface shading effects:If there is a strong surface shading vari-ation arising from oblique lighting on a curved surface,then thefirst-order operator outputs on the log image are more representative of the underlying scene.Figure 8:Dark image with constant illumination (left),invert of bar detector on intensity image (middle),invert of bar detector on log image (right).•Low intensity values :If the image intensity values are low then,because of the finite quantisation of intensity values,image noise becomes much more significant in the log image.The mathematical and empirical demonstrations in the paper assumed that the operators were being applied to smooth surface regions,where the illumination function was continuous across all pixels in an operator’s input neighbourhood.This assumption is invalid across depth and orientation discontinuities and across illumination discontinuities (e.g.shadow boundaries).However,what we observed in these situations is that operator outputs did vary between the log and intensity images,but not in a significant manner.It is unclear of the consequences in image regions where surfaces are mutually illuminating.In summary,the analysis and experiments presented above demonstrate that there are situations to consider the use of the first and second order operators on the logarithm of the intensity image,rather than on the intensity image directly.The cases where the differences are of significance are limited to special situations of strong illumination contrast,and strong shading and require bright images.The benefits might be even greater if 12and 16bit images become common.The paper also showed that,in more typical situations,the output images of the two approaches did not differ in obviously significant ways.AcknowledgementsThanks to C.Robertson for comments and suggestions.References[1]DH Ballard and CM puter Vision.HTML document publishedon CDrom by John Wiley and Sons,Chichester,1996.[3]Horn,BKP,Robot Visiony2+(x−δ)2,1y2+(x+δ)2Taking the log gives:−log(y2+x2−2xδ),−log(y2+x2),−log(y2+x2+2xδ) Thefirst order Taylor expansion of log(1+x)=x gives:−log(y2+x2)+2xδx2+y2,which is clearly linear for small offsetsδ.B Shading on cylindrical surface is locally linear Simplify the problem to lie in two dimensions,with the light source direction l=(1,0).A locally circular surface fragment with local radius R has center of curvature at the origin(0,0).The image plane is at(0,d)and orthographic projection occurs perpendicular to the image plane.Suppose that we observe the surface at points(x−δ,y1),(x,y2),(x+δ,y3). (where x i2+y i2=R2).The surface normals n i at these points are:1R (x,y2),1R (x−δ),1R(x+δ)Thus,the brightness is locally linear.If we now look at the log image,and again apply the Taylor expansion log(1+x)=x,we get:log(xx,log(xR)+δBritish Machine Vision Conference 11C Noise in low intensity regions is moreconsequential in the log imageWe look at how the vertical gradient operator |p i +1−p i |is affected by image noise.Assume that pixels p i +1and p i are sampled from a constant image region with mean value µand Gaussian noise i with variance σ2.Then,the expected (mean)value of |p i +1−p i |is 0,and the expected variance is:σ2|p i +1−p i |=2σ2To calculate the variance of |log (p i +1)−log (p i )|,we first approximate this function by:log (p i +1)−log (p i )=log (p i +1µ+ 2)=log ((1+1µ)−1).=log ((1+1µ)).=log (1+1µ).=1µ=1µ2Thus,the noise in the vertical gradient of the log image is a function of the underlying intensity level,with the noise variance increasing as the brightness decreases.The practical implication of this analysis is that,for the operator scaling that produces comparable edges selected in the intensity and log images,the back-ground noise level in the log image is much higher than that of the intensity image in dark regions of the image,and is lower in bright regions.。
精卫填海英语作文jingweifillsupthesea
精卫填海英语作文jingweifillsupthesea精卫填海英语作文jingwei fills up the sea(通用10篇)无论是身处学校还是步入社会,大家对作文都再熟悉不过了吧,借助作文可以提高我们的语言组织能力。
你知道作文怎样才能写的好吗?下面是店铺为大家整理的“精卫填海英语作文jingwei fills up the sea”,供大家参考借鉴,希望可以帮助到有需要的朋友。
精卫填海英语作文jingwei fills up the sea 篇1古代神话《精卫填海》中说,炎帝的小女儿女娃游东海溺死后,为了报仇,便化做精卫鸟,“常衔西山之山石,以堙于东海”。
后来,一只海燕飞过东海时无意间看见了精卫,他为她的行为感到困惑不解,但了解了事情的起因之后,海燕为精卫大无畏的精神所打动,就与其结成了夫妻,生出许多小鸟,雌的像精卫,雄的像海燕。
小精卫和她们的妈妈一样,也去衔石填海。
直到今天,她们还在做着这种工作。
Long long time ago, there lived a little princess named Niu Wa (女娃) who was the youngest daughter of Emperor Yan, the legendary ruler in ancient Chinese mythology.精卫填海讲的是中国古代神话中炎帝宠爱的小女儿女儿的故事。
The littel princess loved watching the sunrise, admiring the spectacle of nature. She once asked her father where the sun rises. Her father said it was in the Eastern Sea and promised to take her there to see sunrise on a boat, but he had been too busy to do that.女娃很喜欢看日出,喜欢大自然。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
DETERMINATION OF IMAGE ORIENTATION SUPPORTED BY IMU AND GPSKarsten JacobsenUniversity of HannoverInstitute for Photogrammetry and Engineering SurveysNienburger Str. 1D-30167 HannoverJacobsen@ipi.uni-hannover.deKEYWORDS: Inertial Measurement Unit (IMU), GPS, bundle block adjustmentABSTRACTFor operational use, the photo orientations of a blockwith more than 1000 images have been determinedwith an LCR88 inertial measurement unit (IMU) andGPS. The relation of the IMU to the camera andcorrections for the GPS-data of the projection centershave been determined and improved by means of asmall reference block with 5 – 8 photos. Forchecking purposes the photo coordinates of 252photos have been measured and the orientations aredetermined by a combined bundle block adjustmentwith the GPS-data of the projection centers based on9 control points. The achieved accuracy of the photoorientations based on IMU and GPS are sufficient forthe creation of orthophotos but problems are stillexisting with y-parallaxes in the models. The y-parallaxes can be reduced by a combined bundleblock adjustment without control points or a moreexpensive inertial measurement system.1.INTRODUCTIONA combined bundle block adjustment without control points is possible for a block, if a larger percentage of the projection centers are determined by relative kinematic GPS-positioning (Jacobsen 1997). In the case of a real block structure, attitude data are not required, they can be determined by the combined block adjustment with GPS-data (figure 1), that means, if at least 2 parallel flight strips are available. The flight strips may be located one beside the other or even with a vertical displacement of the flight lines (figure 2). The classical location of one flight axis beside the other has the advantage of the same photo scale, this makes the determination of tie points more easy.Figure 2: block configuration of linear objects –IMU-data not requiredOnly for a single flight strip or a combination of single flight strips (figure 3) attitude data are required in addition to GPS-coordinates of the projection centers if no control points are available, because of the problem with the lateral tilt. But even for a real block structure, the combined use of GPS and IMU in the aircraft has some advantages. In a combined computation of the positions with IMU-and GPS-data, GPS cycle slips can be determined and so the problem of shifts and drifts of the GPS-data, different from flight strip to strip can be solved. with GPS – crossing flight strips every 20 – 30 baseIn such a case the crossing flight strips are not directly required, but they do have the advantage of a better control of the block geometry and they are avoiding also problems of a not accurate lateral tilt of long flight strips.Figure 3: typical block configuration of linear objects2.PROJECTBecause of a very restricted time frame for a project handled by the company BSF (Berliner Spezialflug Luftbild und Vermessungen GmbH, Diepensee) the bundle adjustment of a block with 1041 photos should be replaced by the direct determination of the photo orientations by means of relative kinematic GPS-positioning in combination with IMU. The relation between the LCR88 inertial measurement system, mounted on top of the camera and the camera itself should be determined by means of a small reference block flown before and after the project area. The large block size required a photo flight in 2 days, so the small reference block has been imaged 4 times with in total 37 photos.The flying height of approximately 1090m above terrain corresponds with the focal length of 305mm to a photo scale 1 : 3500. The large photo scale was not required for the accuracy of the ground points but for the identification of the mapping objects.For checking purposes a block adjustment of 252 photos (check area in figure 4) has been made, based on 9 control points. 9 control points are not sufficient for such a block of 12 flight strip without crossing lines, so a combined adjustment with coordinates of the projection centers determined by kinematic GPS-positioning was required. Of course this is not a total independent determination of the photo orientations -the same GPS-data have been used like in the determination of the orientations without control points, but the systematic GPS-data could be determined independently based on the control points in the check area.3.PREPARATION OF THE INERTIAL DATA The combined determination of the GPS-positions together with the attitudes, based on a LCR88, has been made by IGI, Hilchenbach by means of Kalman filtering. The conditions for the GPS-positioning was not optimal, partially only 5 satellites have been available and the PDOP was going up to 3.As a first result only pitch, roll and yaw have been available. With the program IMUPRE of the Hannover program system BLUH this has been converted into the usual photo orientations respecting the convergence of meridians, thedifferent rotation definition and the dimension of the attitude data (grads instead of degrees).Figure 5: one of the reference blocks with control pointsBy a comparison of the photo orientations of the reference blocks (figure 5) with the orientations determined by means of GPS and IMU, the relation of the axis between the photogrammetric camera and the IMU has been determined as well as systematic differences of the GPS-positions. By linear time depending interpolation, based on the relation before and after the flight over the main area, the photo orientations of the images in the main area have been improved. The improvement of the attitude data was done in the pitch, roll and yaw-system,corresponding to the relation of the axes.roll [grads]pitch [grads]yaw [grads]systematicdifferences day 1-.445-.469.534 “ day 2-.454-.462.571mean square differenceswithout systematic differences day 1.039.012.044 “ day 2.029.016.049after linear fitting day 1.025.009.007 “ day 2.021.009.010after fitting by t³ day 1.011.009.007 “ day 2.021.009.010Table 1: differences of the attitude data IMU –controlled bundle block adjustment (reference blocks)Table 1 shows the differences and mean square differences between the IMU-data and the orientations determined by bundle block adjustment of the small reference blocks only based on control points. The first and last images of the reference blocks have not been taken into account because theyare not so well supported by control points (see also figure 5), so approximately only 6 photos of each of the 4 control blocks have been used for comparison.A linear time depending improvement of the attitude data is required because the roll has changed both days approximately 0.070 grads between the reference area flown before and after the main area,the yaw has changed the first day 0.080 and the second day 0.100 grads. There was no significant change of the pitch.The photo orientations determined by bundle block adjustment based on control points is not free of errors. The adjustment is giving following mean square standard deviations as mean value of all:Sphi=0.0017 grads, Somega=0.0017 grads,Skappa=0.00042 grads, SX0=0.033m, SY0=0.034m,SZ0=0.015m. But this is only the internal accuracy,it does not show the problems of the strong correlation between phi and X0 and omega and Y0.phi omega kappa X0 Y0 Z0phi 1.00.03-.061.00-.03.30omega .03 1.00.08.03-1.00.11kappa -.06.08 1.00-.06-.09-.01X0 1.00.03-.06 1.00-.03.29Y0-.03-1.00-.09-.03 1.00-.11Z0.30.11-.01.29-.111.00Table 2: correlation matrix of the photo orientations Table 2 shows the strong correlation listed with 1.00,that means it is larger than 0.995. By this reason a complete separation between the attitude data and the projection center coordinates is not possible. It may happen that a correction of the attitude data will be made, but the differences are belonging to the GPS-data and reverse. A separation between both is only possible based on opposite flight directions ordifferent flying altitudes (see also Jacobsen 1999).Figure 6:discrepancies in the projection centers:GPS – bundle block adjustment as a function of the timeThe graphic representation of the discrepancies in the projection centers between the GPS-data and the photo orientations of the bundle block adjustments in figure 6 are showing problems of the GPS-data. The drift of the X-coordinate of the second part of the second day in the range of 1.5m is corresponding to a difference in phi of 0.078 grads. This is very exactly corresponding to a drift of phi with a size of 0.079 grads. This demonstrates the problem of the reference data, especially if a normal angle camera (f=305mm) is used. Such corresponding values cannot be seen at the other reference blocks.4.ANALYSISBased on the bundle block adjustment of the check area, including photos of 12 flight strips, each with 21 images, the photo orientations based on IMU and GPS improved by means of the reference blocks have been analyzed. 9 control points are not sufficient for such a block without crossing flight strips, so a combined adjustment with GPS-data of the projection centers was necessary.Figure 7: configuration of the check areawith the control pointsThe mean square differences at the control points have been 3cm for X and Y and 6cm for the height, together with a sigma0 of 9 µm. Based on the control points, the improved GPS-data have been shifted 11cm in X, 15cm in Y and 59cm in the height, indicating, that the GPS-data improved by the reference blocks still do have remarkable systematic errors.Figure 8: discrepancies of the attitude data corrected IMU – bundle block adjustment f(time)pitch[grads]roll[grads]yaw[grads] absolute0.0280.0200.059 without shifterrors0.0100.0100.013 linear fitting0.0100.0100.007 Table 3: discrepancies of the attitude datacorrected IMU – bundle block adjustmentFigure 8 and table 3 are showing the discrepancies of the attitude data between the IMU-data improved by the reference blocks and the orientations determined by bundle block adjustment and the results after elimination of constant shifts and also drifts individually for every flight strip. Especially larger differences in the yaw can be seen. The yaw has only a very small correlation to other orientation elements and it can be determined more precise than the other attitude values, that means the determined discrepancies can only be explained by the IMU-data. On the other hand the influence of errors in yaw to the image and also the object space is smaller than the influence of the other attitude data.X0 [m]Y0 [m]Z0 [m] absolute0.210.220.64 without shifterrors0.150.130.05 linear fitting0.160.140.05 Table 4: discrepancies of the projection centerscorrected GPS-data – bundle block adjustmentThe discrepancies at the projection center coordinates between the GPS-data corrected by the reference blocks and the results of the bundle blockadjustment of the check area are corresponding to the discrepancies determined by the combined block adjustment itself. Especially the discrepancies in Z0 are obvious.More important than the discrepancy of the individual orientation components are the discrepancies at the ground coordinates determined with the improved photo orientations. With the photo coordinates and the photo orientations determined by GPS and IMU a combined intersection has been computed (iteration 0 of program system BLUH) and the resulting ground coordinates have been compared with the results of the controlled bundle blockX, Y RMSX=0.42m RMSY=0.18mThe discrepancies at the ground coordinates shown in figure 9 (only 10% of the 1886 points are plotted) are within the project specifications. Changing systematic errors can be seen, but the relative accuracy is still better.X [m]Y [m]Z [m] RMS of absolutedifferences0.420.180.85systematic differences-0.180.01-0.59 RMS withoutsystematic differences0.380.180.61relative accuracy(<500m)0.190.100.36Table 5: discrepancies at the ground coordinates As it can be seen in figure 10 and also in table 5, the discrepancies of the Z-components of the ground points are dominated by systematic errors. But also if the overall systematic error of –0.59m is respected, the root mean square differences are only reduced to 0.61m. For a comparison with the X and Y-component, the height to base relation of 3.2 has to be respected, that means, the value of 0.61m corresponds to 0.19m and this is still in the range of the X- and Y-component.Figure 10: discrepancies at the ground coordinates Z - plot of 10% of the 1886 pointsThe absolute differences of the ground coordinates are important for the creation of orthophotos. For the setup of models, the y-parallax is more important. If the y-parallax is reaching the size of the floating mark, usually in the range of 30µm, the operator is getting problems with the stereoscopic impression of the floating mark in relation to the model. For the y-parallax only the relative accuracy of the orientations of both used photos are important. The relative accuracy of the attitude data of neighbored photos has been determined by program BLAN of program system BLUH together with the covariance function. The correlation of neighbored phi-values are c=0.81 and for omega it is c=0.57, that means, the values are strongly dependent. The relative accuracy has following values: Sphi rel = 0.011grads, Somega rel= 0.010grads, Skappa rel = 0.005grads. For theinfluence to the model, these values have to bemultiplied by 2, but the influence of the reference data has to be taken out.Just the value omega has an influence in the center of the model of tan 0.010 grads • 305mm = 53 µm, multiplied by 2 it reaching 75µm. Corresponding to this, the combined intersection of the photo orientations based on IMU- and GPS-data with the photo coordinates of the check area has had a resulting standard deviation of the photo coordinates of 105µm. Such an amount can not be accepted for a model orientation.5.CONCLUSIONThe determination of the image orientations by means of an LCR88-IMU and GPS has resulted in an accuracy of the ground coordinates of 0.42m for X, 0.18m for Y and 0.85m for Z. This was sufficient for the project. Systematic errors are existing, especially for the height.A problem is existing with the used reference blocks, each with 9 images, required for the determination of the relation between the IMU and the photogrammetric camera, but also for a shift-correction (datum) of the projection center coordinates determined by relative kinematic GPS-positioning. The separation of the influence of the IMU and GPS is a problem especially for normal angle cameras (f=305mm). Such reference blocks have to be flown twice in opposite direction or with a different flying altitude.The achieved image orientations are not sufficient for the setup of a model. If this is required, a more accurate IMU-system, that means a more expensive one, has to be used. But even this does not guarantee today the required quality. The best and save solution is the use of the IMU- and GPS-data in a combined bundle block adjustment. This still requires the determination of photo coordinates for the block adjustment – with automatic aero triangulation the effort is limited. A combined bundle adjustment includes also a better reliability. The main advantage of photo orientations based on IMU- and GPS-data is the possibility to reduce the number of required control points, especially for linear objects. Without control or check points usually such results are not respected. Only for special projects in remote areas or in the coastal zone today such photo orientations are accepted without additional checking possibilities.6.ACKNOWLEDGMENTThanks are going to BSF (Berliner Spezialflug Luftbild und Vermessungen GmbH, Diepensee) and IGI, Hilchenbach for the availability of the data and the fruitful cooperation.7.REFERENCESElias, B., (1998): Bündelblockausgleichung mit INS-Daten, diploma thesis University ofHannover 1998Jacobsen, K. (1997): Operational Block Adjustment without Control Points, ASPRS AnnualConvention 1997, SeattleJacobsen, K. (1999): Combined Bundle Block Adjustment with Attitude Data, ASPRSAnnual Convention 1999, PortlandLee, J.O. (1996): Untersuchung von Verfahren zur kombinierten Aerotriangulation mittelsintegriertem GPS/INS, doctor thesisUniversity of Hannover 1996。