Robust Tracking with Motion Estimation and Local Kernel-Based Color Modeling
机械臂自适应非奇异快速终端滑模控制
电光与控制Electronics Optics&Control Vol.28No.5 May2021第28卷第5期2021年5月引用格式:徐宝珍,宋公飞,王超,等•机械臂自适应非奇异快速终端滑模控制[J]•电光与控制,2021,28(5):46-50.XU B Z,SONG G F,WANG C,et al.Adaptive non-singular fast terminal sliding mode control of manipulator[J].Electronics Optics&Control,2021,28(5):46-50.机械臂自适应非奇异快速终端滑模控制徐宝珍1,宋公飞33,王超1,曹广旭"(1.南京信息工程大学自动化学院,南京210044;2.化工a程先进控制和优化技术教育部重点实验室,上海200237;3.江苏省大气环境与装备技术协同创新中心,南京210044;4.中国电子科技集团公司第二十八研究所,南京210044)摘要:针对刚性机械臂有限时间鲁棒控制问题,提出了一种新的自适应非奇异快速终端滑模控制方法。
该方法将非奇异快速终端滑模控制与自适应律相结合,使用非奇异快速终端滑模面加快机械臂轨迹跟踪误差的收敛速度,解决了终端滑模中的奇异问题;通过双曲正切函数代替符号函数减小控制输入的抖振;利用自适应律对未知的外部扰动和系统的不确定性进行估计,实现了在集总扰动未知情况下的轨迹跟踪。
构造Lyapunov函数,证明机械臂系统能够在有限时间内稳定收敛。
最后二自由度机械臂仿真实验结果验证了所设计控制器的有效性和鲁棒性。
关键词:终端滑模控制;机械臂;轨迹跟踪;自适应律;有限时间收敛中图分类号:TP242文献标志码:A doi:10.3969/j.issn.1671-637X.2021.05.011Adaptive Non-或ngular Fast Terminal Sliding ModeControl of ManipulatorXU Baozhen1,SONG Gongfei1'2,3,WANG Chao1,CAO Guangxu4(1.School of Automation,Nanjing University of Information Science&Technology,Nanjing210044,China;2.Key Laboratory of Advanced Control and Optimization for Chemical Processes,Shanghai200237,China;3.Collaborative Innovation Center of Atmospheric Environment and Equipment Technology,Nanjing210044,China;4.The28th Research Institute of China Electronics Technology Group Corporation,Nanjing210044,China)Abstract:For the finite-time robust control of a rigid robot manipulator,a new adaptive non-singular fast terminal sliding mode control method is proposed.This method combines non-singular fast terminal sliding mode control with adaptive law.Firstly,the non-singular fast terminal sliding surface is selected,which is used to accelerate the convergence rate of trajectory tracking error of manipulator and solve singular problems in terminal sliding surface.Then,hyperbolic tangent function replaces sign function to reduce chaHeiing of control input.Moreover,the adaptive law estimates the unknown external disturbance and uncertainties,so as to achieve trajectory tracking with unknown lumped disturbance・It is proved that the robot manipulator system can converge stably in finite time by establishing the Lyapunov function.Finally, the simulation results of a two-DOF robot manipulator are presented to illustrate the effectiveness and robustness of the proposed control method.Key words:terminal sliding mode control;robot manipulator;trajectory tracking;adaptive law;finitetime convergence0引言随着材料、电子和机械工业的快速发展,高性能机收稿日期:2020-11-06修回日期:2021-04-26基金项目:国家自然科学基金面上项目(61973170);中央高校基本科研业务费专项资金资助项目(2020ACOCP02)作者简介:徐宝珍(1997-),女,江西宜春人,硕士生,研究方向为机器人轨迹跟踪控制。
Robust Control and Estimation
Robust Control and Estimation Robust control and estimation are critical aspects of engineering and technology, particularly in the field of automation and control systems. These concepts are essential for ensuring the stability, performance, and reliability of complex systems in the presence of uncertainties and disturbances. Robust control and estimation techniques play a crucial role in various applications, including aerospace, automotive, robotics, manufacturing, and many others. One of the primary challenges in control and estimation is dealing with uncertainties in the system. These uncertainties can arise from various sources, such as modeling errors, external disturbances, sensor noise, and environmental changes. Robust control and estimation techniques are designed to address these uncertainties and ensure that the system behaves as intended under all operating conditions. This is particularly important in safety-critical applications, where the consequences of system failure can be severe. From a control perspective, robust control techniques aim to design controllers that can effectively handle uncertainties and variations in the system. This typically involves formulating control laws that are robust to uncertainties, such as H-infinity control, mu-synthesis, and robust model predictive control. These techniques are based on the idea of worst-case analysis, where the controller is designed to perform well under the most adverse conditions. This ensures that the system remains stable and meets performance requirements, even in the presence of uncertainties. On the other hand, robust estimation techniques are concerned with accurately estimating the state of the system in the presence of uncertainties and disturbances. This is essential for feedback control, where the state estimates are used to compute the control actions. Robust estimation methods, such as Kalman filtering, robust observers, and adaptive estimation, aim to provide accurate and reliable state estimates, even in the presence of noisy measurements and modeling errors. This is crucialfor ensuring the effectiveness of the control system and maintaining the desired performance. In addition to addressing uncertainties, robust control and estimation techniques also play a crucial role in ensuring the stability and performance of networked control systems. With the increasing integration of communication networks in control systems, there is a need to develop techniquesthat can effectively deal with network-induced delays, packet losses, and communication constraints. Robust control and estimation methods for networked control systems are designed to mitigate the effects of these issues and ensure the overall system stability and performance. Furthermore, the development of autonomous systems and artificial intelligence has brought new challenges to robust control and estimation. Autonomous systems, such as self-driving cars and unmanned aerial vehicles, require robust control and estimation techniques to ensure safe and reliable operation in dynamic and uncertain environments. Similarly, the integration of artificial intelligence in control systems introduces new uncertainties and complexities that need to be addressed through robust techniques. In conclusion, robust control and estimation are essential for ensuring the stability, performance, and reliability of complex systems in the presence of uncertainties and disturbances. These techniques play a crucial role in various applications, including aerospace, automotive, robotics, manufacturing, networked control systems, and autonomous systems. As technology continues to advance, the development of new robust control and estimation techniques will be essential for addressing the emerging challenges and ensuring the effectiveness of future control systems.。
frag跟踪原文英文版
Robust Fragments-based Tracking using the Integral HistogramAmit Adam and Ehud RivlinDept.of Computer Science Technion-Israel Institute of TechnologyHaifa32000,Israel{amita,ehudr}@cs.technion.ac.ilIlan ShimshoniDept.of Management Information SystemsHaifa UniversityHaifa31905,Israel{ishimshoni}@mis.haifa.ac.ilAbstractWe present a novel algorithm(which we call“Frag-Track”)for tracking an object in a video sequence.The template object is represented by multiple image fragments or patches.The patches are arbitrary and are not based on an object model(in contrast with traditional use of model-based parts e.g.limbs and torso in human tracking).Every patch votes on the possible positions and scales of the ob-ject in the current frame,by comparing its histogram with the corresponding image patch histogram.We then mini-mize a robust statistic in order to combine the vote maps of the multiple patches.A key tool enabling the application of our algorithm to tracking is the integral histogram data structure[18].Its use allows to extract histograms of multiple rectangular re-gions in the image in a very efficient manner.Our algorithm overcomes several difficulties which can-not be handled by traditional histogram-based algorithms [8,6].First,by robustly combining multiple patch votes,we are able to handle partial occlusions or pose change.Sec-ond,the geometric relations between the template patches allow us to take into account the spatial distribution of the pixel intensities-information which is lost in traditional histogram-based algorithms.Third,as noted by[18],track-ing large targets has the same computational cost as track-ing small targets.We present extensive experimental results on challenging sequences,which demonstrate the robust tracking achieved by our algorithm(even with the use of only gray-scale(non-color)information).1.IntroductionTracking is an important subject in computer vision with a wide range of applications-some of which are surveil-lance,activity analysis,classification and recognition from motion and human-computer interfaces.The three main categories into which most algorithms fall are feature-based tracking(e.g.[3]),contour-based tracking(e.g.[15])and region-based tracking(e.g[13]).In the region-based cate-gory,modeling of the region’s content by a histogram or by other non-parametric descriptions(e.g.kernel-density esti-mate)have become very popular in recent years.In particu-lar,one of the most influential approaches is the mean-shift approach[8,6].With the experience gained by using histograms and the mean shift approach,some difficulties have been studied in recent years.One issue is the local basin of convergence that the mean shift algorithm has.Recently in[22]the au-thors describe a method for converging to the optimum from far-away starting points.A second issue,inherent in the use of histograms,is the loss of spatial information.This issue has been addressed by several works.In[26]the authors introduce a new sim-ilarity measure between the template and image regions, which replaces the original Bhattacharyya metric.This measure takes into account both the intensities and their position in the window.The measure is further computed efficiently by using the Fast Gauss Transform.In[12],the spatial information is taken into account by using“oriented kernels”-this approach is additionally shown to be useful for wide baseline matching.Recently,[4]has addressed this issue by adding the spatial mean and covariance of the pixel positions who contribute to a given bin in the histogram-naming this approach as“spatiograms”.A third issue which is not specifically addressed by these previous approaches is occlusions.The template model is global in nature and hence cannot handle well partial occlu-sions.In this work we address the latter two issues(spatial in-formation and occlusion)by using parts or fragments to rep-resent the template.Thefirst issue is addressed by efficient exhaustive search which will be discussed later on.Given a template to be tracked,we represent it by multiple his-tograms of multiple rectangular sub regions(patches)of the template.By measuring histogram similarity with patchesof the target frame,we obtain a vote-map describing the possible positions of each patch in the target frame.We then combine the vote-maps in a robust manner.Spatial in-formation is not lost due to the use of spatial relationships between patches.Occlusions result in some of the patches contributing outlier vote-maps.Due to our robust method for combining the vote maps,the combined estimate of the target’s position is still accurate.The use of parts or components is a well known tech-nique in the object recognition literature(see chapter23in [11]).Examples of works which use the spatial relation-ships between detections of object parts are[21,17,16,2]. In[24]the issue of choosing informative parts which con-tain the most information concerning the presence of an ob-ject class is discussed.A novel application of detecting un-usual events and salient features based on video and image patches has recently been described in[5].In tracking,the use of parts has usually been in the con-text of human body tracking where the parts are based on a model of the human body-see[23]for example.Re-cently,Hager,Dewan and Stewart[14](followed by Fan et al.[10])analyzed the use of multiple kernels for tracking. In these works the connection between the intensity struc-ture of the target,the possible transformations it can expe-rience between consecutive frames,and the kernel structure used for kernel tracking was analyzed.This analysis gives insight on the limitations of single-kernel tracking,and on the advantages of multiple-kernel tracking.The parts-based tracking algorithm described in this work differs from these and other previous works in a number of important issues:•Our algorithm is robust to partial occlusions-the works in[14,10]cannot handle occlusions due to the non-robust nature of the objective function.•Our algorithm allows the use of any metric for com-paring two histograms,and not just analytically-tractable ones such as the Bhattacharyya or the equiv-alent Matusita metrics.Specifically,by using non-componentwise metrics the effects of bin-quantization are reduced(see section2.1and Fig.3).•The spatial constraints are handled automatically in our algorithm by the voting mechanism.In contrast, in[10]these constraints have to be coded in(e.g.the fixed length constraint).•The robust nature of our algorithm and the efficient use of the integral histogram allows one to use the algo-rithm without giving too much thought on the choice of multiple patches/kernels.In contrast,in[14,10]the authors carefully chose a small number of multiple ker-nels for each specific sequence.•We present extensive experimental validation,on out-of-the-lab real sequences.We demonstrate good track-ing performance on these challenging scenarios,ob-tained with the use of only gray-scale information.Our algorithm requires the extraction of intensity or color histograms over a large number of sub-windows in the target image and in the object template.Recently Pork-ili[18]extended the integral image[25]data structure to an“integral histogram”data structure.Our algorithm ex-ploits this observation-a necessary step in order to be able to apply the algorithm for real time tracking tasks.We ex-tend the tracking application described in[18]by our use of parts,which is crucial in order to achieve robustness to occlusions.2.Patch TrackingGiven an object O and the current frame I,we wish to locate O in the ually O is represented by a tem-plate image T,and we wish tofind the position and the scale of a region in I which is closest to the template T in some sense.Since we are dealing with tracking,we assume that we have a previous estimate of the position and scale,and we will search in the neighborhood of this estimate.For clarity,we will consider in the following only the search in position(x,y).Let(x0,y0)be the object position estimate from the pre-vious frame,and let r be our search radius.Let P T= (dx,dy,h,w)be a rectangular patch in the template,whose center is displaced(dx,dy)from the template center,and whose half width and height are w and h respectively.Let (x,y)be a hypothesis on the object’s position in the cur-rent frame.Then the patch P T defines a corresponding rectangular patch in the image P I;(x,y)whose center is at (x+dx,y+dy)and whose half width and height are w and h.Figure1describes this correspondence.Given the patch P T and the corresponding image patch P I;(x,y),the similarity between the patches is an indication of the validity of the hypothesis that the object is indeed located at(x,y).If d(Q,P)is some measure of similaritybetween patch Q and patch P,then we defineV PT(x,y)=d(P I;(x,y),P T)(1) When(x,y)runs on the range of hypotheses,we getV PT (·,·)which is the vote map corresponding to the tem-plate patch P T.2.1.Patch Similarity MeasuresWe measure similarity between patches by comparing their gray-level or color histograms.This allows moreflexi-bility than the standard normalized correlation or SSD mea-sures.Although for a single patch we lose spatial informa-tion by considering only the histogram,our use of multiple patches and their spatial arrangement in the template com-pensates for this loss.There are a number of known methods for comparing the similarity of two histograms[9].The simplest methods compare the histograms by comparing corresponding bins. For example,one may use the chi-square statistic or sim-ply the norm of the difference between the two histograms when considered as two vectors.The Kolmogorov-Smirnov statistic compares histograms by building the cumulative distribution function(that is cu-mulative sum)of each histogram,and comparing these two functions.The advantage over bin-wise methods is smooth-ing of nearby bin differences due to the quantization of mea-surements into bins.A more appealing approach is the Earth Mover’s Dis-tance(EMD)between two histograms,described in[20]. In this approach the actual dissimilarity between the bins themselves is also taken into account.The idea is to com-pute how much probability has to move between the various bins in order to transform thefirst histogram into the second. In doing so,bin dissimilarity is used:for example,in gray scale it costs more to move0.1probability from the[16,32) bin to the[128,144)bin,than to move it to the[32,47) bin.In thefirst case,the movement of probability is re-quired because of a true difference in the distributions,and in the second case it might be due simply to quantization errors.This is exactly the transportation problem of linear programming.In this problem the bases are always triangu-lar and therefore the problem may be solved efficiently.See [20]for more details and advantages of this approach.We have experimented with two similarity measures. Thefirst is the naive measure which treats the histograms as vectors and just computes the norm of their difference. The second is the EMD measure.For gray scale images, we used16bins.The EMD calculation is very fast and poses no problem.For color images,the number of bins is much larger(with only8bins per channel we get512bins). Therefore when using the EMD we took the K=10bins which obtained maximal counts,normalized them tounityPatch vote map − naive (dis)similarity Patch vote map − EMD (dis)similarity(a)(b)Figure3.V ote maps for the example patch using the the naive mea-sure and the EMD measure.The lower(darker)the vote-the more likely the position.Left(a)-naive measure.Right(b)-EMD.The EMD surface has a less blurred minimum,and is smoother at the same time.and then used the EMD.We used the original EMD code developed by Rubner[19].Figure2shows an example patch(we use gray scale in this example).We computed the patch vote map for all the locations around the patch center which are up to30pixels above or below and up to20pixels to the left or right.Fig-ure3shows the resulting vote maps when using the naive measure and the EMD measure.Note that in both measures the lower the value(darker in the image),the more simi-lar the histograms.The EMD surface is smoother and has a more distinct minimum than the surface obtained when using the naive measure.bining Vote MapsIn the last section we saw how to obtain a vote map V PT(·,·)for every template patch P T.The vote map gives a scalar score for every possible position(x,y)of the tar-get in the current frame I,given the information from patch P T.We now want to combine the vote maps obtained from all template patches.Basically we could sum the vote maps and look for the position which obtained the minimal sum(recall that our vote maps actually measure dissimilarity between patches). The drawback of this approach is that an occlusion affecting even a single patch may contribute a high value to the sum at the correct position,resulting in a wrong estimate.In other words,we would like to use a robust estimator which couldhandle outliers resulting from occluded patches or other rea-sons(e.g.partial pose change-for example a person turns his head).One way to make the sum robust to outliers is to bound the possible contribution of an outlierC(x,y)=PV P(x,y)V P(x,y)<TT V P(x,y)>=T(2)by some threshold T.If we adopt a probabilistic view ofthe measurement process-by transforming the vote mapto a likelihood map(e.g.by setting L P(x,y)=K∗exp−α∗V P(x,y))-then this method is equivalent to addinga uniform outlier density to the true(inlier)density.Min-imizing the value of C(·,·)is then equivalent to obtaininga maximum likelihood estimate of the position,but without letting an outlier take the likelihood down to0.However,we found that choosing the threshold T isnot very intuitive,and that the results are sensitive to this choice.A different approach is to use a LMedS-type es-timator.At each point(x,y)we order the obtained val-ues{V P(x,y)|patches P}and we choose the Q’th smallest score:C(x,y)=Q th value in the sorted set{V P(x,y)|patches P}(3) The parameter Q is much more intuitive:it should be the maximal number of patches that we always expect to yield inlier measurements.For example,if we think that we are guaranteed that occlusions will always leave at least a quar-ter of the target visible,than we will choose Q to be25%of the number of patches(to be precise-we assume that at least a quarter of the patches will be visible).The additional computational burden when using esti-mate(3)instead of(2)is not significant(the number of patches is less than40).ing the Integral HistogramThe algorithm that we have described requires multiple extractions of histograms from multiple rectangular regions.We extract histograms for each template patch,and then we compare these histograms with those extracted from mul-tiple regions in the target image.The tool enabling this tobe done in real time,as required by tracking,is the integral histogram described in[18].The method is an extension of the integral image data structure described in[25].The integral image holds at the point(x,y)in the image the sum of all the pixels containedin the rectangular region defined by the top-left corner ofthe image and the point(x,y).This image allows to com-pute the sum of the pixels on arbitrary rectangular regionsby considering the4integral image values at the cornersFigure4.outer partof the regionof the region-in other words in(very short)constant timeindependent of the size of the region.In order to extract histograms over arbitrary rectangu-lar regions,in the integral histogram we build for each binof the histogram an integral image counting the cumula-tive number of pixels falling into that bin.Then by access-ing these integral images we can immediately compute thenumber of pixels in a given region which fall into every bin,and hence we obtain the histogram of that rectangular re-gion.Once the integral histogram data structure is computed(with cost proportional to the image(or actually search re-gion)size times the number of bins),extraction of a his-togram over a region is very cheap.Therefore evaluatinga hypothesis on the current object’s position(and scale)isrelatively cheap-basically it is the cost of comparing twohistograms.As noted previously,a tracking application of the integralhistogram was suggested in[18].We extend that examplewith the parts-based approach.4.1.Weighting Pixel ContributionsAn important feature in the traditional mean shift algo-rithm is the use of a kernel function which assigns lowerweights to pixels which are further away from the target’scenter.These pixels are more likely to contain backgroundinformation or occluding objects,and hence their contribu-tion to the histogram is diminished.However,when usingthe integral histogram,it is not clear how one may includethis feature.The following discrete approximation scheme may beused instead of the more continuous kernel weighting(seeFigure4).If we want to extract a weighted histogram inthe rectangular region R,we may define an inner rectangleR1and subtract the integral histogram counts of R1from those of R to obtain the counts in the ring R−R1.Thesecounts and the R1counts may be weighted differently andcombined to give a weighted histogram on R.Of course,anadditional inner rectangle R2may be used and so forth.The additional cost is the access and arithmetic involvedwith4additional pixels for every added inner rectangle.Formedium and large targets this cost is negligible when com-pared to trying to weigh the pixels in a straightforward man-ner.4.2.ScaleAs noted in[25,18],an advantage of the integral im-age/histogram is that the computational cost for large re-gions is not higher than the cost for small regions.This makes our search for the proper scale of the target not harder than our search for the proper location.Just as a hypothesis on the position(x,y)of the object defines a correspondence between a template patch P T and an image patch P I;(x,y),if we add a scale s to the hypoth-esis,it is straightforward tofind the corresponding image patch P I;(x,y,s):we just scale the displacement vector of the patch and its height and width by s.The cost of extract-ing the histogram for this larger(or smaller)image patch is the same as for the same-size patch.We have implemented the standard approach(suggested in[8]and adopted by e.g.[4])of enlarging and shrinking the template by10%,and choosing the position and scale which give the lowest score in(3).The next section will present some results obtained with this approach.We remark that as noted in[7],this method has some limitations.For example,if the object being tracked is uniform in color,then there is a tendency for the target to shrink.In the case of partial occlusions of the target,we are faced with an additional dilemma:suppose that a uniform colored target is partially occluded.We get a good score by shrinking the target and locating it around the non-occluded part.Due to our robust approach,we also get a reasonable score by keeping the target at the correct size and locating it at the correct position,which includes some occluded parts of the target.However,there is no guarantee that the correct explanation will yield a better score than the partial expla-nation.A full treatment of this problem is out of the scope of the current work.5.ResultsNote:The full video clips are available at the authors’websites.We now present our experimental results.The tracker was run on gray scale images and the histograms we used contained16bins.Note that the integral histogram data structure requires an image for every bin in the histogram, and therefore on color images the application can become quite memory-consuming.We used vertical and horizontal patches as shown in Fig-ure5.The vertical patches are of half the template height, and about one tenth of the template’s width.The horizon-tal patches are defined in a similar manner.Over all we had around36patches(the number slightly varies with template size because of rounding to integer sizes).We note thatthis choice of patches was arbitrary-we just tried it and found it was good enough.In the discussion we return to this issue.The search radius was set to7pixels from the previous target position.The template wasfixed at thefirst frame and not updated during the sequence(more on this in the discussion).We used the25’th percent quantile for the value of Q in(3).These settings of the algorithm’s parameters werefixed for all the sequences.Thefirst two sequences(“face”and“woman”)show the robustness to occlusions.For these sequences we manually marked the ground truth(everyfifth frame),and plotted the position error of our tracker and of the mean-shift tracker. In both cases our tracker was not affected by the occlusions, while the mean-shift tracker did drift away.Figures6and 7show the errors with respect to the ground truth.Figure8 shows the initial templates and a few frames from these se-quences.Note the last frame of the woman sequence(sec-ond row)where one can see an example of the use of spatial information(seefigure caption also).We additionally note that we ran our tracker on these ex-amples with only a single patch containing the whole tem-plate,and it failed(this is actually the example tracker de-scribed in[18]).The next sequence-“living room”in Figure9-shows performance under partial pose change.When the tracked woman turns her head the mean shift tracker drifts,and then together with an occlusion it gets lost.Our tracker is robust to these interferences.In Figure10we present more samples from three more sequences.In these frames we marked only our tracker.The first two sequences are from the CA VIAR database[1].The first is an occlusion clip and the second shows target scale changes.The third sequence is again an occlusion clip.We bring it to demonstrate how our tracker uses spatial informa-tion(which is generally lost in histogram-based methods). Both persons have globally similar histograms(half dark and half bright).Our tracker“knows”that the bright pixels should be in the upper part of the target and therefore does not drift to the left person when the two persons are close.6.Discussion and ConclusionsIn this work we present a novel approach(“FragTrack”) to tracking.Our approach combines fragments-based repre-initial template frame 222frame 539frame 849initial template frame 66frame 134frame 456Figure 8.Occlusions -frames from “face”and “woman”sequences.Our tracker -solid red.Mean-shift tracker -dashed blue.Note in frame 456how the spatial information -bright in the upper part,dark in the lower part -helps our tracker.The mean-shift tracker which does not have this information chooses a region witha dark upper part and a bright lower part.initial template frame 29frame 141frame 209Figure 9.Pose change and occlusions -frames from “living room”sequence.Our tracker -solid red.Mean-shift tracker -dashed blue.40506070Position error w.r.t. ground truthn p i x e l s )our trackermean shift trackerground truth.Our tracker -solid red.Mean shift -dashed blue.Please see videos for additional impression sentation and voting known from the recognition literature,with the integral histogram tool.The result is a real time tracking algorithm which is robust to partial occlusions and 2530354045Position error w.r.t. ground truthn p i x e l s )our trackermean shift trackermarked ground truth.Our tracker -solid red.Mean shift -dashed blue.Please see videos for additional impressionpose changes.In contrast with other tracking works,our parts or frag-ments approach is model-free:the fragments are choseninitial template frame48frame82frame110initial template frame30frame100frame180initial template frame35frame65frame90Figure10.Additional examples.Thefirst two rows are from the CA VIAR database.No background subtraction/frame differencing was used.In the last row note again the use of spatial information-both persons have the same global histogram.arbitrarily and not by reference to a pre-determined parts-based description of the target(say limbs and torso in hu-man tracking,or eyes and nose in face tracking).Without the integral histogram’s efficient data structure it would not have been possible to compute each fragment’s votes map.On the other hand,without using a fragments-based algorithm,robustness to partial occlusions or pose changes would not have been possible.We demonstrate the validity of our approach by accu-rate tracking of targets under partial occlusions and pose changes in several video clips.The tracking is achieved without any use of color information.There are several interesting issues for current and fu-ture work.Thefirst is the question of template updating. We want to avoid introduction of occluding objects into the template.The use of the various fragments’similarity scores may be useful towards meeting this goal.A second issue is the partial versus full explanation dilemma described earlier and in[7]when choosing scale. This dilemma is even more significant under partial occlu-sions.Lastly,we may also consider disconnected rectangular fragments.It would be interesting tofind a way to choose the most informative fragments[24]with respect to the tracking task.References[1]Caviar datasets available at/vision/caviar/caviardata1/.[2]S.Agarwal,A.Awan,and D.Roth.Learning to detect ob-jects in images via a sparse,part-based representation.IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(11):1475–1490,2004.[3] D.Beymer,P.McLauchlan,B.Coifman,and J.Malik.Areal-time computer vision system for measuring traffic pa-rameters.In Proc.IEEE Conf.on Computer Vision and Pat-tern Recognition(CVPR),1997.[4]S.Birchfield and S.Rangarajan.Spatiograms vs.histogramsfor region based tracking.In Proc.IEEE Conf.on Computer Vision and Pattern Recognition(CVPR),2005.[5]O.Boiman and M.Irani.Detecting irregularities in imagesand video.In Proc.IEEE Int.Conf.on Computer Vision (ICCV),2005.[6]G.Bradski.Real time face and object tracking as a compo-nent of a perceptual user interface.In Proc.IEEE WACV, pages214–219,1998.[7]R.Collins.Mean shift blob tracking through scale space.InProc.IEEE Conf.on Computer Vision and Pattern Recogni-tion(CVPR),pages II:234–240,2003.[8] aniciu,R.Visvanathan,and P.Meer.Kernel basedobject tracking.IEEE Transactions on Pattern Analysis and Machine Intelligence,25(5):564–575,2003.[9]W.Conover.Practical Nonparamteric Statistics.Wiley,1998.[10]Z.Fan,Y.Wu,and M.Yang.Multiple collaborative kerneltracking.In Proc.IEEE Conf.on Computer Vision and Pat-tern Recognition(CVPR),2005.[11] D.Forsyth and puter Vision:A Modern Ap-proach.Prentice-Hall,2001.[12] B.Georgescu and P.Meer.Point matching under large imagedeformations and illumination changes.IEEE Transactions on Pattern Analysis and Machine Intelligence,26:674–689, 2004.[13]G.Hager and P.Belhumeur.Efficient region tracking withparamteric models of geometry and illumination.IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(10):1125–1139,1998.[14]G.Hager,M.Dewan,and C.Stewart.Multiple kernel track-ing with ssd.In Proc.IEEE Conf.on Computer Vision and Pattern Recognition(CVPR),2004.[15]M.Isard and A.Blake.Condensation:Conditional densitypropagation for visual tracking.Int.Journal of Computer Vision(IJCV),29(1):5–28,1998.[16]K.Mikolajczyk,C.Schmid,and A.Zisserman.Humandetection based on a probabilistic assembly of robust part detectors.In Proc.Eurpoean Conf.on Computer Vision (ECCV),2004.[17] A.Mohan, C.Papageorgiou,and T.Poggio.Example-based object detection in images by components.IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(4):349–361,2001.[18] F.Porkili.Integral histogram:A fast way to extract his-tograms in cartesian spaces.In Proc.IEEE Conf.on Com-puter Vision and Pattern Recognition(CVPR),2005.[19]Y.Rubner.Code available at /∼rubner/.[20]Y.Rubner,C.Tomasi,and L.Guibas.The earth mover’sdistance as a metric for image retrieval.Int.Journal of Com-puter Vision(IJCV),40(2):91–121,2000.[21] C.Schmid and R.Mohr.Local gray-value invariants for im-age retrieval.IEEE Transactions on Pattern Analysis and Machine Intelligence,1997.[22] C.Shen,M.Brooks,and A.Hengel.Fast global kernel den-sity mode seeking with application to localisation and track-ing.In Proc.IEEE Int.Conf.on Computer Vision(ICCV), 2005.[23]L.Sigal et al.Tracking loose-limbed people.In Proc.IEEEConf.on Computer Vision and Pattern Recognition(CVPR), 2004.[24]S.Ullman,E.Sali,and M.Vidal-Naquet.A fragment-basedapproach to object representation and classification.In Proc.IWVF4,LNCS2059,pages85–100,2001.[25]P.Viola and M.Jones.Robust real time object detection.In IEEE ICCV Workshop on Statistical and Computational Theories of Vision,2001.[26] C.Yang,R.Duraiswami,and L.Davis.Efficient mean-shifttracking via a new similarity measure.In Proc.IEEE Conf.on Computer Vision and Pattern Recognition(CVPR),2005.。
第五章抗差估计RobustEstimation
Z 0.463 0.435 0.457 0.462
Sum 0.715 0.679 0.682 0.670
X 5.334 5.335 5.248 0.720
Y 4.661 4.661 4.629 1.077
Z 5.367 5.367 5.193 0.420
Sum 5.131 5.131 5.031 0.787
三、应用实例——基准转换
背景
➢ 基准转换一般通过公共点坐标求转换参数; ➢ 公共点坐标难免有误差,甚至有异常误差; ➢ 我国经典大地基准是通过三角测量、导线测量传算
的,累积误差明显——边远地区误差达数米或数十 米; ➢ 利用公共点坐标求解转换参数必然歪曲坐标基准间 的真正关系; ➢ 进而扭曲转换后的大地网。
➢ 再用效率高的权函数进行迭代计算,提高转换参数效 率。
第五章 抗差估计(Robust Estimation)
三、应用实例——基准转换
参数初值
tˆx0 med (Wxi ) tˆy0 med (Wyi )
tˆz0 med (Wzi )
0 wx
med
(
e
0 xi
) / 0.6745
e0 xi
tˆx0
X=[
1 5
(
13 i9
X
2 i
)]
1
2
sum= [
1 15
13 i9
(X
2 i
Yi2
Z
2 i
1
)] 2
1
0,8
0,6
0,4
0,2
0
1
2
3
4
5
6
7
无额外粗差情形中误差
LS LS+Test
运动估计与运动补偿
运动估计与运动补偿运动补偿是通过先前的局部图像来预测、补偿当前的局部图像,它是减少帧序列冗余信息的有效方法。
运动估计是从视频序列中抽取运动信息的一整套技术。
运动估计与运动补偿技术MPEG-4采用I-VOP、P-VOP、B-VOP三种帧格式来表征不同的运动补偿类型。
它采用了H.263中的半像素搜索(half pixel searching)技术和重叠运动补偿(overlapped motion compensation)技术,同时又引入重复填充(repetitive padding)技术和修改的块(多边形)匹配(modified block(polygon)matching)技术以支持任意形状的VOP区域。
此外,为提高运动估计算法精度,MPEG-4采用了MVFAST(Motion Vector Field Adaptive Search Technique)和改进的PMVFAST(Predictive MVFAST)方法用于运动估计。
对于全局运动估计,则采用了基于特征的快速顽健的FFRGMET(Feature-based Fast and Robust Global Motion Estimation Technique)方法。
编解码器用来减少视频序列中的空域冗余。
它也可以用来进行去交织(deinterlacing)的操作。
定义运动补偿是通过先前的局部图像来预测、补偿当前的局部图像,它是减少帧序列冗余信息的有效方法。
分类包括全局运动补偿和分块运动补偿两类。
运动补偿是一种描述相邻帧(相邻在这里表示在编码关系上相邻,在播放顺序上两帧未必相邻)差别的方法,具体来说是描述前面一帧(相邻在这里表示在编码关系上的前面,在播放顺序上未必在当前帧前面)的每个小块怎样移动到当前帧中的某个位置去。
这种方法经常被视频压缩/视频编解码器用来减少视频序列中的空域冗余。
它也可以用来进行去交织(deinterlacing)的操作。
基于改进的RRT^()-connect算法机械臂路径规划
随着时代的飞速发展,高度自主化的机器人在人类社会中的地位与作用越来越大。
而机械臂作为机器人的一个最主要操作部件,其运动规划问题,例如准确抓取物体,在运动中躲避障碍物等,是现在研究的热点,对其运动规划的不断深入研究是非常必要的。
机械臂的运动规划主要在高维空间中进行。
RRT (Rapidly-exploring Random Tree)算法[1]基于随机采样的规划方式,无需对构型空间的障碍物进行精确描述,同时不需要预处理,因此在高维空间被广为使用。
近些年人们对于RRT算法的研究很多,2000年Kuffner等提出RRT-connect算法[2],通过在起点与终点同时生成两棵随机树,加快了算法的收敛速度,但存在搜索路径步长较长的情况。
2002年Bruce等提出了ERRT(Extend RRT)算法[3]。
2006年Ferguson等提出DRRT (Dynamic RRT)算法[4]。
2011年Karaman和Frazzoli提出改进的RRT*算法[5],在继承传统RRT算法概率完备性的基础上,同时具备了渐进最优性,保证路径较优,但是会增加搜索时间。
2012年Islam等提出快速收敛的RRT*-smart算法[6],利用智能采样和路径优化来迫近最优解,但是路径采样点较少,使得路径棱角较大,不利于实际运用。
2013年Jordan等通过将RRT*算法进行双向搜索,提出B-RRT*算法[7],加快了搜索速度。
同年Salzman等提出在下界树LBT-RRT中连续插值的渐进优化算法[8]。
2015年Qureshi等提出在B-RRT*算法中插入智能函数提高搜索速度的IB-RRT*算法[9]。
同年Klemm等结合RRT*的渐进最优和RRT-connect的双向搜基于改进的RRT*-connect算法机械臂路径规划刘建宇,范平清上海工程技术大学机械与汽车工程学院,上海201620摘要:基于双向渐进最优的RRT*-connect算法,对高维的机械臂运动规划进行分析,从而使规划过程中的搜索路径更短,效率更高。
英文-基于视觉的旋翼无人机地面目标跟踪
SIFT algorithm is used to recognize the ground target in 用于识别地面目 this paper. The SIFT algorithm, first proposed by David. G. 标 Lowe in 1999 [5] and improved in 2004 [6], is a hot field of feature-matching at present, and its effectiveness is invariant of image rotation, scale zoom and brightness transformations, and also maintains a certain degree of stability on 透视变换 仿射变换 perspective transformation and affine transformation. SIFT feature points are scale-invariant local points of an image, with the characteristics of good uniqueness, informative, large amounts, high speed, scalability, and so on. A. SIFT Algorithm The SIFT algorithm consists of four parts. The process of SIFT feature construction is shown in Fig. 1.
目标识别算法
SIFT算法本文是
性、准确性快
I. INTRODUCTION UAV is one of the best platforms to perform dull, dirty or dangerous (3D) tasks [1]. UAV can be used in various applications where human is impossible to intervene. It greatly expands the application space of visual tracking. Research on the technology of vision based ground target tracking for UAV has been a great concern among cybernetic experts and robotic experts, and has become one of the most active research directions in UAV applications. Currently, researchers from America, Britain, France and Sweden are on the cutting edge in this field [2]. Typical visual tracking platforms for UAV include Scan Eagle, GTMax, RQ-11, RQ-16, DragonFly, etc. Because of many advantages, such as small size, light weight, flexible, easy to carry and low cost, rotor UAV has a broad application prospect in the fields of traffic monitoring, resource exploration, electricity patrol, forest fire prevention, aerial photography, atmospheric monitoring, etc [3]. Vision based ground target tracking system for rotor UAV is such a system that gets images by the camera installed on a low-flying rotor UAV, then recognizes the target in the images and estimates the motion state of the target, and finally according to the visual information regulates the pan-tilt-zoom (PTZ) camera automatically to keep the target at the center of the camera view. In view of the current situation of international researches, the study of ground target tracking system for
Robust motion Controller Design for High-Accuracy Positioning system高精度定位系统鲁棒运动控制器设计
考虑因素在选择Q(s)的时候。注意,(1 - Q(s))
和Q(s)可以分别被视为一种灵敏度函数和互补的灵敏度函数的速度反馈循环。这个研究已经选择一个三阶二项式滤波器来满足其属性。
在Umeno和Hori[14]中已经提出一般形式和选择二项式过滤器来得到Q(s)。
扰动观测器可以实现的数字化的几种方法。实验结果通过五部分获得将它转换成结构由图5展示,采用双线性变换到G1和G2(s)并将其转换为数字滤波器。前馈摩擦补偿器在后面解释。在[2]中由另一种方式可以实现。
一介绍
现代机械系统,如机床、微电子制造设备、机械手、自动检测机必须由运动控制器支持以确保稳健,高速,和高精度定位/跟踪性能。高生产力通常需要高速操作。精确需求变得更加严格是因为在现代机械设备或微电子产品中组件的大小的变小。这个是通过设置测量精度来实现的。编码器的使用,目标是使误差靠近包含瞬变的编码器分辨率的定位值。鲁棒性不仅仅是暗示鲁棒性的稳定也有性能的稳健性。当动态特性从一个单位变换到另一个在不同操作期间的一个单位的特征的一个重要的要求是避免控制器参数的调节的过于灵敏。
Embedding Motion in Model-Based Stochastic Tracking
algorithms that keep only one configuration state [5], which are therefore sensitive to single failures in the presence of ambiguities or fast or erratic motion. In this paper, we address two important issues related to tracking with a particle filter. The first issue refers to the specific form of the observation likelihood, that relies on the conditional independence of observations given the state sequence. The second one refers to the choice of an appropriate proposal distribution, which, unlike the prior dynamical model, should take into account the new observations. To handle these issues, we propose a new particle filter tracking method based on visual motion. Our method relies on a new graphical model allowing for the natural introduction of implicit or explicit motion information in the likelihood term, and on the exploitation of explicit motion measurements in the proposal distribution. the above issues, our approach, and their benefits, is given in the following paragraphs. The definition of the observation likelihood distribution is perhaps the most important element in visual tracking with a particle filter. This distribution allows for the evaluation of the likelihood of the current observation given the current object state, and relies on the specific object representation. The object representation corresponds to all the information that characterizes the object like the target position, geometry, appearance, color, etc. Parametrized shapes like splines [2] or ellipses [6], and color distributions [5]–[8], are often used as target representation. One drawback of these generic representations is that they can be quite unspecific, which augments the chances of ambiguities. One way to improve the robustness of a tracker consists of combining low-level measurements such as shape and color [6]. The generic conditional form of the likelihood term relies on a standard hypothesis in probabilistic visual tracking, namely the independence of observations given the state sequence [2], [6], [9]–[13]. In this paper, we argue that this assumption can be inaccurate in the case of visual tracking. As a remedy, we propose a new model that assumes that the current observation depends on the current and previous object configurations as well as on the past observation. We show that under this more general assumption, the obtained particle filtering algorithm has similar equations than the algorithm based on the standard hypothesis. To our knowledge, this has not been shown before, and so it represents the first contribution of this article. The new assumption can thus be used to naturally introduce implicit or explicit motion information in the observation likelihood term. The introduction of such data correlation between successive images will turn generic trackers like shape or color histogram trackers into more specifi in Model-Based Stochastic Tracking
目标跟踪算法综述
目标跟踪算法综述大连理工大学卢湖川一、引言目标跟踪是计算机视觉领域的一个重要问题,在运动分析、视频压缩、行为识别、视频监控、智能交通和机器人导航等很多研究方向上都有着广泛的应用。
目标跟踪的主要任务是给定目标物体在第一帧视频图像中的位置,通过外观模型和运动模型估计目标在接下来的视频图像中的状态。
如图1所示。
目标跟踪主要可以分为5部分,分别是运动模型、特征提取、外观模型、目标定位和模型更新。
运动模型可以依据上一帧目标的位置来预测在当前帧目标可能出现的区域,现在大部分算法采用的是粒子滤波或相关滤波的方法来建模目标运动。
随后,提取粒子图像块特征,利用外观模型来验证运动模型预测的区域是被跟踪目标的可能性,进行目标定位。
由于跟踪物体先验信息的缺乏,需要在跟踪过程中实时进行模型更新,使得跟踪器能够适应目标外观和环境的变化。
尽管在线目标跟踪的研究在过去几十年里有很大进展,但是由被跟踪目标外观及周围环境变化带来的困难使得设计一个鲁棒的在线跟踪算法仍然是一个富有挑战性的课题。
本文将对最近几年本领域相关算法进行综述。
二、目标跟踪研究现状1. 基于相关滤波的目标跟踪算法在相关滤波目标跟踪算法出现之前,大部分目标跟踪算法采用粒子滤波框架来进行目标跟踪,粒子数量往往成为限制算法速度的一个重要原因。
相关滤波提出了一种新颖的循环采样方法,并利用循环样本构建循环矩阵。
利用循环矩阵时域频域转换的特殊性质,将运算转换到频域内进行计算,大大加快的分类器的训练。
同时,在目标检测阶段,分类器可以同时得到所有循环样本得分组成的响应图像,根据最大值位置进行目标定位。
相关滤波用于目标跟踪最早是在MOSSE算法[1]中提出的。
发展至今,很多基于相关滤波的改进工作在目标跟踪领域已经取得很多可喜的成果。
1.1. 特征部分改进MOSSE[1] 算法及在此基础上引入循环矩阵快速计算的CSK[2]算法均采用简单灰度特征,这种特征很容易受到外界环境的干扰,导致跟踪不准确。
伯沙安全系统 视频系统版1-3 商品说明和常见问题解答说明书
Intrusion Detection with FW 6.60 Technical NoteTable of contents1Introduction31.1Applications (3)1.2Common product platform (CPP) (3)1.3Limitations (3)2Technical Background & FAQ32.1What is an intruder? (3)2.2How can I distinguish intruders from animals? (3)2.3What is a camera calibration and when do I need it? (4)2.4Where do false alerts in trees come from? (4)2.5Why are objects detected so late? (5)2.6How should I set up the camera view? (5)2.7How far into the distance can Intelligent Video Analytics / Essential Video Analytics detect objects? (5)3Setup63.1General advice (6)3.2Small, controlled environments, e.g. indoor (6)3.3Climbing walls or throwing across walls (6)3.4Façade protection (7)3.5Medium areas (7)3.6Large areas (7)3.7Infrared illumination & insect swarms (7)3.8Shaking / vibrating camera (8)3.9Optimization via forensic search (8)1IntroductionThis technical note describes the best practices for and answers commonquestions regarding intrusion detection using Intelligent Video Analytics orEssential Video Analytics with FW 6.60.1.1Applications④Perimeter protection④Sterile zones④Warehouse after hours④Solar plants④Façade protection④… and wherever and whenever no one is supposed to be within an area during a certain time1.2Common product platform (CPP)Bosch cameras can be clustered by their common product platform. As different platforms offer a different amount of processing power, this can make differences in the performance. For an overview of the different product platforms and the cameras belonging to them, see the tech note on Video Content Analysis (VCA) Capabilities per Device.1.3Limitations④Different performance and setup options on Intelligent Video Analytics on CPP4 and Essential Video Analytics on theone hand, and Intelligent Video Analytics on CPP6/7/7.3 on the other hand④Does not work for well-populated and crowded scenes④Does not work on elevators or other conveyance belts④Does not work if camera is moving2Technical Background & FAQ2.1What is an intruder?If we talk about intruders, we typically mean people entering areas which are off-limits to them. Depending on the application, however, the people may also sit in vehicles or bikes. Furthermore, professional intruders typically do not walk into the area, but crawl or roll to present the camera the least view of them that is possible.2.2How can I distinguish intruders from animals?You can’t. It is possible to separate standing and walking people from smaller animalslike dogs, foxes or rabbits by their size, but if we talk about professional intruderscrawling or rolling into the scene, then most of the time the difference to the animal in question is not large enough for a robust classification. There is currently no videoanalytic for intrusion detection on the market that can really solve this problem. If youare only interested in walking / standing persons, the automatic object classification canbe used. See the tech note on object classification for configuration details.Intruders: Walking, crawling, rollingA crawling person looks similar to a dog2.3 What is a camera calibration and when do I need it?A camera calibration teaches the camera about the perspective in the scene. Due to perspective, in the rear of the video images the persons appear smaller, they cover less pixel in the image, though their real size is the same. Perspective is thus needed whenever the real size and speed of objects is needed, as well as an automatic perspection correction of object sizes. Calibration becomes more important the larger the area covered by a single camera is. For small areas (10-20m distance), the perspective effect is typically neglectable, for larger areas, it becomesessential for robust performance. Note that the longer the distance, the less reliable object sizeand speed estimations become, as less pixel are available per meter.To calibrate, the position of the camera in relation to a single, planar ground plane is described by the elevation of the camera, the angles (tilt, roll) towards the ground plane and the focal length of the lens. As calibration is only done in reference to a single, planar ground plane, scenes with stairs, escalators, several ground levels, facades or rising ground cannot be calibrated correctly. If the rising ground only differs a little from the planar ground plane, a best effort calibration can be tried. In all other cases, please refrain completely from using a calibration and set the object size filters, if needed, for the different image regions by hand.Further information on calibration can be found in the VCA tech note on geolocation.2.4 Where do false alerts in trees come from?All video analytics algorithms for intrusion detection are based on three coretechnologies: Change detection, motion estimation and background subtraction.∙ Change detection evaluates per block whether the content of the block has changeswithin the last second. This technology is used in MOTION+. The disadvantage isthat every change in the image, be it an illumination change or actual motion,triggers the blocks as changed. Note that the after-image of the object also triggers the blocks.∙ For motion estimation, also called optical flow, it is evaluated for every part in theimage where this part was in the last image, or a second ago. Flow is based primarily on this technology.∙ For background subtraction, one, multiple or stochastic background images arelearned over time and updated continuously. Every difference to the learnedbackground is then extracted as a moving foreground object and tracked over time.Intelligent Video Analytics and Essential Video Analytics use this technology in combination with motion estimation.Manual object size filter (yellow) can also compensate perspective Perspective: Objects in the rear appear smaller ChangedetectionMotionestimationBackgroundsubtraction & trackingNow for all three technologies, if the wind moves a branch, then there is a distinct change to the position of the branch a second ago and the branch is detected as a potential change / motion / moving object. By combining the different technologies, tracking the objects over time and evaluating the consistency of motion, robustness can be achieved to suppress these kind of moving objects the user is typically not interested in.2.5 Why are objects detected so late?Objects are actually detected as soon as they appear, however, to validate that they are interesting objects with consistent motion and not spurious detections by wind in trees or flags or the falling of rain drops and snowflakes, Intelligent Video Analytics and Essential Video Analytics hold the detection back for a few frames. To get the objects as soon as theyappear, please go to MetadataGeneration->Tracking and disable the noise suppression. For Intelligent Video Analytics on CPP6/7/7.3, in addition raise the sensitivity to max.2.6 How should I set up the camera view?If possible make sure that intruders cross the field of view instead of walking towards the camera. Due to the perspective, a person walking toward the camera does not cross as many pixel in the image and does not have much apparent motion as a person crossing the camera view. Thus it is more difficult to detect and separate from noise. Higher elevation is preferred due to the same reason. Though higher poles are more expensive and prone to shaking, the lower a camera is mounted the less apparent motion objects walking toward the camera have, and the harder they are to detect. Note also that the more area is covered by the selected lens, the farther an object must travel to cross the same amount of pixel.2.7 How far into the distance can Intelligent Video Analytics / Essential VideoAnalytics detect objects?A general answer cannot be given, as this depends on the chosen camera, the chosen video aspect ratio, the camera perspective, on the focal length and on the light and weather conditions. Furthermore, both Intelligent Video Analytics and Essential Video Analytics are not directly computed on the original camera resolution but on a reduced one due to computational power limits. For an overview of which resolution is used on which camera, please see the tech note on VCA Capabilities per Device.Generally, a larger focal length indicates a larger zoom factor and a smaller width of the field of view. So one can see farther into the distance but less far to the left and right than with a smaller focal length. Furthermore, with a larger focal length, the unobserved area in front of the camera is much larger as well. Another trade-off for the larger detection range of larger focal length is that motion towards the camera takes longer to detect.The detection distance also depends on the size of the object, with longer detection ranges for larger objects. Wind in trees: branches moveConsistency of motionMotion towards thecamera is lessDistance resolution is best near the camera, and degrades heavily into the distance where a single pixel in the internal resolution often covers several meters of ground. Here an example for Essential Video Analytics:Use the Bosch Video Analytics and Lens Calculator at /LensCalculator/html/lens-calculator.html to determine video analytics detection distances for specific cameras, lenses and focal lengths.3SetupConfiguration of intrusion detection is separated into several applications, which all have their own characters and require different optimizations. For detection of intruding ships, please see the tech note on IVA 6.10 Ship Tracking. If you are interested in an application where you need to be alerted already if only a part of the intruder, like a hand or an arm, trespasses into the alarm zone, please see the tech note on Museum Mode FW 6.30.3.1General advice-Alarm field preferable to line crossing: For a line crossing to trigger, the object needs to be detected before and after the line. In combination with challenging situations like storm in trees, where detection can be delayed in favour of false alarm suppression, the placement of the line needs to be done with care. An alarm field, which triggers whenever an object is detected inside, covers larger areas and is thus inherently more robust.-Scheduling: This is available on DINION & FLEXIDOME cameras. Configure one or both VCA profiles, then change the VCA configuration to “Scheduled” and define the times where which VCA profile should run.-Alarm-based recording / adjustment of recording frame rates: Can be triggered by any alarm. Configurable via Recording -> Profiles, or via the alarm task editor for full flexibility3.2Small, controlled environments, e.g. indoor-Object size: Intelligent Video Analytics and Essential Video Analytics do not know whether a small object in the image is actually a small object, or a large object that is far away. Therefore they will detect and track even the smallest objects like leaves and small garbage blown by the wind in the foreground or small animals in the default settings. To avoid that, set the min object size, either in the MetadataGeneration -> Tracking, where it will completely suppress them, or filter them out in your alarm task.-Calibration: Perspective correction of the object size is not necessary for small environments for intrusion detection. -Fast object detection: In indoor applications, always go to MetadataGeneration->Tracking, disable the noise suppression and (Intelligent Video Analytics on CPP6/7/7.3) raise the sensitivity to max. Noise suppression and object validation is only needed for outdoor environments. In controlled, small outdoor applications, that is much tarmac and no trees, this can also be disabled.3.3Climbing walls or throwing across walls-Calibration: Don’t use any calibration. Flying and climbing objects will not be detected & tracked well if camera is calibrated.3.4Façade protection-Calibration: As the façade is vertical, calibration is not possible at all.-Fast object detection: As the façade is mostly tarmac, and if no trees are nearby who’s shadows can fall on the façade, then you don’t need noise suppression and object validation. Go to MetadataGeneration->Tracking, disable the noise suppression and (Intelligent Video Analytics on CPP6/7/7.3) raise the sensitivity to max.3.5Medium areas-Calibration: If the ground is more or less flat, then add calibration to teach the camera about perspective. Only thus is the most robust detection possible.-Noise suppression: Noise suppression is enabled by default and should stay that way. Use Intelligent Video Analytics on CPP6/7/7.3 cameras with noise suppression STRONG for best noise suppression.-Use scenario default “Intrusion (one field)” available from FW 6.60 onwards. Automatically enables 3D tracking, sets noise suppression to STRONG, and adds object filters targeted for suppressing non-persons. Needs calibrated camera.3.6Large areas-Calibration: If the ground is more or less flat, then add calibration to teach the camera about perspective. Only thus is the most robust detection with largest detection ranges possible.-Noise suppression: Noise suppression is enabled by default and should stay that way. Use noise suppression MEDIUM where available for longest detection distances while still suppressing many false alerts.-Double detection distance: For Intelligent Video Analytics on CPP6/7/7.3 with FW ≥6.30, detection distance for moving objects can be doubled by calibration the camera, using a 3D tracking mode, and setting noise suppression to OFF or MEDIUM.-Use scenario default “Intrusion (two fields)” available from FW 6.60 onwards. Automatically sets 3D tracking and noise suppression to MEDIUM where available and STRONG otherwise, thus enabling longest detection distances.Also adds a second detection field which also needs to be entered by alarm objects as well as object filters targeted for suppressing non-persons to further false alarm suppression. Needs calibrated camera.3.7Infrared illumination & insect swarmsInsects are drawn to the light of infrared illuminators. If the infrared illuminators are included in the camera or positioned close nearby, this means that a myriad of insects will flutter through the video and cause false alerts. Therefore always position the illuminator in at least 80cm distance to the camera.Though false alerts due to insects cannot be suppressed completely, they are already greatly reduced in Intelligent Video Analytics on CPP6/7/7.3 cameras from FW 6.10 onward. With Intelligent Video Analytics on CPP4 cameras, with Essential Video Analytics, or to generally further reduce false alerts, use multi-line crossing or combine several single alarm rules and detection via VCA task script language. From FW 6.60 onwards, alarming on an object inside, entering or leaving up to three alarm fields in a specified order is possible directly via the GUI, and also part of the scenario default “Intrusion (two fields)”. For FW <6.60, VCA task scripting is needed.For scripting, go to the video analytics configuration and open the task page. Right-click on the video and select Advanced -> VCA Task Editor. A separate popup with the current VCA task script will appear. Here is an example configuration, which can be copied and pasted into the VCA task script editor. Further information about VCA tasks script language and example scripts are available in the VCA task script language manual and a separate tech note.VCA task script example: Alarm if an object enters field and afterwards, within 30 seconds, crosses the line in the middle of the field://Definition of task primitivesResolution := { Min (-1, -1) Max (1, 1) };Field #1 := { Point (-0.6, 0.95) Point (-0.25, -0.95)Point (0.25, -0.95) Point (0.6, 0.95) DebounceTime (0.50) };Line #1 := { Point (0.0, -0.95) Point (0.0, 0.95)DebounceTime (0.50) };//@Task T:0 V:0 I:1 "Enter Field and Line" { external Event #1:={EnteredField #1before (*,30) CrossedLine #1 where first.oid == second.oid };//@}3.8 Shaking / vibrating cameraWhen the camera shakes, the content of the whole image shakes with it. The effects are especially visible around edges, as these cause the most change. Thus, false alerts can occur and the tracking of existing objects can be disrupted.Compensation of shaking and vibrating cameras within Intelligent Video Analytics on CPP6/7/7.3 cameras was introduced with FW 6.10 and perfected with FW 6.20. It is always active there. With Intelligent Video Analytics CPP4 cameras and with Essential Video Analytics, no compensation for shaking cameras from video analytics side is possible.3.9 Optimization via forensic searchThere are two parts of the configuration of Intelligent Video Analytics and Essential Video Analytics. The first part defines the object detection and tracking, also called metadata generation. This includes camera calibration, selection of thetracking mode, masking areas from the processing and defining idle / removed debounce times. This first part needs to be done initially and cannot be changed afterwards.The second part of the configuration evaluates the metadata and includes tasks like line crossing, object in field and more. This second part can be fully evaluated and optimized using forensic search. To do so, record video including the events to be detected. Then use a forensic search capable viewing client. Define or adapt your alarm tasks, and evaluate whether all events are detected correctly and how many false alerts are still left.Bosch Sicherheitssysteme GmbH Robert-Bosch-Ring 585630 GrasbrunnGermany© Bosch Sicherheitssysteme GmhH, 2018。
基于无迹卡尔曼滤波的汽车状态参数估计
基于无迹卡尔曼滤波的汽车状态参数估计赵万忠;张寒;王春燕【摘要】由于部分汽车状态参数无法直接通过传感器获得,为了提高这些参数的估计精度以准确判断汽车行驶过程中的状态变化,增强控制系统的鲁棒性,文中提出了基于无迹卡尔曼滤波的汽车状态参数估计方法。
该方法在传统卡尔曼滤波算法的基础上,采用无迹卡尔曼滤波算法对汽车质心侧偏角、横摆角速度、路面附着系数等状态参数进行估计,并运用Simulink与Carsim进行联合仿真。
结果表明,无迹卡尔曼滤波算法响应快,估计精度较扩展卡尔曼滤波高,能满足车辆高级动力学控制系统的控制需要。
%In order to improve the estimation accuracy of some vehicle state parameters that can not be obtained by sensors directly and thus to estimate the state variation of running vehicles accurately,a method on the basis of un-scented Kalman filtering (UKF)isproposed,which helps enhance the robustness of vehicle control system.In this method,an UKF algorithm on the basis of traditional Kalman filtering is developed to estimate such vehicle state parameters as side slip angle,yaw rate and road adhesion coefficient,and a simulation by using both Simulink and Carsim software is carried out.The results indicate that the proposed UKF is superior to the extended Kalman filte-ring for its short response time and high estimation accuracy.Thus,it can meet the requirements of advanced dy-namic control system of vehicles.【期刊名称】《华南理工大学学报(自然科学版)》【年(卷),期】2016(044)003【总页数】6页(P76-80,88)【关键词】无迹卡尔曼滤波;参数估计;质心侧偏角;横摆角速度;路面附着系数【作者】赵万忠;张寒;王春燕【作者单位】南京航空航天大学能源与动力学院,江苏南京210016; 上海交通大学机械系统与振动国家重点实验室,上海200240;南京航空航天大学能源与动力学院,江苏南京210016;南京航空航天大学能源与动力学院,江苏南京210016【正文语种】中文【中图分类】U461.6车辆高级动力学控制系统的广泛应用为汽车提供了良好的操控性能,大大提高了行驶过程的安全性[1-2].出于对汽车施加更加简单、精确并且智能操控的目的,控制单元应能够采集到更多并且更加精确的参数.使用有限的传感器和有效的动力学模型,通过参数估计方法获得尽可能多的、精度符合要求的状态参数,既能准确地判断汽车行驶过程中的状态变化,又能提高控制系统的鲁棒性[3- 4],减少生产成本,是一种经济有效的方法.现有的参数估计方法[5-7]有状态观测器法[8]、模糊逻辑估计法[9-10]、神经网络法[11]、系统辨识法以及卡尔曼滤波估计法[12]等.但神经网络法需要大量的训练样本,模糊逻辑估计法[13]的加权系数的确定强烈依靠工程师的经验,因而应用最广泛的是卡尔曼滤波估计法.卡尔曼滤波中又大多采用扩展卡尔曼滤波估计法(EKF),但由于汽车是一个强非线性的系统,EKF通过一阶泰勒展开引入了截断误差,当汽车行驶在非线性工况时,估计结果难以达到很高的精度,甚至导致结果发散.无迹卡尔曼滤波(UKF)由于不需要计算非线性函数的Jacobi矩阵,可以处理不可导的非线性函数,估计精度较EKF高,因而更适用于非线性系统参数的估计.为此,文中采用UKF估计方法对汽车的质心侧偏角、横摆角速度、路面附着系数等状态参数进行估计,使用Matlab/Simulink与Carsim进行联合仿真,将估计结果与Carsim 系统的实际输出值进行对比分析,并与扩展卡尔曼滤波估计结果进行对比,以验证估计结果的精确度.1.1 整车动力学模型文中主要研究汽车在平整路面上行驶的运动特性,在线性二自由度模型的基础上加入纵向运动自由度,使该模型拥有侧向、横摆、纵向3个自由度.其运动方程如下: (1)式中,vy为侧向车速,vx为纵向车速,γ为横摆角速度, d1为质心到前轴的距离,d2为质心到后轴的距离,m为整车质量,δ为前轮转角,k1和k2分别为前、后轮的侧偏刚度总和,β为质心侧偏角,ax为纵向加速度,ay为侧向加速度,Iz为绕z轴的转动惯量.1.2 轮胎模型为了简化计算,提高计算效率,文中在准确刻画轮胎在不同路面附着系数及侧偏角条件下的轮胎力的前提下,使用了参数较少的Dugoff轮胎模型[14].单个车轮的纵向、侧向轮胎力Fx及Fy的数学表达式如下:式中:Fz为轮胎垂向载荷;μ0为路面附着系数;为轮胎滑移率; Cx、Cy分别为轮胎的纵滑刚度和侧偏刚度;α为轮胎的侧偏角;L为边界值,用来表述轮胎的非线性特性;ε为速度影响因子,修正了轮胎滑移速度对轮胎力的影响.Dugoff轮胎模型的数学表达式可以简化为以下归一化形式:式中,分别为纵向、侧向归一化轮胎力,与路面附着系数无关.4个车轮的垂直载荷数学表达式为式中,h为汽车质心高度,df为前轮间距,dr为后轮间距,l为前后轴间距,l=d1+d2.1.3 四轮车辆动力学模型为了得到关于路面附着系数的状态模型,文中在Dugoff轮胎模型的基础上建立四轮车辆动力学模型,用于对汽车行驶过程中的路面附着系数进行实时估计.动力学方程如下:式中,μfl、μfr、μrl、μrr分别为汽车四轮的路面附着系数,分别为汽车四轮的归一化纵向力与横向力.卡尔曼滤波[15]是一种最优状态估计算法,它可以应用于各类受随机干扰的动态系统.卡尔曼滤波给出了一种十分高效的递推算法,该算法通过实时获得的、受噪声污染的一系列离散观测数据来对原有系统进行线性、无偏及最小误差方差的最优估计. 无迹卡尔曼滤波[16]是一类新的非线性滤波算法,该算法不是逼近非线性函数,而是用样本加权求和直接逼近随机分布,并且测量更新部分采用卡尔曼滤波的更新原理.对于如下非线性离散系统:样本点构造方法如下:各点权值为式中,n为待估计的状态向量维数.假设在上一时刻的状态估计值和方差阵分别为(k-1)和Px(k-1),则对非线性系统(8)采用UKF进行滤波的具体步骤如下:(1)设定初值(2)更新时间当k>1时,按式(9)构造2n+1个样本点,即χ(i=1,2,…,n)然后计算预测样本点,即χ最后计算预测样本点的均值和方差,即(3)更新测量当获得新的测量值z(k)后,对状态均值和方差进行更新,即为了准确估计汽车行驶过程中的状态变化,文中以横摆角速度、质心侧偏角以及纵向车速为状态变量,即;以前轮转角δ和纵向加速度ax为系统输入控制变量,即;以侧向加速度ay为输出变量,即y=ay.综合动力学方程(1),利用Simulink与Carsim软件进行联合仿真的结构图如图1所示,并将估计结果与Carsim输出值进行对比.仿真参数由Carsim软件获得,具体数值如下: k1=-143 583 N/rad,k2=-111 200N/rad,Iz=460 7.47 kg·m2,m=152 9.98 kg,方向盘转角到前轮转角的传动比为17,d1=1.14 m,d2=1.64 m,df=dr=1.55 m,h=0.518 m.工况1 Carsim仿真速度设为65 km/h,即初始状态,方向盘模拟角阶跃输入,幅值为1 rad,仿真结果如图2所示.由图2可知,在对方向盘施加角阶跃输入时,汽车的行驶状态发生改变,在初始时刻的估计结果与实际值有一定的偏差,随着时间的推移,汽车状态估计值逐渐与实际值保持良好的跟随性,稳定误差在2%左右.工况2 车速保持65 km/h不变,初始状态不变,方向盘输入改为正弦输入,输入工况为转向正弦扫频输入(Sine sweep steer),估计结果如图3所示.由图3可知,在方向盘正弦输入工况下,汽车行驶状态时刻发生改变,估计结果能对实际值保持良好的跟随性,估计误差很小,估计精度符合要求,可用于下一步的路面附着系数估计.为了估计路面附着系数,考虑到各变量均便于使用传感器测得或间接估计得到,综合四轮车辆动力学模型,文中选取式(7)作为量测方程,此时系统的状态变量为四轮的路面附着系数,即x=[μfl,μfr,μrl,μrr],输入控制变量u=δ,输出].量测方程中归一化轮胎力由Dugoff轮胎模型获得.需要的参数除了垂向载荷外,还有滑移率和轮胎侧偏角α,可由式(17)计算得到:式中,ij、vij、αij、ωij(ij=fl,fr,rl,rr)分别为四轮的滑移率、速度、侧偏角、车轮转速,vcog为汽车质心速度.此时轮胎模型的输入为:前轮转角δ(可由方向盘转角与传动比获得)、4个车轮转速(ωfl、ωfr、ωrl、ωrr,可由转速传感器获得)、纵向及侧向加速度(ax、ay,可由加速度传感器获得)、质心侧偏角β、横摆角速度γ、纵向车速vx(由上一步估计得到). 综合汽车状态估计结果与Dugoff轮胎模型,运用Simulink与Carsim进行联合仿真的结构图如图4所示.在高路面附着系数仿真工况下,路面附着系数设为0.85,Carsim模拟方向盘角阶跃输入,估计结果如图5所示.由图5可知,使用UKF进行路面附着系数估计的结果和实际值吻合较好.经计算,4个轮胎的路面附着系数估计总误差均值为0.007 0,误差在0.8%左右,精度较高,可用于实车估计中.在低路面附着系数条件下,车辆在转向时容易发生滑移,为了验证该算法在低路面附着系数转向时的精确性,将方向盘转角设为正弦输入,路面附着系数设为0.3.同时,为了对比无迹卡尔曼滤波与扩展卡尔曼滤波的估计精度,采用这两种算法分别进行估计,结果如图6所示.由图6可知,在低路面附着系数条件下,UKF与EKF的估计结果都能保持对实际值的跟随性,并且UKF的结果明显优于EKF.经计算,EKF估计的误差均值为0.001 5,标准差为0.015 9,而UKF估计的误差均值为0.000 3,标准差为0.005 9,精度提高了3%左右.文中基于无迹卡尔曼滤波算法对汽车质心侧偏角、横摆角速度、路面附着系数等状态及参数进行估计,结果表明:无迹卡尔曼滤波可通过简单有效的模型估计得到汽车的实时状态与参数变化,充分验证了无迹卡尔曼滤波在汽车操纵稳定性状态及参数估计中应用的高效性和精确性;与扩展卡尔曼滤波估计相比,无迹卡尔曼滤波的估计精度更高.因此,使用文中估计方法对车辆的驱动或制动力矩进行控制,能有效地改善车辆在行驶过程中的打滑和制动过程中的抱死状况,保证汽车的行驶安全性.【相关文献】[1] KURISHIGE M,WADA S,KIFUKU T,et al.A new EPS control strategy to improve steering wheel returnability [R].Warrendale:SAE International,2000.[2] JIANG F,GAO Z,JIANG F.An adaptive nonlinear filter approach to the vehicle velocity estimation for ABS [C]∥Proceedings of IEEE International Conference on Control Applications.Anchorage:IEEE,2000:490- 495.[3] LI L,WANG F Y,ZHOU Q Z.A robust observer designed for vehicle lateral motion estimation [C]∥Proceedings of IEEE Intelligent Vehicle sVegas:IEEE,2005:417- 422.[4] 刘伟.车辆电子稳定性控制系统质心侧偏角非线性状态估计的研究 [D].长春:吉林大学,2009.[5] 林棻,赵又群.汽车侧偏角估计方法比较 [J].南京理工大学学报(自然科学版),2009,33(1):122-126.LIN Fen,ZHAO parison of methods for estimating vehicle sideslip angle [J].Journal of Nanjing University of Science and Technology(Natural Science),2009,33(1):122-126.[6] JIN X,YIN G,LIN Y.Interacting multiple model filter-based estimation of lateral tire-road forces for electric vehicles [R].Warrendale:SAE International,2014.[7] BIAN M,CHEN L,LUO Y,et al.A dynamic model for tire/road friction estimation under combined longitudinal/lateral slip situation [R].Warrendale:SAE International,2014.[8] IMSLAND L,JOHANSEN T A,FOSSEN T I,et al.Vehicle velocity estimation using nonlinear observers [J]. Automatica,2006,42(12):2091-2103.[9] DAISSA A,KIENCKE U.Estimation of vehicle speed fuzzy-estimation in comparision with Kalman-filtering [C]∥Proceedings of the 4th IEEE Conference on Control Applications.Albany:IEEE,2002:281-284.[10] 施树明,HENK L,PAUL B,等.基于模糊逻辑的车辆侧偏角估计方法 [J].汽车工程,2005,27(4):426- 470.SHI Shu-ming,HENK L,PAUL B,etal.Estimation of vehicle side slip angle based on fuzzy logic [J].Automotive Engineering,2005,27(4):426- 470.[11] SASAKI H,NISHIMAKI T.A side-slip angle estimation using neural network for a wheeled vehicle [R].Warrendale:SAE International,2000.[12] LI L,SONG J,LI H Z.A variable structure adaptive extended Kalman filter for vehicle slip angle estimation [J].International Journal of Vehicle Design,2011,56 (1/2/3/4):161-185.[13] 李刚,宗长富,张强,等.基于模糊路面识别的4WID电动车驱动防滑控制 [J].华南理工学报(自然科学版),2012,40(12):99-106.LI Gang,ZONG Chang-fu,ZHANG Qiang,et al.Anti slip control of 4WID electric vehicle based on fussy road identification [J].Journal of South China University of Technology(Natural Science Edition),2012,40(12):99-106.[14] 周磊,张向文.基于Dugoff轮胎模型的爆胎车辆运动学仿真 [J].计算机仿真,2012,29(6):308-385.ZHOU Lei,ZHANG Xiang-wen.Simulation of vehicle dynamics in tire blow-out process based on Dugoff tire model [J].Computer Simulation,2012,29(6):308-385.[15] WENZEL T A,BURNHAMK J,BLUNDELLM V,et al.Dual extended Kalman filter for vehicle state and para-meter estimation [J].Vehicle System Dynamics,2006,44(2):153-171.[16] 刘胜.最优估计理论 [M].北京:高等教育出版社,2009.。
slam特征跟踪方法
slam特征跟踪方法From a technical standpoint, SLAM feature tracking methods play a vital role in accurately estimating therobot's pose and mapping the environment. These methods typically rely on extracting and matching visual or geometric features across consecutive frames to establish correspondences and compute the robot's motion. Feature tracking algorithms should be robust to changes in lighting conditions, viewpoint variations, occlusions, and dynamic objects. Moreover, they should be able to handle large-scale environments and real-time processing requirements. Achieving these objectives is challenging due to the complexity and dynamic nature of real-world environments.One popular approach to SLAM feature tracking is theuse of feature descriptors, such as SIFT (Scale-Invariant Feature Transform) or ORB (Oriented FAST and Rotated BRIEF). These descriptors encode distinctive information about the features, allowing for reliable matching across frames. However, feature descriptors alone may not be sufficient inchallenging scenarios with significant viewpoint changes or occlusions. To address this, researchers have proposed methods that combine feature descriptors with geometric constraints, such as epipolar geometry or 3D point cloud information. These methods leverage the geometric relationships between the features to improve tracking accuracy and robustness.Another important aspect of SLAM feature tracking is the initialization of the tracking process. When a robot starts exploring a new environment, it needs to identify and track features from scratch. This initialization step is crucial for accurate motion estimation and subsequent mapping. Various methods have been proposed to address this challenge, including keypoint detection algorithms, such as Harris corners or FAST (Features from Accelerated Segment Test), which aim to identify salient features in the scene. Once the initial set of features is obtained, the tracking process can be initialized and refined using feature matching and motion estimation techniques.In recent years, deep learning-based approaches havealso shown promise in SLAM feature tracking. Convolutional neural networks (CNNs) have been employed to learn feature representations directly from raw image data, eliminating the need for handcrafted descriptors. These learned features can be more robust to variations in lighting and viewpoint, potentially improving tracking performance. Additionally, recurrent neural networks (RNNs) and long short-term memory (LSTM) networks have been explored for modeling temporal dependencies in feature tracking, enabling better handling of motion blur or fast camera movements.Despite the advancements in SLAM feature tracking methods, several challenges remain. One major challenge is the trade-off between tracking accuracy and computational efficiency. SLAM systems often operate in real-time, and the feature tracking component should be able to process frames at high frame rates while maintaining accurate estimates. This requires efficient feature detection, matching, and motion estimation algorithms. Another challenge is the robustness of feature tracking in dynamic environments. Moving objects or changes in the scene candisrupt feature correspondences and lead to tracking failures. Developing methods that can handle dynamic environments and recover from failures is an ongoing research topic.In conclusion, slam特征跟踪方法 (SLAM feature tracking methods) are crucial for enabling mobile robots to navigate and map their surroundings simultaneously. These methods involve extracting, matching, and tracking distinctive features in the environment to estimate the robot's motion and build a map. While feature descriptors and geometric constraints have been traditionally used, recent advancements in deep learning have opened new possibilities for improving tracking accuracy and robustness. However, challenges such as real-time processing, dynamic environments, and tracking initialization still need to be addressed. Continued research and development in SLAM feature tracking methods will contribute to the advancement of robotics and computer vision, enabling robots to operate autonomously in complex and dynamic environments.。
运动目标检测与跟踪算法的研究进展
运动目标检测与跟踪算法的研究进展0 引言人类感知的环境信息大多是通过视觉获得的,而在接受到的所有视觉信息中,人们又往往对动态信息更感兴趣。
随着多媒体技术的发展,人们正在接触越来越多的视频信息。
一方面,要获得较高压缩比来存储这些信息,另一方面,需要对感兴趣的区域或对象进行操作[1]。
因此对视频图像中运动目标的提取、分类识别和跟踪,已成为对运动目标的行为进行理解和描述视频图像中动态信息的主要内容。
运动目标的检测与跟踪在技术上融合了计算机视觉、视频图像处理、模式识别和自动控制等相关领域的知识[2]。
运动目标的检测与跟踪是视频技术的一个重要研究方向,其应用十分广泛。
在交通流量的监测、安全监控、军事制导、视觉导航,以及视频编码中都有涉及。
目前,运动目标的检测与跟踪已经取得了很多成果,并且不断有新技术、新算法涌现。
但是,在实际环境中,由于自然环境的复杂(光照、气候的变化等),目标的高机动性,干扰了目标检测与跟踪,造成检测不准确且跟踪效率不高。
因此,研究改进运动目标检测与跟踪算法有很现实的意义和应用价值。
1 运动目标检测常用算法运动目标检测就是从视频图像中将变化的区域从背景中提取出来,此类算法依照目标与摄像机之间的关系可以分为静态背景下运动检测和动态背景下运动检测。
静态背景下只有被监视目标在摄像机的视场内运动;而动态背景下摄像机也发生了运动,这个过程就产生了目标与背景之间复杂的相对运动,造成动态背景下的运动检测和跟踪难度很大。
目前对于动态背景下运动检测和跟踪的研究较少,因此本文暂不涉及运动背景下的运动目标检测与跟踪。
在静态背景下,运动目标检测主要算法有三种:帧间差分法、背景差分法和光流法。
下面分别对这三种算法进行分析。
1.1 帧间差分法帧间差分法[3]的基本原理就是相邻帧的图像对应像素点的灰度值相减,通过差分图像进行二值化处理以确定运动目标。
帧间差分法的主要优点是:算法实现简单,程序设计复杂度低;不存在背景的获取、更新和存储的问题;对场景中光线的变化不太敏感,实时性好。
W4 Who When Where What翻译原文
College Park, MD 20742
Abstract
W4 is a real time visual surveillance system for detecting and tracking people and monitoring their activities in an outdoor environment. It operates on monocular grayscale video imagery, or on video imagery from an infrared camera. Unlike many of systems for tracking people, W4 makes no use of color cues. Instead, W4 employs a combination of shape analysis and tracking to locate people and their parts head, hands, feet, torso and to create models of people's appearance so that they can be tracked through interactions such as occlusions. W4 is capable of simultaneously tracking multiple people even with occlusion. It runs at 25 Hz for 320x240 resolution images on a dual-Pentium PC.
W4 has been designed to work with only monochromatic video sources, either visible or infrared. While most previous work on detection and tracking of people has relied heavily on color cues, W4 is designed for outdoor surveillance tasks, and particularly for nighttime or other low light level situations. In such cases, color will not be available, and people need to be detected and tracked based on weaker appearance and motion cues. W4 is a real time system. It currently is implemented ona dual processor Pentium PC and can process between 20-30 frames per second depending on the image resolution typically lower for IR sensorsthan video sensors and the number of people in its field of view. In the long run, W4 will be extended with models to recognize the actions of the people it tracks. Specifically, we are interested in interactions between people and objects–e.g., people exchanging objects, leaving objects in the scene, taking objects from the scene. The descriptions of people-their - global motions and the motions of their parts-developed by W4 are designed to support such activity recognition.
HSfMHybridStructure-from-Motion《学习笔记》
HSfMHybridStructure-from-Motion《学习笔记》HSfM: Hybrid Structure-from-MotionAbstr a c t为了估计初始的相机位姿,SFM⽅法可以被概括为增量式或全局式。
虽然增量系统在鲁棒性和准确性⽅⾯都有所进步,在效率上仍是其主要的挑战。
为了解决这个问题,全局重建系统通过对极⼏何图中同时估计所有相机的位姿,但它对外点很敏感。
在这个⼯作⾥,我们提出了⼀个混合式sfm⽅法在统⼀的框架下解决效率,准确性和鲁棒性的问题。
进⼀步来说,我们提出⼀种社区化的⾃适应的平均⽅法,⾸先以全局⽅式估计相机旋转,然后基于这些估计的摄像机旋转,以增量式的⽅法去计算相机中⼼。
⼤量的实验表明,在计算效率⽅⾯,我们的混合⽅法的执⾏效果与许多最新的全局SfM⽅法相似或更好,同时与其他两种最新的状态相⽐,实现了相似的重构精度和鲁棒性渐进的SfM⽅法。
Intro duc tio nSFM技术是指通过⼀系列图⽚估计三维场景结构和相机位姿。
它通常包含三个模块,特征提取和匹配,初始相机位姿估计和BA。
根据初始相机姿势的估算⽅式不同,sfm可以被笼统的分为两类:增量式和全局式。
对于增量式⽅法,⼀种⽅法是选择⼀些种⼦图像进⾏初始重建,然后重复添加新图像。
另⼀种⽅法是⾸先将图像聚集成原⼦模型,然后重建每个原⼦模型,然后逐步合并它们。
可以说,增量⽅式是3D重建最流⾏的策略。
然⽽,这种⽅法对初始种⼦模型重建和模型⽣成⽅式很敏感。
另外,重建误差随着迭代的进⾏⽽累积。
对于⼤规模的场景重建,重建的结构可能会发⽣场景漂移。
此外,反复执⾏耗时的捆绑调整BA,这⼤⼤降低了系统的稳定性和效率。
为了解决这些不⾜,全局sfm⽅法在近些年变得更加流⾏。
对于全局式⽅法,初始相机的位姿同时从对极⼏何图像(EG)估计,图的顶点对应图像,边链接匹配的图像对,BA只执⾏⼀次,这在系统效率和可扩展性⽅⾯带来了更⼤的潜⼒。
全局摄像机位姿估计的通⽤pipeline包括两个步骤:旋转平均和位移平均。
基于图优化的增量式单目SFM 三维重建方法
第6期2019年2月No.6February ,2019段建伟(河南理工大学矿山空间信息技术国家测绘地理信息局重点实验室,河南焦作454003)0引言运动恢复结构(Structure from Motion ,SFM )是从视频序列或影像数据恢复三维结构的一种新技术,同时也是计算机视觉多视图三维重建的核心内容[1]。
SFM 技术与现有3D 建模软件(如3D MAX 、PhotoScan 等)相比,具有操作便捷、不需要专业建模技术等特点[2],已成为国内外学者研究的热点。
薛武等[3]发现SFM 达到了POS 辅助光束法平差的精度,可以满足1∶500成图要求。
为提高SFM 系统性能,华盛顿大学教授Wu [4]提出一种新的BA 策略,在速度和准确度方面取得了良好的平衡。
国外目前已开发出较为成熟的基于SFM 技术的软件,如Wu [5]分享的VisualSFM ,利用CMVS 和PMVS[6]得到更具有视觉效果的密集点云。
中科院自动化研究所也开发出一套基于图像重建的系统CVSuite [7]。
目前,SFM 技术已经广泛用于三维重建[8-9]、增强现实[10]、三维地图重建[11]、图像填充[12]、无人驾驶[13]等领域。
SFM 分为全局SFM (global SFM )和增量式SFM(incremental SFM )。
前者对错误匹配尤其敏感,甚至一个错点就可能导致解算失败[2]。
后者的迭代优化步骤虽然可以剔除大部分的错误匹配,降低外点对估计结果的影响[14],但由于其迭代优化增加图像过程会导致相对相机姿态估算的误差累积,使得三维重建结果存在漂移问题[4]。
本文提出一种图优化的增量式SFM 三维重建方法,以最小化重投影误差为代价函数构造图优化[15]模型,对估算的相机姿态和重建的三维点云进行优化,实例验证表明了本文方法可以满足SfM 重建要求。
1SFM 与图优化理论1.1SFM 原理在计算机视觉中,空间点P 与其投影点p 满足以下关系:p ij =K i (R i P j +t i )(1)其中:R 为影像的旋转矩阵,t 为影像的平移矩阵,K 为内参矩阵。
一种激光定位导航的关键帧选取方法
一种激光定位导航的关键帧选取方法Laser positioning and navigation is a crucial technology in various fields such as robotics, autonomous vehicles, and augmented reality. 激光定位导航是机器人、自动驾驶车辆和增强现实等各个领域中关键的技术。
One of the key components of laser positioning and navigation is the selection of keyframes. 激光定位导航的关键组成部分之一是关键帧的选择。
Keyframes are frames of video or images that are selected as the most representative or informative frames in a sequence. 关键帧是视频或图像序列中被选择为最具代表性或信息量丰富的帧。
The selection of keyframes is essential for efficient and accurate laser positioning and navigation. 关键帧的选择对于高效、准确的激光定位导航至关重要。
Inthis article, we will explore different methods for selecting keyframes in laser positioning and navigation. 在本文中,我们将探讨在激光定位导航中选择关键帧的不同方法。
One method for selecting keyframes in laser positioning and navigation is based on feature detection and matching. 在激光定位导航中选择关键帧的一种方法是基于特征检测和匹配。
moveit motion planning的实现
moveit motion planning的实现The MoveIt Motion Planning Framework is a robust and flexible solution for robot motion planning tasks. It leverages the capabilities of the Robot Operating System (ROS) to provide a comprehensive set of tools for generating collision-free paths for robotic arms and other mobile robots.MoveIt Motion Planning Framework是一个强大且灵活的机器人运动规划任务解决方案。
它利用机器人操作系统(ROS)的功能,提供了一套全面的工具,用于为机械臂和其他移动机器人生成无碰撞路径。
The core of MoveIt lies in its ability to create and manipulate a robot's kinematic model. This model represents the robot's physical structure and its relationships between different joints and links. MoveIt uses this model to understand the robot's workspace and potential collisions with obstacles.MoveIt的核心在于其创建和操纵机器人运动学模型的能力。
该模型代表了机器人的物理结构以及不同关节和连杆之间的关系。
MoveIt利用此模型来理解机器人的工作空间和潜在的障碍物碰撞。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
This work was supported by an ERCIM post-doctoral fellowship at IRISA/INRIA, Rennes, France. ∗ Corresponding Author Email addresses: venkatesh.babu@ (R. Venkatesh Babu), perez@irisa.fr (Patrick P´ erez), Patrick.Bouthemy@irisa.fr (Patrick Bouthemy).
b IRISA/INRIA-Rennes,
France
Abstract Visual tracking has been a challenging problem in computer vision over the decades. The applications of visual tracking are far-reaching, ranging from surveillance and monitoring to smart rooms. Mean-shift tracker, which gained attention recently, is known for tracking objects in a cluttered environment. In this work, we propose a new method to track objects by combining two well-known trackers, sum-of-squared differences (SSD) and color-based mean-shift (MS) tracker. In the proposed combination, the two trackers complement each other by overcoming their respective disadvantages. The rapid model change in SSD tracker is overcome by the MS tracker module, while the inability of MS tracker to handle large displacements is circumvented by the SSD module. The performance of the combined tracker is illustrated to be better than those of the individual trackers, for tracking fast-moving objects. Since the MS tracker relies on global object parameters such as color, the performance of the tracker degrades when the object undergoes partial occlusion. To avoid adverse effects of the global model, we use MS tracker to track local object properties instead of the global ones. Further, likelihood ratio weighting is used for the SSD tracker to avoid drift during partial occlusion and to update the MS tracking modules. The proposed tracker outperforms the traditional MS tracker as illustrated. Key words: Visual Tracking, Mean-Shift, Object Tracking, Kernel Tracking
Preprint submitted to Elsevier Science
19 July 2006
1
Introduction
The objective of object tracking is to faithfully locate the targets in successive video frames. The major challenges encountered in visual tracking are cluttered background, noise, change in illumination, occlusion and scale/appearance change of the objects. Considerable work has already been done in visual tracking to address the aforementioned challenges. Most of the tracking algorithms can be broadly classified into the following four categories. (1) Gradient-based methods locate target objects in the subsequent frame by minimizing a cost function [1,2]. (2) Feature-based approaches use features extracted from image attributes such as intensity, color, edges and contours for tracking target objects [3–5]. (3) Knowledge-based tracking algorithms use a priori knowledge of target objects such as shape, object skeleton, skin color models and silhouette [6–9]. (4) Learning-based approaches use pattern recognition algorithms to learn the target objects in order to search them in an image sequence [10–12]. Visual tracking in a cluttered environment remains one of the challenging problems in computer vision for the past few decades. Various applications like surveillance and monitoring, video indexing and retrieval require the ability to faithfully track objects in a complex scene involving appearance and scale change. Though there exist many techniques for tracking objects, colorbased tracking with kernel density estimation, introduced in [13,8], has recently gained more attention among research community due to its low computational complexity and robustness to appearance change. The reported work in [13] is due to the use of a deterministic gradient ascent (the “mean shift” iteration) starting at location, corresponding to the object location in previous frame. A similar work in [8] relies on the use of a global appearance model, e.g., in terms of colors, as opposed to very precise appearance models such as pixel-wise intensity templates [14,15]. The mean-shift algorithm was originally proposed by Fukunaga and Hostetler [16] for clustering data. It was introduced to image processing community by Cheng [17] a decade ago. This theory became popular among vision community after its successful application to image segmentation and tracking by Comaniciu and Meer [18,5]. Later, many variants of the mean-shift algorithm were proposed for various applications [19–24]. Though mean-shift tracker performs well on sequences with relatively small object displacement, its performance is not guaranteed when the objects move fast as well as when they undergo partial occlusion. Here, we attempt to im2