Robust Tracking with Motion Estimation and Local Kernel-Based Color Modeling
Robust Control and Estimation Robust control and estimation are critical aspects of engineering and technology, particularly in the field of automation and control systems. These concepts are essential for ensuring the stability, performance, and reliability of complex systems in the presence of uncertainties and disturbances. Robust control and estimation techniques play a crucial role in various applications, including aerospace, automotive, robotics, manufacturing, and many others. One of the primary challenges in control and estimation is dealing with uncertainties in the system. These uncertainties can arise from various sources, such as modeling errors, external disturbances, sensor noise, and environmental changes. Robust control and estimation techniques are designed to address these uncertainties and ensure that the system behaves as intended under all operating conditions. This is particularly important in safety-critical applications, where the consequences of system failure can be severe. From a control perspective, robust control techniques aim to design controllers that can effectively handle uncertainties and variations in the system. This typically involves formulating control laws that are robust to uncertainties, such as H-infinity control, mu-synthesis, and robust model predictive control. These techniques are based on the idea of worst-case analysis, where the controller is designed to perform well under the most adverse conditions. This ensures that the system remains stable and meets performance requirements, even in the presence of uncertainties. On the other hand, robust estimation techniques are concerned with accurately estimating the state of the system in the presence of uncertainties and disturbances. This is essential for feedback control, where the state estimates are used to compute the control actions. Robust estimation methods, such as Kalman filtering, robust observers, and adaptive estimation, aim to provide accurate and reliable state estimates, even in the presence of noisy measurements and modeling errors. This is crucialfor ensuring the effectiveness of the control system and maintaining the desired performance. In addition to addressing uncertainties, robust control and estimation techniques also play a crucial role in ensuring the stability and performance of networked control systems. With the increasing integration of communication networks in control systems, there is a need to develop techniquesthat can effectively deal with network-induced delays, packet losses, and communication constraints. Robust control and estimation methods for networked control systems are designed to mitigate the effects of these issues and ensure the overall system stability and performance. Furthermore, the development of autonomous systems and artificial intelligence has brought new challenges to robust control and estimation. Autonomous systems, such as self-driving cars and unmanned aerial vehicles, require robust control and estimation techniques to ensure safe and reliable operation in dynamic and uncertain environments. Similarly, the integration of artificial intelligence in control systems introduces new uncertainties and complexities that need to be addressed through robust techniques. In conclusion, robust control and estimation are essential for ensuring the stability, performance, and reliability of complex systems in the presence of uncertainties and disturbances. These techniques play a crucial role in various applications, including aerospace, automotive, robotics, manufacturing, networked control systems, and autonomous systems. As technology continues to advance, the development of new robust control and estimation techniques will be essential for addressing the emerging challenges and ensuring the effectiveness of future control systems.。
Robust Fragments-based Tracking using the Integral HistogramAmit Adam and Ehud RivlinDept.of Computer Science Technion-Israel Institute of TechnologyHaifa32000,Israel{amita,ehudr} ShimshoniDept.of Management Information SystemsHaifa UniversityHaifa31905,Israel{ishimshoni} present a novel algorithm(which we call“Frag-Track”)for tracking an object in a video sequence.The template object is represented by multiple image fragments or patches.The patches are arbitrary and are not based on an object model(in contrast with traditional use of model-based parts e.g.limbs and torso in human tracking).Every patch votes on the possible positions and scales of the ob-ject in the current frame,by comparing its histogram with the corresponding image patch histogram.We then mini-mize a robust statistic in order to combine the vote maps of the multiple patches.A key tool enabling the application of our algorithm to tracking is the integral histogram data structure[18].Its use allows to extract histograms of multiple rectangular re-gions in the image in a very efficient manner.Our algorithm overcomes several difficulties which can-not be handled by traditional histogram-based algorithms [8,6].First,by robustly combining multiple patch votes,we are able to handle partial occlusions or pose change.Sec-ond,the geometric relations between the template patches allow us to take into account the spatial distribution of the pixel intensities-information which is lost in traditional histogram-based algorithms.Third,as noted by[18],track-ing large targets has the same computational cost as track-ing small targets.We present extensive experimental results on challenging sequences,which demonstrate the robust tracking achieved by our algorithm(even with the use of only gray-scale(non-color)information).1.IntroductionTracking is an important subject in computer vision with a wide range of applications-some of which are surveil-lance,activity analysis,classification and recognition from motion and human-computer interfaces.The three main categories into which most algorithms fall are feature-based tracking(e.g.[3]),contour-based tracking(e.g.[15])and region-based tracking(e.g[13]).In the region-based cate-gory,modeling of the region’s content by a histogram or by other non-parametric descriptions(e.g.kernel-density esti-mate)have become very popular in recent years.In particu-lar,one of the most influential approaches is the mean-shift approach[8,6].With the experience gained by using histograms and the mean shift approach,some difficulties have been studied in recent years.One issue is the local basin of convergence that the mean shift algorithm has.Recently in[22]the au-thors describe a method for converging to the optimum from far-away starting points.A second issue,inherent in the use of histograms,is the loss of spatial information.This issue has been addressed by several works.In[26]the authors introduce a new sim-ilarity measure between the template and image regions, which replaces the original Bhattacharyya metric.This measure takes into account both the intensities and their position in the window.The measure is further computed efficiently by using the Fast Gauss Transform.In[12],the spatial information is taken into account by using“oriented kernels”-this approach is additionally shown to be useful for wide baseline matching.Recently,[4]has addressed this issue by adding the spatial mean and covariance of the pixel positions who contribute to a given bin in the histogram-naming this approach as“spatiograms”.A third issue which is not specifically addressed by these previous approaches is occlusions.The template model is global in nature and hence cannot handle well partial occlu-sions.In this work we address the latter two issues(spatial in-formation and occlusion)by using parts or fragments to rep-resent the template.Thefirst issue is addressed by efficient exhaustive search which will be discussed later on.Given a template to be tracked,we represent it by multiple his-tograms of multiple rectangular sub regions(patches)of the template.By measuring histogram similarity with patchesof the target frame,we obtain a vote-map describing the possible positions of each patch in the target frame.We then combine the vote-maps in a robust manner.Spatial in-formation is not lost due to the use of spatial relationships between patches.Occlusions result in some of the patches contributing outlier vote-maps.Due to our robust method for combining the vote maps,the combined estimate of the target’s position is still accurate.The use of parts or components is a well known tech-nique in the object recognition literature(see chapter23in [11]).Examples of works which use the spatial relation-ships between detections of object parts are[21,17,16,2]. Embedding Motion in Model-Based Stochastic Tracking
algorithms that keep only one configuration state [5], which are therefore sensitive to single failures in the presence of ambiguities or fast or erratic motion. In this paper, we address two important issues related to tracking with a particle filter. The first issue refers to the specific form of the observation likelihood, that relies on the conditional independence of observations given the state sequence. The second one refers to the choice of an appropriate proposal distribution, which, unlike the prior dynamical model, should take into account the new observations. To handle these issues, we propose a new particle filter tracking method based on visual motion. Our method relies on a new graphical model allowing for the natural introduction of implicit or explicit motion information in the likelihood term, and on the exploitation of explicit motion measurements in the proposal distribution. the above issues, our approach, and their benefits, is given in the following paragraphs. The definition of the observation likelihood distribution is perhaps the most important element in visual tracking with a particle filter. This distribution allows for the evaluation of the likelihood of the current observation given the current object state, and relies on the specific object representation. The object representation corresponds to all the information that characterizes the object like the target position, geometry, appearance, color, etc. Parametrized shapes like splines [2] or ellipses [6], and color distributions [5]–[8], are often used as target representation. One drawback of these generic representations is that they can be quite unspecific, which augments the chances of ambiguities. One way to improve the robustness of a tracker consists of combining low-level measurements such as shape and color [6]. The generic conditional form of the likelihood term relies on a standard hypothesis in probabilistic visual tracking, namely the independence of observations given the state sequence [2], [6], [9]–[13]. In this paper, we argue that this assumption can be inaccurate in the case of visual tracking. As a remedy, we propose a new model that assumes that the current observation depends on the current and previous object configurations as well as on the past observation. We show that under this more general assumption, the obtained particle filtering algorithm has similar equations than the algorithm based on the standard hypothesis. To our knowledge, this has not been shown before, and so it represents the first contribution of this article. The new assumption can thus be used to naturally introduce implicit or explicit motion information in the observation likelihood term. The introduction of such data correlation between successive images will turn generic trackers like shape or color histogram trackers into more specifi in Model-Based Stochastic Tracking
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
This work was supported by an ERCIM post-doctoral fellowship at IRISA/INRIA, Rennes, France. ∗ Corresponding Author Email addresses: venkatesh.babu@ (R. Venkatesh Babu), (Patrick P´ erez), (Patrick Bouthemy).
Abstract Visual tracking has been a challenging problem in computer vision over the decades. The applications of visual tracking are far-reaching, ranging from surveillance and monitoring to smart rooms. Mean-shift tracker, which gained attention recently, is known for tracking objects in a cluttered environment. In this work, we propose a new method to track objects by combining two well-known trackers, sum-of-squared differences (SSD) and color-based mean-shift (MS) tracker. In the proposed combination, the two trackers complement each other by overcoming their respective disadvantages. The rapid model change in SSD tracker is overcome by the MS tracker module, while the inability of MS tracker to handle large displacements is circumvented by the SSD module. The performance of the combined tracker is illustrated to be better than those of the individual trackers, for tracking fast-moving objects. Since the MS tracker relies on global object parameters such as color, the performance of the tracker degrades when the object undergoes partial occlusion. To avoid adverse effects of the global model, we use MS tracker to track local object properties instead of the global ones. Further, likelihood ratio weighting is used for the SSD tracker to avoid drift during partial occlusion and to update the MS tracking modules. The proposed tracker outperforms the traditional MS tracker as illustrated. Key words: Visual Tracking, Mean-Shift, Object Tracking, Kernel Tracking
Preprint submitted to Elsevier Science
19 July 2006
The objective of object tracking is to faithfully locate the targets in successive video frames. The major challenges encountered in visual tracking are cluttered background, noise, change in illumination, occlusion and scale/appearance change of the objects. Considerable work has already been done in visual tracking to address the aforementioned challenges. Most of the tracking algorithms can be broadly classified into the following four categories. (1) Gradient-based methods locate target objects in the subsequent frame by minimizing a cost function [1,2]. (2) Feature-based approaches use features extracted from image attributes such as intensity, color, edges and contours for tracking target objects [3–5]. (3) Knowledge-based tracking algorithms use a priori knowledge of target objects such as shape, object skeleton, skin color models and silhouette [6–9]. (4) Learning-based approaches use pattern recognition algorithms to learn the target objects in order to search them in an image sequence [10–12]. Visual tracking in a cluttered environment remains one of the challenging problems in computer vision for the past few decades. Various applications like surveillance and monitoring, video indexing and retrieval require the ability to faithfully track objects in a complex scene involving appearance and scale change. Though there exist many techniques for tracking objects, colorbased tracking with kernel density estimation, introduced in [13,8], has recently gained more attention among research community due to its low computational complexity and robustness to appearance change. The reported work in [13] is due to the use of a deterministic gradient ascent (the “mean shift” iteration) starting at location, corresponding to the object location in previous frame. A similar work in [8] relies on the use of a global appearance model, e.g., in terms of colors, as opposed to very precise appearance models such as pixel-wise intensity templates [14,15]. The mean-shift algorithm was originally proposed by Fukunaga and Hostetler [16] for clustering data. It was introduced to image processing community by Cheng [17] a decade ago. This theory became popular among vision community after its successful application to image segmentation and tracking by Comaniciu and Meer [18,5]. Later, many variants of the mean-shift algorithm were proposed for various applications [19–24]. Though mean-shift tracker performs well on sequences with relatively small object displacement, its performance is not guaranteed when the objects move fast as well as when they undergo partial occlusion. Here, we attempt to im2