视频中人的行为异常检测

合集下载

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

scale-invariant feature transform
histogram of oriented gradient
Besides the global and local features, some methods are proposed to model human body for human body tracking, pose estimation and human activity recognition.
For both of the probability model-based algorithms, including generative models and discriminative models, their performance relies on extensive training dataset.
Human Pose Estimation
Huo et al propose a system for human pose recognition. First, a 2D model is used for torso detection and tracking a skin color model is used for hands tracking. 3D reconstruction is done by multi-view from synchronized multiple cameras. The 2D and 3D coordinates are then converted into a normalized feature space classified by nearest mean classifiers (NMC) for recognizing key poses.
Indirect Model
In an indirect model scheme, a priori model is only used indirectly to guide the interpretation of measured data.
Indirect Model
In an indirect model scheme, a priori model is only used indirectly to guide the interpretation of measured data.
Model-Free
There is no a priori model, only simple blobs(一滴，一团，一点) to represent the human poses，which are used to represent head/hands .
Model-Free
Nakazawa use an ellipse to represent the human body and perform ellipse tracking by four steps 1. extraction of the human region from the image 2. generation of simulated image 3. matching human position 4. updating of human position.
Video-Based Human Activity Recognition
a person with a bag loitering at an airport or station
the automatic recognition of different player’s actions during a tennis game so as to create an avatar in the computer to play tennis for the player
The main difference between generative models and discriminative models is that the generative classifiers commonly learn a model of the joint probability, p(x,y), of the input x and the label y, or equivalently the likelihood p(x|y) according to Bayes’ rule; while the discriminative classifiers model the posterior p(y|x) directly.
无法生成样本，只能判断分类，如SVM。
产生式模型：无穷样本 ==》概率密度模型 = 产生模型 ==》预测判别式模型：有限样本 ==》判别函数 = 预测模型 ==》预测
一个举例： (1,0), (1,0), (2,0), (2, 1)
产生式模型： P (x, y)： P(1, 0) = 1/2, P(1, 1) = 0, P(2, 0) = 1/4, P(2, 1) = 1/4. 判别式模型： P (y | x)： P(0|1) = 1, P(1|1) = 0, P(0|2) = 1/2, P(1|2) = 1/2
Leung and Yang use U-shaped edges to describe the outline of a moving human body from a video sequence.
Direct Model
In the direct model scheme, a priori human model (i.e., the explicit 3D geometric representation of human space and kinematic structure) is directly used as the model, which represents the observed subject and is continuously updated by the observations.
产生式模型和判别式模型（Generative model vs. Diห้องสมุดไป่ตู้criminative model）
o和s分别代表观察序列和标记序列
• 产生式模型：构建o和s的联合分布p(s,o)，因可以根据联合概率来生成
样本，如HMM。
• 判别式模型：构建o和s的条件分布p(s|o)，因为没有s的知识，
The STV and DFT features belong to global features which consider the whole image so that they are limited on viewpoint changes and occlusion. Hence, some local descriptors are considered.
Depending on whether the use or not an explicit a priori body model , the body modeling can be roughly categorized in three classes , model-free, indirect model and direct model.
（动态时间归整）
（产生式模型）
（判别式模型）
（隐马尔科夫模型）（隐半马尔科夫模型）（动态贝叶斯网络）
The dynamic time warping (DTW) , a method for measuring similarity between two temporal sequences, which may vary in time or speed, is one of the most common temporal classification algorithms due to its simplicity; however, DTW is not appropriate for a large number of classes with many variations.
Therefore, other methods are proposed, such as Kalman filter , binary tree , multidimensional indexing , and K nearest neighbor (K-NN) .
Single Person Activity Recognition Trajectory Falling Detection Human Pose Estimation
Falling Detection
Falling detection is significantly critical for security and safety environments, especially for the elderly who live alone. Töreyin model human motions with HMMs, and fuse audio channel data(音频信道的数据 ) with the results of HMMs to detect a falling event. The
The space-time volume (STV) is limited on non-periodic activities
the discrete Fourier transform (DFT) ， which has been widely used to represent information about the geometric structure of the object.
A 3D human body with orientations of the 10 body parts, namely torso, head, upper/lower left / right arm/leg .
(a)
(b)
(c)
Human body models (a) human image, (b) 13-point model, (c) 3D model.
the automatic recognition of patient’s action to facilitate the rehabilitation processes
characteristics of the segmented objects such as shape, silhouette, colors and motions are extracted and represented in some form of features
Trajectory
A trajectory is the path that a person moves as a function of time. Lu and Little use a PCA-HOG descriptor to track and recognize sports videos, such as hockey and soccer. Bodor et al. use Kalman filters to analyze pedestrian location and velocity, which are used to classify either a walking pedestrian, a running pedestrian, a loitering pedestrian or a falling-down pedestrian.