SlideNote04
合集下载
相关主题
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Introduction
• Definition of SVM
– Two Linear Classifiers with their maximal margin (their distance) in the mapped feature space – Mapping from the given input space into the feature space
– Kernel Function
• Mention mapping into another space • Types and Properties of Kernel Functions • Properties of Kernel Matrix
– Weight Optimization
• Linear Equations of Weight Vectors Primal Problem • Primal Problem Dual Problem • Optimization of Lagrange Multipliers
1 2
w w
T
J w , b , w J w , b , b
d w
N i i i 1
T
xi b 1
0 0
w
N
i 1 i
N
i
d ix i
i 1
di 0
• Dual Problem
J w , b ,
J w , b ,
– SMO Algorithm – Genetic Algorithm
Summary and Further Discussion
• Summary
– Two Hyper-Planes as the SVM Classifier – Kernel Functions
• • • • Inner Product: Linear Classifier Polynomial Function: Nonlinear Classifier Gaussian Function: RBF (Radial Basis Function) Hyper sigmoid function: MLP (Multilayer Perceptron)
• Inner Product in the unmapped space Scalar • Inner Product in the mapped space Scalar
– Types of Kernel Function
• Inner Product • Polynomial • Gaussian
Two Linear Classifiers
Two Linear Classifiers
Two Linear Classifiers
Two Linear Classifiers
Two Linear Classifiers
• Comments on two Hyper-planes
– Minimize the Scale of Weights for maximizing the margin – No guarantee for being even linearly separable in the mapped space
• More possibility for the linear separability in the mapped space (original dimension << mapped dimension)
– Only support vectors are considered for classifying novel examples
Kernel Functions
• Overview Kernel Function
– Inner Product between two input vectors in the mapped space – Need not to map each training example into one in the mapped space, explicitly
1
Introduction
• Organization of this Lecture
– Two Parallel Hyper Planes as the SVM Classifier
• Beginning with the Perceptron • Expansion from a single hyper plane into two parallel hyper-planes • In the unmapped input space
bo 1 w o x
T
S
i 1
Weight Optimization
• Optimization Techniques
– Heuristic Algorithm
• Select a positive example and a negative example at random • Assign 1.0 to their corresponding Lagrange multipliers • Assign 0.0 to the others • Classify the Training Examples
Introduction
Introduction
Introduction
• Involved Notations
– Input Vector x i – Weight Vector w w o – Bias b bo d – Desirable (Target) Output i N – Training Examples x i , d i i 1 – Lagrange Multipliers i – Kernel Function K x, x i
w x i b 1
T
d i w x i b 1
T
Minimize w
1 2
w w
T
Weight Optimization
• Primal Problem
d i w x i b 1
T
Minimize w
1 2
w w
T
Minimize J w , b ,
• Depending on only training examples which are close to the two hyper-plane
– Two hyper-hyper-planes defined by following the hull convexity of support vectors
• The original input space: non-linear separable • The mapped space: linear separable
– Definition of two hyper-planes – Input Vectors close to hyper-planes Support Vectors – Count only support vectors but not other vectors
Kernel Functions
Weight Optimization
• Constraints of Weight Optimization
w xi b 1
T
for d i 1 for d i 1
w o x i bo 1 w o x i bo 1
T
T
for d i 1 for d i 1
– Measure how much two input vectors are similar with each other
Kernel Functions
• Typical Kernel Functions T T K – Basic Inner Product x , x i x x i
Kernel Functions
• Properties of Kernel Functions
K x, y K 1 x, z K 2 x, z K x, y 1 K 1 x, y K x, y K 1 x, y K 2 x, y K x, y f x f y K x, y K 3 x , y K x, y xBy
– Polynomial Function x K
– Radial Basis Function
T
,x x x 1
T
K x , xi
– Sigmoidal Function
T
T
x x i exp 2
t
T
q
2
K x , x i tanh 2 x x i 1
SVM (Support Vector Machine)
Week 04
Outline
• • • • • Introduction Two Linear Classifiers Kernel Functions Weight Optimization Summary and Further Discussion
Weight Optimization
1 2 w
T
N
w
w
N
N
i d i w x i b 1
T
i 1
T
1 2
w
T
i 1
i
d iw
xi b id i
i 1
N
i 1
N
i
i 1
i
di 0
N T
w
i 1
N
i
d ixi
T
w w
T
id iw x i
i 1
N
N
i 1
i j d i d j x i x j
j 1
i
N
Q
i
i 1
1
2
i 1
N
Fra Baidu bibliotek
jd id jx x
T i
j
Weight Optimization
Optimize i Maximize
Q
to satisfying the following conditi
T
w x i b continuous
T
value
Summary and Further Discussion
i
i 1 N i 1
N
1
2
N
i j d i d j x i x
T
j
i 1
Constraints i 0
i 0
When finding the optimized language multiplier o , i
wo
N
S
o ,i d i x i
– Weight Optimization
• Optimization of Lagrange Multipliers • Primal Problems Dual Problems
Summary and Further Discussion
Summary and Further Discussion
Kernel Functions
• Kernel Matrix x , d N i i i 1
K x 1 , x 1 K x 2 , x 1 .... K x N , x 1 K x 1 , x 2 K x 2 , x 2 ..... K x N , x 1 .... .... .... .... K x 1 , x N K x 2 , x N .... K x N , x 1
• SVM for Multiple Classification or Regression
– View the SVM as the binary classifier in this lecture – Multiple Classification: #Class > 2
w x i b within a given range – Regression