半监督支持向量机的研究
合集下载
相关主题
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
S3VMs: Formulation
http://lamda.nju.edu.cn
Control model complexity
Losses on labeled and unlabeled data
The label of unlabeled data are unknown, and need to be optimized
Human efforts and material resources
Exploiting Unlabeled Data
http://lamda.nju.edu.cn
Collection of unlabeled data is usually cheaper
Two popular schemes for exploiting unlabeled data to help
The seminal work [Blum & Mitchell, 1998] has won the ‘10-year best paper’ award in the 25th International Conference on Machine Learning (ICML’08).
Graph-based methods [Blum & Chawla, 2001; Zhu et al., 2003; Zhou et al., 2005; Belkin et al., 2006]
URL: http://lamda.nju.edu.cn/liyf/
Email: liyf@lamda.nju.edu.cn
MLA’13, Shanghai
Joint work with
Zhi-Hua Zhou
Nanjing University
James Kwok
Hong Kong University of Science and Technology
Predict
Unseen Data
In order to have a good generalization performance, supervised learning methods often assumes that a large amount of labeled data are available.
supervised learning Semi-supervised learning: the learner tries to exploit the unlabeled
examples by itself. Active learning: the learner actively selects some unlabeled examples to query from an oracle
thousand examples)
Can we have a scalable and convex S3VM?
WellSVM [Li et al., JMLR13]
Observation
http://lamda.nju.edu.cn
Balance constraint
Both labeled and unlabeled data have large margin
S3VMs: Formulation
http://lamda.nju.edu.cn
SVM
Prior knowledge
S3VMs are an mixed-integer program, thus intractable in general.
Ivor Tsang
Nanyang Technological University
Acknowledge
Teng Zhang, Linli Xu and Kai Zhang
Supervised Learning
http://lamda.nju.edu.cn
Labeled Data
Train
Learning Methods
Bioinformatics [Kasabov & Pang, 2004] Named Entity Recognition [Goutte et al., 2002] …
Outline
http://lamda.nju.edu.cn
Scalability of S3VMs WellSVM [Li et al., JMLR13] Efficiency of S3VMs
Labeled Data Is Expensive
http://lamda.nju.edu.cn
However, labeling process is expensive in many real tasks
Disease diagnosis Drug detection Image classification Text categorization …
“快”
“好”
“省”
Related Works
http://lamda.nju.edu.cn
Global optimization
Branch-and-Bound [Chepelle et al., NIPS2006] Deterministic Annealing [Sindhwani et al., ICML2006] Continuation Method [Chepelle et al., ICML2006]
Local optimization
Local Conbinatorial Search [Joachims, ICML1999] Alternating Optimization [Zhang et al., ICML2009] Constrained Convex-Concave Procedure (CCCP) [Collobert et
S3VMs
http://lamda.nju.edu.cn
Unlabeled Data
Large-margin separator (or, low-density separator)
Labeled Data
In [Vapnik, SLT’98], it is shown that large margin could help improve the generalization learning bound.
Relax S3VMs as convex Semi-Definite Programming (SDP) SDP typically scales O(n6.5) where n is the sample size [Zhang
et al., TNN2011].
Pro: promising performance Con: poor scalability (i.e., could not handle with more than several
Semi-Supervised Learning
http://lamda.nju.edu.cn
SLearner
SSLearner
Several Surveys and Books O. Chapelle et al. Semi-supervised learning. MIT Press Cambridge, 2006. X. Zhu and A. Goldberg. Introduction to semi-supervised learning. Morgan & Claypool Publishers, 2009. Z.-H. Zhou and M. Li. Semi-supervised learning by disagreement. Knowledge and Information Systems, 24(3):415–439, 2010. 周志华. 基于分歧的半监督学习, 特邀综述. 自动化学报. 2013年11月.
The seminal work [Zhu et al., 2003] has won the ‘10-year best paper’ award in the 30th International Conference on Machine Learning (ICML’13).
Semi-supervised support vector machines (S3VMs) [Vapnik, 1998;
Bennett & Demiriz, 1999; Joachims, 1999; Chapelle & Zien, 2005]
The seminal work [Joachims, 1999] has won the ‘10-year best paper’ award in the 26th International Conference on Machine Learning (ICML’09).
Pro: good performance on very small data sets Con: poor scalability (i.e., could not handle with more than several
hundred examples)
Related Works
http://lamda.nju.edu.cn
“多”
MeanS3VM [Li et al., ICML09]
Safeness of S3VMs S4VM [Li and Zhou, ICML11] Cost sensitivity of S3VMs CS4VM [Li et al., AAAI10]
“快”
“好”
“省”
Outline
al., JMLR2006]
Pro: good scalability
Con: suffer from local optima, suboptimal performance
Related Works
http://lamda.nju.edu.cn
SDP convex relaxation [Xu et al., 2005; De Bie and Cristianini, 2006]
Research on Semi-Supervised SVMs (半监督支持向量机的研究)
Yu-Feng Li
National Key Laboratory for Novel Software Technology, Nanjing University, China
http://lamda.nju.edu.cn
Four Major Paradigms of SSL
Generative methods [Miller & Uyar, 1997; Nigam et al., 2000; Cozman & Cohen, 2002] Co-training/Disagreement-based methods [Blum & Mitchell, 1998; Balcan et al., 2005; Zhou & Li, 2010]
http://lamda.nju.edu.cn
Scalability of S3VMs WellSVM [Li et al., JMLR13] Efficiency of S3VMs
“多”
Meaቤተ መጻሕፍቲ ባይዱS3VM [Li et al., ICML09]
Safeness of S3VMs S4VM [Li and Zhou, ICML11] Cost sensitivity of S3VMs CS4VM [Li et al., AAAI10]
S3VMs: Applications
http://lamda.nju.edu.cn
Text Categorization [Joachims 1999; Joachims, 2002] Email Classification [Kockelkorn et al., 2003]
Image Retrieval [Wang et al., 2003]