大数据机器学习

合集下载

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

Challenge in Big Data Applications: Curse of dimensionality Storage cost Query speed
Li ( h t t p : / / c s . n ju . e d u . cn / l wj ) Big Learning CS, NJU 10 / 79
Flickr: 6 billion photos
Wal-Mart: 267 million items/day; 4PB data warehouse Sloan Digital Sky Survey: New Mexico telescope captures 200 GB image data/day
Li ( h t t p : / / c s . n ju . e d u . cn / l wj )
Big Learning
CS, NJU
6 / 79
Introduction
Challenge
Storage: memory and disk
Computation: CPU
Communication: network
Li ( h t t p : / / c s . n ju . e d u . cn / l wj ) Big Learning CS, NJU 5 / 79
Introduction
Big Data Machine Learning
Definition: perform machine learning from big data.
Role: key for big data
Ultimate goal of big data processing is to mine value from data. Machine learning provides fundamental theory and computational techniques for big data mining and analysis.
Li ( h t t p : / / c s . n ju . e d u . cn / l wj )
Big Learning
CS, NJU
4 / 79
Introduction
Definition of Big Data
Gartner (2012): “Big data is high volume, high velocity, and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization.” ( “ 3Vs” ) International Data Corporation (IDC) (2011): “Big data technologies describe a new generation of technologies and architectures, designed to economically extract value from very large volumes of a wide variety of data, by enabling high-velocity capture, discovery, and/or analysis.” ( “ 4Vs” ) McKinsey Global Institute (MGI) (2011): “Big data refers to datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze.” Why not hot until recent years? Big data: 7 ? Cloud computing? ? ? E ? Big data machine learning? ? 7 E ?
Li ( h t t p : / / c s . n ju . e d u . cn / l wj )
Big Learning
CS, NJU
7 / 79
Introduction
Our Contribution
Learning to hash (M F ? S ): memory/disk/cpu/communication Distributed learning (? ? ? ? S ): memory/disk/cpu; but increase communication cost Stochastic learning (? ? ? S ): memory/disk/cpu
Big Learning
CS, NJU
14 / 79
Learning to Hash
Querying
Li ( h t t p : / / c s . n ju . e d u . cn / l wj )
Big Learning
CS, NJU
15 / 79
Learning to Hash
Fast Query Speed
Outline
1
Introduction Learning to Hash Isotropic Hashing Supervised Hashing with Latent Factor Models Supervised Multimodal Hashing with SCM Multiple-Bit Quantization
By using hashing scheme, we can achieve constant or sub-linear search time complexity.
Exhaustive search is also acceptable because the distance calculation cost is cheap now.
Conclusion
Big Learning CS, NJU 2 / 79
4
5
Li ( h t t p : / / c s . n ju . e d u . cn / l wj )
Introduction
Outline
1
Introduction Learning to Hash Isotropic Hashing Supervised Hashing with Latent Factor Models Supervised Multimodal Hashing with SCM Multiple-Bit Quantization
??????S
o?
LAMDA Group H???O?????E?X ^?#E?I[?:? ? liwujun@nju.edu.cn
Dec 23, 2014
Li ( h t t p : / / c s . n ju . e d u . cn / l wj )
Big Learning
CS, NJU
1 / 79
Learning to Hash
Similarity Preserving Hashing
Li ( h t t p : / / c s . n ju . e d u . cn / l wj )
Big Learning
CS, NJU
11 / 79
Learning to Hash
Reduce Dimensionality and Storage Cost
2
3
Distributed Learning Coupled Group Lasso for Web-Scale CTR Prediction Distributed Power-Law Graph Computing Stochastic Learning Distributed Stochastic ADMM for Matrix Factorization
Projected with real-valued projection function Given a point x, each projected dimension i will be associated with a real-valued projection function f i (x) (e.g. f i (x) = wTi x )
Li ( h t t p : / / c s . n ju . e d u . cn / l wj )
Big Learning
CS, NJU
16 / 79
Learning to Hash
Two Stages of Hash Function Learning
Projection Stage (Dimension Reduction)
2
3
Distributed Learning Coupled Group Lasso for Web-Scale CTR Prediction Distributed Power-Law Graph Computing Stochastic Learning Distributed Stochastic ADMM for Matrix Factorization
Li ( h t t p : / / c s . n ju . e d u . cn / l wj )
Big Learning
CS, NJU
8 / 79
Learning to Hash
Outline
1
Introduction Learning to Hash Isotropic Hashing Supervised Hashing with Latent Factor Models Supervised Multimodal Hashing with SCM Multiple-Bit Quantization
Conclusion
Big Learning CS, NJU 9 / 79
4
5
Li ( h t t p : / / c s . n ju . e d u . cn / l wj )
Learning to Hash
Nearest Neighbor Search (Retrieval)
Given a query point q, return the points closest (similar) toபைடு நூலகம்q in the database(e.g. images). Underlying many machine learning, data mining, information retrieval problems
Conclusion
Big Learning CS, NJU 3 / 79
4
5
Li ( h t t p : / / c s . n ju . e d u . cn / l wj )
Introduction
Big Data
Big data has attracted much attention from both academia and industry. Facebook: 750 million users
Li ( h t t p : / / c s . n ju . e d u . cn / l wj )
Big Learning
CS, NJU
12 / 79
Learning to Hash
Querying
Hamming distance: ||01101110, 00101101||H = 3 ||11011, 01011||H = 1
Li ( h t t p : / / c s . n ju . e d u . cn / l wj )
Big Learning
CS, NJU
13 / 79
Learning to Hash
Querying
Li ( h t t p : / / c s . n ju . e d u . cn / l wj )
2
3
Distributed Learning Coupled Group Lasso for Web-Scale CTR Prediction Distributed Power-Law Graph Computing Stochastic Learning Distributed Stochastic ADMM for Matrix Factorization