机器学习介绍(英文版：备注里有中文翻译)

合集下载

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

2020/9/13
If the output eigenvector marks come from a limited set that consist of class or name variable, then the kind of machine learning belongs to classification problem. If output mark is a continuous variable, then the kind of machine learning belongs to regression problem.
2020/9/13
IndependБайду номын сангаасnt component analysis, ICA
The basic idea of ICA is to extract the independence signal from a group of mixed observed signal or use independence signal to represent other signal.
2020/9/13
Principal Component Analysis, PCA
PCA is the most common linear dimension reduction method. Its target is mapping the data of high dimension to low-dimension space via certain linear projection, and expect the variance of data that project the corresponding dimension is maximum. It can use fewer data dimension meanwhile retain the major characteristic of raw data.
2020/9/13
Artificial neural network, ANN
ANN is a mathematical model that apply a kind of structure which similar with synapse connection for information processing. In this model, a mass of node form a network, i.e. neural network, to reach the goal of information processing. Neural network usually need to train. The course of training is network learning. The training change the link weight of network node and make it possess the function of classification. The network after training apply to recognize object.
The core target of machine learning is to generalize from known experience. Generalization means a power of which the machine learning system to be learned for known data that could predict the new data.
2020/9/13
Linear discriminant analysis, LDA
The basic idea of LDA is projection, mapping the N dimension data to lowdimension space and separate the between-groups as soon as possible. i.e. the optimal separability in the space. The benchmark is the new subspace has maximum between class distance and minimal inter-object distance.
the Google search index of three concept since 2004
2020/9/13
The constructed machine learning system based on computer mainly contains two core parts：representation and generalization. The first step for data learning is to represent the data, i.e. detect the pattern of data. Establish a generalized model of data space according to a group of known data to predict the new data.
Machine Learning
2020/9/13
Machine learning, as a branch of artificial intelligence, is general terms of a kind of analytical method. It mainly utilizes computer simulate or realize the learned behavior of human.
2020/9/13
1）Machine learning just like a true champion which go haughtily; 2）Pattern recognition in process of decline and die out; 3）Deep learning is a brand-new and rapidly rising field.
2020/9/13
Naive Bayes, NB
NB classification algorithm is a classification method in statistics. It use probability statistics knowledge for classification. This algorithm could apply to large database and it has high classification accuracy and high speed.
2020/9/13
Classification algorithm
Decision tree
Decision tree is a tree structure. Each nonleaf node expresses the test of a feature property and each branch expresses the output of feature property in certain range and each leaf node stores a class. The decisionmaking course of decision tree is starting from root node, testing the corresponding feature property of waiting objects, selecting the output branch according to their values, until reaching the leaf node and take the class that leaf node store as the decision result.
2020/9/13
Recursive feature elimination algorithm, RFE
RFE is a greedy algorithm that wipe off insignificance feature step by step to select the feature. Firstly, cyclic ordering the feature according to the weight of sub-feature in classification and remove the feature which rank at terminal one by one. Then, according to the final feature ordering list, select different dimension of several feature subset front to back. Assess the classification effect of different feature subset and then get the optimal feature subset.
2020/9/13
Classification step
Raw data
Feature extraction
Feature selection
Model training
2020/9/13
New data
Classification and prediction
Feature selection（ feature reduction ）
2020/9/13
Unsupervised learning
Input data has no labels. It relates to another learning algorithm, i.e. clustering. The basic definition is a course that divide the gather of physical or abstract object into multiple class which consist of similar objects.
Curse of Dimensionality：Usually refer to the problem that concerned about computation of vector. With the increase of dimension, calculated amount will jump exponentially.
Cortical features of different brain regions exhibit variant effect during the classification process and may exist some redundant feature. In particular after the multimodal fusion, the increase of feature dimension will cause “curse of Dimensionality”.
2020/9/13
Supervised learning
Input data has labels. The common kind of learning algorithm is classification. The model has been trained via the correspondence between feature and label of input data. Therefore, when some unknown data which has features but no label input, we can predict the label of unknown data according to the existing model.