Hierarchical Text Classification and Evaluation

合集下载

英国《科学文摘》及其检索方法

一、概述英国《科学文摘》（Science Abstraccts，简称SA），创刊于1898年，是一种物理学、电气电子学、计算机与控制领域综合性科技检索刊物，现由英国电气工程师学会（The Institute of Electrical Engineers，简称IEE）下设的“国际物理与工程情报服务部”（International Information Services for the Physics and Engineering Communities，简称INSPEC）编辑出版。

《SA》创刊时刊名为《科学文摘：物理与电气工程》（Science Abstracts：Physics and Electrical Engineering）。

经过多次调整与充实，1969年起由INSPEC负责编辑成现在我们看到的三个分辑：A辑（Series A）：《物理文摘》（Physics Abstracts，简称PA），半月刊。

B辑（Series B）：《电气与电子学文摘》（Electrical and Electronics Abstracts，简称EEA），月刊。

C辑（Series C）：《计算机与控制文摘》（Computer & Control Abstracts，简称CCA），月刊。

半月刊或月刊形式出版的《SA》A、B、C三辑称之为期文摘，是《SA》的主体内容。

每隔半年或隔3—5年，《SA》还出版配套的半年或多年累积索引。

另外，除了这些印刷型出版物外，《SA》还出版缩微型和机读型（磁带或光盘）版本，其中光盘型的《SA》（INSPEC光盘）使用更为方便、迅速。

《SA》报道的文献来自世界50余国各种文字的期刊论文、会议文献、技术报告、学位论文、图书专著、标准资料等。

1971年前《SA》还曾经报道过专利说明书。

目前，《SA》每年报道文献量约为30万篇。

其中A辑约15万篇，B辑约8万篇，C辑约7万篇。

三辑报道的文献有部分重复。

学历学位中英文对照

学历、学位的中英文对照不同的国家,学位也不一样.按照《中华人民共和国学位条例》“Regulations Concerning Academic Degrees in the People's Republic of China”中国学历的翻译标准如下:结业证书Certificate of Completion毕业证书Certificate of Graduation肄业证书Certificate of Completion/Incompletion/Attendance/Study教育学院College/Institute of Education中学Middle [Secondary] School师范学校Normal School [upper secondary level]师范专科学校Normal Specialized Postsecondary College师范大学Normal [T eachers] University公证书Notarial Certificate专科学校Postsecondary Specialized College广播电视大学Radio and T elevision University中等专科学校Secondary Specialized School自学考试Self-S tudy Examination技工学校Skilled Workers [T raining] School业余大学Spare-Time University职工大学Staff and Workers University大学University (regular, degree-granting)职业大学Vocational University结业证书Certificate of Completion毕业证书Certificate of Graduation肄业证书Certificate of Completion/Incompletion/Attendance/Study师范学校Normal School对应,美国学校提供的学位有很多种，依所学领域的不同，而有不同的学位.美国的学历大搜罗如下:Ph.D.（Doctor of Philosophy）: 博士学位。

Bag of Tricks for Efficient Text Classification

a r X i v :1607.01759v 2 [c s .C L ] 7 J u l 2016Bag of Tricks for Efﬁcient Text ClassiﬁcationArmand JoulinEdouard Grave Piotr Bojanowski Tomas MikolovFacebook AI Research{ajoulin,egrave,bojanowski,tmikolov }@AbstractThis paper proposes a simple and efﬁcient ap-proach for text classiﬁcation and representa-tion learning.Our experiments show that our fast text classiﬁer fastText is often on par with deep learning classiﬁers in terms of ac-curacy,and many orders of magnitude faster for training and evaluation.We can train fastText on more than one billion words in less than ten minutes using a standard mul-ticore CPU,and classify half a million sen-tences among 312K classes in less than a minute.1IntroductionBuilding good representations for text classi-ﬁcation is an important task with many ap-plications,such as web search,information retrieval,ranking and document classiﬁca-tion (Deerwester et al.,1990;Pang and Lee,2008).Recently,models based on neural networks have become increasingly popular for computing sentence representations (Bengio et al.,2003;Collobert and Weston,2008).While these models achieve very good performance in practice (Kim,2014;Zhang and LeCun,2015;Zhang et al.,2015),they tend to be relatively slow both at train and test time,limiting their use on very large datasets.At the same time,simple linear models have also shown impressive performance while being very computationally efﬁcient (Mikolov et al.,2013;Levy et al.,2015).They usually learn word level representations that are later combined to form sen-tence representations.In this work,we propose anextension of these models to directly learn sentence representations.We show that by incorporating additional statistics such as using bag of n-grams,we reduce the gap in accuracy between linear and deep models,while being many orders of magnitude faster.Our work is closely related to stan-dard linear text classiﬁers (Joachims,1998;McCallum and Nigam,1998;Fan et al.,2008).Similar to Wang and Manning (2012),our moti-vation is to explore simple baselines inspired by models used for learning unsupervised word repre-sentations.As opposed to Le and Mikolov (2014),our approach does not require sophisticated infer-ence at test time,making its learned representations easily reusable on different problems.We evaluate the quality of our model on two different tasks,namely tag prediction and sentiment analysis.2Model architectureA simple and efﬁcient baseline for sentence classiﬁcation is to represent sentences as bag of words (BoW)and train a linear classiﬁer,for example a logistic regression or support vec-tor machine (Joachims,1998;Fan et al.,2008).However,linear classiﬁers do not share pa-rameters among features and classes,possibly limiting mon solutions to this problem are to factorize the linear clas-siﬁer into low rank matrices (Schutze,1992;Mikolov et al.,2013)or to use multilayer neu-ral networks (Collobert and Weston,2008;Zhang et al.,2015).In the case of neural net-works,the information is shared via the hiddenFigure1:Model architecture for fast sentence classiﬁcation. layers.Figure1shows a simple model with1hidden layer.Theﬁrst weight matrix can be seen as a look-up table over the words of a sentence.The word representations are averaged into a text rep-resentation,which is in turn fed to a linear classi-ﬁer.This architecture is similar to the cbow model of Mikolov et al.(2013),where the middle word is replaced by a label.The model takes a sequence of words as an input and produces a probability distri-bution over the predeﬁned classes.We use a softmax function to compute these probabilities.Training such model is similar in nature to word2vec,i.e.,we use stochastic gradient descent and backpropagation(Rumelhart et al.,1986)with a linearly decaying learning rate.Our model is trained asynchronously on multiple CPUs.2.1Hierarchical softmaxWhen the number of targets is large,computing the linear classiﬁer is computationally expensive. More precisely,the computational complexity is O(Kd)where K is the number of targets and d the dimension of the hidden layer.In order to im-prove our running time,we use a hierarchical soft-max(Goodman,2001)based on a Huffman cod-ing tree(Mikolov et al.,2013).During training,the computational complexity drops to O(d log2(K)). In this tree,the targets are the leaves.The hierarchical softmax is also advantageous at test time when searching for the most likely class. Each node is associated with a probability that is the probability of the path from the root to that node.If the node is at depth l+1with parents n1,...,n l,its probability isP(n l+1)=li=1P(n i).This means that the probability of a node is always lower than the one of its parent.Exploring the tree with a depthﬁrst search and tracking the maximum probability among the leaves allows us to discard any branch associated with a smaller probability.In practice,we observe a reduction of the complexity to O(d log2(K))at test time.This approach is further extended to compute the T-top targets at the cost of O(log(T)),using a binary heap.2.2N-gram featuresBag of words is invariant to word order but taking explicitly this order into account is often compu-tationally very expensive.Instead,we use bag of n-gram as additional features to capture some par-tial information about the local word order.This is very efﬁcient in practice while achieving compa-rable results to methods that explicitly use the or-der(Wang and Manning,2012).We maintain a fast and memory efﬁcient mapping of the n-grams by using the hashing trick(Weinberger et al.,2009)with the same hash-ing function as in Mikolov et al.(2011)and10M bins if we only used bigrams,and100M otherwise. 3Experiments3.1Sentiment analysisDatasets and baselines.We employ the same8datasets and evaluation protocol of Zhang et al.(2015).We report the N-grams and TFIDF baselines from Zhang et al.(2015),as well as the character level convolutional model (char-CNN)of Zhang and LeCun(2015)and the very deep convolutional network(VDCNN) of Conneau et al.(2016).We also compare to Tang et al.(2015)following their evaluation protocol.We report their main baselines as well asBoW(Zhang et al.,2015)88.892.996.692.258.068.954.690.4 ngrams(Zhang et al.,2015)92.097.198.695.656.368.554.392.0 ngrams TFIDF(Zhang et al.,2015)92.497.298.795.454.868.552.491.5 char-CNN(Zhang and LeCun,2015)87.295.198.394.762.071.259.594.5 VDCNN(Conneau et al.,2016)91.396.898.795.764.773.463.095.7Table1:Test accuracy[%]on sentiment datasets.FastText has been run with the same parameters for all the datasets.It has10 hidden units and we evaluate it with and without bigrams.For VDCNN and char-CNN,we show the best reported numbers without data augmentation.AG1h3h8h12h2017h3sSogou--8h3013h4018h4036sDBpedia2h5h9h14h5020h8sYelp P.--9h2014h3023h0015sYelp F.--9h4015h1d18sYah.A.8h1d20h1d7h1d17h27sAmz.F.2d5d2d7h3d15h5d20h33sAmz.P.2d5d2d7h3d16h5d20h52sModel Yelp’13Yelp’14Yelp’15IMDBfastText64.266.266.645.2taiyoucon 2011digitals:individuals digital pho-tos from the anime convention taiyoucon 2011in mesa,arizona.if you know the model and/or the character,please comment.#cosplay#24mm #anime #animeconvention #arizona #canon #con #convention #cos #cosplay #costume #mesa #play #taiyou #taiyouconbeagle enjoys the snowfall #snow#2007#beagle #hillsboro #january #maddison #maddy #oregon #snow euclid avenue #newyorkcity#cleveland #euclidavenueModelprec@1Running time Freq.baseline 2.2--Tagspace,h =5030.13h86h Tagspace,h =20035.65h3215hTable 5:Prec@1on the test set for tag prediction onYFCC100M.We also report the training time and test time.Test time is reported for a single thread,while training uses 20threads for both models.Table4shows some qualitative examples.FastText learns to associate words in the caption with their hashtags,e.g.,“christmas”with “#christ-mas”.It also captures simple relations between words,such as “snowfall”and “#snow”.Finally,us-ing bigrams also allows it to capture relations such as “twin cities”and “#minneapolis”.4Discussion and conclusionIn this work,we have developed fastText which extends word2vec to tackle sentence and document classiﬁcation.Unlike unsupervisedly trained word vectors from word2vec,our word features can be averaged together to form good sentence represen-tations.In several tasks,we have obtained perfor-mance on par with recently proposed methods in-spired by deep learning,while observing a mas-sive speed-up.Although deep neural networks have in theory much higher representational power than shallow models,it is not clear if simple text classiﬁ-cation problems such as sentiment analysis are the right ones to evaluate them.We will publish our code so that the research community can easily build on top of our work.References[Bengio et al.2003]Yoshua Bengio,Rjean Ducharme, Pascal Vincent,and Christian Jauvin.2003.A neu-ral probabilistic language model.JMLR. [Collobert and Weston2008]Ronan Collobert and Jason Weston.2008.A uniﬁed architecture for natural lan-guage processing:Deep neural networks with multi-task learning.In ICML.[Conneau et al.2016]Alexis Conneau,Holger Schwenk, Lo¨ıc Barrault,and Yann Lecun.2016.Very deep con-volutional networks for natural language processing.arXiv preprint arXiv:1606.01781.[Deerwester et al.1990]Scott Deerwester,Susan T Du-mais,George W Furnas,Thomas K Landauer,and Richard Harshman.1990.Indexing by latent semantic analysis.Journal of the American society for informa-tion science.[Fan et al.2008]Rong-En Fan,Kai-Wei Chang,Cho-Jui Hsieh,Xiang-Rui Wang,and Chih-Jen Lin.2008.Li-blinear:A library for large linear classiﬁcation.JMLR. [Goodman2001]Joshua Goodman.2001.Classes for fast maximum entropy training.In ICASSP. [Joachims1998]Thorsten Joachims.1998.Text catego-rization with support vector machines:Learning with many relevant features.Springer.[Kim2014]Yoon Kim.2014.Convolutional neural net-works for sentence classiﬁcation.In EMNLP.[Le and Mikolov2014]Quoc V Le and Tomas Mikolov.2014.Distributed representations of sentences and documents.arXiv preprint arXiv:1405.4053. [Levy et al.2015]Omer Levy,Yoav Goldberg,and Ido Dagan.2015.Improving distributional similarity with lessons learned from word embeddings.TACL.[McCallum and Nigam1998]Andrew McCallum and Ka-mal Nigam.1998.A comparison of event models for naive bayes text classiﬁcation.In AAAI workshop on learning for text categorization.[Mikolov et al.2011]Tom´aˇs Mikolov,Anoop Deoras, Daniel Povey,Luk´aˇs Burget,and JanˇCernock`y.2011.Strategies for training large scale neural network lan-guage models.In Workshop on Automatic Speech Recognition and Understanding.IEEE.[Mikolov et al.2013]Tomas Mikolov,Kai Chen,Greg Corrado,and Jeffrey Dean.2013.Efﬁcient estimation of word representations in vector space.arXiv preprint arXiv:1301.3781.[Ni et al.2015]Karl Ni,Roger Pearce,KoﬁBoakye, Brian Van Essen,Damian Borth,Barry Chen,and Eric rge-scale deep learning on the YFCC100M dataset.In arXiv preprint arXiv:1502.03409.[Pang and Lee2008]Bo Pang and Lillian Lee.2008.Opinion mining and sentiment analysis.Foundations and trends in information retrieval.[Rumelhart et al.1986]David E Rumelhart,Geoffrey E Hinton,and Ronald J Williams.1986.Learning in-ternal representations by error-propagation.In Par-allel Distributed Processing:Explorations in the Mi-crostructure of Cognition.MIT Press.[Schutze1992]Hinrich Schutze.1992.Dimensions of meaning.In Supercomputing.[Tang et al.2015]Duyu Tang,Bing Qin,and Ting Liu.2015.Document modeling with gated recurrent neural network for sentiment classiﬁcation.In EMNLP. [Wang and Manning2012]Sida Wang and Christopher D Manning.2012.Baselines and bigrams:Simple,good sentiment and topic classiﬁcation.In ACL. [Weinberger et al.2009]Kilian Weinberger,Anirban Das-gupta,John Langford,Alex Smola,and Josh Atten-berg.2009.Feature hashing for large scale multitask learning.In ICML.[Weston et al.2011]Jason Weston,Samy Bengio,and Nicolas Usunier.2011.Wsabie:Scaling up to large vocabulary image annotation.In IJCAI.[Weston et al.2014]Jason Weston,Sumit Chopra,and Keith Adams.2014.#tagspace:Semantic embed-dings from hashtags.In EMNLP.[Zhang and LeCun2015]Xiang Zhang and Yann LeCun.2015.Text understanding from scratch.arXiv preprint arXiv:1502.01710.[Zhang et al.2015]Xiang Zhang,Junbo Zhao,and Yann LeCun.2015.Character-level convolutional networks for text classiﬁcation.In NIPS.。

机器学习与人工智能领域中常用的英语词汇

机器学习与人工智能领域中常用的英语词汇1.General Concepts (基础概念)•Artificial Intelligence (AI) - 人工智能1)Artificial Intelligence (AI) - 人工智能2)Machine Learning (ML) - 机器学习3)Deep Learning (DL) - 深度学习4)Neural Network - 神经网络5)Natural Language Processing (NLP) - 自然语言处理6)Computer Vision - 计算机视觉7)Robotics - 机器人技术8)Speech Recognition - 语音识别9)Expert Systems - 专家系统10)Knowledge Representation - 知识表示11)Pattern Recognition - 模式识别12)Cognitive Computing - 认知计算13)Autonomous Systems - 自主系统14)Human-Machine Interaction - 人机交互15)Intelligent Agents - 智能代理16)Machine Translation - 机器翻译17)Swarm Intelligence - 群体智能18)Genetic Algorithms - 遗传算法19)Fuzzy Logic - 模糊逻辑20)Reinforcement Learning - 强化学习•Machine Learning (ML) - 机器学习1)Machine Learning (ML) - 机器学习2)Artificial Neural Network - 人工神经网络3)Deep Learning - 深度学习4)Supervised Learning - 有监督学习5)Unsupervised Learning - 无监督学习6)Reinforcement Learning - 强化学习7)Semi-Supervised Learning - 半监督学习8)Training Data - 训练数据9)Test Data - 测试数据10)Validation Data - 验证数据11)Feature - 特征12)Label - 标签13)Model - 模型14)Algorithm - 算法15)Regression - 回归16)Classification - 分类17)Clustering - 聚类18)Dimensionality Reduction - 降维19)Overfitting - 过拟合20)Underfitting - 欠拟合•Deep Learning (DL) - 深度学习1)Deep Learning - 深度学习2)Neural Network - 神经网络3)Artificial Neural Network (ANN) - 人工神经网络4)Convolutional Neural Network (CNN) - 卷积神经网络5)Recurrent Neural Network (RNN) - 循环神经网络6)Long Short-Term Memory (LSTM) - 长短期记忆网络7)Gated Recurrent Unit (GRU) - 门控循环单元8)Autoencoder - 自编码器9)Generative Adversarial Network (GAN) - 生成对抗网络10)Transfer Learning - 迁移学习11)Pre-trained Model - 预训练模型12)Fine-tuning - 微调13)Feature Extraction - 特征提取14)Activation Function - 激活函数15)Loss Function - 损失函数16)Gradient Descent - 梯度下降17)Backpropagation - 反向传播18)Epoch - 训练周期19)Batch Size - 批量大小20)Dropout - 丢弃法•Neural Network - 神经网络1)Neural Network - 神经网络2)Artificial Neural Network (ANN) - 人工神经网络3)Deep Neural Network (DNN) - 深度神经网络4)Convolutional Neural Network (CNN) - 卷积神经网络5)Recurrent Neural Network (RNN) - 循环神经网络6)Long Short-Term Memory (LSTM) - 长短期记忆网络7)Gated Recurrent Unit (GRU) - 门控循环单元8)Feedforward Neural Network - 前馈神经网络9)Multi-layer Perceptron (MLP) - 多层感知器10)Radial Basis Function Network (RBFN) - 径向基函数网络11)Hopfield Network - 霍普菲尔德网络12)Boltzmann Machine - 玻尔兹曼机13)Autoencoder - 自编码器14)Spiking Neural Network (SNN) - 脉冲神经网络15)Self-organizing Map (SOM) - 自组织映射16)Restricted Boltzmann Machine (RBM) - 受限玻尔兹曼机17)Hebbian Learning - 海比安学习18)Competitive Learning - 竞争学习19)Neuroevolutionary - 神经进化20)Neuron - 神经元•Algorithm - 算法1)Algorithm - 算法2)Supervised Learning Algorithm - 有监督学习算法3)Unsupervised Learning Algorithm - 无监督学习算法4)Reinforcement Learning Algorithm - 强化学习算法5)Classification Algorithm - 分类算法6)Regression Algorithm - 回归算法7)Clustering Algorithm - 聚类算法8)Dimensionality Reduction Algorithm - 降维算法9)Decision Tree Algorithm - 决策树算法10)Random Forest Algorithm - 随机森林算法11)Support Vector Machine (SVM) Algorithm - 支持向量机算法12)K-Nearest Neighbors (KNN) Algorithm - K近邻算法13)Naive Bayes Algorithm - 朴素贝叶斯算法14)Gradient Descent Algorithm - 梯度下降算法15)Genetic Algorithm - 遗传算法16)Neural Network Algorithm - 神经网络算法17)Deep Learning Algorithm - 深度学习算法18)Ensemble Learning Algorithm - 集成学习算法19)Reinforcement Learning Algorithm - 强化学习算法20)Metaheuristic Algorithm - 元启发式算法•Model - 模型1)Model - 模型2)Machine Learning Model - 机器学习模型3)Artificial Intelligence Model - 人工智能模型4)Predictive Model - 预测模型5)Classification Model - 分类模型6)Regression Model - 回归模型7)Generative Model - 生成模型8)Discriminative Model - 判别模型9)Probabilistic Model - 概率模型10)Statistical Model - 统计模型11)Neural Network Model - 神经网络模型12)Deep Learning Model - 深度学习模型13)Ensemble Model - 集成模型14)Reinforcement Learning Model - 强化学习模型15)Support Vector Machine (SVM) Model - 支持向量机模型16)Decision Tree Model - 决策树模型17)Random Forest Model - 随机森林模型18)Naive Bayes Model - 朴素贝叶斯模型19)Autoencoder Model - 自编码器模型20)Convolutional Neural Network (CNN) Model - 卷积神经网络模型•Dataset - 数据集1)Dataset - 数据集2)Training Dataset - 训练数据集3)Test Dataset - 测试数据集4)Validation Dataset - 验证数据集5)Balanced Dataset - 平衡数据集6)Imbalanced Dataset - 不平衡数据集7)Synthetic Dataset - 合成数据集8)Benchmark Dataset - 基准数据集9)Open Dataset - 开放数据集10)Labeled Dataset - 标记数据集11)Unlabeled Dataset - 未标记数据集12)Semi-Supervised Dataset - 半监督数据集13)Multiclass Dataset - 多分类数据集14)Feature Set - 特征集15)Data Augmentation - 数据增强16)Data Preprocessing - 数据预处理17)Missing Data - 缺失数据18)Outlier Detection - 异常值检测19)Data Imputation - 数据插补20)Metadata - 元数据•Training - 训练1)Training - 训练2)Training Data - 训练数据3)Training Phase - 训练阶段4)Training Set - 训练集5)Training Examples - 训练样本6)Training Instance - 训练实例7)Training Algorithm - 训练算法8)Training Model - 训练模型9)Training Process - 训练过程10)Training Loss - 训练损失11)Training Epoch - 训练周期12)Training Batch - 训练批次13)Online Training - 在线训练14)Offline Training - 离线训练15)Continuous Training - 连续训练16)Transfer Learning - 迁移学习17)Fine-Tuning - 微调18)Curriculum Learning - 课程学习19)Self-Supervised Learning - 自监督学习20)Active Learning - 主动学习•Testing - 测试1)Testing - 测试2)Test Data - 测试数据3)Test Set - 测试集4)Test Examples - 测试样本5)Test Instance - 测试实例6)Test Phase - 测试阶段7)Test Accuracy - 测试准确率8)Test Loss - 测试损失9)Test Error - 测试错误10)Test Metrics - 测试指标11)Test Suite - 测试套件12)Test Case - 测试用例13)Test Coverage - 测试覆盖率14)Cross-Validation - 交叉验证15)Holdout Validation - 留出验证16)K-Fold Cross-Validation - K折交叉验证17)Stratified Cross-Validation - 分层交叉验证18)Test Driven Development (TDD) - 测试驱动开发19)A/B Testing - A/B 测试20)Model Evaluation - 模型评估•Validation - 验证1)Validation - 验证2)Validation Data - 验证数据3)Validation Set - 验证集4)Validation Examples - 验证样本5)Validation Instance - 验证实例6)Validation Phase - 验证阶段7)Validation Accuracy - 验证准确率8)Validation Loss - 验证损失9)Validation Error - 验证错误10)Validation Metrics - 验证指标11)Cross-Validation - 交叉验证12)Holdout Validation - 留出验证13)K-Fold Cross-Validation - K折交叉验证14)Stratified Cross-Validation - 分层交叉验证15)Leave-One-Out Cross-Validation - 留一法交叉验证16)Validation Curve - 验证曲线17)Hyperparameter Validation - 超参数验证18)Model Validation - 模型验证19)Early Stopping - 提前停止20)Validation Strategy - 验证策略•Supervised Learning - 有监督学习1)Supervised Learning - 有监督学习2)Label - 标签3)Feature - 特征4)Target - 目标5)Training Labels - 训练标签6)Training Features - 训练特征7)Training Targets - 训练目标8)Training Examples - 训练样本9)Training Instance - 训练实例10)Regression - 回归11)Classification - 分类12)Predictor - 预测器13)Regression Model - 回归模型14)Classifier - 分类器15)Decision Tree - 决策树16)Support Vector Machine (SVM) - 支持向量机17)Neural Network - 神经网络18)Feature Engineering - 特征工程19)Model Evaluation - 模型评估20)Overfitting - 过拟合21)Underfitting - 欠拟合22)Bias-Variance Tradeoff - 偏差-方差权衡•Unsupervised Learning - 无监督学习1)Unsupervised Learning - 无监督学习2)Clustering - 聚类3)Dimensionality Reduction - 降维4)Anomaly Detection - 异常检测5)Association Rule Learning - 关联规则学习6)Feature Extraction - 特征提取7)Feature Selection - 特征选择8)K-Means - K均值9)Hierarchical Clustering - 层次聚类10)Density-Based Clustering - 基于密度的聚类11)Principal Component Analysis (PCA) - 主成分分析12)Independent Component Analysis (ICA) - 独立成分分析13)T-distributed Stochastic Neighbor Embedding (t-SNE) - t分布随机邻居嵌入14)Gaussian Mixture Model (GMM) - 高斯混合模型15)Self-Organizing Maps (SOM) - 自组织映射16)Autoencoder - 自动编码器17)Latent Variable - 潜变量18)Data Preprocessing - 数据预处理19)Outlier Detection - 异常值检测20)Clustering Algorithm - 聚类算法•Reinforcement Learning - 强化学习1)Reinforcement Learning - 强化学习2)Agent - 代理3)Environment - 环境4)State - 状态5)Action - 动作6)Reward - 奖励7)Policy - 策略8)Value Function - 值函数9)Q-Learning - Q学习10)Deep Q-Network (DQN) - 深度Q网络11)Policy Gradient - 策略梯度12)Actor-Critic - 演员-评论家13)Exploration - 探索14)Exploitation - 开发15)Temporal Difference (TD) - 时间差分16)Markov Decision Process (MDP) - 马尔可夫决策过程17)State-Action-Reward-State-Action (SARSA) - 状态-动作-奖励-状态-动作18)Policy Iteration - 策略迭代19)Value Iteration - 值迭代20)Monte Carlo Methods - 蒙特卡洛方法•Semi-Supervised Learning - 半监督学习1)Semi-Supervised Learning - 半监督学习2)Labeled Data - 有标签数据3)Unlabeled Data - 无标签数据4)Label Propagation - 标签传播5)Self-Training - 自训练6)Co-Training - 协同训练7)Transudative Learning - 传导学习8)Inductive Learning - 归纳学习9)Manifold Regularization - 流形正则化10)Graph-based Methods - 基于图的方法11)Cluster Assumption - 聚类假设12)Low-Density Separation - 低密度分离13)Semi-Supervised Support Vector Machines (S3VM) - 半监督支持向量机14)Expectation-Maximization (EM) - 期望最大化15)Co-EM - 协同期望最大化16)Entropy-Regularized EM - 熵正则化EM17)Mean Teacher - 平均教师18)Virtual Adversarial Training - 虚拟对抗训练19)Tri-training - 三重训练20)Mix Match - 混合匹配•Feature - 特征1)Feature - 特征2)Feature Engineering - 特征工程3)Feature Extraction - 特征提取4)Feature Selection - 特征选择5)Input Features - 输入特征6)Output Features - 输出特征7)Feature Vector - 特征向量8)Feature Space - 特征空间9)Feature Representation - 特征表示10)Feature Transformation - 特征转换11)Feature Importance - 特征重要性12)Feature Scaling - 特征缩放13)Feature Normalization - 特征归一化14)Feature Encoding - 特征编码15)Feature Fusion - 特征融合16)Feature Dimensionality Reduction - 特征维度减少17)Continuous Feature - 连续特征18)Categorical Feature - 分类特征19)Nominal Feature - 名义特征20)Ordinal Feature - 有序特征•Label - 标签1)Label - 标签2)Labeling - 标注3)Ground Truth - 地面真值4)Class Label - 类别标签5)Target Variable - 目标变量6)Labeling Scheme - 标注方案7)Multi-class Labeling - 多类别标注8)Binary Labeling - 二分类标注9)Label Noise - 标签噪声10)Labeling Error - 标注错误11)Label Propagation - 标签传播12)Unlabeled Data - 无标签数据13)Labeled Data - 有标签数据14)Semi-supervised Learning - 半监督学习15)Active Learning - 主动学习16)Weakly Supervised Learning - 弱监督学习17)Noisy Label Learning - 噪声标签学习18)Self-training - 自训练19)Crowdsourcing Labeling - 众包标注20)Label Smoothing - 标签平滑化•Prediction - 预测1)Prediction - 预测2)Forecasting - 预测3)Regression - 回归4)Classification - 分类5)Time Series Prediction - 时间序列预测6)Forecast Accuracy - 预测准确性7)Predictive Modeling - 预测建模8)Predictive Analytics - 预测分析9)Forecasting Method - 预测方法10)Predictive Performance - 预测性能11)Predictive Power - 预测能力12)Prediction Error - 预测误差13)Prediction Interval - 预测区间14)Prediction Model - 预测模型15)Predictive Uncertainty - 预测不确定性16)Forecast Horizon - 预测时间跨度17)Predictive Maintenance - 预测性维护18)Predictive Policing - 预测式警务19)Predictive Healthcare - 预测性医疗20)Predictive Maintenance - 预测性维护•Classification - 分类1)Classification - 分类2)Classifier - 分类器3)Class - 类别4)Classify - 对数据进行分类5)Class Label - 类别标签6)Binary Classification - 二元分类7)Multiclass Classification - 多类分类8)Class Probability - 类别概率9)Decision Boundary - 决策边界10)Decision Tree - 决策树11)Support Vector Machine (SVM) - 支持向量机12)K-Nearest Neighbors (KNN) - K最近邻算法13)Naive Bayes - 朴素贝叶斯14)Logistic Regression - 逻辑回归15)Random Forest - 随机森林16)Neural Network - 神经网络17)SoftMax Function - SoftMax函数18)One-vs-All (One-vs-Rest) - 一对多(一对剩余)19)Ensemble Learning - 集成学习20)Confusion Matrix - 混淆矩阵•Regression - 回归1)Regression Analysis - 回归分析2)Linear Regression - 线性回归3)Multiple Regression - 多元回归4)Polynomial Regression - 多项式回归5)Logistic Regression - 逻辑回归6)Ridge Regression - 岭回归7)Lasso Regression - Lasso回归8)Elastic Net Regression - 弹性网络回归9)Regression Coefficients - 回归系数10)Residuals - 残差11)Ordinary Least Squares (OLS) - 普通最小二乘法12)Ridge Regression Coefficient - 岭回归系数13)Lasso Regression Coefficient - Lasso回归系数14)Elastic Net Regression Coefficient - 弹性网络回归系数15)Regression Line - 回归线16)Prediction Error - 预测误差17)Regression Model - 回归模型18)Nonlinear Regression - 非线性回归19)Generalized Linear Models (GLM) - 广义线性模型20)Coefficient of Determination (R-squared) - 决定系数21)F-test - F检验22)Homoscedasticity - 同方差性23)Heteroscedasticity - 异方差性24)Autocorrelation - 自相关25)Multicollinearity - 多重共线性26)Outliers - 异常值27)Cross-validation - 交叉验证28)Feature Selection - 特征选择29)Feature Engineering - 特征工程30)Regularization - 正则化2.Neural Networks and Deep Learning (神经网络与深度学习)•Convolutional Neural Network (CNN) - 卷积神经网络1)Convolutional Neural Network (CNN) - 卷积神经网络2)Convolution Layer - 卷积层3)Feature Map - 特征图4)Convolution Operation - 卷积操作5)Stride - 步幅6)Padding - 填充7)Pooling Layer - 池化层8)Max Pooling - 最大池化9)Average Pooling - 平均池化10)Fully Connected Layer - 全连接层11)Activation Function - 激活函数12)Rectified Linear Unit (ReLU) - 线性修正单元13)Dropout - 随机失活14)Batch Normalization - 批量归一化15)Transfer Learning - 迁移学习16)Fine-Tuning - 微调17)Image Classification - 图像分类18)Object Detection - 物体检测19)Semantic Segmentation - 语义分割20)Instance Segmentation - 实例分割21)Generative Adversarial Network (GAN) - 生成对抗网络22)Image Generation - 图像生成23)Style Transfer - 风格迁移24)Convolutional Autoencoder - 卷积自编码器25)Recurrent Neural Network (RNN) - 循环神经网络•Recurrent Neural Network (RNN) - 循环神经网络1)Recurrent Neural Network (RNN) - 循环神经网络2)Long Short-Term Memory (LSTM) - 长短期记忆网络3)Gated Recurrent Unit (GRU) - 门控循环单元4)Sequence Modeling - 序列建模5)Time Series Prediction - 时间序列预测6)Natural Language Processing (NLP) - 自然语言处理7)Text Generation - 文本生成8)Sentiment Analysis - 情感分析9)Named Entity Recognition (NER) - 命名实体识别10)Part-of-Speech Tagging (POS Tagging) - 词性标注11)Sequence-to-Sequence (Seq2Seq) - 序列到序列12)Attention Mechanism - 注意力机制13)Encoder-Decoder Architecture - 编码器-解码器架构14)Bidirectional RNN - 双向循环神经网络15)Teacher Forcing - 强制教师法16)Backpropagation Through Time (BPTT) - 通过时间的反向传播17)Vanishing Gradient Problem - 梯度消失问题18)Exploding Gradient Problem - 梯度爆炸问题19)Language Modeling - 语言建模20)Speech Recognition - 语音识别•Long Short-Term Memory (LSTM) - 长短期记忆网络1)Long Short-Term Memory (LSTM) - 长短期记忆网络2)Cell State - 细胞状态3)Hidden State - 隐藏状态4)Forget Gate - 遗忘门5)Input Gate - 输入门6)Output Gate - 输出门7)Peephole Connections - 窥视孔连接8)Gated Recurrent Unit (GRU) - 门控循环单元9)Vanishing Gradient Problem - 梯度消失问题10)Exploding Gradient Problem - 梯度爆炸问题11)Sequence Modeling - 序列建模12)Time Series Prediction - 时间序列预测13)Natural Language Processing (NLP) - 自然语言处理14)Text Generation - 文本生成15)Sentiment Analysis - 情感分析16)Named Entity Recognition (NER) - 命名实体识别17)Part-of-Speech Tagging (POS Tagging) - 词性标注18)Attention Mechanism - 注意力机制19)Encoder-Decoder Architecture - 编码器-解码器架构20)Bidirectional LSTM - 双向长短期记忆网络•Attention Mechanism - 注意力机制1)Attention Mechanism - 注意力机制2)Self-Attention - 自注意力3)Multi-Head Attention - 多头注意力4)Transformer - 变换器5)Query - 查询6)Key - 键7)Value - 值8)Query-Value Attention - 查询-值注意力9)Dot-Product Attention - 点积注意力10)Scaled Dot-Product Attention - 缩放点积注意力11)Additive Attention - 加性注意力12)Context Vector - 上下文向量13)Attention Score - 注意力分数14)SoftMax Function - SoftMax函数15)Attention Weight - 注意力权重16)Global Attention - 全局注意力17)Local Attention - 局部注意力18)Positional Encoding - 位置编码19)Encoder-Decoder Attention - 编码器-解码器注意力20)Cross-Modal Attention - 跨模态注意力•Generative Adversarial Network (GAN) - 生成对抗网络1)Generative Adversarial Network (GAN) - 生成对抗网络2)Generator - 生成器3)Discriminator - 判别器4)Adversarial Training - 对抗训练5)Minimax Game - 极小极大博弈6)Nash Equilibrium - 纳什均衡7)Mode Collapse - 模式崩溃8)Training Stability - 训练稳定性9)Loss Function - 损失函数10)Discriminative Loss - 判别损失11)Generative Loss - 生成损失12)Wasserstein GAN (WGAN) - Wasserstein GAN（WGAN）13)Deep Convolutional GAN (DCGAN) - 深度卷积生成对抗网络（DCGAN）14)Conditional GAN (c GAN) - 条件生成对抗网络（c GAN）15)Style GAN - 风格生成对抗网络16)Cycle GAN - 循环生成对抗网络17)Progressive Growing GAN (PGGAN) - 渐进式增长生成对抗网络（PGGAN）18)Self-Attention GAN (SAGAN) - 自注意力生成对抗网络（SAGAN）19)Big GAN - 大规模生成对抗网络20)Adversarial Examples - 对抗样本•Encoder-Decoder - 编码器-解码器1)Encoder-Decoder Architecture - 编码器-解码器架构2)Encoder - 编码器3)Decoder - 解码器4)Sequence-to-Sequence Model (Seq2Seq) - 序列到序列模型5)State Vector - 状态向量6)Context Vector - 上下文向量7)Hidden State - 隐藏状态8)Attention Mechanism - 注意力机制9)Teacher Forcing - 强制教师法10)Beam Search - 束搜索11)Recurrent Neural Network (RNN) - 循环神经网络12)Long Short-Term Memory (LSTM) - 长短期记忆网络13)Gated Recurrent Unit (GRU) - 门控循环单元14)Bidirectional Encoder - 双向编码器15)Greedy Decoding - 贪婪解码16)Masking - 遮盖17)Dropout - 随机失活18)Embedding Layer - 嵌入层19)Cross-Entropy Loss - 交叉熵损失20)Tokenization - 令牌化•Transfer Learning - 迁移学习1)Transfer Learning - 迁移学习2)Source Domain - 源领域3)Target Domain - 目标领域4)Fine-Tuning - 微调5)Domain Adaptation - 领域自适应6)Pre-Trained Model - 预训练模型7)Feature Extraction - 特征提取8)Knowledge Transfer - 知识迁移9)Unsupervised Domain Adaptation - 无监督领域自适应10)Semi-Supervised Domain Adaptation - 半监督领域自适应11)Multi-Task Learning - 多任务学习12)Data Augmentation - 数据增强13)Task Transfer - 任务迁移14)Model Agnostic Meta-Learning (MAML) - 与模型无关的元学习（MAML）15)One-Shot Learning - 单样本学习16)Zero-Shot Learning - 零样本学习17)Few-Shot Learning - 少样本学习18)Knowledge Distillation - 知识蒸馏19)Representation Learning - 表征学习20)Adversarial Transfer Learning - 对抗迁移学习•Pre-trained Models - 预训练模型1)Pre-trained Model - 预训练模型2)Transfer Learning - 迁移学习3)Fine-Tuning - 微调4)Knowledge Transfer - 知识迁移5)Domain Adaptation - 领域自适应6)Feature Extraction - 特征提取7)Representation Learning - 表征学习8)Language Model - 语言模型9)Bidirectional Encoder Representations from Transformers (BERT) - 双向编码器结构转换器10)Generative Pre-trained Transformer (GPT) - 生成式预训练转换器11)Transformer-based Models - 基于转换器的模型12)Masked Language Model (MLM) - 掩蔽语言模型13)Cloze Task - 填空任务14)Tokenization - 令牌化15)Word Embeddings - 词嵌入16)Sentence Embeddings - 句子嵌入17)Contextual Embeddings - 上下文嵌入18)Self-Supervised Learning - 自监督学习19)Large-Scale Pre-trained Models - 大规模预训练模型•Loss Function - 损失函数1)Loss Function - 损失函数2)Mean Squared Error (MSE) - 均方误差3)Mean Absolute Error (MAE) - 平均绝对误差4)Cross-Entropy Loss - 交叉熵损失5)Binary Cross-Entropy Loss - 二元交叉熵损失6)Categorical Cross-Entropy Loss - 分类交叉熵损失7)Hinge Loss - 合页损失8)Huber Loss - Huber损失9)Wasserstein Distance - Wasserstein距离10)Triplet Loss - 三元组损失11)Contrastive Loss - 对比损失12)Dice Loss - Dice损失13)Focal Loss - 焦点损失14)GAN Loss - GAN损失15)Adversarial Loss - 对抗损失16)L1 Loss - L1损失17)L2 Loss - L2损失18)Huber Loss - Huber损失19)Quantile Loss - 分位数损失•Activation Function - 激活函数1)Activation Function - 激活函数2)Sigmoid Function - Sigmoid函数3)Hyperbolic Tangent Function (Tanh) - 双曲正切函数4)Rectified Linear Unit (Re LU) - 矩形线性单元5)Parametric Re LU (P Re LU) - 参数化Re LU6)Exponential Linear Unit (ELU) - 指数线性单元7)Swish Function - Swish函数8)Softplus Function - Soft plus函数9)Softmax Function - SoftMax函数10)Hard Tanh Function - 硬双曲正切函数11)Softsign Function - Softsign函数12)GELU (Gaussian Error Linear Unit) - GELU（高斯误差线性单元）13)Mish Function - Mish函数14)CELU (Continuous Exponential Linear Unit) - CELU（连续指数线性单元）15)Bent Identity Function - 弯曲恒等函数16)Gaussian Error Linear Units (GELUs) - 高斯误差线性单元17)Adaptive Piecewise Linear (APL) - 自适应分段线性函数18)Radial Basis Function (RBF) - 径向基函数•Backpropagation - 反向传播1)Backpropagation - 反向传播2)Gradient Descent - 梯度下降3)Partial Derivative - 偏导数4)Chain Rule - 链式法则5)Forward Pass - 前向传播6)Backward Pass - 反向传播7)Computational Graph - 计算图8)Neural Network - 神经网络9)Loss Function - 损失函数10)Gradient Calculation - 梯度计算11)Weight Update - 权重更新12)Activation Function - 激活函数13)Optimizer - 优化器14)Learning Rate - 学习率15)Mini-Batch Gradient Descent - 小批量梯度下降16)Stochastic Gradient Descent (SGD) - 随机梯度下降17)Batch Gradient Descent - 批量梯度下降18)Momentum - 动量19)Adam Optimizer - Adam优化器20)Learning Rate Decay - 学习率衰减•Gradient Descent - 梯度下降1)Gradient Descent - 梯度下降2)Stochastic Gradient Descent (SGD) - 随机梯度下降3)Mini-Batch Gradient Descent - 小批量梯度下降4)Batch Gradient Descent - 批量梯度下降5)Learning Rate - 学习率6)Momentum - 动量7)Adaptive Moment Estimation (Adam) - 自适应矩估计8)RMSprop - 均方根传播9)Learning Rate Schedule - 学习率调度10)Convergence - 收敛11)Divergence - 发散12)Adagrad - 自适应学习速率方法13)Adadelta - 自适应增量学习率方法14)Adamax - 自适应矩估计的扩展版本15)Nadam - Nesterov Accelerated Adaptive Moment Estimation16)Learning Rate Decay - 学习率衰减17)Step Size - 步长18)Conjugate Gradient Descent - 共轭梯度下降19)Line Search - 线搜索20)Newton's Method - 牛顿法•Learning Rate - 学习率1)Learning Rate - 学习率2)Adaptive Learning Rate - 自适应学习率3)Learning Rate Decay - 学习率衰减4)Initial Learning Rate - 初始学习率5)Step Size - 步长6)Momentum - 动量7)Exponential Decay - 指数衰减8)Annealing - 退火9)Cyclical Learning Rate - 循环学习率10)Learning Rate Schedule - 学习率调度11)Warm-up - 预热12)Learning Rate Policy - 学习率策略13)Learning Rate Annealing - 学习率退火14)Cosine Annealing - 余弦退火15)Gradient Clipping - 梯度裁剪16)Adapting Learning Rate - 适应学习率17)Learning Rate Multiplier - 学习率倍增器18)Learning Rate Reduction - 学习率降低19)Learning Rate Update - 学习率更新20)Scheduled Learning Rate - 定期学习率•Batch Size - 批量大小1)Batch Size - 批量大小2)Mini-Batch - 小批量3)Batch Gradient Descent - 批量梯度下降4)Stochastic Gradient Descent (SGD) - 随机梯度下降5)Mini-Batch Gradient Descent - 小批量梯度下降6)Online Learning - 在线学习7)Full-Batch - 全批量8)Data Batch - 数据批次9)Training Batch - 训练批次10)Batch Normalization - 批量归一化11)Batch-wise Optimization - 批量优化12)Batch Processing - 批量处理13)Batch Sampling - 批量采样14)Adaptive Batch Size - 自适应批量大小15)Batch Splitting - 批量分割16)Dynamic Batch Size - 动态批量大小17)Fixed Batch Size - 固定批量大小18)Batch-wise Inference - 批量推理19)Batch-wise Training - 批量训练20)Batch Shuffling - 批量洗牌•Epoch - 训练周期1)Training Epoch - 训练周期2)Epoch Size - 周期大小3)Early Stopping - 提前停止4)Validation Set - 验证集5)Training Set - 训练集6)Test Set - 测试集7)Overfitting - 过拟合8)Underfitting - 欠拟合9)Model Evaluation - 模型评估10)Model Selection - 模型选择11)Hyperparameter Tuning - 超参数调优12)Cross-Validation - 交叉验证13)K-fold Cross-Validation - K折交叉验证14)Stratified Cross-Validation - 分层交叉验证15)Leave-One-Out Cross-Validation (LOOCV) - 留一法交叉验证16)Grid Search - 网格搜索17)Random Search - 随机搜索18)Model Complexity - 模型复杂度19)Learning Curve - 学习曲线20)Convergence - 收敛3.Machine Learning Techniques and Algorithms (机器学习技术与算法)•Decision Tree - 决策树1)Decision Tree - 决策树2)Node - 节点3)Root Node - 根节点4)Leaf Node - 叶节点5)Internal Node - 内部节点6)Splitting Criterion - 分裂准则7)Gini Impurity - 基尼不纯度8)Entropy - 熵9)Information Gain - 信息增益10)Gain Ratio - 增益率11)Pruning - 剪枝12)Recursive Partitioning - 递归分割13)CART (Classification and Regression Trees) - 分类回归树14)ID3 (Iterative Dichotomiser 3) - 迭代二叉树315)C4.5 (successor of ID3) - C4.5（ID3的后继者）16)C5.0 (successor of C4.5) - C5.0（C4.5的后继者）17)Split Point - 分裂点18)Decision Boundary - 决策边界19)Pruned Tree - 剪枝后的树20)Decision Tree Ensemble - 决策树集成•Random Forest - 随机森林1)Random Forest - 随机森林2)Ensemble Learning - 集成学习3)Bootstrap Sampling - 自助采样4)Bagging (Bootstrap Aggregating) - 装袋法5)Out-of-Bag (OOB) Error - 袋外误差6)Feature Subset - 特征子集7)Decision Tree - 决策树8)Base Estimator - 基础估计器9)Tree Depth - 树深度10)Randomization - 随机化11)Majority Voting - 多数投票12)Feature Importance - 特征重要性13)OOB Score - 袋外得分14)Forest Size - 森林大小15)Max Features - 最大特征数16)Min Samples Split - 最小分裂样本数17)Min Samples Leaf - 最小叶节点样本数18)Gini Impurity - 基尼不纯度19)Entropy - 熵20)Variable Importance - 变量重要性•Support Vector Machine (SVM) - 支持向量机1)Support Vector Machine (SVM) - 支持向量机2)Hyperplane - 超平面3)Kernel Trick - 核技巧4)Kernel Function - 核函数5)Margin - 间隔6)Support Vectors - 支持向量7)Decision Boundary - 决策边界8)Maximum Margin Classifier - 最大间隔分类器9)Soft Margin Classifier - 软间隔分类器10) C Parameter - C参数11)Radial Basis Function (RBF) Kernel - 径向基函数核12)Polynomial Kernel - 多项式核13)Linear Kernel - 线性核14)Quadratic Kernel - 二次核15)Gaussian Kernel - 高斯核16)Regularization - 正则化17)Dual Problem - 对偶问题18)Primal Problem - 原始问题19)Kernelized SVM - 核化支持向量机20)Multiclass SVM - 多类支持向量机•K-Nearest Neighbors (KNN) - K-最近邻1)K-Nearest Neighbors (KNN) - K-最近邻2)Nearest Neighbor - 最近邻3)Distance Metric - 距离度量4)Euclidean Distance - 欧氏距离5)Manhattan Distance - 曼哈顿距离6)Minkowski Distance - 闵可夫斯基距离7)Cosine Similarity - 余弦相似度8)K Value - K值9)Majority Voting - 多数投票10)Weighted KNN - 加权KNN11)Radius Neighbors - 半径邻居12)Ball Tree - 球树13)KD Tree - KD树14)Locality-Sensitive Hashing (LSH) - 局部敏感哈希15)Curse of Dimensionality - 维度灾难16)Class Label - 类标签17)Training Set - 训练集18)Test Set - 测试集19)Validation Set - 验证集20)Cross-Validation - 交叉验证•Naive Bayes - 朴素贝叶斯1)Naive Bayes - 朴素贝叶斯2)Bayes' Theorem - 贝叶斯定理3)Prior Probability - 先验概率4)Posterior Probability - 后验概率5)Likelihood - 似然6)Class Conditional Probability - 类条件概率7)Feature Independence Assumption - 特征独立假设8)Multinomial Naive Bayes - 多项式朴素贝叶斯9)Gaussian Naive Bayes - 高斯朴素贝叶斯10)Bernoulli Naive Bayes - 伯努利朴素贝叶斯11)Laplace Smoothing - 拉普拉斯平滑12)Add-One Smoothing - 加一平滑13)Maximum A Posteriori (MAP) - 最大后验概率14)Maximum Likelihood Estimation (MLE) - 最大似然估计15)Classification - 分类16)Feature Vectors - 特征向量17)Training Set - 训练集18)Test Set - 测试集19)Class Label - 类标签20)Confusion Matrix - 混淆矩阵•Clustering - 聚类1)Clustering - 聚类2)Centroid - 质心3)Cluster Analysis - 聚类分析4)Partitioning Clustering - 划分式聚类5)Hierarchical Clustering - 层次聚类6)Density-Based Clustering - 基于密度的聚类7)K-Means Clustering - K均值聚类8)K-Medoids Clustering - K中心点聚类9)DBSCAN (Density-Based Spatial Clustering of Applications with Noise) - 基于密度的空间聚类算法10)Agglomerative Clustering - 聚合式聚类11)Dendrogram - 系统树图12)Silhouette Score - 轮廓系数13)Elbow Method - 肘部法则14)Clustering Validation - 聚类验证15)Intra-cluster Distance - 类内距离16)Inter-cluster Distance - 类间距离17)Cluster Cohesion - 类内连贯性18)Cluster Separation - 类间分离度19)Cluster Assignment - 聚类分配20)Cluster Label - 聚类标签•K-Means - K-均值1)K-Means - K-均值2)Centroid - 质心3)Cluster - 聚类4)Cluster Center - 聚类中心5)Cluster Assignment - 聚类分配6)Cluster Analysis - 聚类分析7)K Value - K值8)Elbow Method - 肘部法则9)Inertia - 惯性10)Silhouette Score - 轮廓系数11)Convergence - 收敛12)Initialization - 初始化13)Euclidean Distance - 欧氏距离14)Manhattan Distance - 曼哈顿距离15)Distance Metric - 距离度量16)Cluster Radius - 聚类半径17)Within-Cluster Variation - 类内变异18)Cluster Quality - 聚类质量19)Clustering Algorithm - 聚类算法20)Clustering Validation - 聚类验证•Dimensionality Reduction - 降维1)Dimensionality Reduction - 降维2)Feature Extraction - 特征提取3)Feature Selection - 特征选择4)Principal Component Analysis (PCA) - 主成分分析5)Singular Value Decomposition (SVD) - 奇异值分解6)Linear Discriminant Analysis (LDA) - 线性判别分析7)t-Distributed Stochastic Neighbor Embedding (t-SNE) - t-分布随机邻域嵌入8)Autoencoder - 自编码器9)Manifold Learning - 流形学习10)Locally Linear Embedding (LLE) - 局部线性嵌入11)Isomap - 等度量映射12)Uniform Manifold Approximation and Projection (UMAP) - 均匀流形逼近与投影13)Kernel PCA - 核主成分分析14)Non-negative Matrix Factorization (NMF) - 非负矩阵分解15)Independent Component Analysis (ICA) - 独立成分分析16)Variational Autoencoder (VAE) - 变分自编码器17)Sparse Coding - 稀疏编码18)Random Projection - 随机投影19)Neighborhood Preserving Embedding (NPE) - 保持邻域结构的嵌入20)Curvilinear Component Analysis (CCA) - 曲线成分分析•Principal Component Analysis (PCA) - 主成分分析1)Principal Component Analysis (PCA) - 主成分分析2)Eigenvector - 特征向量3)Eigenvalue - 特征值4)Covariance Matrix - 协方差矩阵。

hierarchical text classification综述 -回复

hierarchical text classification综述-回复所提到的主题是"hierarchical text classification综述"，下面将一步一步回答该主题并撰写一篇1500-2000字的文章。

文章标题：Hierarchical Text Classification综述：解析和探索文本分类的层次化实践引言：在信息时代，大量的文本数据被生成和储存。

文本分类是一种重要的技术，用于将文本分组到特定的类别中，从而有效地组织和管理这些海量数据。

然而，传统的文本分类方法只能将文本数据划分为单个层次的类别。

随着信息储量的不断增长和深度学习技术的快速发展，层次化文本分类变得越来越重要。

本文将对hierarchical text classification进行综述，探讨其基本原理、方法和应用，以及未来发展的前景。

一、基本原理1.1 文本分类的定义和目的文本分类是将给定的文本数据分配到预定义类别的任务。

它是一种监督学习任务，基于已标注的训练数据来预测未标注文本的类别。

文本分类的目的是根据文本的内容将其分类，以便更好地理解和组织信息。

1.2 层次化文本分类的概念层次化文本分类是将文本数据划分为多个层次的类别。

这种方法提供了更精细和结构化的组织方式，使得分类结果更具灵活性和可解释性。

例如，一个层次化分类体系可以包含多个级别，从大类到细分的子类，逐渐细化分类。

二、基本方法2.1 特征提取与表示传统方法通常使用统计特征（如词频、tf-idf）来表示文本。

而深度学习方法则采用词嵌入技术（如Word2Vec、FastText）来学习文本的语义表示。

这些方法都可以用于层次化文本分类，但需要注意不同层次之间的特征表示的一致性。

2.2 分类器选择与训练常用的分类器包括朴素贝叶斯、支持向量机（SVM）、决策树和深度神经网络等。

在层次化文本分类中，通常采用自顶向下的策略，先对高级类别进行分类，然后对子类别进行逐级细分。

聚类分析文献英文翻译

电气信息工程学院外文翻译英文名称：Data mining-clustering译文名称：数据挖掘—聚类分析专业：自动化姓名：****班级学号：****指导教师：******译文出处：Data mining：Ian H.Witten, EibeFrank 著二○一○年四月二十六日Clustering5.1 INTRODUCTIONClustering is similar to classification in that data are grouped. However, unlike classification, the groups are not predefined. Instead, the grouping is accomplished by finding similarities between data according to characteristics found in the actual data. The groups are called clusters. Some authors view clustering as a special type of classification. In this text, however, we follow a more conventional view in that the two are different. Many definitions for clusters have been proposed:●Set of like elements. Elements from different clusters are not alike.●The distance between points in a cluster is less than the distance betweena point in the cluster and any point outside it.A term similar to clustering is database segmentation, where like tuple (record) in a database are grouped together. This is done to partition or segment the database into components that then give the user a more general view of the data. In this case text, we do not differentiate between segmentation and clustering. A simple example of clustering is found in Example 5.1. This example illustrates the fact that that determining how to do the clustering is not straightforward.As illustrated in Figure 5.1, a given set of data may be clustered on different attributes. Here a group of homes in a geographic area is shown. The first floor type of clustering is based on the location of the home. Homes that are geographically close to each other are clustered together. In the second clustering, homes are grouped based on the size of the house.Clustering has been used in many application domains, including biology, medicine, anthropology, marketing, and economics. Clustering applications include plant and animal classification, disease classification, image processing, pattern recognition, and document retrieval. One of the first domains in which clustering was used was biological taxonomy. Recent uses include examining Web log data to detect usage patterns.When clustering is applied to a real-world database, many interesting problems occur:●Outlier handling is difficult. Here the elements do not naturally fallinto any cluster. They can be viewed as solitary clusters. However, if aclustering algorithm attempts to find larger clusters, these outliers will beforced to be placed in some cluster. This process may result in the creationof poor clusters by combining two existing clusters and leaving the outlier in its own cluster.● Dynamic data in the database implies that cluster membership may change over time.● Interpreting the semantic meaning of each cluster may be difficult. With classification, the labeling of the classes is known ahead of time. However, with clustering, this may not be the case. Thus, when the clustering process finishes creating a set of clusters, the exact meaning of each cluster may not be obvious. Here is where a domain expert is needed to assign a label or interpretation for each cluster.● There is no one correct answer to a clustering problem. In fact, many answers may be found. The exact number of clusters required is not easy to determine. Again, a domain expert may be required. For example, suppose we have a set of data about plants that have been collected during a field trip. Without any prior knowledge of plant classification, if we attempt to divide this set of data into similar groupings, it would not be clear how many groups should be created.● Another related issue is what data should be used of clustering. Unlike learning during a classification process, where there is some a priori knowledge concerning what the attributes of each classification should be, in clustering we have no supervised learning to aid the process. Indeed, clustering can be viewed as similar to unsupervised learning.We can then summarize some basic features of clustering (as opposed to classification):● The (best) number of clusters is not known.● There may not be any a priori knowledge concerning the clusters.● Cluster results are dynamic.The clustering problem is stated as shown in Definition 5.1. Here we assume that the number of clusters to be created is an input value, k. The actual content (and interpretation) of each cluster,j k ,1j k ≤≤, is determined as a result of the function definition. Without loss of generality, we will view that the result of solving a clustering problem is that a set of clusters is created: K={12,,...,k k k k }.D EFINITION 5.1.Given a database D ={12,,...,n t t t } of tuples and an integer value k , the clustering problem is to define a mapping f : {1,...,}D k → where each i t is assigned to one cluster j K ,1j k ≤≤. A cluster j K , contains precisely those tuples mapped to it; that is, j K ={|(),1,i i j t f t K i n =≤≤and i t D ∈}.A classification of the different types of clustering algorithms is shown in Figure 5.2. Clustering algorithms themselves may be viewed as hierarchical or partitional. With hierarchical clustering, a nested set of clusters is created. Each level in the hierarchy has a separate set of clusters. At the lowest level, each item is in its own unique cluster. At the highest level, all items belong to the same cluster. With hierarchical clustering, the desired number of clusters is not input. With partitional clustering, the algorithm creates only one set of clusters. These approaches use the desired number of clusters to drive how the final set is created. Traditional clustering algorithms tend to be targeted to small numeric database that fit into memory .There are, however, more recent clustering algorithms that look at categorical data and are targeted to larger, perhaps dynamic, databases. Algorithms targeted to larger databases may adapt to memory constraints by either sampling the database or using data structures, which can be compressed or pruned to fit into memory regardless of the size of the database. Clustering algorithms may also differ based on whether they produce overlapping or nonoverlapping clusters. Even though we consider only nonoverlapping clusters, it is possible to place an item in multiple clusters. In turn, nonoverlapping clusters can be viewed as extrinsic or intrinsic. Extrinsic techniques use labeling of the items to assist in the classification process. These algorithms are the traditional classification supervised learning algorithms in which a special input training set is used. Intrinsic algorithms do not use any a priori category labels, but depend only on the adjacency matrix containing the distance between objects. All algorithms we examine in this chapter fall into the intrinsic class.The types of clustering algorithms can be furthered classified based on the implementation technique used. Hierarchical algorithms can becategorized as agglomerative or divisive. ”Agglomerative ” implies that the clusters are created in a bottom-up fashion, while divisive algorithms work in a top-down fashion. Although both hierarchical and partitional algorithms could be described using the agglomerative vs. divisive label, it typically is more associated with hierarchical algorithms. Another descriptive tag indicates whether each individual element is handled one by one, serial (sometimes called incremental), or whether all items are examined together, simultaneous. If a specific tuple is viewed as having attribute values for all attributes in the schema, then clustering algorithms could differ as to how the attribute values are examined. As is usually done with decision tree classification techniques, some algorithms examine attribute values one at a time, monothetic. Polythetic algorithms consider all attribute values at one time. Finally, clustering algorithms can be labeled base on the mathematical formulation given to the algorithm: graph theoretic or matrix algebra. In this chapter we generally use the graph approach and describe the input to the clustering algorithm as an adjacency matrix labeled with distance measure.We discuss many clustering algorithms in the following sections. This is only a representative subset of the many algorithms that have been proposed in the literature. Before looking at these algorithms, we first examine possible similarity measures and examine the impact of outliers.5.2 SIMILARITY AND DISTANCE MEASURESThere are many desirable properties for the clusters created by a solution to a specific clustering problem. The most important one is that a tuple within one cluster is more like tuples within that cluster than it is similar to tuples outside it. As with classification, then, we assume the definition of a similarity measure, sim(,i l t t ), defined between any two tuples, ,i l t t D . This provides a more strict and alternative clustering definition, as found in Definition 5.2. Unless otherwise stated, we use the first definition rather than the second. Keep in mind that the similarity relationship stated within the second definition is a desirable, although not always obtainable, property.A distance measure, dis(,i j t t ), as opposed to similarity, is often used inclustering. The clustering problem then has the desirable property that given a cluster,j K ,,jl jm j t t K ∀∈ and ,(,)(,)i j jl jm jl i t K sim t t dis t t ∉≤.Some clustering algorithms look only at numeric data, usually assuming metric data points. Metric attributes satisfy the triangular inequality. The cluster can then be described by using several characteristic values. Given a cluster, m K of N points { 12,,...,m m mN t t t }, we make the following definitions [ZRL96]:Here the centroid is the “middle ” of the cluster; it need not be an actual point in the cluster. Some clustering algorithms alternatively assume that the cluster is represented by one centrally located object in the cluster called a medoid . The radius is the square root of the average mean squared distance from any point in the cluster to the centroid, and of points in the cluster. We use the notation m M to indicate the medoid for cluster m K .Many clustering algorithms require that the distance between clusters (rather than elements) be determined. This is not an easy task given that there are many interpretations for distance between clusters. Given clusters i K and j K , there are several standard alternatives to calculate the distance between clusters. A representative list is:● Single link : Smallest distance between an element in onecluster and an element in the other. We thus havedis(,i j K K )=min((,))il jm il i j dis t t t K K ∀∈∉and jm j i t K K ∀∈∉.● Complete link : Largest distance between an element in onecluster and an element in the other. We thus havedis(,i j K K )=max((,))il jm il i j dis t t t K K ∀∈∉and jm j i t K K ∀∈∉.● Average : Average distance between an element in onecluster and an element in the other. We thus havedis(,i j K K )=((,))il jm il i j mean dis t t t K K ∀∈∉and jm j i t K K ∀∈∉.● Centroid : If cluster have a representative centroid, then thecentroid distance is defined as the distance between the centroids.We thus have dis(,i j K K )=dis(,i j C C ), where i C is the centroidfor i K and similarly for j C .Medoid : Using a medoid to represent each cluster, thedistance between the clusters can be defined by the distancebetween the medoids: dis(,i j K K )=(,)i j dis M M5.3 OUTLIERSAs mentioned earlier, outliers are sample points with values much different from those of the remaining set of data. Outliers may represent errors in the data (perhaps a malfunctioning sensor recorded an incorrect data value) or could be correct data values that are simply much different from the remaining data. A person who is 2.5 meters tall is much taller than most people. In analyzing the height of individuals, this value probably would be viewed as an outlier.Some clustering techniques do not perform well with the presence of outliers. This problem is illustrated in Figure 5.3. Here if three clusters are found (solid line), the outlier will occur in a cluster by itself. However, if two clusters are found (dashed line), the two (obviously) different sets of data will be placed in one cluster because they are closer together than the outlier. This problem is complicated by the fact that many clustering algorithms actually have as input the number of desired clusters to be found.Clustering algorithms may actually find and remove outliers to ensure that they perform better. However, care must be taken in actually removing outliers. For example, suppose that the data mining problem is to predict flooding. Extremely high water level values occur very infrequently, and when compared with the normal water level values may seem to be outliers. However, removing these values may not allow the data mining algorithms to work effectively because there would be no data that showed that floods ever actually occurred.Outlier detection, or outlier mining, is the process of identifying outliers in a set of data. Clustering, or other data mining, algorithms may then choose to remove or treat these values differently. Some outlier detection techniques are based on statistical techniques. These usually assume that the set of data follows a known distribution and that outliers can be detected by well-known tests such as discordancy tests. However, thesetests are not very realistic for real-world data because real-world data values may not follow well-defined data distributions. Also, most of these tests assume single attribute value, and many attributes are involved in real-world datasets. Alternative detection techniques may be based on distance measures.聚类分析5.1简介聚类分析与分类数据分组类似。

hierarchical名词

hierarchical名词hierarchical（等级制）- a term used to describe a system or structure in which various levels or ranks of authority exist, with higher levels having more power and control than lower levels.1. The military operates under a hierarchical system, with each rank having specific duties and responsibilities.军队运作在一个等级制度下，每个军阶有着特定的职责和责任。

2. In many companies, there is a hierarchical organizational structure, with senior management making decisions that are implemented by lower-level employees.在许多公司中，存在着一种等级制的组织结构，高层管理人员做出决策，由低层员工来执行。

3. The Catholic Church has a hierarchical structure, with the Pope as the highest authority and bishops and priests at lower levels.天主教会有一个等级制度，教皇是最高权威，而主教和神父则处于较低的层级。

4. The feudal system was a hierarchical social structure in medieval Europe, with kings, nobles, and peasants occupying different levels of power and status.封建制度是中世纪欧洲的一种等级社会结构，国王、贵族和农民在权力和地位上处于不同的层次。

可靠性专业英语

可靠性工程质量专业英语词汇集Absolute deviation, 绝对离差Absolute number, 绝对数Absolute residuals, 绝对残差Acceleration array, 加速度立体阵Acceleration in an arbitrary direction, 任意方向上的加速度Acceleration normal, 法向加速度Acceleration space dimension, 加速度空间的维数Acceleration tangential, 切向加速度Acceleration vector, 加速度向量Acceptable hypothesis, 可接受假设Accumulation, 累积Accuracy, 准确度Actual frequency, 实际频数Adaptive estimator, 自适应估计量Addition, 相加Addition theorem, 加法定理Additivity, 可加性Adjusted rate, 调整率Adjusted value, 校正值Admissible error, 容许误差Aggregation, 聚集性Alternative hypothesis, 备择假设Among groups, 组间Amounts, 总量Analysis of correlation, 相关分析Analysis of covariance, 协方差分析Analysis of regression, 回归分析Analysis of time series, 时间序列分析Analysis of variance, 方差分析Angular transformation, 角转换ANOV A （analysis of variance）, 方差分析ANOV A Models, 方差分析模型Arcing, 弧弧旋Arcsine transformation, 反正弦变换Area under the curve, 曲线面积AREG , 评估从一个时间点到下一个时间点回归相关时的误差ARIMA, 季节和非季节性单变量模型的极大似然估计Arithmetic grid paper, 算术格纸Arithmetic mean, 算术平均数Arrhenius relation, 艾恩尼斯关系Assessing fit, 拟合的评估Associative laws, 结合律Asymmetric distribution, 非对称分布Asymptotic bias, 渐近偏倚Asymptotic efficiency, 渐近效率Asymptotic variance, 渐近方差Attributable risk, 归因危险度Attribute data, 属性资料Attribution, 属性Autocorrelation, 自相关Autocorrelation of residuals, 残差的自相关Average, 平均数Average confidence interval length, 平均置信区间长度Average growth rate, 平均增长率Bar chart, 条形图Bar graph, 条形图Base period, 基期Bayes' theorem , Bayes定理Bell-shaped curve, 钟形曲线Bernoulli distribution, 伯努力分布Best-trim estimator, 最好切尾估计量Bias, 偏性Binary logistic regression, 二元逻辑斯蒂回归Binomial distribution, 二项分布Bisquare, 双平方Bivariate Correlate, 二变量相关Bivariate normal distribution, 双变量正态分布Bivariate normal population, 双变量正态总体Biweight interval, 双权区间Biweight M-estimator, 双权M估计量Block, 区组配伍组BMDP(Biomedical computer programs), BMDP统计软件包Boxplots, 箱线图箱尾图Breakdown bound, 崩溃界崩溃点Canonical correlation, 典型相关Caption, 纵标目Case-control study, 病例对照研究Categorical variable, 分类变量Catenary, 悬链线Cauchy distribution, 柯西分布Cause-and-effect relationship, 因果关系Cell, 单元Censoring, 终检Center of symmetry, 对称中心Centering and scaling, 中心化和定标Central tendency, 集中趋势Central value, 中心值CHAID -χ2 Automatic Interaction Detector, 卡方自动交互检测Chance, 机遇Chance error, 随机误差Chance variable, 随机变量Characteristic equation, 特征方程Characteristic root, 特征根Characteristic vector, 特征向量Chebshev criterion of fit, 拟合的切比雪夫准则Chernoff faces, 切尔诺夫脸谱图Chi-square test, 卡方检验χ2检验Choleskey decomposition, 乔洛斯基分解Circle chart, 圆图Class interval, 组距Class mid-value, 组中值Class upper limit, 组上限Classified variable, 分类变量Cluster analysis, 聚类分析Cluster sampling, 整群抽样Code, 代码Coded data, 编码数据Coding, 编码Coefficient of contingency, 列联系数Coefficient of determination, 决定系数Coefficient of multiple correlation, 多重相关系数Coefficient of partial correlation, 偏相关系数Coefficient of production-moment correlation, 积差相关系数Coefficient of rank correlation, 等级相关系数Coefficient of regression, 回归系数Coefficient of skewness, 偏度系数Coefficient of variation, 变异系数Cohort study, 队列研究Column, 列Column effect, 列效应Column factor, 列因素Combination pool, 合并Combinative table, 组合表Common factor, 共性因子Common regression coefficient, 公共回归系数Common value, 共同值Common variance, 公共方差Common variation, 公共变异Communality variance, 共性方差Comparability, 可比性Comparison of bathes, 批比较Comparison value, 比较值Compartment model, 分部模型Compassion, 伸缩Complement of an event, 补事件Complete association, 完全正相关Complete dissociation, 完全不相关Complete statistics, 完备统计量Completely randomized design, 完全随机化设计Composite event, 联合事件Composite events, 复合事件Concavity, 凹性Conditional expectation, 条件期望Conditional likelihood, 条件似然Conditional probability, 条件概率Conditionally linear, 依条件线性Confidence interval, 置信区间Confidence limit, 置信限Confidence lower limit, 置信下限Confidence upper limit, 置信上限Confirmatory Factor Analysis , 验证性因子分析Confirmatory research, 证实性实验研究Confounding factor, 混杂因素Conjoint, 联合分析Consistency, 相合性Consistency check, 一致性检验Consistent asymptotically normal estimate, 相合渐近正态估计Consistent estimate, 相合估计Constrained nonlinear regression, 受约束非线性回归Constraint, 约束Contaminated distribution, 污染分布Contaminated Gausssian, 污染高斯分布Contaminated normal distribution, 污染正态分布Contamination, 污染Contamination model, 污染模型Contingency table, 列联表Contour, 边界线Contribution rate, 贡献率Control, 对照Controlled experiments, 对照实验Conventional depth, 常规深度Convolution, 卷积Corrected factor, 校正因子Corrected mean, 校正均值Correction coefficient, 校正系数Correctness, 正确性Correlation coefficient, 相关系数Correlation index, 相关指数Correspondence, 对应Counting, 计数Counts, 计数频数Covariance, 协方差Covariant, 共变Cox Regression, Cox回归Criteria for fitting, 拟合准则Criteria of least squares, 最小二乘准则Critical ratio, 临界比Critical region, 拒绝域Critical value, 临界值Cross-over design, 交叉设计Cross-section analysis, 横断面分析Cross-section survey, 横断面调查Crosstabs , 交叉表Cross-tabulation table, 复合表Cube root, 立方根Cumulative distribution function, 分布函数Cumulative probability, 累计概率Curvature, 曲率弯曲Curvature, 曲率Curve fit , 曲线拟和Curve fitting, 曲线拟合Curvilinear regression, 曲线回归Curvilinear relation, 曲线关系Cut-and-try method, 尝试法Cycle, 周期Cyclist, 周期性D test, D检验Data acquisition, 资料收集Data bank, 数据库Data capacity, 数据容量Data deficiencies, 数据缺乏Data handling, 数据处理Data manipulation, 数据处理Data processing, 数据处理Data reduction, 数据缩减Data set, 数据集Data sources, 数据来源Data transformation, 数据变换Data validity, 数据有效性Data-in, 数据输入Data-out, 数据输出Dead time, 停滞期Degree of freedom, 自由度Degree of precision, 精密度Degree of reliability, 可靠性程度Degression, 递减Density function, 密度函数Density of data points, 数据点的密度Dependent variable, 应变量依变量因变量Dependent variable, 因变量Depth, 深度Derivative matrix, 导数矩阵Derivative-free methods, 无导数方法Design, 设计Determinacy, 确定性Determinant, 行列式Determinant, 决定因素Deviation, 离差Deviation from average, 离均差Diagnostic plot, 诊断图Dichotomous variable, 二分变量Differential equation, 微分方程Direct standardization, 直接标准化法Discrete variable, 离散型变量DISCRIMINANT, 判断Discriminant analysis, 判别分析Discriminant coefficient, 判别系数Discriminant function, 判别值Dispersion, 散布分散度Disproportional, 不成比例的Disproportionate sub-class numbers, 不成比例次级组含量Distribution free, 分布无关性免分布Distribution shape, 分布形状Distribution-free method, 任意分布法Distributive laws, 分配律Disturbance, 随机扰动项Dose response curve, 剂量反应曲线Double blind method, 双盲法Double blind trial, 双盲试验Double exponential distribution, 双指数分布Double logarithmic, 双对数Downward rank, 降秩Dual-space plot, 对偶空间图DUD, 无导数方法Duncan's new multiple range method, 新复极差法Duncan新法E-LEffect, 实验效应Eigenvalue, 特征值Eigenvector, 特征向量Ellipse, 椭圆Empirical distribution, 经验分布Empirical probability, 经验概率单位Enumeration data, 计数资料Equal sun-class number, 相等次级组含量Equally likely, 等可能Equivariance, 同变性Error, 误差错误Error of estimate, 估计误差Error type I, 第一类错误Error type II, 第二类错误Estimand, 被估量Estimated error mean squares, 估计误差均方Estimated error sum of squares, 估计误差平方和Euclidean distance, 欧式距离Event, 事件Event, 事件Exceptional data point, 异常数据点Expectation plane, 期望平面Expectation surface, 期望曲面Expected values, 期望值Experiment, 实验Experimental sampling, 试验抽样Experimental unit, 试验单位Explanatory variable, 说明变量Exploratory data analysis, 探索性数据分析Explore Summarize, 探索-摘要Exponential curve, 指数曲线Exponential growth, 指数式增长EXSMOOTH, 指数平滑方法Extended fit, 扩充拟合Extra parameter, 附加参数Extrapolation, 外推法Extreme observation, 末端观测值Extremes, 极端值极值F distribution, F分布F test, F检验Factor, 因素因子Factor analysis, 因子分析Factor Analysis, 因子分析Factor score, 因子得分Factorial, 阶乘Factorial design, 析因试验设计False negative, 假阴性False negative error, 假阴性错误Family of distributions, 分布族Family of estimators, 估计量族Fanning, 扇面Fatality rate, 病死率Field investigation, 现场调查Field survey, 现场调查Finite population, 有限总体Finite-sample, 有限样本First derivative, 一阶导数First principal component, 第一主成分First quartile, 第一四分位数Fisher information, 费雪信息量Fitted value, 拟合值Fitting a curve, 曲线拟合Fixed base, 定基Fluctuation, 随机起伏Forecast, 预测Four fold table, 四格表Fourth, 四分点Fraction blow, 左侧比率Fractional error, 相对误差Frequency, 频率Frequency polygon, 频数多边图Frontier point, 界限点Function relationship, 泛函关系Gamma distribution, 伽玛分布Gauss increment, 高斯增量Gaussian distribution, 高斯分布正态分布Gauss-Newton increment, 高斯-牛顿增量General census, 全面普查GENLOG (Generalized liner models), 广义线性模型Geometric mean, 几何平均数Gini's mean difference, 基尼均差GLM (General liner models), 通用线性模型Goodness of fit, 拟和优度配合度Gradient of determinant, 行列式的梯度Graeco-Latin square, 希腊拉丁方Grand mean, 总均值Gross errors, 重大错误Gross-error sensitivity, 大错敏感度Group averages, 分组平均Grouped data, 分组资料Guessed mean, 假定平均数Half-life, 半衰期Hampel M-estimators, 汉佩尔M估计量Happenstance, 偶然事件Harmonic mean, 调和均数Hazard function, 风险均数Hazard rate, 风险率Heading, 标目Heavy-tailed distribution, 重尾分布Hessian array, 海森立体阵Heterogeneity, 不同质Heterogeneity of variance, 方差不齐Hierarchical classification, 组内分组Hierarchical clustering method, 系统聚类法High-leverage point, 高杠杆率点HILOGLINEAR, 多维列联表的层次对数线性模型Hinge, 折叶点Histogram, 直方图Historical cohort study, 历史性队列研究Holes, 空洞HOMALS, 多重响应分析Homogeneity of variance, 方差齐性Homogeneity test, 齐性检验Huber M-estimators, 休伯M估计量Hyperbola, 双曲线Hypothesis testing, 假设检验Hypothetical universe, 假设总体Impossible event, 不可能事件Independence, 独立性Independent variable, 自变量Index, 指标指数Indirect standardization, 间接标准化法Individual, 个体Inference band, 推断带Infinite population, 无限总体Infinitely great, 无穷大Infinitely small, 无穷小Influence curve, 影响曲线Information capacity, 信息容量Initial condition, 初始条件Initial estimate, 初始估计值Initial level, 最初水平Interaction, 交互作用Interaction terms, 交互作用项Intercept, 截距Interpolation, 内插法Interquartile range, 四分位距Interval estimation, 区间估计Intervals of equal probability, 等概率区间Intrinsic curvature, 固有曲率Invariance, 不变性Inverse matrix, 逆矩阵Inverse probability, 逆概率Inverse sine transformation, 反正弦变换Iteration, 迭代Jacobian determinant, 雅可比行列式Joint distribution function, 分布函数Joint probability, 联合概率Joint probability distribution, 联合概率分布K means method, 逐步聚类法Kaplan-Meier, 评估事件的时间长度Kaplan-Merier chart, Kaplan-Merier图Kendall's rank correlation, Kendall等级相关Kinetic, 动力学Kolmogorov-Smirnove test, 柯尔莫哥洛夫-斯米尔诺夫检验Kruskal and Wallis test, Kruskal及Wallis检验多样本的秩和检验H检验Kurtosis, 峰度Lack of fit, 失拟Ladder of powers, 幂阶梯Lag, 滞后Large sample, 大样本Large sample test, 大样本检验Latin square, 拉丁方Latin square design, 拉丁方设计Leakage, 泄漏Least favorable configuration, 最不利构形Least favorable distribution, 最不利分布Least significant difference, 最小显著差法Least square method, 最小二乘法Least-absolute-residuals estimates, 最小绝对残差估计Least-absolute-residuals fit, 最小绝对残差拟合Least-absolute-residuals line, 最小绝对残差线Legend, 图例L-estimator, L估计量L-estimator of location, 位置L估计量L-estimator of scale, 尺度L估计量Level, 水平Life expectance, 预期期望寿命Life table, 寿命表Life table method, 生命表法Light-tailed distribution, 轻尾分布Likelihood function, 似然函数Likelihood ratio, 似然比line graph, 线图Linear correlation, 直线相关Linear equation, 线性方程Linear programming, 线性规划Linear regression, 直线回归Linear Regression, 线性回归Linear trend, 线性趋势Loading, 载荷Location and scale equivariance, 位置尺度同变性Location equivariance, 位置同变性Location invariance, 位置不变性Location scale family, 位置尺度族Log rank test, 时序检验Logarithmic curve, 对数曲线Logarithmic normal distribution, 对数正态分布Logarithmic scale, 对数尺度Logarithmic transformation, 对数变换Logic check, 逻辑检查Logistic distribution, 逻辑斯特分布Logit transformation, Logit转换LOGLINEAR, 多维列联表通用模型Lognormal distribution, 对数正态分布Lost function, 损失函数Low correlation, 低度相关Lower limit, 下限Lowest-attained variance, 最小可达方差LSD, 最小显著差法的简称Lurking variable, 潜在变量M-RMain effect, 主效应Major heading, 主辞标目Marginal density function, 边缘密度函数Marginal probability, 边缘概率Marginal probability distribution, 边缘概率分布Matched data, 配对资料Matched distribution, 匹配过分布Matching of distribution, 分布的匹配Matching of transformation, 变换的匹配Mathematical expectation, 数学期望Mathematical model, 数学模型Maximum L-estimator, 极大极小L 估计量Maximum likelihood method, 最大似然法Mean, 均数Mean squares between groups, 组间均方Mean squares within group, 组内均方Means (Compare means), 均值-均值比较Median, 中位数Median effective dose, 半数效量Median lethal dose, 半数致死量Median polish, 中位数平滑Median test, 中位数检验Minimal sufficient statistic, 最小充分统计量Minimum distance estimation, 最小距离估计Minimum effective dose, 最小有效量Minimum lethal dose, 最小致死量Minimum variance estimator, 最小方差估计量MINITAB, 统计软件包Minor heading, 宾词标目Missing data, 缺失值Model specification, 模型的确定Modeling Statistics , 模型统计Models for outliers, 离群值模型Modifying the model, 模型的修正Modulus of continuity, 连续性模Morbidity, 发病率Most favorable configuration, 最有利构形Multidimensional Scaling (ASCAL), 多维尺度多维标度Multinomial Logistic Regression , 多项逻辑斯蒂回归Multiple comparison, 多重比较Multiple correlation , 复相关Multiple covariance, 多元协方差Multiple linear regression, 多元线性回归Multiple response , 多重选项Multiple solutions, 多解Multiplication theorem, 乘法定理Multiresponse, 多元响应Multi-stage sampling, 多阶段抽样Multivariate T distribution, 多元T分布Mutual exclusive, 互不相容Mutual independence, 互相独立Natural boundary, 自然边界Natural dead, 自然死亡Natural zero, 自然零Negative correlation, 负相关Negative linear correlation, 负线性相关Negatively skewed, 负偏Newman-Keuls method, q检验NK method, q检验No statistical significance, 无统计意义Nominal variable, 名义变量Nonconstancy of variability, 变异的非定常性Nonlinear regression, 非线性相关Nonparametric statistics, 非参数统计Nonparametric test, 非参数检验Nonparametric tests, 非参数检验Normal deviate, 正态离差Normal distribution, 正态分布Normal equation, 正规方程组Normal ranges, 正常范围Normal value, 正常值Nuisance parameter, 多余参数讨厌参数Null hypothesis, 无效假设Numerical variable, 数值变量Objective function, 目标函数Observation unit, 观察单位Observed value, 观察值One sided test, 单侧检验One-way analysis of variance, 单因素方差分析Oneway ANOV A , 单因素方差分析Open sequential trial, 开放型序贯设计Optrim, 优切尾Optrim efficiency, 优切尾效率Order statistics, 顺序统计量Ordered categories, 有序分类Ordinal logistic regression , 序数逻辑斯蒂回归Ordinal variable, 有序变量Orthogonal basis, 正交基Orthogonal design, 正交试验设计Orthogonality conditions, 正交条件ORTHOPLAN, 正交设计Outlier cutoffs, 离群值截断点Outliers, 极端值OVERALS , 多组变量的非线性正规相关Overshoot, 迭代过度Paired design, 配对设计Paired sample, 配对样本Pairwise slopes, 成对斜率Parabola, 抛物线Parallel tests, 平行试验Parameter, 参数Parametric statistics, 参数统计Parametric test, 参数检验Partial correlation, 偏相关Partial regression, 偏回归Partial sorting, 偏排序Partials residuals, 偏残差Pattern, 模式Pearson curves, 皮尔逊曲线Peeling, 退层Percent bar graph, 百分条形图Percentage, 百分比Percentile, 百分位数Percentile curves, 百分位曲线Periodicity, 周期性Permutation, 排列P-estimator, P估计量Pie graph, 饼图Pitman estimator, 皮特曼估计量Pivot, 枢轴量Planar, 平坦Planar assumption, 平面的假设PLANCARDS, 生成试验的计划卡Point estimation, 点估计Poisson distribution, 泊松分布Polishing, 平滑Polled standard deviation, 合并标准差Polled variance, 合并方差Polygon, 多边图Polynomial, 多项式Polynomial curve, 多项式曲线Population, 总体Population attributable risk, 人群归因危险度Positive correlation, 正相关Positively skewed, 正偏Posterior distribution, 后验分布Power of a test, 检验效能Precision, 精密度Predicted value, 预测值Preliminary analysis, 预备性分析Principal component analysis, 主成分分析Prior distribution, 先验分布Prior probability, 先验概率Probabilistic model, 概率模型probability, 概率Probability density, 概率密度Product moment, 乘积矩协方差Profile trace, 截面迹图Proportion, 比构成比Proportion allocation in stratified random sampling, 按比例分层随机抽样Proportionate, 成比例Proportionate sub-class numbers, 成比例次级组含量Prospective study, 前瞻性调查Proximities, 亲近性Pseudo F test, 近似F检验Pseudo model, 近似模型Pseudosigma, 伪标准差Purposive sampling, 有目的抽样QR decomposition, QR分解Quadratic approximation, 二次近似Qualitative classification, 属性分类Qualitative method, 定性方法Quantile-quantile plot, 分位数-分位数图Q-Q图Quantitative analysis, 定量分析Quartile, 四分位数Quick Cluster, 快速聚类Radix sort, 基数排序Random allocation, 随机化分组Random blocks design, 随机区组设计Random event, 随机事件Randomization, 随机化Range, 极差全距Rank correlation, 等级相关Rank sum test, 秩和检验Rank test, 秩检验Ranked data, 等级资料Rate, 比率Ratio, 比例Raw data, 原始资料Raw residual, 原始残差Rayleigh's test, 雷氏检验Rayleigh's Z, 雷氏Z值Reciprocal, 倒数Reciprocal transformation, 倒数变换Recording, 记录Redescending estimators, 回降估计量Reducing dimensions, 降维Re-expression, 重新表达Reference set, 标准组Region of acceptance, 接受域Regression coefficient, 回归系数Regression sum of square, 回归平方和Rejection point, 拒绝点Relative dispersion, 相对离散度Relative number, 相对数Reliability, 可靠性Reparametrization, 重新设置参数Replication, 重复Report Summaries, 报告摘要Residual sum of square, 剩余平方和Resistance, 耐抗性Resistant line, 耐抗线Resistant technique, 耐抗技术R-estimator of location, 位置R估计量R-estimator of scale, 尺度R估计量Retrospective study, 回顾性调查Ridge trace, 岭迹Ridit analysis, Ridit分析Rotation, 旋转Rounding, 舍入Row, 行Row effects, 行效应Row factor, 行因素RXC table, RXC表S-ZSample, 样本Sample regression coefficient, 样本回归系数Sample size, 样本量Sample standard deviation, 样本标准差Sampling error, 抽样误差SAS(Statistical analysis system ), SAS统计软件包Scale, 尺度量表Scatter diagram, 散点图Schematic plot, 示意图简图Score test, 计分检验Screening, 筛检SEASON, 季节分析Second derivative, 二阶导数Second principal component, 第二主成分SEM (Structural equation modeling), 结构化方程模型Semi-logarithmic graph, 半对数图Semi-logarithmic paper, 半对数格纸Sensitivity curve, 敏感度曲线Sequential analysis, 贯序分析Sequential data set, 顺序数据集Sequential design, 贯序设计Sequential method, 贯序法Sequential test, 贯序检验法Serial tests, 系列试验Short-cut method, 简捷法Sigmoid curve, S形曲线Sign function, 正负号函数Sign test, 符号检验Signed rank, 符号秩Significance test, 显著性检验Significant figure, 有效数字Simple cluster sampling, 简单整群抽样Simple correlation, 简单相关Simple random sampling, 简单随机抽样Simple regression, 简单回归simple table, 简单表Sine estimator, 正弦估计量Single-valued estimate, 单值估计Singular matrix, 奇异矩阵Skewed distribution, 偏斜分布Skewness, 偏度Slash distribution, 斜线分布Slope, 斜率Smirnov test, 斯米尔诺夫检验Source of variation, 变异来源Spearman rank correlation, 斯皮尔曼等级相关Specific factor, 特殊因子Specific factor variance, 特殊因子方差Spectra , 频谱Spherical distribution, 球型正态分布Spread, 展布SPSS(Statistical package for the social science), SPSS统计软件包Spurious correlation, 假性相关Square root transformation, 平方根变换Stabilizing variance, 稳定方差Standard deviation, 标准差Standard error, 标准误Standard error of difference, 差别的标准误Standard error of estimate, 标准估计误差Standard error of rate, 率的标准误Standard normal distribution, 标准正态分布Standardization, 标准化Starting value, 起始值Statistic, 统计量Statistical control, 统计控制Statistical graph, 统计图Statistical inference, 统计推断Statistical table, 统计表Steepest descent, 最速下降法Stem and leaf display, 茎叶图Step factor, 步长因子Stepwise regression, 逐步回归Storage, 存Strata, 层（复数）Stratified sampling, 分层抽样Stratified sampling, 分层抽样Strength, 强度Stringency, 严密性Structural relationship, 结构关系Studentized residual, 学生化残差t化残差Sub-class numbers, 次级组含量Subdividing, 分割Sufficient statistic, 充分统计量Sum of products, 积和Sum of squares, 离差平方和Sum of squares about regression, 回归平方和Sum of squares between groups, 组间平方和Sum of squares of partial regression, 偏回归平方和Sure event, 必然事件Survey, 调查Survival, 生存分析Survival rate, 生存率Suspended root gram, 悬吊根图Symmetry, 对称Systematic error, 系统误差Systematic sampling, 系统抽样Tags, 标签Tail area, 尾部面积Tail length, 尾长Tail weight, 尾重Tangent line, 切线Target distribution, 目标分布Taylor series, 泰勒级数Tendency of dispersion, 离散趋势Testing of hypotheses, 假设检验Theoretical frequency, 理论频数Time series, 时间序列Tolerance interval, 容忍区间Tolerance lower limit, 容忍下限Tolerance upper limit, 容忍上限Torsion, 扰率Total sum of square, 总平方和Total variation, 总变异Transformation, 转换Treatment, 处理Trend, 趋势Trend of percentage, 百分比趋势Trial, 试验Trial and error method, 试错法Tuning constant, 细调常数Two sided test, 双向检验Two-stage least squares, 二阶最小平方Two-stage sampling, 二阶段抽样Two-tailed test, 双侧检验Two-way analysis of variance, 双因素方差分析Two-way table, 双向表Type I error, 一类错误α错误Type II error, 二类错误β错误UMVU, 方差一致最小无偏估计简称Unbiased estimate, 无偏估计Unconstrained nonlinear regression , 无约束非线性回归Unequal subclass number, 不等次级组含量Ungrouped data, 不分组资料Uniform coordinate, 均匀坐标Uniform distribution, 均匀分布Uniformly minimum variance unbiased estimate, 方差一致最小无偏估计Unit, 单元Unordered categories, 无序分类Upper limit, 上限Upward rank, 升秩Vague concept, 模糊概念Validity, 有效性V ARCOMP (Variance component estimation), 方差元素估计Variability, 变异性Variable, 变量Variance, 方差Variation, 变异Varimax orthogonal rotation, 方差最大正交旋转V olume of distribution, 容积W test, W检验Weibull distribution, 威布尔分布Weight, 权数Weighted Chi-square test, 加权卡方检验Cochran检验Weighted linear regression method, 加权直线回归Weighted mean, 加权平均数Weighted mean square, 加权平均方差Weighted sum of square, 加权平方和Weighting coefficient, 权重系数Weighting method, 加权法W-estimation, W估计量W-estimation of location, 位置W估计量Width, 宽度Wilcoxon paired test, 威斯康星配对法配对符号秩和检验Wild point, 野点狂点Wild value, 野值狂值Winsorized mean, 缩尾均值Withdraw, 失访Youden's index, 尤登指数Z test, Z检验Zero correlation, 零相关Z-transformation, Z变换。

统计学专业词汇英语翻译

统计学专业词汇英语翻译H?lder不等式H?1der's inequalityHaldane差异测度Haldane's discrepancy measure半常态分布；半常态分配half normal distribution半常态描点图half normal plots半柯西分布；半柯西分配half-Cauchy distribution半不变式(量) half-invariant半常态机率纸half-normal probability paper半周期half-period半格子方阵half-plaid square半重复设计half-replicate design半宽度half-widthHammersley-Chapman-Robbins不等式Hammersley-Chapman-Robbins inequalityHardy总和法Hardy summation methodHardy公式Hardy's formulaHardy-Weinberg比例式Hardy-Weinberg proportions Harley逼近Harley approximation调和分析harmonic analysis调和分布；调和分配harmonic distribution调和平均数harmonic mean调和过程harmonic process国际商品统一分类Harmonized Commodity Description and Coding SystemHarris漫步Harris walkHarrison法Harrison's methodHartley检定Hartley's testHartley-Rao方案Hartley-Rao scheme故障分析hazard analysis故障函数hazard function故障图hazard plots故障率(分布) hazard rate (distribution)卫生统计health statistics重尾分布heavy tail distributionHellinger距离Hellinger distanceHelly第一定理Helly's first theoremHelly-Bray定理Helly-Bray theoremsHelmert准则Helmert criterionHelmert分布；Helmert分配Helmert distribution Helmert变换Helmert transformationHermite分布；Hermite分配Hermite distributions Hess矩阵Hessian matrix异偏态的heteroclitic异质性；非均齐性heterogeneity异质母体heterogeneous population异质层heterogeneous strata属量变数heterograde异峰度的heterokurtic不等变异的heteroscedastic不等变异数模型heteroscedastic model不等变异heteroscedastic variation不等变异性heteroscedasticity异质型的heterotypic潜伏周期hidden periodicity层次生灭过程hierarchical birth and death process 层次分类hierarchical classification层次可分组设计hierarchical group divisible design 层次hierarchy高度相关high correlation高低图high-low graph高阶自身回归方案higher order autoregressive scheme高阶混同higher order confounding高阶滞后结构higher order lag structures高阶递移矩阵higher transition matrix最高事后密度区间highest posterior density intervals登山法hill-climbing method直方图；矩形图histogram命中点hitting pointHodges二变量符号检定Hodges bivariate sign testHodges-Ajne检定Hodges-Ajne's testHodges-Lehmann估计值Hodges-Lehmann estimate Hodges-Lehmann估计量Hodges-Lehmann estimator Hodges-Lehmann单样本估计量Hodges-Lehmannone-sample estimatorHoeffding独立性检定Hoeffding's independence test Hoeffding不等式Hoeffding's inequalityHollander二变量对称性检定Hollander's bivariate symmetry testHollander平行性检定Hollander's parallelism test Hollander-Proschan「新比旧佳」检定Hollander-Proschan ' new better than used' testHolt法Holt's method同偏态的homoclitic均齐性；同质性homogeneity实验条件的均齐性homogeneity of experimental conditions变异数的均齐性homogeneity of variances均齐的；齐次的；同质的homogeneous均齐过程homogeneous process均齐层homogeneous stratum均齐递移机率homogeneous transition probability属性变数homograde等峰度的homokurtic等变异的homoscedastic等变异数模型homoscedastic model均齐变异homoscedastic variation等变异性homoscedasticity正当过程honest process横条图horizontal bar chart横尺度horizontal scaleHorvitz-Thompson估计量Horvitz-Thompson estimator Hotelling的T统计量Hotelling's T statisticHotelling的T2统计量Hotelling's T2 statisticHotelling检定(相依相关) Hotelling's test (depen dent correlations)Hotelling-Pabst检定Hotelling-Pabst test住户household住宅单位housing unitHuber估计值Huber estimateHuber损失函数Huber loss function人体工学；人因工程human engineering人为误差human error潮湿试验humidity test高峰hump百分比条图hundred-percent bar chart全数检验hundred-percent inspectionHunt-Stein定理Hunt-Stein theorem混合比率hybrid ratio混合制hybrid system超方格hyper square超希腊拉丁方格hyper-Graeco Latin square超卜瓦松分布；超卜瓦松分配hyper-Poisson distribution超球面常态分布；超球面常态分配hyper-spherical normal distribution双曲正割分布；双曲正割分配hyperbolic secant distribution 超立方体hypercube超几何分布；超几何分布hypergeometric distribution超几何机率hypergeometric probability超几何数列hypergeometric series超常态离势hypernormal dispersion超常态分布；超常态分配hypernormal distribution超常态性hypernormality假设；拟说hypothesis假设检定hypothesis testing统计假设hypothesis;statistical假设母体hypothetical population假设全域hypothetical universe。

深度学习概述

深度学习是机器学习研究中的一个新的领域，其动机在于建立、模拟人脑进行分析学习的神经网络，它模仿人脑的机制来解释数据，例如图像，声音和文本。

同机器学习方法一样，深度机器学习方法也有监督学习与无监督学习之分．不同的学习框架下建立的学习模型很是不同．例如，卷积神经网络（Convolutional neural networks，简称CNNs）就是一种深度的监督学习下的机器学习模型，而深度置信网（Deep Belief Nets，简称DBNs）就是一种无监督学习下的机器学习模型。

目录1简介2基础概念▪深度▪解决问题3核心思想4例题5转折点6成功应用1简介深度学习的概念源于人工神经网络的研究。

含多隐层的多层感知器就是一种深度学习结构。

深度学习通过组合低层特征形成更加抽象的高层表示属性类别或特征，以发现数据的分布式特征表示。

[2]深度学习的概念由Hinton等人于2006年提出。

基于深信度网(DBN)提出非监督贪心逐层训练算法，为解决深层结构相关的优化难题带来希望，随后提出多层自动编码器深层结构。

此外Lecun等人提出的卷积神经网络是第一个真正多层结构学习算法，它利用空间相对关系减少参数数目以提高训练性能。

[2]2基础概念深度：从一个输入中产生一个输出所涉及的计算可以通过一个流向图(flow graph)来表示：流向图是一种能够表示计算的图，在这种图中每一个节点表示一个基本的计算并且一个计算深度学习的值(计算的结果被应用到这个节点的孩子节点的值)。

考虑这样一个计算集合，它可以被允许在每一个节点和可能的图结构中，并定义了一个函数族。

输入节点没有孩子，输出节点没有父亲。

这种流向图的一个特别属性是深度(depth)：从一个输入到一个输出的最长路径的长度。

传统的前馈神经网络能够被看做拥有等于层数的深度(比如对于输出层为隐层数加1)。

SVMs有深度2(一个对应于核输出或者特征空间，另一个对应于所产生输出的线性混合)。

腾讯对话机器人

Knowledge
Understanding
Generation
Planning
• Structured • Unstructured • Real world
• Annotation • Semantics • Matching
2
User Interests
• Predefined ontology • Automatically extracted tags • User behavior based user interests • …
Technology
Recommendation system
News characteris2cs
Environmental characteris2cs
User characteris2cs
Context characteris2cs
Ar$cle score Score(u,d)=f(class,topic,tag,2me,…)
Linear Model
Shallow CNN of (J&Z 15)
Deep Pyramid CNN (J&Z 17)
2
Example: Tencent Verticle Search Applications
Internet
Mobile
Science
Jack Ma
Robin Li
iPhone
NASA
Basketball
Kobe
Lakers
User
Classification + tag
Sport

深度学习文字识别论文综述

深度学习文字识别论文综述深度学习文字识别论文综述深度学习是机器学习研究中的一个新的领域，其动机在于建立、模拟人脑进行分析学习的神经网络，它模仿人脑的机制来解释数据，例如图像，声音和文本。

深度学习是无监督学习的一种，深度学习采用了神经网络的分层结构，系统包括输入层、隐层（多层）、输出层组成的多层网络，只有相邻的节点之间有连接，同一层以及跨层节点之间相互无连接。

深度学习通过建立类似于人脑的分层模型结构，对输入数据逐级提取从底层到高层的特征，从而能很好地建立从底层信号到高层语义的映射关系。

近年来，谷歌、微软、百度等拥有大数据的高科技公司相继投入大量资源进行深度学习技术研发，在语音、图像、自然语言、在线广告等领域取得显著进展。

从对实际应用的贡献来说，深度学习可能是机器学习领域最近这十年来最成功的研究方向。

深度学习模型不仅大幅提高了图像识别的精度，同时也避免了需要消耗大量的时间进行人工特征提取的工作，使得在线运算效率大大提升。

深度学习用于文字定位论文Thai Text Localization in Natural Scene Images using Convolutional Neural Network主要采用CNN的方法进行自然场景中的文本分类，并根据泰字的特点进行分类后的后处理，得到更加精确的定位效果。

如图1所示为CNN网络模型，CNN网络由一个输入层，两个卷积层和两个下采样层以及一个全连接层组成，输出为一个二分类向量，即文本和非文本。

图1 CNN网络模型该文主要思路为将图像切块后进行训练，采用人工标注样本的方法，使得网络具有识别文本和非文本的能力。

由于样本数量较少，文中采用了根据已有字体生成训练数据集的方法，包括对字体随机添加背景、调整字体风格以及应用滤波器。

如图2为生成的泰字样本，文中在标签的过程中将半个字或者整个字都标记为文本，增加了网络对文字的识别率。

图2训练样本集在使用生成好的网络进行文字定位的过程中，论文采用的编组方法结合了泰字的特点，如图3为对图像文字的初步定位，其中被标记的区域被网络识别为文字。

统计学术语中英对照

统计学术语中英文对照Absolute deviation,绝对离差Absolute number,绝对数Absolute residuals,绝对残差Acceleration array,加速度立体阵Acceleration in an arbitrary direction,任意方向上的加速度Acceleration normal,法向加速度Acceleration space dimension,加速度空间的维数Acceleration tangential,切向加速度Acceleration vector,加速度向量Acceptable hypothesis,可接受假设Accumulation,累积Accuracy,准确度Actual frequency,实际频数Adaptive estimator,自适应估计量Addition,相加Addition theorem,加法定理Additivity,可加性Adjusted rate,调整率Adjusted value,校正值Admissible error,容许误差Aggregation,聚集性Alternative hypothesis,备择假设Among groups,组间Amounts,总量Analysis of correlation,相关分析Analysis of covariance,协方差分析Analysis of regression,回归分析Analysis of time series,时间序列分析Analysis of variance,方差分析Angular transformation,角转换ANOV A（analysis of variance）,方差分析ANOV A Models,方差分析模型Arcing,弧/弧旋Arcsine transformation,反正弦变换Area under the curve,曲线面积AREG ,评估从一个时间点到下一个时间点回归相关时的误差ARIMA,季节和非季节性单变量模型的极大似然估计Arithmetic grid paper,算术格纸Arithmetic mean,算术平均数Arrhenius relation,艾恩尼斯关系Assessing fit,拟合的评估Associative laws,结合律Asymmetric distribution,非对称分布Asymptotic bias,渐近偏倚Asymptotic efficiency,渐近效率Asymptotic variance,渐近方差Attributable risk,归因危险度Attribute data,属性资料Attribution,属性Autocorrelation,自相关Autocorrelation of residuals,残差的自相关Average,平均数Average confidence interval length,平均置信区间长度Average growth rate,平均增长率Bar chart,条形图Bar graph,条形图Base period,基期Bayes' theorem , Bayes定理Bell-shaped curve,钟形曲线Bernoulli distribution,伯努力分布Best-trim estimator,最好切尾估计量Bias,偏性Binary logistic regression,二元逻辑斯蒂回归Binomial distribution,二项分布Bisquare,双平方Bivariate Correlate,二变量相关Bivariate normal distribution,双变量正态分布Bivariate normal population,双变量正态总体Biweight interval,双权区间Biweight M-estimator,双权M估计量Block,区组/配伍组BMDP(Biomedical computer programs), BMDP统计软件包Boxplots,箱线图/箱尾图Breakdown bound,崩溃界/崩溃点Canonical correlation,典型相关Caption,纵标目Case-control study,病例对照研究Categorical variable,分类变量Catenary,悬链线Cauchy distribution,柯西分布Cause-and-effect relationship,因果关系Cell,单元Censoring,终检Center of symmetry,对称中心Centering and scaling,中心化和定标Central tendency,集中趋势Central value,中心值CHAID -χ2 Automatic Interaction Detector,卡方自动交互检测Chance,机遇Chance error,随机误差Chance variable,随机变量Characteristic equation,特征方程Characteristic root,特征根Characteristic vector,特征向量Chebshev criterion of fit,拟合的切比雪夫准则Chernoff faces,切尔诺夫脸谱图Chi-square test,卡方检验/χ2检验Choleskey decomposition,乔洛斯基分解Circle chart,圆图Class interval,组距Class mid-value,组中值Class upper limit,组上限Classified variable,分类变量Cluster analysis,聚类分析Cluster sampling,整群抽样Code,代码Coded data,编码数据Coding,编码Coefficient of contingency,列联系数Coefficient of determination,决定系数Coefficient of multiple correlation,多重相关系数Coefficient of partial correlation,偏相关系数Coefficient of production-momentcorrelation,积差相关系数Coefficient of rank correlation,等级相关系数Coefficient of regression,回归系数Coefficient of skewness,偏度系数Coefficient of variation,变异系数Cohort study,队列研究Column,列Column effect,列效应Column factor,列因素Combination pool,合并Combinative table,组合表Common factor,共性因子Common regression coefficient,公共回归系数Common value,共同值Common variance,公共方差Common variation,公共变异Communality variance,共性方差Comparability,可比性Comparison of bathes,批比较Comparison value,比较值Compartment model,分部模型Compassion,伸缩Complement of an event,补事件Complete association,完全正相关Complete dissociation,完全不相关Complete statistics,完备统计量Completely randomized design,完全随机化设计Composite event,联合事件Composite events,复合事件Concavity,凹性Conditional expectation,条件期望Conditional likelihood,条件似然Conditional probability,条件概率Conditionally linear,依条件线性Confidence interval,置信区间Confidence limit,置信限Confidence lower limit,置信下限Confidence upper limit,置信上限Confirmatory Factor Analysis ,验证性因子分析Confirmatory research,证实性实验研究Confounding factor,混杂因素Conjoint,联合分析Consistency,相合性Consistency check,一致性检验Consistent asymptotically normal estimate,相合渐近正态估计Consistent estimate,相合估计Constrained nonlinear regression,受约束非线性回归Constraint,约束Contaminated distribution,污染分布Contaminated Gausssian,污染高斯分布Contaminated normal distribution,污染正态分布Contamination,污染Contamination model,污染模型Contingency table,列联表Contour,边界线Contribution rate,贡献率Control,对照Controlled experiments,对照实验Conventional depth,常规深度Convolution,卷积Corrected factor,校正因子Corrected mean,校正均值Correction coefficient,校正系数Correctness,正确性Correlation coefficient,相关系数Correlation index,相关指数Correspondence,对应Counting,计数Counts,计数/频数Covariance,协方差Covariant,共变Cox Regression, Cox回归Criteria for fitting,拟合准则Criteria of least squares,最小二乘准则Critical ratio,临界比Critical region,拒绝域Critical value,临界值Cross-over design,交叉设计Cross-section analysis,横断面分析Cross-section survey,横断面调查Crosstabs ,交叉表Cross-tabulation table,复合表Cube root,立方根Cumulative distribution function,分布函数Cumulative probability,累计概率Curvature,曲率/弯曲Curvature,曲率Curve fit ,曲线拟和Curve fitting,曲线拟合Curvilinear regression,曲线回归Curvilinear relation,曲线关系Cut-and-try method,尝试法Cycle,周期Cyclist,周期性D test, D检验Data acquisition,资料收集Data bank,数据库Data capacity,数据容量Data deficiencies,数据缺乏Data handling,数据处理Data manipulation,数据处理Data processing,数据处理Data reduction,数据缩减Data set,数据集Data sources,数据来源Data transformation,数据变换Data validity,数据有效性Data-in,数据输入Data-out,数据输出Dead time,停滞期Degree of freedom,自由度Degree of precision,精密度Degree of reliability,可靠性程度Degression,递减Density function,密度函数Density of data points,数据点的密度Dependent variable,应变量/依变量/因变量Dependent variable,因变量Depth,深度Derivative matrix,导数矩阵Derivative-free methods,无导数方法Design,设计Determinacy,确定性Determinant,行列式Determinant,决定因素Deviation,离差Deviation from average,离均差Diagnostic plot,诊断图Dichotomous variable,二分变量Differential equation,微分方程Direct standardization,直接标准化法Discrete variable,离散型变量DISCRIMINANT,判断Discriminant analysis,判别分析Discriminant coefficient,判别系数Discriminant function,判别值Dispersion,散布/分散度Disproportional,不成比例的Disproportionate sub-class numbers,不成比例次级组含量Distribution free,分布无关性/免分布Distribution shape,分布形状Distribution-free method,任意分布法Distributive laws,分配律Disturbance,随机扰动项Dose response curve,剂量反应曲线Double blind method,双盲法Double blind trial,双盲试验Double exponential distribution,双指数分布Double logarithmic,双对数Downward rank,降秩Dual-space plot,对偶空间图DUD,无导数方法Duncan's new multiple range method,新复极差法/Duncan新法Effect,实验效应Eigenvalue,特征值Eigenvector,特征向量Ellipse,椭圆Empirical distribution,经验分布Empirical probability,经验概率单位Enumeration data,计数资料Equal sun-class number,相等次级组含量Equally likely,等可能Equivariance,同变性Error,误差/错误Error of estimate,估计误差Error type I,第一类错误Error type II,第二类错误Estimand,被估量Estimated error mean squares,估计误差均方Estimated error sum of squares,估计误差平方和Euclidean distance,欧式距离Event,事件Event,事件Exceptional data point,异常数据点Expectation plane,期望平面Expectation surface,期望曲面Expected values,期望值Experiment,实验Experimental sampling,试验抽样Experimental unit,试验单位Explanatory variable,说明变量Exploratory data analysis,探索性数据分析Explore Summarize,探索-摘要Exponential curve,指数曲线Exponential growth,指数式增长EXSMOOTH,指数平滑方法Extended fit,扩充拟合Extra parameter,附加参数Extrapolation,外推法Extreme observation,末端观测值Extremes,极端值/极值2006-12-31 03:16 #1独孤求败5银牌会员UID 20763精华 1积分1230帖子710粮票2826阅读权限60注册2006-6-21状态离线 F distribution, F分布F test, F检验Factor,因素/因子Factor analysis,因子分析Factor Analysis,因子分析Factor score,因子得分Factorial,阶乘Factorial design,析因试验设计False negative,假阴性False negative error,假阴性错误Family of distributions,分布族Family of estimators,估计量族Fanning,扇面Fatality rate,病死率Field investigation,现场调查Field survey,现场调查Finite population,有限总体Finite-sample,有限样本First derivative,一阶导数First principal component,第一主成分First quartile,第一四分位数Fisher information,费雪信息量Fitted value,拟合值Fitting a curve,曲线拟合Fixed base,定基Fluctuation,随机起伏Forecast,预测Four fold table,四格表Fourth,四分点Fraction blow,左侧比率Fractional error,相对误差Frequency,频率Frequency polygon,频数多边图Frontier point,界限点Function relationship,泛函关系Gamma distribution,伽玛分布Gauss increment,高斯增量Gaussian distribution,高斯分布/正态分布Gauss-Newton increment,高斯-牛顿增量General census,全面普查GENLOG (Generalized liner models),广义线性模型Geometric mean,几何平均数Gini's mean difference,基尼均差GLM (General liner models),通用线性模型Goodness of fit,拟和优度/配合度Gradient of determinant,行列式的梯度Graeco-Latin square,希腊拉丁方Grand mean,总均值Gross errors,重大错误Gross-error sensitivity,大错敏感度Group averages,分组平均Grouped data,分组资料Guessed mean,假定平均数Half-life,半衰期Hampel M-estimators,汉佩尔M估计量Happenstance,偶然事件Harmonic mean,调和均数Hazard function,风险均数Hazard rate,风险率Heading,标目Heavy-tailed distribution,重尾分布Hessian array,海森立体阵Heterogeneity,不同质Heterogeneity of variance,方差不齐Hierarchical classification,组内分组Hierarchical clustering method,系统聚类法High-leverage point,高杠杆率点HILOGLINEAR,多维列联表的层次对数线性模型Hinge,折叶点Histogram,直方图Historical cohort study,历史性队列研究Holes,空洞HOMALS,多重响应分析Homogeneity of variance,方差齐性Homogeneity test,齐性检验Huber M-estimators,休伯M估计量Hyperbola,双曲线Hypothesis testing,假设检验Hypothetical universe,假设总体Impossible event,不可能事件Independence,独立性Independent variable,自变量Index,指标/指数Indirect standardization,间接标准化法Individual,个体Inference band,推断带Infinite population,无限总体Infinitely great,无穷大Infinitely small,无穷小Influence curve,影响曲线Information capacity,信息容量Initial condition,初始条件Initial estimate,初始估计值Initial level,最初水平Interaction,交互作用Interaction terms,交互作用项Intercept,截距Interpolation,内插法Interquartile range,四分位距Interval estimation,区间估计Intervals of equal probability,等概率区间Intrinsic curvature,固有曲率Invariance,不变性Inverse matrix,逆矩阵Inverse probability,逆概率Inverse sine transformation,反正弦变换Iteration,迭代Jacobian determinant,雅可比行列式Joint distribution function,分布函数Joint probability,联合概率Joint probability distribution,联合概率分布K means method,逐步聚类法Kaplan-Meier,评估事件的时间长度Kaplan-Merier chart, Kaplan-Merier图Kendall's rank correlation, Kendall等级相关Kinetic,动力学Kolmogorov-Smirnove test,柯尔莫哥洛夫-斯米尔诺夫检验Kruskal and Wallis test, Kruskal及Wallis检验/多样本的秩和检验/H检验Kurtosis,峰度Lack of fit,失拟Ladder of powers,幂阶梯Lag,滞后Large sample,大样本Large sample test,大样本检验Latin square,拉丁方Latin square design,拉丁方设计Leakage,泄漏Least favorable configuration,最不利构形Least favorable distribution,最不利分布Least significant difference,最小显著差法Least square method,最小二乘法Least-absolute-residuals estimates,最小绝对残差估计Least-absolute-residuals fit,最小绝对残差拟合Least-absolute-residuals line,最小绝对残差线Legend,图例L-estimator, L估计量L-estimator of location,位置L估计量L-estimator of scale,尺度L估计量Level,水平Life expectance,预期期望寿命Life table,寿命表Life table method,生命表法Light-tailed distribution,轻尾分布Likelihood function,似然函数Likelihood ratio,似然比line graph,线图Linear correlation,直线相关Linear equation,线性方程Linear programming,线性规划Linear regression,直线回归Linear Regression,线性回归Linear trend,线性趋势Loading,载荷Location and scale equivariance,位置尺度同变性Location equivariance,位置同变性Location invariance,位置不变性Location scale family,位置尺度族Log rank test,时序检验Logarithmic curve,对数曲线Logarithmic normal distribution,对数正态分布Logarithmic scale,对数尺度Logarithmic transformation,对数变换Logic check,逻辑检查Logistic distribution,逻辑斯特分布Logit transformation, Logit转换LOGLINEAR,多维列联表通用模型Lognormal distribution,对数正态分布Lost function,损失函数Low correlation,低度相关Lower limit,下限Lowest-attained variance,最小可达方差LSD,最小显著差法的简称Lurking variable,潜在变量Main effect,主效应Major heading,主辞标目Marginal density function,边缘密度函数Marginal probability,边缘概率Marginal probability distribution,边缘概率分布Matched data,配对资料Matched distribution,匹配过分布Matching of distribution,分布的匹配Matching of transformation,变换的匹配Mathematical expectation,数学期望Mathematical model,数学模型Maximum L-estimator,极大极小L估计量Maximum likelihood method,最大似然法Mean,均数Mean squares between groups,组间均方Mean squares within group,组内均方Means (Compare means),均值-均值比较Median,中位数Median effective dose,半数效量Median lethal dose,半数致死量Median polish,中位数平滑Median test,中位数检验Minimal sufficient statistic,最小充分统计量Minimum distance estimation,最小距离估计Minimum effective dose,最小有效量Minimum lethal dose,最小致死量Minimum variance estimator,最小方差估计量MINITAB,统计软件包Minor heading,宾词标目Missing data,缺失值Model specification,模型的确定Modeling Statistics ,模型统计Models for outliers,离群值模型Modifying the model,模型的修正Modulus of continuity,连续性模Morbidity,发病率Most favorable configuration,最有利构形Multidimensional Scaling (ASCAL),多维尺度/多维标度Multinomial Logistic Regression ,多项逻辑斯蒂回归Multiple comparison,多重比较Multiple correlation ,复相关Multiple covariance,多元协方差Multiple linear regression,多元线性回归Multiple response ,多重选项Multiple solutions,多解Multiplication theorem,乘法定理Multiresponse,多元响应Multi-stage sampling,多阶段抽样Multivariate T distribution,多元T分布Mutual exclusive,互不相容Mutual independence,互相独立Natural boundary,自然边界Natural dead,自然死亡Natural zero,自然零Negative correlation,负相关Negative linear correlation,负线性相关Negatively skewed,负偏Newman-Keuls method, q检验NK method, q检验No statistical significance,无统计意义Nominal variable,名义变量Nonconstancy of variability,变异的非定常性Nonlinear regression,非线性相关Nonparametric statistics,非参数统计Nonparametric test,非参数检验Nonparametric tests,非参数检验Normal deviate,正态离差Normal distribution,正态分布Normal equation,正规方程组Normal ranges,正常范围Normal value,正常值Nuisance parameter,多余参数/讨厌参数Null hypothesis,无效假设Numerical variable,数值变量Objective function,目标函数Observation unit,观察单位Observed value,观察值One sided test,单侧检验One-way analysis of variance,单因素方差分析Oneway ANOV A ,单因素方差分析Open sequential trial,开放型序贯设计Optrim,优切尾Optrim efficiency,优切尾效率Order statistics,顺序统计量Ordered categories,有序分类Ordinal logistic regression ,序数逻辑斯蒂回归Ordinal variable,有序变量Orthogonal basis,正交基Orthogonal design,正交试验设计Orthogonality conditions,正交条件ORTHOPLAN,正交设计Outlier cutoffs,离群值截断点Outliers,极端值OVERALS ,多组变量的非线性正规相关Overshoot,迭代过度Paired design,配对设计Paired sample,配对样本Pairwise slopes,成对斜率Parabola,抛物线Parallel tests,平行试验Parameter,参数Parametric statistics,参数统计Parametric test,参数检验Partial correlation,偏相关Partial regression,偏回归Partial sorting,偏排序Partials residuals,偏残差Pattern,模式Pearson curves,皮尔逊曲线Peeling,退层Percent bar graph,百分条形图Percentage,百分比Percentile,百分位数Percentile curves,百分位曲线Periodicity,周期性Permutation,排列P-estimator, P估计量Pie graph,饼图Pitman estimator,皮特曼估计量Pivot,枢轴量Planar,平坦Planar assumption,平面的假设PLANCARDS,生成试验的计划卡Point estimation,点估计Poisson distribution,泊松分布Polishing,平滑Polled standard deviation,合并标准差Polled variance,合并方差Polygon,多边图Polynomial,多项式Polynomial curve,多项式曲线Population,总体Population attributable risk,人群归因危险度Positive correlation,正相关Positively skewed,正偏Posterior distribution,后验分布Power of a test,检验效能Precision,精密度Predicted value,预测值Preliminary analysis,预备性分析Principal component analysis,主成分分析Prior distribution,先验分布Prior probability,先验概率Probabilistic model,概率模型probability,概率Probability density,概率密度Product moment,乘积矩/协方差Profile trace,截面迹图Proportion,比/构成比Proportion allocation in stratified random sampling,按比例分层随机抽样Proportionate,成比例Proportionate sub-class numbers,成比例次级组含量Prospective study,前瞻性调查Proximities,亲近性Pseudo F test,近似F检验Pseudo model,近似模型Pseudosigma,伪标准差Purposive sampling,有目的抽样QR decomposition, QR分解Quadratic approximation,二次近似Qualitative classification,属性分类Qualitative method,定性方法Quantile-quantile plot,分位数-分位数图/Q-Q图Quantitative analysis,定量分析Quartile,四分位数Quick Cluster,快速聚类Radix sort,基数排序Random allocation,随机化分组Random blocks design,随机区组设计Random event,随机事件Randomization,随机化Range,极差/全距Rank correlation,等级相关Rank sum test,秩和检验Rank test,秩检验Ranked data,等级资料Rate,比率Ratio,比例Raw data,原始资料Raw residual,原始残差Rayleigh's test,雷氏检验Rayleigh's Z,雷氏Z值Reciprocal,倒数Reciprocal transformation,倒数变换Recording,记录Redescending estimators,回降估计量Reducing dimensions,降维Re-expression,重新表达Reference set,标准组Region of acceptance,接受域Regression coefficient,回归系数Regression sum of square,回归平方和Rejection point,拒绝点Relative dispersion,相对离散度Relative number,相对数Reliability,可靠性Reparametrization,重新设置参数Replication,重复Report Summaries,报告摘要Residual sum of square,剩余平方和Resistance,耐抗性Resistant line,耐抗线Resistant technique,耐抗技术R-estimator of location,位置R估计量R-estimator of scale,尺度R估计量Retrospective study,回顾性调查Ridge trace,岭迹Ridit analysis, Ridit分析Rotation,旋转Rounding,舍入Row,行Row effects,行效应Row factor,行因素RXC table, RXC表Sample,样本Sample regression coefficient,样本回归系数Sample size,样本量Sample standard deviation,样本标准差Sampling error,抽样误差SAS(Statistical analysis system ), SAS 统计软件包Scale,尺度/量表Scatter diagram,散点图Schematic plot,示意图/简图Score test,计分检验Screening,筛检SEASON,季节分析Second derivative,二阶导数Second principal component,第二主成分SEM (Structural equation modeling),结构化方程模型Semi-logarithmic graph,半对数图Semi-logarithmic paper,半对数格纸Sensitivity curve,敏感度曲线Sequential analysis,贯序分析Sequential data set,顺序数据集Sequential design,贯序设计Sequential method,贯序法Sequential test,贯序检验法Serial tests,系列试验Short-cut method,简捷法Sigmoid curve, S形曲线Sign function,正负号函数Sign test,符号检验Signed rank,符号秩Significance test,显著性检验Significant figure,有效数字Simple cluster sampling,简单整群抽样Simple correlation,简单相关Simple random sampling,简单随机抽样Simple regression,简单回归simple table,简单表Sine estimator,正弦估计量Single-valued estimate,单值估计Singular matrix,奇异矩阵Skewed distribution,偏斜分布Skewness,偏度Slash distribution,斜线分布Slope,斜率Smirnov test,斯米尔诺夫检验Source of variation,变异来源Spearman rank correlation,斯皮尔曼等级相关Specific factor,特殊因子Specific factor variance,特殊因子方差Spectra ,频谱Spherical distribution,球型正态分布Spread,展布SPSS(Statistical package for the social science), SPSS统计软件包Spurious correlation,假性相关Square root transformation,平方根变换Stabilizing variance,稳定方差Standard deviation,标准差Standard error,标准误Standard error of difference,差别的标准误Standard error of estimate,标准估计误差Standard error of rate,率的标准误Standard normal distribution,标准正态分布Standardization,标准化Starting value,起始值Statistic,统计量Statistical control,统计控制Statistical graph,统计图Statistical inference,统计推断Statistical table,统计表Steepest descent,最速下降法Stem and leaf display,茎叶图Step factor,步长因子Stepwise regression,逐步回归Storage,存Strata,层（复数）Stratified sampling,分层抽样Stratified sampling,分层抽样Strength,强度Stringency,严密性Structural relationship,结构关系Studentized residual,学生化残差/t化残差Sub-class numbers,次级组含量Subdividing,分割Sufficient statistic,充分统计量Sum of products,积和Sum of squares,离差平方和Sum of squares about regression,回归平方和Sum of squares between groups,组间平方和Sum of squares of partial regression,偏回归平方和Sure event,必然事件Survey,调查Survival,生存分析Survival rate,生存率Suspended root gram,悬吊根图Symmetry,对称Systematic error,系统误差Systematic sampling,系统抽样Tags,标签Tail area,尾部面积Tail length,尾长Tail weight,尾重Tangent line,切线Target distribution,目标分布Taylor series,泰勒级数Tendency of dispersion,离散趋势Testing of hypotheses,假设检验Theoretical frequency,理论频数Time series,时间序列Tolerance interval,容忍区间Tolerance lower limit,容忍下限Tolerance upper limit,容忍上限Torsion,扰率Total sum of square,总平方和Total variation,总变异Transformation,转换Treatment,处理Trend,趋势Trend of percentage,百分比趋势Trial,试验Trial and error method,试错法Tuning constant,细调常数Two sided test,双向检验Two-stage least squares,二阶最小平方Two-stage sampling,二阶段抽样Two-tailed test,双侧检验Two-way analysis of variance,双因素方差分析Two-way table,双向表Type I error,一类错误/α错误Type II error,二类错误/β错误UMVU,方差一致最小无偏估计简称Unbiased estimate,无偏估计Unconstrained nonlinear regression ,无约束非线性回归Unequal subclass number,不等次级组含量。

hierarchical text classification综述

层次化文本分类（Hierarchical Text Classification，HTC）是一种特殊的多标签文本分类（Multi-Label Text Classification，MLC）问题，其中分类结果对应于分类层次中的一个或多个节点。

以下是关于层次化文本分类的综述：一、研究背景层次化文本分类在信息检索、文档组织、情感分析等领域具有广泛应用。

然而，由于标签层次的复杂结构，层次文本分类是一项具有挑战性的任务。

现有的方法在处理层次文本分类时，往往忽略了文本和标签之间的语义关系，不能充分利用文本的层次信息。

二、现有方法局部分类器链方法：这类方法通过训练一系列局部分类器来解决层次文本分类问题。

每个局部分类器负责分类层次中的一个节点，通过将文本逐级传递给下一级分类器，实现层次化分类。

然而，这种方法忽略了标签之间的依赖关系，可能导致错误传播。

全局分类器方法：全局分类器方法试图在整个分类层次上训练一个统一的模型。

这类方法通常使用图模型或结构化输出学习来建模标签之间的依赖关系。

然而，全局分类器方法在处理大规模层次结构时可能面临计算复杂度高的问题。

三、挑战与问题标签依赖建模：在层次文本分类中，标签之间存在复杂的依赖关系。

如何有效地建模这些依赖关系是一个具有挑战性的问题。

文本特征提取：提取与层次结构相关的文本特征是层次文本分类的关键。

现有的方法在处理这一问题时往往忽略了文本和标签之间的语义关系。

计算效率：全局分类器方法在处理大规模层次结构时可能面临计算复杂度高的问题，如何提高计算效率是一个需要解决的问题。

四、未来研究方向深度学习方法：深度学习方法在自然语言处理领域取得了显著成果，未来可以探索如何利用深度学习方法解决层次文本分类问题。

例如，可以利用神经网络建模标签依赖关系，同时提取与层次结构相关的文本特征。

迁移学习方法：迁移学习方法可以利用从一个任务中学到的知识来帮助解决另一个相关任务。

在层次文本分类中，可以探索如何利用迁移学习方法将已有的分类知识迁移到新的层次结构中，从而提高分类性能。

军用计算机安全评估准则

军用计算机安全评估准则中华人民共和国国家军用标准军用计算机安全评估准则GJB2646－96Military computer security evaluation criteria（国防科学技术工业委员会1996年6月4日发布 1996年12月1日实施）一范围1.1 主题内容本标准规定了评估计算机安全的准则，等级划分及每个等级的安全要求。

1.2 适用范围本标准适用于军用计算机安全评估，主要面向操作系统，也适用于其他需要进行安全评估的计算机。

二引用文件GJB2255-95军用计算机安全术语三定义3.1 术语本章未列入的术语, 见GJB2255。

3.1.1 自主保护 discretionary protection辨识用户身份和他们的需求, 限制用户使用信息的访问控制的方法。

3.1.2 强制访问控制 mandatory access control根据客体所包含信息的敏感性以及主体访问此类敏感信息的权限,限制主体访问客体的方法。

3.1.3 安全等级 security level为表示信息的不同敏感度, 按保密程度不同对信息进行层次划分的组合或集合。

3.1.4 审计 audit对影响系统安全的各种活动进行记录并为系统安全员提供安全管理依据的程序。

3.1.5 隔离 isolation为防止其他用户或程序的非授权访问, 把操作系统、用户程序、数据文件加以彼此独立存储的行为。

3.1.6 可信计算基(TCB) trusted computing base计算机系统内保护装置的总体, 包括硬件、固体、软件和负责执行安全策略的组合体。

它建立了一个基本的保护环境并提供一个可信计算系统所要求的附加用户服务。

3.1.7 敏感标号 sensitivity label表示客体安全级别并描述客体数据敏感性的一组信息, 可信计算基中把敏感标号作为强制访问控制决策的依据。

3.1.8 系统完整性 system integrity系统不能以非授权手段被破坏或修改的性质。

数据分类分级管理方法

数据分类分级管理方法Data classification and hierarchical management is an essential aspect of information security in today's digital age. 数据分类和分级管理是当今数字时代信息安全的重要组成部分。

It involves categorizing data based on its sensitivity, importance, and confidentiality, and then securely storing and managing it according to its level of classification. 这涉及根据数据的敏感性、重要性和保密性对其进行分类，然后根据其分类级别安全地存储和管理数据。

By implementing a robust data classification and hierarchical management system, organizations can better protect their sensitive information from unauthorized access, breaches, and other security threats. 通过实施强大的数据分类和分级管理系统，组织可以更好地保护其敏感信息免受未经授权的访问、侵犯和其他安全威胁。

One of the key benefits of data classification and hierarchical management is that it provides a clear framework for organizing and structuring data within an organization. 数据分类和分级管理的一项关键好处是提供了一个清晰的框架，用于组织和管理组织内的数据。

档次英文单词

档次英文单词单词：class1. 定义与释义1.1词性：名词、动词1.2释义：作为名词时，表示阶级、等级、班级、课等；作为动词时，表示把……分类、分等级。

1.3英文解释：As a noun, it can refer to a system of ordering society into hierarchical groups (class), a group of students taught together (class), or a period of time during which students are taught a particular subject (class). As a verb, it means to arrange or group according to quality, rank, or grade.1.4相关词汇：- 同义词（名词）：grade、rank、category；- 派生词：classy（形容词，时髦的、上等的）、classless（形容词，无阶级的）、classification（名词，分类、类别）2. 起源与背景2.1词源：源自拉丁语“classis”，最初表示罗马公民的一个群体划分，特别是根据财产进行的划分，与军事征募有关。

2.2趣闻：在英国，社会阶级（class）在历史上有着非常明显的划分，不同阶级在着装、口音、居住区域、社交活动等方面都有很大差异。

例如，上层阶级往往有独特的贵族口音，居住在豪华的庄园里，而工人阶级则多居住在城市的特定区域，从事体力劳动相关工作。

3. 常用搭配与短语3.1短语：- classmate：同班同学- 例句：My classmate Tom is very good at math.- 翻译：我的同班同学汤姆很擅长数学。

- classroom：教室- 例句：The teacher is waiting for us in the classroom.- 翻译：老师正在教室里等我们。

分层分类的中小学英语教师培训方案设计

Secondary school English teachers are responsible for further developing students' language skills and introducing more complex texts and literary works They need to have a deep understanding of language structure and features, as well as an ability to guide students in their critical thinking and analysis of texts
Classroom Management and Lesson Planning
Train junior teachers in effective classroom management techniques and how to plan engaging lessons
Teaching Complex Grammar Structures
Discussions effective methods for assessing student progress and providing feedback
Teaching Advanced Writing Skills
Focus on helping students develop strong essay writing, including critical thinking and arguments
Through hierarchical classification training, teachers at different levels can receive training that is tailed to their needs This guarantees that they are required with the most up to date teaching methods and resources, thus improving the overall quality of English education in schools

crnn的原理与流程

crnn的原理与流程CRNN (Convolutional Recurrent Neural Network) is a type of neural network that combines both convolutional and recurrent neural network. It is widely used in image-based sequence recognition tasks, such as scene text recognition, handwriting recognition, and so on. CRNN integrates feature extraction, sequence modeling, and transcription into a single trainable end-to-end network, making it particularly effective for tasks requiring recognition of sequential patterns within images.CRNN的原理基于卷积神经网络和循环神经网络的结合，通过端到端的训练来实现特征提取、序列建模和文本转录。

这种网络结构在图像序列识别任务中得到广泛应用，例如场景文本识别、手写识别等。

CRNN能够有效地识别图像中的序列模式，因此在需要从图像中识别出连续文本的任务中非常有效。

The basic architecture of CRNN consists of three main components: the convolutional layers, the recurrent layers, and the transcription layer. The convolutional layers are responsible for extracting feature maps from the input image, capturing both local and global visual patterns. The recurrent layers, typically implemented as bidirectionalLSTM or GRU, process the extracted features along the time axis to capture temporal dependencies within the image sequence. Finally, the transcription layer uses a CTC (Connectionist Temporal Classification) loss function to convert the sequence of feature maps into a sequence of characters or symbols.CRNN的基本结构由卷积层、循环层和文本转录层三部分组成。

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

Hierarchical Text Classiﬁcation and EvaluationAixin Sun and Ee-Peng LimCenter for Advanced Information SystemsNanyang Technological UniversityNanyang Avenue,Singapore639798,Singaporesunaixin@.sg aseplim@.sgAbstractHierarchical Classiﬁcation refers to assigning of one or more suitable categories from a hierarchical category space to a document.While previous work in hierarchical classi-ﬁcation focused on virtual category trees where documents are assigned only to the leaf categories,we propose a top-down level-based classiﬁcation method that can classify documents to both leaf and internal categories.As the stan-dard performance measures assume independence between categories,they have not considered the documents incor-rectly classiﬁed into categories that are similar or not far from the correct ones in the category tree.We therefore propose the Category-Similarity Measures and Distance-Based Measures to consider the degree of misclassiﬁcation in measuring the classiﬁcation performance.An experi-ment has been carried out to measure the performance of our proposed hierarchical classiﬁcation method.The re-sults showed that our method performs well for Reuters text collection when enough training documents are given and the new measures have indeed considered the contributions of misclassiﬁed documents.1.IntroductionText classiﬁcation(TC)or text categorization is the pro-cess of automatically assigning one or more predeﬁned cat-egories to text documents.In TC research,most of the stud-ies have focused onﬂat classiﬁcation where the predeﬁned categories are treated in isolation and there is no structure deﬁning the relationships among them[1,19].Such cate-gories are also known asﬂat categories.However,when the number of categories grows to a signiﬁcantly large number, it will become much more difﬁcult to browse and search the categories.One way to solve this problem is to organize the categories into a hierarchy like the one developed by Yahoo![18].Hierarchical classiﬁcation allows us to address a large classiﬁcation problem using a divide-and-conquer ap-proach.At the root level in the category hierarchy,a docu-ment can beﬁrst classiﬁed into one or more sub-categories using someﬂat classiﬁcation method(s).The classiﬁca-tion can be repeated on the document in each of the sub-categories until the document reaches some leaf categories or cannot be further classiﬁed into any sub-categories.A few hierarchical classiﬁcation methods have been proposed recently[1,2,10,13,15,17].In most of the hierarchical classiﬁcation methods,the categories are organized in tree-like structures.On the whole,we can identify four distinct category structures for text classiﬁcation.They are:1.Virtual category tree:In this category structure,cate-gories are organized as a tree.Each category can be-long to at most one parent category and documents can only be assigned to the leaf categories[2].2.Category tree:This is an extension of the virtual cat-egory tree that allows documents to be assigned into both internal and leaf categories[15].3.Virtual directed acyclic category graph:In this cate-gory structure,categories are organized as a Directed Acyclic Graph(DAG).Similar to the virtual category tree,documents can only be assigned to leaf cate-gories.4.Directed acyclic category graph:This is perhaps themost commonly-used structure in the popular web di-rectory services such as Yahoo![18]and Open Direc-tory Project[11].Documents can be assigned to both internal and leaf categories.In this paper,we will only focus on hierarchical classiﬁca-tion that involves category trees.To compare different hierarchical classiﬁcation methods, experiments involving training and test data sets have to be conducted,and some performance measures are used to de-termine the effectiveness of the methods.Inﬂat classiﬁ-cation,performance measures such as precision and re-call have been widely used[14,19].The same performanceProceedings of the 2001 IEEE International Conference on Data Mining (ICDM 2001), Pages 521--528, California, USA, November 2001.measures have also been used to measure the performance of hierarchical classiﬁcation methods.In this paper,we ar-gue that these performance measures are not adequate as they have largely ignored the parent-child and sibling re-lationships between categories in a hierarchy.By not con-sidering the“closeness”of categories,the performance of hierarchical classiﬁcation may not be accurately captured. In general,the categories from the same subtree share more domain knowledge than the ones from different subtrees, that is,the categories from the same subtree are semanti-cally closer to one another.With the standard precision and recall measures,all these relationships among categories are not accounted for.In this paper,we will present several performance measures applicable to hierarchical classiﬁca-tion.Among them are the category similarity measures and distance-based measures.In this paper,we aim to establish a framework to evalu-ate the performance of hierarchical classiﬁcation.There are two main contributions:1.We deﬁne a new set of performance measures that con-sider the semantic relationships and parent-child rela-tionships among categories in a hierarchy.The intu-ition is that when a document is wrongly classiﬁed, one has to examine how different is the incorrect cate-gory from the correct category.2.We develop a top-down level-based hierarchical clas-siﬁcation method for category tree using Support Vec-tor Machine(SVM)classiﬁers.By conducting exper-iments using the Reuters text collection and the new performance measures,we illustrate how the perfor-mance of hierarchical classiﬁcation can be more accu-rately determined.This paper is organized as follows.Weﬁrst give an overview of the related hierarchical classiﬁcation work in Section 2.In Section3,we present several new perfor-mance measures for hierarchical classiﬁcation.The exper-iment on our proposed hierarchical classiﬁcation method will be described in Section5.The results of both standard performance measures and the new performance measures are presented in this section.Finally,we conclude our work in Section6.2.Related workThe existing hierarchical classiﬁcation methods have mostly assumed a virtual category tree structure[2,13]. Furthermore,these methods have often been evaluated us-ing the performance measures developed forﬂat classiﬁca-tion.There are basically two approaches adopted by the ex-isting hierarchical classiﬁcation methods,namely,the big-bang approach and the top-down level-based approach.In the big-bang approach,only a single classiﬁer is used in the classiﬁcation process.Given a document,the clas-siﬁer assigns it to one or more categories in the category tree.The assigned categories can be internal or leaf cat-egories depending on the category structure supported by the methods.The big-bang approach has been achieved with Rocchio-like classiﬁer[7],rule-based classiﬁer[13] and methods built upon association rule mining[15].The performance measures used in these experiments have been very much based on simple empirical observations of the number of correctly classiﬁed documents or the percentage of incorrectly classiﬁed documents.In the top-down level-based approach,one or more clas-siﬁers are constructed at each level of the category tree and each classiﬁer works as aﬂat classiﬁer at that level.A docu-ment willﬁrst be classiﬁed by the classiﬁer at the root level into one or more lower level categories.It will then be fur-ther classiﬁed by the classiﬁer(s)of the lower level cate-gory(ies)until it reaches aﬁnal category which could be a leaf category or an internal category.The top-down level-based classiﬁcation has been implemented with ACTION (for Automatic Classiﬁcation for Full-Text Documents)al-gorithm in[1],multiple Bayesian classiﬁers in[6]and Sup-port Vector Machine classiﬁers in[2].Three performance measures,i.e.,precision,recall and F-measure have been used in these experiments.Compared to the top-down level-based approach,the big-bang approach can only use the information carried by the category structure during the training phase but not the classiﬁcation phase.As discriminative features(e.g.,terms) at a parent category may not be discriminative at the child categories,it is usually very difﬁcult for a classiﬁcation method using big-bang approach to exploit different sets of features at different category levels.Another issue in the big-bang approach is that the classiﬁer constructed may not beﬂexible enough to cater for changes to the category struc-ture.The classiﬁer needs to be retrained once the category structure is changed.On the other hand,the top-down level-based classiﬁca-tion approach is not problem-free.One of its obvious prob-lems is that a misclassiﬁcation at a parent(ancestor)cate-gory may force a document to be excluded from the child categories before it could be examined by the classiﬁers of the child categories.Classiﬁcation methods based on top-down approach also require more training examples since multiple classiﬁers have to be constructed and each requires a different training set.Without adequate training examples, the performance of these classiﬁers may suffer.3.Performance measuresTo evaluate a hierarchical classiﬁcation method,one can directly apply the standard precision and recall forﬂat clas-Category Expert JudgmentsYESClassiﬁerJudgments(1)(2) Based on the standard precision and recall for each cat-egory,the overall precision and recall for the whole cat-egory space,i.e.,,can be obtained in two ways,namely,Micro-Average and Macro-Average.Micro-Average gives equal importance to each document,while Macro-Average gives equal importance to each category [19]:1.Micro-Average:(4) 2.Macro-Average:(6) Neither precision nor recall can effectively measure classi-ﬁcation performance in isolation[14].Therefore,the per-formance of the text classiﬁcation has often been measured by the combination of the two measures.The popular com-binations are listed below:1.Break-Even Point(BEP):BEP,proposed by Lewis[8],deﬁnes the point at which precision and recall are equal.However,in some cases,BEP can never be ob-tained.For example,if there are only a few positive test documents compared to a large number of negative ones,the recall value can be so high that the precision can never reach.2.Measure:measure was proposed by Rijsbergen[12].It is a single score computed from precision andrecall values according to the user-deﬁned importance(i.e.,)of precision and recall.Normally,isused[19].The formula is:Besides precision and recall,other commonly-used per-formance measures include Accuracy and Error[14,6],de-noted by and for category respectively.(9) 3.2.Measures based on category similarityIntuitively,if a classiﬁcation method misclassiﬁes documents into categories similar to the correct categories, it is considered better than another method,say,that mis-classiﬁes the documents into totally unrelated categories. We therefore extend the standard precision and recall deﬁ-nitions to distinguish the performance of and.The Category Similarity between two categories and ,denoted by,can be computed in several ways.In our work,we have chosen to adopt cosine dis-tance between the feature vectors of two categories.It is suggested that the feature vector for a category should be derived by summation of the feature vectors of all training documents under it.The feature vectors of documents are the ones used to build the classiﬁers.From the category similarities,one can deﬁne the Average Category Similarity (ACS).The formulas for and are:(10)(12)Similarly,if is wrongly rejected from,say,the of to depends on the category simi-larities between and the assigned categories of.(17)(19)Macro-Average:(22) Similar to the extended precision and recall,the extended accuracy and error for category can be deﬁned based on document contribution:(24) Note that the sum of extended accuracy and error is1which is the same as the original deﬁnitions.As we have discussed in Section3.1,it is not sufﬁcient to evaluate a classiﬁcation method using only precision or recall.Instead,they have to be considered together.The performance measures that combine both precision and re-call are the Break-Even Point(BEP),and Average11-Point Precision.Among them,can be easily computed using the extended precision and recall.BEP can be ap-plied to classiﬁcation methods that can rank documents for each category.In hierarchical classiﬁcation using the big-bang approach,BEP and Average11-Point Precision can be computed for classiﬁcation methods that can rank all docu-ments in the test set for each category.On the other hand, for those classiﬁcation methods using top-down level-based approach,the test documents available for classiﬁcation at a level are determined by the parent classiﬁer as the latter may reject documents before they reach the child classiﬁer(s). With such restriction,it is difﬁcult to compute the BEP and Average11-Point Precision for each category.Hence,we argue that the above two performance measures are less ap-plicable to the hierarchical classiﬁcation methods.3.3.Measures based on category distanceInstead of using category similarity,we can deﬁne per-formance measures based on the distances between cate-gories in a category tree.The distance between two cate-gories and,denoted by,is deﬁned to be the number of the links between and.Intuitively,theshorter the length,the closer the two categories.The distance between categories wasﬁrst proposed to measure misclassiﬁcation in[16].Nevertheless,the work did not deﬁne performance measures based on category dis-tance.To deﬁne the of misclassiﬁed docu-ments,an acceptable distance,denoted as,mustﬁrst be speciﬁed by the user.must be greater than0.For example,if,a misclassiﬁcation of document that involves the labelled and assigned categories at more than1link apart will yield negative contribution,but zero con-tribution at1link apart.Formally,the of a document to category based on category distance isdeﬁned as follows:If:(26)For the same reason,the needs to be reﬁned to be in the range of.With this new deﬁnition for contribution,the extended precision and recall based on cat-egory distance,denoted by and,can be de-ﬁned using the formulae(17)and(18)respectively.Simi-larly,Micro-Average,Macro-Average,,accuracy and er-ror can be extended.4.Hierarchical classiﬁcation methodIn this section,we propose a hierarchical classiﬁcation method for category tree structure based on top-down level-based approach.All the classiﬁers involved in this method are binary classiﬁers.Binary classiﬁers normally need to be trained with both positive and negative training documents. In the hierarchical classiﬁcation method,a binary classi-ﬁer is built for each category.These classiﬁers that deter-mine whether a document should belong to the correspond-ing categories are known as the local-classiﬁers.However, an additional binary classiﬁer is built for each internal cat-egory to determine whether a document should be given to the classiﬁers of its sub-categories.This special classiﬁer is known as the subtree-classiﬁer since it decides whether a document should belong to a subtree.This separation of local and subtree classiﬁers distinguishes our method from that proposed by Dumais and Chen[2].To build binary classiﬁers in hierarchical classiﬁcation,special considera-tion must be given to the selection of training documents for each classiﬁer.The of a category in a given category tree,denoted by is the set of categories that belongs to the subtree rooted at including.For example,in the hierarchy shown in Figure1Tree(a),.For any doc-ument,is true if and only if belongs to category;is true if and only if belongs to any of the categories in.Hier3oiloil-seed veg-oilhog carcass livestockcrudenat-gas ship wheat corn grainHier1Hier22Asnormally there is no document directly under the root category,thelocal-classiﬁer for the root category is not constructed.category tree.In the experiments,we compared the clas-siﬁcation performance indicated by both the standard and extended precision and recall.The SVM classiﬁer used in our experiment is SVM Version 3.50implemented by Joachims [4].In our experiment,Reuters-21578collection 3was used.To conduct our experiment,category trees need to be manu-ally derived from the 135categories.Kohler and Sahami ex-tracted three category trees from the Reuters-22173collec-tion by identifying the category labels that suggest parent-child relationships [6].Three slightly different category trees are derived from the Reuters-21578collection using a similar approach (see Figure 1).Note that the roots of the three category trees are virtual categories.Almost all documents in Reuters collection come with title,dateline and text body.We obtained the index terms from only the title and text body after stopword removal and stemming.The stopwords and the stemming algorithms have been taken directly from the BOW library [9].A bi-nary term vector is obtained for each document without ap-plying any feature selection.In our experiment,we used Lewis Split provided by Reuters collection to obtain train-ing documents and test documents.All the test documents that belong to one or more cat-egory in a category tree are used as its positive test doc-uments.The same number of test documents that do not belong to the category tree are randomly selected to be the negative test documents.The statistics about our training and test documents are shown in Table 2.For Category ,:and :refer to the subtree-classiﬁer and local-classiﬁer for respectively.In the table,,andrefer to number of positive training,negative train-ing and positive test documents respectively.The cate-gory similarity matrix for Tree(a)is shown in Table 3.As,only the lower half of the ma-trix is shown.The ones for Tree(b)and Tree (c)can be computed in a similar way.The results of our experiment are shown in Tables 4,5and 6for the three category trees.We computed the stan-dard Precision and Recall ,the precision and recall based on category similarity;and the preci-sion and recall based on category distance forTree(a)Tree(c)++Category-++981394Hier2270271123574-livestock:s188153-391189livestock:l83019437-veg-oil:s81119-434149veg-oil:l10416117530carcass33371419889hog67551818256oil-seed67352421271palm-oil1614718----4014----1812Table2.Number of training and test documents for TreesCategory grain ship wheat1.000--grain 1.000--0.602 1.000-ship0.556 1.000-0.4870.463 1.000wheat0.8150.513 1.000 Average Category Similarity0.559Table3.Category similarity matrix for Tree(a) Category0.8460.8490.890grain0.9390.9380.9320.8180.8330.900ship0.8200.8210.8430.8880.9440.929wheat0.9860.9930.980 Micro-Ave0.9090.9190.9220.8640.8920.903livestock0.7080.7710.6540.8780.9060.904 carcass0.6110.7340.4731.000 1.000 1.000oil-seed0.8930.9190.9011.000 1.000 1.0000.7100.7890.734Macro-Ave0.6710.7260.695 Table5.Testing results for Tree(b) Category1.000 1.000 1.000str-metal0.0000.0000.0000.9230.9230.923rice0.8330.8330.8331.000 1.000 1.000 copper0.8330.8330.8330.5000.5000.500tin0.2500.2490.166 Micro-Ave0.5300.5340.5340.7950.7950.795though the extended precision and recall based on eithercategory similarity or category distance give different val-ues compared to the standard precision and recall,it is not clear enough to conclude the performance of our method for Tree(b)and (c).In summary,our method performed reasonably well for the Reuters collection when given enough training docu-ments.The extended measures have indeed considered con-tributions of wrongly classiﬁed documents.6.ConclusionsIn this paper,we give an overview of hierarchical clas-siﬁcation problem and its solutions.While the commonly-used performance measures are the ones for ﬂat classiﬁca-tion,the relationships among the categories in category tree are not accounted for.We propose some novel approaches to include the contributions of wrongly classiﬁed docu-ments towards performance measures.We have also devel-oped a top-down level-based classiﬁcation method using bi-nary classiﬁers (such as SVM)and evaluated the method us-ing the Reuters collection.The results show that our method works well and the extended precision and recall can be fea-sibly implemented.In our future work,we are going to evaluate the hier-archical classiﬁcation method using classiﬁers other than SVM to compare their performance using the extended measures.Since we use category similarity and distance to measure the classiﬁcation results,we plan to design a new hierarchical classiﬁcation method that makes use of such information.In the top-down level-based approach,the error made at the parent category is not recoverable at the child category.We will also try to design a more toler-ant hierarchical classiﬁcation method with which the child classiﬁers are able to recover the errors made by the parent classiﬁer(s).References[1]S.D’Alessio,K.Murray,R.Schiafﬁno,and A.Kershen-baum.The effect of using hierarchical classiﬁers in text categorization.In Proc.of the 6th Int.Conf.“Recherche d’Information Assistee par Ordinateur”,pages 302–313,Paris,FR,2000.[2]S.Dumais and H.Chen.Hierarchical classiﬁcation of Webcontent.In Proc.of the 23rd ACM Int.Conf.on Research and Development in Information Retrieval ,pages 256–263,Athens,GR,2000.[3]S.Dumais,J.Platt,D.Heckerman,and M.Sahami.Induc-tive learning algorithms and representations for text catego-rization.In Proc.of the 7th Int.Conf.on Information and Knowledge Management ,pages 148–155,1998.[4]T.Joachims.,an implementa-tion of Support Vector Machines (SVMs)in C.http://ais.gmd.de/˜thorsten/svmoc.of the 1st SIAM Int.Conf.on01.C.Lie w .Building hierarchicalximity .In Pr oc.of the 25th Int.Bases ,pages 363–374,Edinburgh,ner ,and J.O.Pedersen.Exploit-orization.Information Retrieval ,com.statistical approaches to text cat-etrieval ,1(1-2):69–90,1999.。