数据挖掘技术综述毕业论文外文翻译

合集下载
相关主题
  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。

Summary of Data Mining Technology

Abstract: With the development of computer and network technology, it is very easy to obtain relevant information. But for the large number of large-scale data, the traditional statistical methods can not complete the analysis of such data. Therefore, an intelligent, comprehensive application of a variety of statistical analysis, database, intelligent language to analyze large data data "data mining" (Date Mining) technology came into being. This paper mainly introduces the basic concept of data mining and the method of data mining. The application of data mining and its development prospect are also described in this paper.

Keywords: data mining; method; application; foreground

1 Introduction

With the rapid development of information technology, the scale of the database has been expanding, resulting in a lot of data. The surge of data is hidden behind a lot of important information, people want to be able to conduct a higher level of analysis in order to make better use of these data. In order to provide decision makers with a unified global perspective, data warehouses are established in many areas. But a lot of data often makes it impossible to identify hidden in which can provide support for decision-making information, and the traditional query, reporting tools can not meet the needs of mining this information. Therefore, the need for a new data analysis technology to deal with large amounts of data, and from the extraction of valuable potential knowledge, data mining (Data Mining) technology came into being. Data mining technology is also accompanied by the development of data warehouse technology and gradually improved.

2 Data Mining Technology

2.1 Definition of data mining

Data mining refers to the non-trivial process of automatically extracting useful information hidden in the data from the data set. The information is represented by rules, concepts, rules and patterns. It helps decision makers analyze historical data and current data and discover hidden relationships and patterns to predict future behaviors that may occur. The process of data mining is also called the process of knowledge discovery. It is a kind of interdisciplinary and interdisciplinary subject, which involves the fields of database, artificial intelligence, mathematical statistics, visualization and parallel computing. Data mining is a new information processing technology, its main feature is the database of large amounts of data extraction, conversion, analysis and other model

processing, and extract the auxiliary decision-making key data. Data mining is an important technology in KDD (Knowledge Discovery in Database). It does not use the standard database query language (such as SQL) to query, but the content of the query to summarize the pattern and the inherent law of the search. Traditional query and report processing are only the result of the incident, and there is no in-depth study of the reasons for the occurrence of data mining is the main understanding of the causes of occurrence, and with a certain degree of confidence in the future forecast for the decision-making behavior to provide favorable stand by.

2.2 Methods of data mining

Data mining research combines a number of different disciplines in the field of technology and results, making the current data mining methods show a variety of forms. From the perspective of statistical analysis, the data mining models used in statistical analysis techniques are linear and non-linear analysis, regression analysis, logistic regression analysis, univariate analysis, multivariate analysis, time series analysis, recent sequence analysis, and recent Oracle algorithm and clustering analysis and other methods. Using these techniques, you can examine the data in those unusual forms, and then interpret the data using various statistical models and mathematical models to explain the market rules and business opportunities that are hidden behind those data. Knowledge discovery class Data mining technology is a kind of mining technology which is completely different from the statistical analysis class data mining technology, including artificial neural network, support vector machine, decision tree, genetic algorithm, rough set, rule discovery and association order.

2.2.1 Statistical methods

Traditional statistics provide a number of discriminant and regression analysis methods for data mining. Commonly used techniques such as Bayesian reasoning, regression analysis, and variance analysis. Bayesian reasoning is the basic principle of correcting the probability distribution of data sets after knowing new information Tools, to deal with the classification of data mining problems, regression analysis used to find an input variable and the relationship between the output variables of the best model, in the regression analysis used to describe a variable trends and other variables of the relationship between the linear regression, There is also a logarithmic regression for predicting the occurrence of certain events. The variance analysis in the statistical method is generally used to analyze the effects of estimating the regression line's performance and the independent variables on the final regression, which is the result of many mining applications One of the powerful tools.

相关文档
最新文档