weka教程_使用方法

合集下载
  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
University of Waikato 36

“Meta”-classifiers include:

1/28/2014
1/28/2014
University of Waikato
37
1/28/2014
University of Waikato
38
1/28/2014
University of Waikato
University of Waikato 8
1/28/2014
WEKA only deals with “flat” files
@relation heart-disease-simplified @attribute age numeric @attribute sex { female, male} @attribute chest_pain_type { typ_angina, asympt, non_anginal, atyp_angina} @attribute cholesterol numeric @attribute exercise_induced_angina { no, yes} @attribute class { present, not_present} @data 63,male,typ_angina,233,no,not_present 67,male,asympt,286,yes,present 67,male,asympt,229,yes,present 38,female,non_anginal,?,no,not_present ...
作者: Ian H. Witten / Eibe Frank 副标题: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems) 页数: 525 出版社: Morgan Kaufmann 出版年: 2005-06-08
University of Waikato 6
1/28/2014
WEKA: versions

There are several versions of WEKA:
WEKA 3.4: “book version” compatible with description in data mining book WEKA 3.6: “GUI version” adds graphical user interfaces WEKA 3.7: “development version” with lots of improvements
15
1/28/2014
University of Waikato
16
1/28/2014
University of Waikato
17
1/28/2014
University of Waikato
18
1/28/2014
University of Waikato
19
1/28/2014
University of Waikato
20
1/28/2014
University of Waikato
21
1/28/2014
University of Waikato
22
1/28/2014
University of Waikato
23
1/28/2014
University of Waikato
24
1/28/2014
University of Waikato

Classification and Regression ing Association Rules Attribute Selection Data Visualization


The Experimenter The Knowledge Flow GUI Conclusions

Discretization, normalization, resampling, attribute selection, transforming and combining attributes, …
1/28/2014
University of Waikato
14
1/28/2014
University of Waikato

Decision trees and lists, instance-based classifiers, support vector machines, multi-layer perceptrons, logistic regression, Bayes’ nets, …
Bagging, boosting, stacking, error-correcting output codes, locally weighted learning, …
1/28/2014 University of Waikato 3
WEKA: the bird(译:秧鸡)
Copyright: Martin Kramer (mkramer@wxs.nl)
1/28/2014 University of Waikato 4
关于WEKA的简介

WEKA作为一个公开的数据挖掘工作平台,集合了 大量能承担数据挖掘任务的机器学习算法,包括对 数据进行预处理,分类,回归、聚类、关联规则以 及在新的交互式界面上的可视化。而开发者则可使 用Java语言,利用WEKA的架构上开发出更多的数 据挖掘算法。用户如果想自己实现数据挖掘算法的 话,可以查看WEKA的接口文档。在WEKA中集成 自己的算法甚至借鉴它的方法自己实现可视化工具 并不是件很困难的事情。
25
1/28/2014
University of Waikato
26
1/28/2014
University of Waikato
27
1/28/2014
University of Waikato
28
1/28/2014
University of Waikato
29
1/28/2014
University of Waikato
30
1/28/2014
University of Waikato
31
1/28/2014
University of Waikato
32
1/28/2014
University of Waikato
33
1/28/2014
University of Waikato
34
1/28/2014
University of Waikato
35
Explorer: building “classifiers”


Classifiers in WEKA are models for predicting nominal or numeric quantities Implemented learning schemes include:
1/28/2014 University of Waikato 9
WEKA only deals with “flat” files
@relation heart-disease-simplified @attribute age numeric @attribute sex { female, male} @attribute chest_pain_type { typ_angina, asympt, non_anginal, atyp_angina} @attribute cholesterol numeric @attribute exercise_induced_angina { no, yes} @attribute class { present, not_present} @data 63,male,typ_angina,233,no,not_present 67,male,asympt,286,yes,present 67,male,asympt,229,yes,present 38,female,non_anginal,?,no,not_present ...
1/28/2014
University of Waikato
2
关于WEKA的简介
WEKA的全名是怀卡托智能分析环境(Waikato Environment for Knowledge Analysis),是一款免费 的,非商业化(与之对应的是SPSS公司商业数据挖 掘产品--Clementine )的,基于JAVA环境下开源的机 器学习(machine learning)以及数据挖掘(data minining)软件。它和它的源代码可在其官方网站下 载。非常有趣的是,该软件的缩写WEKA也是New Zealand独有的一种鸟名,而Weka的主要开发者同时 恰好来自New Zealand的the University of Waikato。

Machine Learning with WEKA
Eibe Frank
Writing a famous book

WEKA: A Machine Learning Toolkit The Explorer

• • • •
Department of Computer Science, University of Waikato, New Zealand


This talk is based on the snapshot of WEKA 3.3
1/28/2014
University of Waikato
7
WEKA:Format of the Data


使用这个系统前,首先需要将用户的数据转变成为 WEKA所需要的数据格式(ARFF格式)。大多数 ARFF数据文件是一个包括所有事例的列表,还有 每个事例的属性值,这些属性值用逗号分开。当事 例存在EXCEL或数据库中的时候,只需要将他们提 出,转成数据间用逗号分割的形式,然后加上数据 集的名字@relation,属性信息@attribute,值 @data,然后再将该文件保存成ARFF格式即可。 需要注意的是WEKA中的分类方案缺省假定ARFF 文件中的最后一个属性是分类属性。
University of Waikato 5
1/28/2014
WEKA开发历史的介绍

WEKA自1993年由位于 New Zealand的 the University of Waikato 进行开发,最初的软件基于C 语言实现。1997年,开发小组用JAVA语言重新编 写了该软件,并且对相关的数据挖掘算法进行了大 量的改进。2005年8月,在第11届ACM SIGKDD国 际会议上,the University of Waikato 的Weka小组 荣获了数据挖掘和知识探索领域的最高服务奖, Weka系统得到了广泛的认可,被誉为数据挖掘和 机器学习历史上的里程碑,是现今最完备的数据挖 掘工具之一.
1/28/2014 University of Waikato 10
1/28/2014
University of Waikato
11
1/28/2014
University of Waikato
12
1/28/2014
University of Waikato
13
Explorer: pre-processing the data
39
1/28/2014
University of Waikato
40
1/28/2014
University of Waikato
41
1/28/2014
University of Waikato
42
1/28/2014
University of Waikato
43
1/28/2014



Data can be imported from a file in various formats: ARFF, CSV, C4.5, binary Data can also be read from a URL or from an SQL database (using JDBC) Pre-processing tools in WEKA are called “filters” WEKA contains filters for:
相关文档
最新文档