数据分析实验

合集下载
  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。

WEKA 数据分析实验

1.实验简介

借助工具Weka 3.6,对数据样本进行测试,分类测试方法包括:朴素贝叶斯、决策树、随机数三类,聚类测试方法包括:DBScan,K均值两种;

2.数据样本

以熟悉数据分类的各类常用算法,以及了解Weka的使用方法为目的,本次试验中,采用的数据样本是Weka软件自带的“Vote”样本,如图:

3.关联规则分析

1)操作步骤:

a)点击“Explorer”按钮,弹出“Weka Explorer”控制界面

b)选择“Associate”选项卡;

c)点击“Choose”按钮,选择“Apriori”规则

d)点击参数文本框框,在参数选项卡设置参数如:

e)点击左侧“Start”按钮

2)执行结果:

=== Run information ===

Scheme: weka.associations.Apriori -I -N 10 -T 0 -C 0.9 -D 0.05 -U 1.0 -M 0.5 -S -1.0 -c -1 Relation: vote

Instances: 435

Attributes: 17

handicapped-infants

water-project-cost-sharing

adoption-of-the-budget-resolution

physician-fee-freeze

el-salvador-aid

religious-groups-in-schools

anti-satellite-test-ban

aid-to-nicaraguan-contras

mx-missile

immigration

synfuels-corporation-cutback

education-spending

superfund-right-to-sue

crime

duty-free-exports

export-administration-act-south-africa

Class

=== Associator model (full training set) ===

Apriori

=======

Minimum support: 0.5 (218 instances)

Minimum metric : 0.9

Number of cycles performed: 10

Generated sets of large itemsets:

Size of set of large itemsetsL(1): 12

Large ItemsetsL(1):

handicapped-infants=n 236

adoption-of-the-budget-resolution=y 253

physician-fee-freeze=n 247

religious-groups-in-schools=y 272

anti-satellite-test-ban=y 239

aid-to-nicaraguan-contras=y 242

synfuels-corporation-cutback=n 264

education-spending=n 233

crime=y 248

duty-free-exports=n 233

export-administration-act-south-africa=y 269

Class=democrat 267

Size of set of large itemsetsL(2): 4

Large ItemsetsL(2):

adoption-of-the-budget-resolution=y physician-fee-freeze=n 219

adoption-of-the-budget-resolution=y Class=democrat 231

physician-fee-freeze=n Class=democrat 245

aid-to-nicaraguan-contras=y Class=democrat 218

Size of set of large itemsetsL(3): 1

Large ItemsetsL(3):

adoption-of-the-budget-resolution=y physician-fee-freeze=n Class=democrat 219

Best rules found:

1. adoption-of-the-budget-resolution=y physician-fee-freeze=n 219 ==> Class=democrat 219 conf:(1)

2. physician-fee-freeze=n 247 ==> Class=democrat 245 conf:(0.99)

3. adoption-of-the-budget-resolution=y Class=democrat 231 ==> physician-fee-freeze=n 219 conf:(0.95)

4. Class=democrat 267 ==> physician-fee-freeze=n 245 conf:(0.92)

5. adoption-of-the-budget-resolution=y 253 ==> Class=democrat 231 conf:(0.91)

6. aid-to-nicaraguan-contras=y 242 ==> Class=democrat 218 conf:(0.9)

3)结果分析:

a)该样本数据,数据记录数435个,17个属性,进行了10轮测试

b)最小支持度为0.5,即至少需要218个实例;

c)最小置信度为0.9;

d)进行了10轮搜索,频繁1项集12个,频繁2项集4个,频繁3项集1个;

4.分类算法-随机树分析

1)操作步骤:

a)点击“Explorer”按钮,弹出“Weka Explorer”控制界面

b)选择“Classify ”选项卡;

c)点击“Choose”按钮,选择“trees” “RandomTree”规则

相关文档
最新文档