数据挖掘实验报告三
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
数据挖掘实验报告
班级统计121班学号姓名胡越
实验名称实验三:分类知识挖掘实验类型综合性实验
实验目的:
(1)掌握利用决策树(C4.5算法)进行分类的方法。
(2)掌握利用朴素贝叶斯分类的方法。
实验要求:
(1)对数据集bankdata.arff利用决策树(C4.5算法)进行分类,给出得出的决策树及分类器的性能评价指标,并利用建立的分类模型对下列表中给出的实例进行分类。
age sex region income married children car save_act current_act mortgage pep
21 MALE TOWN 5014.21 NO 0 YES YES YES YES
42 MALE INNER_CITY 17390.1 YES 0 NO YES YES NO
59 FEMALE RURAL 35610.5 NO 2 YES NO NO NO
45 FEMALE TOWN 26948 NO 0 NO YES YES YES
58 FEMALE TOWN 34524.9 YES 2 YES YES NO NO
30 MALE INNER_CITY 27808.1 NO 3 NO NO YES NO
(2)对数据集bankdata.arff利用朴素贝叶斯分类方法进行分类,给出分类模型的参数及分类器的性能评价指标,并利用建立的分类模型对上表中给出的实例进行分类。
实验结果:
(1)
分类器的性能评价指标: Kappa statistic 0.7942
age sex region income married children car save_act current_act mortgage pep
21 MALE TOWN 5014.21 NO 0 YES YES YES YES no
42 MALE INNER_CITY 17390.1 YES 0 NO YES YES NO no
59 FEMALE RURAL 35610.5 NO 2 YES NO NO NO yes
45 FEMALE TOWN 26948 NO 0 NO YES YES YES no
58 FEMALE TOWN 34524.9 YES 2 YES YES NO NO yes
30 MALE INNER_CITY 27808.1 NO 3 NO NO YES NO no
(2)=== Classifier model (full training set) ===
Naive Bayes Classifier
Class
Attribute YES NO
(0.46) (0.54)
===================================== age
mean 45.1277 40.0982 std. dev. 14.3018 14.1018
weight sum 274 326 precision 1 1
sex
FEMALE 131.0 171.0 MALE 145.0 157.0 [total] 276.0 328.0
region
INNER_CITY 124.0 147.0 TOWN 72.0 103.0 RURAL 47.0 51.0 SUBURBAN 35.0 29.0 [total] 278.0 330.0 income
mean 30644.8069 24902.2958
std. dev. 13585.1095 11640.5073
weight sum 274 326 precision 97.1838 97.1838 married
NO 121.0 85.0 YES 155.0 243.0 [total] 276.0 328.0 children
mean 0.9453 1.0675 std. dev. 0.859 1.1937
weight sum 274 326 precision 1 1
car
NO 137.0 169.0
YES 139.0 159.0
[total] 276.0 328.0
save_act
NO 96.0 92.0
YES 180.0 236.0
[total] 276.0 328.0
current_act
NO 64.0 83.0
YES 212.0 245.0
[total] 276.0 328.0
mortgage
NO 183.0 210.0
YES 93.0 118.0
[total] 276.0 328.0
Time taken to build model: 0.01 seconds
分类器的性能评价指标: Kappa statistic 0.2851
age sex region income married children car save_act current_act mortgage pep
21 MALE TOWN 5014.21 NO 0 YES YES YES YES no
42 MALE INNER_CITY 17390.1 YES 0 NO YES YES NO no
59 FEMALE RURAL 35610.5 NO 2 YES NO NO NO yes
45 FEMALE TOWN 26948 NO 0 NO YES YES YES no
58 FEMALE TOWN 34524.9 YES 2 YES YES NO NO no
30 MALE INNER_CITY 27808.1 NO 3 NO NO YES NO no