实验报告聚类分析

合集下载
  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。

实验报告聚类分析

实验原理:K均值聚类、中心点聚类、系统聚类和EM算法聚类分析技术。实验题目:用鸢尾花的数据集,进行聚类挖掘分析。

实验要求:探索鸢尾花数据的基本特征,利用不同的聚类挖掘方法,获得基本结论并简明解释。

实验题目--分析报告:data(iris)

> rm(list=ls())

> gc()

used (Mb) gc trigger (Mb) max used (Mb)

Ncells 431730 929718 607591

Vcells 787605 8388608 1592403

> data(iris)

> datav-iris

> head(data)

1 Species

setosa

2 setosa

3 setosa

4 setosa

5 setosa

6 setosa

#Kmear聚类分析

> n ewiris <- iris

> n ewiris$Species <- NULL

> (kc <- kmea ns(n ewiris, 3))

K-mea ns clusteri ng with 3 clusters of sizes 62, 50, 38 Cluster mea ns:

1

Clusteri ng vector:

[1] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2

[41] 2 2 2 2 2 2 2 2 2 2 1 1 3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

3 1 1 [81] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 3 1 3 3 3 3 1 3 3 3 3 3 3 1 1 3 3 3 3 1 [121] 3 1 3 1 3 3 1 1 3 3 3 3 3 1 3 3 3 3 1 3 3 3 1 3 3 3 1 3 3 1

With in cluster sum of squares by cluster:

[1]

(between_SS / total_SS = %)

Available comp onen ts:

[1] "cluster" "centers" "totss" "withinss ...........

⑹"betweenss" "size" "iter" "ifault"

> table(iris$Species, kc$cluster)

1 2 3

setosa 0 50 0

versicolor 48 0 2

virgi nica 14 0 36

> plot( newiris[c("", "")], col = kc$cluster)

> poi nts(kc$ce nters[,c("", "")], col = 1:3, pch = 8, cex=2)

#K-Mediods 进行聚类分析

> ("cluster")

> library(cluster)

> <-pam(iris,3)

> table(iris$Species,$clusteri ng)

1 2 3

setosa 50 0 0

versicolor 0 3 47

virgi nica 0 49 1

> layout(matrix(c(1,2),1,2))

> plot

Q

45

50 55 flO 05 70 75 8D

Sepal.Length Llp-Zs

E

Silhouette plot of pam(x = iris, k = 3) nwl50 3 AJSteis Cj j. i^ave^cj s ; l. 50 | O.6C 2 52 0.41 0.0 0.2 0.4 D.S 0.6 1.0 SilfKiuele widdl 〒 SiHowHie widWi - 0.57

?i

Coirijjonenl 1

Tn®牌 TWO componerts explain &&.02 % of me

poini w

> layout(matrix(1))

[[2]]

#hc

> <-hclust( dist(iris[,1:4]))

> plot( , hang = -1)

> plclust( , labels = FALSE, ha ng = -1)

> re <- , k = 3)

> <-cutree, 3)

dist(iris[: 1:4]}

hclust 仁"complete")

#利用剪枝函数cutree()参数h 控制输出height=18时的系谱类别 > sapply (uniq ue,

+ fun ctio n(g)iris$Species[==g])

[[1]]

[1] setosa setosa setosa setosa setosa setosa setosa setosa setosa setosa setosa

[12] setosa setosa setosa setosa setosa setosa setosa setosa setosa setosa setosa

[23] setosa setosa setosa setosa setosa setosa setosa setosa setosa setosa setosa

[34] setosa setosa setosa setosa setosa setosa setosa setosa setosa setosa setosa

[45] setosa setosa setosa setosa setosa setosa

Levels: setosa versicolor virginica

相关文档
最新文档