k均值聚类插补英语
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
k均值聚类插补英语
K-means clustering algorithm is a kind of iterative clustering analysis algorithm. The steps are as follows: to pre-divide the data into K clusters, random selection of K objects as the initial clustering centers, then calculate the distance between each object and the various seed clustering centers, and assign each object to the nearest clustering center. The clustering center and the objects assigned to them represent a cluster. After each sample is assigned, the clustering center will be recomputed according to the existing objects in the cluster. This process will be repeated until a termination condition is met. The termination condition can be that no (or a minimum number) of objects are reassigned to different clusters, or no (or a minimum number) clustering centers change anymore, or the sum of squares of errors is locally minimized. Clustering is the process of dividing data into multiple categories according to the "similarity" of the data.
翻译:
K 均值聚类算法是一种迭代式聚类分析算法。
其步骤如下:将数据预分为 K 个簇,随机选择 K 个对象作为初始聚类中心,然后计算每个对象与各个种子聚类中心之间的距离,将每个对象分配给最近的聚类中心。
聚类中心和分配给它们的对象代表一个簇。
在每次分配样本后,将根据簇中现有的对象重新计算聚类中心。
这个过程将重复进行,直到满足终止条件。
终止条件可以是没有
(或最少数量的)对象被重新分配到不同的簇,或者没有(或最少数量的)聚类中心再发生变化,或者误差平方和在局部最小化。
聚类是根据数据的“相似性”将数据划分为多个类别的过程。